CDH5.9.2 整合TEZ

1.安裝配置TEZ

1.1 環(huán)境要求

  • CDH5.9.2(hadoop2.6.0)
  • 編譯環(huán)境:gcc, gcc-c++, make, build
  • Nodejs、npm (Tez-ui需要)
  • Git
  • pb2.5.0
  • maven3
  • Tez0.8.5

1.2編譯環(huán)境準(zhǔn)備

安裝gcc, gcc-c++, make, build

yum install gcc gcc-c++ libstdc++-devel make build

安裝Nodejs,npm

wget http://nodejs.org/dist/v0.8.14/node-v0.8.14.tar.gz
./configure 
make && make install

安裝GIt

https://git-scm.com/download  ./configuremakemake install

安裝ProtocolBuffer.5.0

https://github.com/google/protobuf/releases/tag/v2.5.0
./configure
make && make install

1.3 編譯TEZ

1.3.1 官網(wǎng)下載tez

1.3.2 解壓

1.3.3修改源碼

/tez-mapreduce/src/main/java/org/apache/tez/mapreduce/hadoop/mapreduce/JobContextImpl.java

diff --git a/tez-mapreduce/src/main/java/org/apache/tez/mapreduce/hadoop/mapreduce/JobContextImpl.java b/tez-mapreduce/src/main/java/org/apache/tez/mapreduce/hadoop/mapreduce/JobContextImpl.java
 
index 12491ed..b4ca24c 100644
 
--- a/tez-mapreduce/src/main/java/org/apache/tez/mapreduce/hadoop/mapreduce/JobContextImpl.java
 
+++ b/tez-mapreduce/src/main/java/org/apache/tez/mapreduce/hadoop/mapreduce/JobContextImpl.java
 
@@ -475,5 +475,16 @@ public class JobContextImpl implements JobContext {
 
   public Progressable getProgressible() {
 
     return progress;
 
   }
 
+
 
+  /**
 
+   * Get the boolean value for the property that specifies which classpath
 
+   * takes precedence when tasks are launched. True - user's classes takes
 
+   * precedence. False - system's classes takes precedence.
 
+   * @return true if user's classes should take precedence
 
+   */
 
+   @Override
 
+  public boolean userClassesTakesPrecedence() {
 
+    return getJobConf().getBoolean(MRJobConfig.MAPREDUCE_JOB_USER_CLASSPATH_FIRST, false);
 
+  }
 
    
 
 }

/tez-ext-service-tests/src/test/java/org/apache/tez/shufflehandler/ShuffleHandler.java

I figured out this answer from my coworker we have to use "headers().set, headers().get" instead of "setHeader(), getHeader()".

1.3.4 修改配置

vi pom.xml

<profile>
    <id>cdh5.9.2</id>
    <activation>
      <activeByDefault>false</activeByDefault>
    </activation>
    <properties>
       <hadoop.version>2.5.0-cdh5.2.5</hadoop.version>
     </properties>
     <pluginRepositories>
       <pluginRepository>
          <id>cloudera</id>
          <url>https://repository.cloudera.com/artifactory/cloudera-repos/</url>
       </pluginRepository>
      </pluginRepositories>
      <repositories>
      <repository>
         <id>cloudera</id>
         <url>https://repository.cloudera.com/artifactory/cloudera-repos/</url>
       </repository>
     </repositories>
   </profile>

vi tez-ui/pom.xml

<nodeVersion>v0.12.9</nodeVersion>
<npmVersion>2.14.9</npmVersion>

1.3.5 開(kāi)始編譯

mvn clean package -DskipTests=true -Dmaven.javadoc.skip=true  -Dfrontend-maven-plugin.version=0.0.23

注意:出現(xiàn)node.gz.tar文件下載失敗 到tez-ui下手動(dòng)編譯:

手動(dòng)編譯tez-ui 和tez-ui2 使用taobao的源

npm --registry=https://registry.npm.taobao.org install --verbose

2.Hive On Tez

  1. 拷貝tez-0.8.2-minimal目錄至HDFS
hdfs dfs -put tez-dist/target/tez-0.8.5-minimal tez-dir/
  1. 把hadoop-mapreduce-client-common-2.6.0-cdh5.9.2.jar拷貝到hdfs的/tez-dir/tez-0.8.5-minimal目錄
  2. 把tez-0.8.5下jar和lib下的jar拷貝到hive客戶(hù)端部署的lib目錄,刪除hive/auxlib下的hive-exec-1.1.0-cdh5.9.3-core.jar和hive-exec-core.jar 否則會(huì)有kryo錯(cuò)誤。
  3. 創(chuàng)建tez.size.xml 保存到/etc/hive/conf/
<?xml version="1.0"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at
 
    http://www.apache.org/licenses/LICENSE-2.0
 
  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->
 
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
 
<!-- Put site-specific property overrides in this file. -->
 
<configuration>
 <property>
   <name>tez.lib.uris</name>
   <value>${fs.defaultFS}/tez-dir/tez-0.8.5-minimal,${fs.defaultFS}/tez-dir/tez-0.8.5-minimal/lib</value>
 </property>
 <property>
   <name>tez.use.cluster.hadoop-libs</name>
   <value>true</value>
 </property>
</configuration>
  1. 在hive中使用tez
set hive.execution.engine=tez;

截止2017-09-27日,已在公司線上環(huán)境使用3個(gè)多月,由于時(shí)間問(wèn)題tez-ui沒(méi)有成功整合。有時(shí)間會(huì)解決下ui問(wèn)題。有朋友成功整合tez-ui的也可分享下。

最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時(shí)請(qǐng)結(jié)合常識(shí)與多方信息審慎甄別。
平臺(tái)聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點(diǎn),簡(jiǎn)書(shū)系信息發(fā)布平臺(tái),僅提供信息存儲(chǔ)服務(wù)。

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容