需求
歷史任務(wù)基于 Spark1. 新任務(wù)計(jì)劃轉(zhuǎn)移到 Spark2. 需要 Oozie 同時(shí)支持兩個(gè)版本.
步驟
1 配置 sharelib
參照 Hortonworks 文檔 創(chuàng)建 spark2 sharelib:
hdfs dfs -mkdir /user/oozie/share/lib/lib_<ts>/spark2
hdfs dfs -put \
/usr/hdp/<version>/spark2/jars/* \
/user/oozie/share/lib/lib_<ts>/spark2/
hdfs dfs -cp \
/user/oozie/share/lib/lib_<ts>/spark/oozie-sharelib-spark-<version>.jar \
/user/oozie/share/lib/lib_<ts>/spark2/
hdfs dfs -cp \
/user/oozie/share/lib/lib_<ts>/spark/hive-site.xml \
/user/oozie/share/lib/lib_<ts>/spark2/
hdfs dfs -put \
/usr/hdp/<version>/spark2/python/lib/py* \
/user/oozie/share/lib/lib_<ts>/spark2/
oozie admin –sharelibupdate
檢查是否創(chuàng)建成功:
oozie admin –shareliblist spark2
2 配置 job.properties
設(shè)置以下屬性值:
oozie.action.sharelib.for.spark=spark2
2 刪除重復(fù)的 jar 包
完成上述配置后運(yùn)行 workflow 會(huì)報(bào)重復(fù)加載文件的錯(cuò)誤:
WARN SparkActionExecutor:523 - SERVER[] USER[admin] GROUP[-] TOKEN[] APP[Workflow2] JOB[0000012-170717153234639-oozie-oozi-W] ACTION[0000012-170717153234639-oozie-oozi-W@spark_1] Launcher ERROR, reason: Main class [org.apache.oozie.action.hadoop.SparkMain], main() threw exception, Attempt to add (hdfs://:8020/user/oozie/share/lib/lib_20170613110051/oozie/aws-java-sdk-core-1.10.6.jar) multiple times to the distributed cache.
2017-07-19 12:36:53,275 WARN SparkActionExecutor:523 - SERVER[] USER[admin] GROUP[-] TOKEN[] APP[Workflow2] JOB[0000012-170717153234639-oozie-oozi-W] ACTION[0000012-170717153234639-oozie-oozi-W@spark_1] Launcher exception: Attempt to add (hdfs://:8020/user/oozie/share/lib/lib_20170613110051/oozie/aws-java-sdk-core-1.10.6.jar) multiple times to the distributed cache.
java.lang.IllegalArgumentException: Attempt to add (hdfs://:8020/user/oozie/share/lib/lib_20170613110051/oozie/aws-java-sdk-core-1.10.6.jar) multiple times to the distributed cache.
原因
/user/oozie/share/lib/lib_<TS>/oozie/ 和 /user/oozie/share/lib/lib_<TS>/spark2/ 下存在同名 jar 包.
解決方法
刪除 spark2(盡量減少對(duì)其他非 spark2 任務(wù)的影響) 下的同名 jar 包.
3 新舊版本的 jar 包兼容
完成上述配置后運(yùn)行 workflow 會(huì)報(bào)以下錯(cuò)誤:
[0000591-171111105051439-oozie-oozi-W] ACTION[0000591-171111105051439-oozie-oozi-W@spark-b645] Launcher exception: com.fasterxml.jackson.databind.JavaType.isReferenceType()Z
java.lang.NoSuchMethodError: com.fasterxml.jackson.databind.JavaType.isReferenceType()Z
at com.fasterxml.jackson.databind.ser.BasicSerializerFactory.findSerializerByLookup(BasicSerializerFactory.java:302)
原因
spark2 加載了 /user/oozie/share/lib/lib_<TS>/oozie/ 下 版本過舊的 jar 包:
jackson-annotations-2.4.0.jar (?)
oozie/jackson-core-2.4.4.jar
jackson-databind-2.4.4.jar
解決方法
由于以上版本的 jar 被Spark1任務(wù)依賴, 不能直接更新或刪除. 因?yàn)檫@個(gè) Oozie 示例只運(yùn)行 spark action, 簡(jiǎn)單地把這些
jar move() 到 /user/oozie/share/lib/lib_<TS>/spark/ 目錄即可.
此時(shí), spark1 任務(wù)加載的是 /user/oozie/share/lib/lib_<TS>/spark/ 下的
jackson-annotations-2.4.0.jar
oozie/jackson-core-2.4.4.jar
jackson-databind-2.4.4.jar
此時(shí), spark2 任務(wù)加載的是 /user/oozie/share/lib/lib_<TS>/spark2/ 下的
jackson-annotations-2.6.5.jar
oozie/jackson-core-2.6.5.jar
jackson-databind-2.6.5.jar
其他解決方法
hortonworks 有提到可以重新生成oozie sharelib:
/usr/hdp/2.3.2.0-2950/oozie/bin/oozie-setup.sh sharelib create -locallib /usr/hdp/<version>/oozie/oozie-sharelib.tar.gz -fs hdfs://<namenode-host>:8020
oozie admin -oozie http://<oozie-host>:11000/oozie -sharelibupdate