DataX 是阿里巴巴集團內(nèi)被廣泛使用的離線數(shù)據(jù)同步工具/平臺,實現(xiàn)包括 MySQL、Oracle、SqlServer、Postgre、HDFS、Hive、ADS、HBase、TableStore(OTS)、MaxCompute(ODPS)、DRDS 等各種異構數(shù)據(jù)源之間高效的數(shù)據(jù)同步功能
最新的版本已經(jīng)指出 ES 記錄一下步驟
環(huán)境要求:
Linux
JDK1.8
Python2.6
wget http://datax-opensource.oss-cn-hangzhou.aliyuncs.com/datax.tar.gz
wget http://www.trieuvan.com/apache/maven/maven-3/3.6.3/binaries/apache-maven-3.6.3-bin.tar.gz
解壓即用,所有的job? 都配置在 /datax/job
啟動命令
python datax.py ./job/stream2stream.json
懶得人總是很多,DataX集成可視化頁面 孕育而生
下載安裝:
git clone https://github.com/pengls/datax-admin.git
安裝mvn? 編譯
tar zxvf apache-maven-3.6.3-bin.tar.gz && mv apache-maven-3.6.3 /usr/local/maven3
添加到環(huán)境變量
export M2_HOME=/usr/local/maven3
export PATH=$PATH:$JAVA_HOME/bin:$M2_HOME/bin
測試
mvn -v?
Apache Maven 3.6.3 (cecedd343002696d0abb50b32b541b8a6ba2883f)
Maven home: /usr/local/maven3
Java version: 1.8.0_251, vendor: Oracle Corporation, runtime: /usr/local/jdk1.8/jre
Default locale: en_US, platform encoding: UTF-8
OS name: "linux", version: "3.10.0-514.26.2.el7.x86_64", arch: "amd64", family: "unix"
1.修改datax_admin下resources/application.yml文件
cat /opt/datax-web/datax-admin/src/main/resources/application.yml<<EOF

# 配置mybatis-plus打印slq日志
logging:
? level:
? ? com.wugui.datax.admin.mapper: error
? path: ./data/applogs/admin? ##目錄不存在請新建
執(zhí)行doc/db下面的datax_web.sql文件
2.修改datax_executor下resources/application.yml文件
# log config
logging:
? config: classpath:logback.xml
? path: ./data/applogs/executor/jobhandler
修改日志路徑path
datax:
? job:
? ? admin:
? ? ? ### datax-web admin address
? ? ? addresses: http://127.0.0.1:8080
? ? executor:
? ? ? appname: datax-executor
? ? ? ip:
? ? ? port: 9999
? ? ? ### job log path
? ? ? logpath: ./data/applogs/executor/jobhandler
? ? ? ### job log retention days
? ? ? logretentiondays: 30
? executor:
? ? jsonpath: /Users/mac/data/applogs
? pypath: /Users/mac/tools/datax/bin/datax.py
編譯部署
1.本地安裝好maven環(huán)境,安裝此處細節(jié)忽略
2.執(zhí)行mvn package -Dmaven.test.skip=true
3.打包成功后分別將datax-admin、datax-executor模塊target下datax-admin-2.1.1.jar、datax-executor-2.1.1.jar放到指定目錄
4.分別啟動datax-admin-2.1.1.jar、datax-executor-2.1.1.jar
5.啟動命令demo: nohup java -Xmx1024M -Xms1024M -Xmn448M -XX:MaxMetaspaceSize=192M -XX:MetaspaceSize=192M -jar datax-admin-2.1.1.jar& nohup java -Xmx1024M -Xms1024M -Xmn448M -XX:MaxMetaspaceSize=192M -XX:MetaspaceSize=192M -jar datax-executor-2.1.1.jar&
訪問測試
http://127.0.0.1:8080/index.html#/dashboard