安裝DATAHUB

理論與實(shí)踐太難了,這玩意錯(cuò)誤太多,遍地是坑,還有就是不知道怎么用,一臉懵逼

https://github.com/linkedin/datahub

安裝docker

yum -y install docker
# 未啟動(dòng)docker,出現(xiàn)如下問題
[root@localhost ~]# docker pull java:8
Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
[root@localhost ~]# systemctl daemon-reload
[root@localhost ~]# systemctl restart docker.service
# 啟動(dòng)
# 成功解決

安裝python3.8

下載wget https://www.python.org/ftp/python/3.8.2/Python-3.8.0.tgz


#第一步 
#非常重要要不然要報(bào)錯(cuò)
ModuleNotFoundError: No module named '_ctypes'

yum install libffi-devel -y
cd /app
wget https://www.python.org/ftp/python/3.8.2/Python-3.8.0.tgz
tar -zxvf Python-3.8.0.tgz
 cd /app/Python-3.8.0
#編譯安裝
./configure --prefix=/usr/local/python3
 make && make install
#創(chuàng)建軟連接
ln -s /usr/local/python3/bin/python3 /usr/local/bin/python3
ln -s /usr/local/python3/bin/pip3 /usr/local/bin/pip3
#驗(yàn)證是否成功
[root@xxx Python-3.8.2]# python3 -V
Python 3.8.2
[root@xxx Python-3.8.2]# pip3 -V
pip 19.2.3 from /usr/local/python3/lib/python3.8/site-packages/pip (python 3.8)

安裝pip3

#更新pip3
[root@artemis python3]# pip3 install --upgrade pip -i http://pypi.douban.com/simple --trusted-host pypi.douban.com
Looking in indexes: http://pypi.douban.com/simple
Collecting pip
  Downloading http://pypi.doubanio.com/packages/54/eb/4a3642e971f404d69d4f6fa3885559d67562801b99d7592487f1ecc4e017/pip-20.3.3-py2.py3-none-any.whl (1.5MB)
     |████████████████████████████████| 1.5MB 6.3MB/s 
Installing collected packages: pip
  Found existing installation: pip 19.2.3
    Uninstalling pip-19.2.3:
      Successfully uninstalled pip-19.2.3
Successfully installed pip-20.3.3

設(shè)置自由切換python2和python3 實(shí)際不需要這步

本文方法使用的是update-alternatives工具
第一步
查看是否已經(jīng)存在python的可選項(xiàng)

update-alternatives --display python  
#!若無則不顯示任何信息

第二步
將python2和python3分別添加為可選項(xiàng)

sudo update-alternatives --install /usr/bin/python python /usr/bin/python2.7 1
sudo update-alternatives --install /usr/bin/python python /usr/local/bin/python3 2
#! /usr/bin/python鏈接文件相同,/usr/local/bin/python3.4 1則根據(jù)自己具體安裝目錄來設(shè)定,1、2分別代表優(yōu)先級(jí)

第三步
查看版本,此時(shí)的版本是Python2

python --version
Python 2.7.5

第四步
切換版本

sudo update-alternatives --config python
    There are 2 programs which provide 'python'.

     Selection    Command
    -----------------------------------------------
     + 1           /usr/bin/python2.7
    *  2           /usr/local/bin/python3.6

    Enter to keep the current selection[+], or type selection number: 2(1為python2.7,2為python3.6)

第五步
如上,我們選擇的選項(xiàng)是2,因此此時(shí)版本應(yīng)該為Python3

python --version
Python 3.6.5


附加,刪除可選項(xiàng)

sudo update-alternatives --remove python /usr/bin/python2.7 (刪除2.7)
sudo update-alternatives --remove python /usr/local/bin/python3.6 (刪除3.6)

安裝 sklearn

 pip3 install sklearn  -i http://pypi.douban.com/simple --trusted-host pypi.douban.com

安裝docker-compose 方法1 會(huì)報(bào)錯(cuò)2 不建議

pip3 install docker-compose  -i http://pypi.douban.com/simple --trusted-host pypi.douban.com

安裝docker-compose 方法2 可以解決報(bào)錯(cuò)2

curl -L "https://github.com/docker/compose/releases/download/1.27.4/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose

chmod +x /usr/local/bin/docker-compose

安裝完畢后重啟下docker服務(wù)

systemctl 方式

守護(hù)進(jìn)程重啟
sudo systemctl daemon-reload
重啟docker服務(wù)
sudo systemctl restart docker
關(guān)閉docker
sudo systemctl stop docker

報(bào)錯(cuò)1可能是網(wǎng)絡(luò)引起的

[root@ares datahub]# ./docker/quickstart.sh
Pulling mysql                ... done
Pulling zookeeper            ... done
Pulling elasticsearch        ... done
Pulling elasticsearch-setup  ... done
Pulling kibana               ... done
Pulling broker               ... done
Pulling neo4j                ... done
Pulling schema-registry      ... done
Pulling schema-registry-ui   ... done
Pulling kafka-setup          ... done
Pulling datahub-mae-consumer ... done
Pulling datahub-gms          ... done
Pulling datahub-mce-consumer ... done
Pulling datahub-frontend     ... done
Pulling kafka-rest-proxy     ... done
Pulling kafka-topics-ui      ... done
Building elasticsearch-setup
Step 1/6 : FROM jwilder/dockerize:0.6.1
 ---> 849596ab86ff
Step 2/6 : RUN apk add --no-cache curl jq
 ---> Running in 8ca0149674f7
fetch http://dl-cdn.alpinelinux.org/alpine/v3.6/main/x86_64/APKINDEX.tar.gz
WARNING: Ignoring http://dl-cdn.alpinelinux.org/alpine/v3.6/main/x86_64/APKINDEX.tar.gz: network error (check Internet connection and firewall)
fetch http://dl-cdn.alpinelinux.org/alpine/v3.6/community/x86_64/APKINDEX.tar.gz
WARNING: Ignoring http://dl-cdn.alpinelinux.org/alpine/v3.6/community/x86_64/APKINDEX.tar.gz: network error (check Internet connection and firewall)
ERROR: unsatisfiable constraints:
  curl (missing):
    required by: world[curl]
  jq (missing):
    required by: world[jq]
ERROR: Service 'elasticsearch-setup' failed to build : The command '/bin/sh -c apk add --no-cache curl jq' returned a non-zero code: 2

報(bào)錯(cuò)2 docker-compose

[root@artemis datahub]# ./docker/quickstart.sh
/usr/lib/python2.7/site-packages/paramiko/transport.py:33: CryptographyDeprecationWarning: Python 2 is no longer supported by the Python core team. Support for it is now deprecated in cryptography, and will be removed in the next release.
  from cryptography.hazmat.backends import default_backend
Traceback (most recent call last):
  File "/usr/bin/docker-compose", line 5, in <module>
    from compose.cli.main import main
  File "/usr/lib/python2.7/site-packages/compose/cli/main.py", line 24, in <module>
    from ..config import ConfigurationError
  File "/usr/lib/python2.7/site-packages/compose/config/__init__.py", line 6, in <module>
    from .config import ConfigurationError
  File "/usr/lib/python2.7/site-packages/compose/config/config.py", line 51, in <module>
    from .validation import match_named_volumes
  File "/usr/lib/python2.7/site-packages/compose/config/validation.py", line 12, in <module>
    from jsonschema import Draft4Validator
  File "/usr/lib/python2.7/site-packages/jsonschema/__init__.py", line 21, in <module>
    from jsonschema._types import TypeChecker
  File "/usr/lib/python2.7/site-packages/jsonschema/_types.py", line 3, in <module>
    from pyrsistent import pmap
  File "/usr/lib64/python2.7/site-packages/pyrsistent/__init__.py", line 3, in <module>
    from pyrsistent._pmap import pmap, m, PMap
  File "/usr/lib64/python2.7/site-packages/pyrsistent/_pmap.py", line 98
    ) from e
         ^
SyntaxError: invalid syntax

報(bào)錯(cuò)3 替換國內(nèi)源在報(bào)錯(cuò)5還有一種解決方案,視情況替換

WARNING: Ignoring http://dl-cdn.alpinelinux.org/alpine/v3.6/main/x86_64/APKINDEX.tar.gz: network error (check Internet connection and firewall)

正確的做法是使用國內(nèi)源完全覆蓋 /etc/apk/repositories在Dockerfile中增加下面的第二行

FROM alpine:3.7
RUN echo -e http://mirrors.ustc.edu.cn/alpine/v3.7/main/ > /etc/apk/repositories
可能修改的文件
/app/datahub/docker/datahub-mae-consumer/Dockerfile
/app/datahub/docker/datahub-gms/Dockerfile
/app/datahub/docker/datahub-frontend/Dockerfile
/app/datahub/docker/datahub-mae-consumer/Dockerfile
/app/datahub/docker/datahub-mce-consumer/Dockerfile


具體是哪個(gè)或者哪幾個(gè)不清楚,寧可錯(cuò)殺不可放過,在所有的FROM 后面都加上,一個(gè)文件里面可能有多個(gè),好像有問題等會(huì)放一放

http://www.itdecent.cn/p/eb34e7088c77

官方使用指南

https://github.com/linkedin/datahub/blob/master/docs/debugging.md#how-can-i-confirm-if-all-docker-containers-are-running-as-expected-after-a-quickstart

檢查啟動(dòng)

docker container ls

[root@ares ~]# docker container ls
CONTAINER ID   IMAGE                                   COMMAND                  CREATED        STATUS        PORTS                                                      NAMES
14ef8eedaf60   linkedin/datahub-frontend:latest        "datahub-frontend/bi…"   15 hours ago   Up 15 hours   0.0.0.0:9001->9001/tcp                                     datahub-frontend
8377aad77608   landoop/kafka-topics-ui:0.9.4           "/run.sh"                15 hours ago   Up 15 hours   0.0.0.0:18000->8000/tcp                                    kafka-topics-ui
a5ef70bde9f1   linkedin/datahub-mae-consumer:latest    "/bin/sh -c /datahub…"   15 hours ago   Up 15 hours   9090/tcp, 0.0.0.0:9091->9091/tcp                           datahub-mae-consumer
f5ab5ae53011   linkedin/datahub-gms:latest             "/bin/sh -c /datahub…"   15 hours ago   Up 15 hours   0.0.0.0:8080->8080/tcp                                     datahub-gms
f5cda035d5e5   confluentinc/cp-kafka-rest:5.4.0        "/etc/confluent/dock…"   15 hours ago   Up 15 hours   0.0.0.0:8082->8082/tcp                                     kafka-rest-proxy
2328387122a3   landoop/schema-registry-ui:latest       "/run.sh"                15 hours ago   Up 15 hours   0.0.0.0:8000->8000/tcp                                     schema-registry-ui
95acca24d698   confluentinc/cp-schema-registry:5.4.0   "/etc/confluent/dock…"   15 hours ago   Up 15 hours   0.0.0.0:8081->8081/tcp                                     schema-registry
58e7a0d307d2   confluentinc/cp-kafka:5.4.0             "/etc/confluent/dock…"   15 hours ago   Up 15 hours   0.0.0.0:9092->9092/tcp, 0.0.0.0:29092->29092/tcp           broker
1b4b6dec57e9   kibana:5.6.8                            "/docker-entrypoint.…"   15 hours ago   Up 15 hours   0.0.0.0:5601->5601/tcp                                     kibana
61a90895e756   neo4j:4.0.6                             "/sbin/tini -g -- /d…"   15 hours ago   Up 15 hours   0.0.0.0:7474->7474/tcp, 7473/tcp, 0.0.0.0:7687->7687/tcp   neo4j
5b7ab8c768c1   confluentinc/cp-zookeeper:5.4.0         "/etc/confluent/dock…"   15 hours ago   Up 15 hours   2888/tcp, 0.0.0.0:2181->2181/tcp, 3888/tcp                 zookeeper
da9188c1035f   elasticsearch:5.6.8                     "/docker-entrypoint.…"   15 hours ago   Up 15 hours   0.0.0.0:9200->9200/tcp, 9300/tcp                           elasticsearch
49252b1240b8   mysql:5.7                               "docker-entrypoint.s…"   15 hours ago   Up 15 hours   0.0.0.0:3306->3306/tcp, 33060/tcp                          mysql

docker內(nèi)切換阿里源及安裝vim方法,在下面報(bào)錯(cuò)5還有個(gè)在加載docker時(shí)修改的方案


#切換阿里源
echo -e "deb http://mirrors.ustc.edu.cn/debian buster main contrib non-free \n\
deb http://mirrors.ustc.edu.cn/debian buster-backports main contrib non-free \n\
deb http://mirrors.ustc.edu.cn/debian buster-proposed-updates main contrib non-free  \n\
deb http://mirrors.ustc.edu.cn/debian-security buster/updates main contrib non-free \n" \
> /etc/apt/sources.list
root@mysql:/# apt-get clean
root@mysql:/# apt-get update
Get:1 http://mirrors.ustc.edu.cn/debian buster InRelease [121 kB]
Get:2 http://mirrors.ustc.edu.cn/debian buster-backports InRelease [46.7 kB]    
Get:3 http://mirrors.ustc.edu.cn/debian buster-proposed-updates InRelease [54.5 kB]
Get:4 http://mirrors.ustc.edu.cn/debian-security buster/updates InRelease [65.4 kB]
Get:5 http://mirrors.ustc.edu.cn/debian buster/non-free amd64 Packages [87.7 kB]
Get:6 http://mirrors.ustc.edu.cn/debian buster/contrib amd64 Packages [50.2 kB]
Hit:7 http://repo.mysql.com/apt/debian buster InRelease              
Get:8 http://mirrors.ustc.edu.cn/debian buster/main amd64 Packages [7907 kB]
Get:9 http://mirrors.ustc.edu.cn/debian buster-backports/non-free amd64 Packages [29.0 kB]
Get:10 http://mirrors.ustc.edu.cn/debian buster-backports/contrib amd64 Packages [7816 B]
Get:11 http://mirrors.ustc.edu.cn/debian buster-backports/main amd64 Packages [410 kB]
Get:12 http://mirrors.ustc.edu.cn/debian buster-proposed-updates/main amd64 Packages [50.1 kB]
Get:13 http://mirrors.ustc.edu.cn/debian-security buster/updates/main amd64 Packages [260 kB]
Get:14 http://mirrors.ustc.edu.cn/debian-security buster/updates/non-free amd64 Packages [556 B]
Fetched 9090 kB in 4s (2509 kB/s)                        
Reading package lists... Done

#安裝vim
apt-get update,
這個(gè)命令的作用是:同步 /etc/apt/sources.list 和 /etc/apt/sources.list.d 中列出的源的索引,這樣才能獲取到最新的軟件包。

 等更新完畢 apt-get install vim命令即可。

登錄mysql方法

docker exec -it mysql /usr/bin/mysql datahub --user=datahub --password=datahub

查看所有鏡像
 docker images

1、啟動(dòng)所有容器

docker start $(docker ps -a | awk '{ print $1}' | tail -n +2)

2、關(guān)閉所有容器

docker stop $(docker ps -a | awk '{ print $1}' | tail -n +2)

3、刪除所有容器

docker rm $(docker ps -a | awk '{ print $1}' | tail -n +2)

4、刪除所有鏡像(慎用)

docker rmi $(docker images | awk '{print $3}' |tail -n +2)

systemctl status docker 

docker container ls

檢查各個(gè)Docker容器日志docker logs <<container_name>>。

對(duì)于datahub-gms,您應(yīng)該在初始化結(jié)束時(shí)看到類似以下的日志:

docker logs datahub-gms
2020-02-06 09:20:54.870:INFO:oejs.Server:main: Started @18807ms

對(duì)于datahub-frontend,您應(yīng)該在初始化結(jié)束時(shí)看到類似以下的日志:

docker logs datahub-frontend
09:20:22 [main] INFO  play.core.server.AkkaHttpServer - Listening for HTTP on /0.0.0.0:9001

運(yùn)行.docker/ingestion/ingestion.sh 報(bào)錯(cuò)4

[root@ares datahub]# ./docker/ingestion/ingestion.sh
WARNING: Native build is an experimental feature and could change at any time
WARNING: Found orphan containers (datahub-mae-consumer, schema-registry-ui, broker, kafka-setup, kafka-topics-ui, datahub-frontend, mysql, datahub-mce-consumer, kafka-rest-proxy, neo4j, kibana, elasticsearch-setup, datahub-gms, elasticsearch, schema-registry, zookeeper) for this project. If you removed or renamed this service in your compose file, you can run this command with the --remove-orphans flag to clean it up.
Building ingestion
[+] Building 86.4s (9/11)                                                                                                                                                                                                                
 => [internal] load build definition from Dockerfile                                                                                                                                                                                0.0s
 => => transferring dockerfile: 32B                                                                                                                                                                                                 0.0s
 => [internal] load .dockerignore                                                                                                                                                                                                   0.0s
 => => transferring context: 35B                                                                                                                                                                                                    0.0s
 => [internal] load metadata for docker.io/library/openjdk:8                                                                                                                                                                        3.7s
 => [internal] load metadata for docker.io/library/openjdk:8-jre-alpine                                                                                                                                                             4.7s
 => [prod-build 1/3] FROM docker.io/library/openjdk:8@sha256:c1dcc499d35d74a93c6cbfb1819a88bd588e06741d23f9a1962f636799d77822                                                                                                       0.0s
 => [internal] load build context                                                                                                                                                                                                   0.5s
 => => transferring context: 385.15kB                                                                                                                                                                                               0.4s
 => CACHED [base 1/1] FROM docker.io/library/openjdk:8-jre-alpine@sha256:f362b165b870ef129cbe730f29065ff37399c0aa8bcab3e44b51c302938c9193                                                                                           0.0s
 => CACHED [prod-build 2/3] COPY . datahub-src                                                                                                                                                                                      0.0s
 => ERROR [prod-build 3/3] RUN cd datahub-src && ./gradlew :metadata-ingestion-examples:mce-cli:build                                                                                                                              81.2s
------                                                                                                                                                                                                                                   
 > [prod-build 3/3] RUN cd datahub-src && ./gradlew :metadata-ingestion-examples:mce-cli:build:                                                                                                                                          
#9 0.579 Downloading https://services.gradle.org/distributions/gradle-5.6.4-bin.zip                                                                                                                                                      
#9 4.484 .........................................................................................                                                                                                                                       
#9 32.15                                                                                                                                                                                                                                 
#9 32.15 Welcome to Gradle 5.6.4!                                                                                                                                                                                                        
#9 32.15 
#9 32.15 Here are the highlights of this release:
#9 32.15  - Incremental Groovy compilation
#9 32.15  - Groovy compile avoidance
#9 32.15  - Test fixtures for Java projects
#9 32.15  - Manage plugin versions via settings script
#9 32.15 
#9 32.15 For more details see https://docs.gradle.org/5.6.4/release-notes.html
#9 32.15 
#9 32.34 To honour the JVM settings for this build a new JVM will be forked. Please consider using the daemon: https://docs.gradle.org/5.6.4/userguide/gradle_daemon.html.
#9 33.84 Daemon will be stopped at the end of the build stopping after processing
#9 36.74 Configuration on demand is an incubating feature.
#9 80.64 
#9 80.64 FAILURE: Build failed with an exception.
#9 80.64 
#9 80.64 * What went wrong:
#9 80.64 A problem occurred configuring root project 'datahub-src'.
#9 80.64 > Could not resolve all artifacts for configuration ':classpath'.
#9 80.64    > Could not resolve com.linkedin.pegasus:gradle-plugins:28.3.7.
#9 80.64      Required by:
#9 80.64          project :
#9 80.64       > Could not resolve com.linkedin.pegasus:gradle-plugins:28.3.7.
#9 80.64          > Could not get resource 'https://linkedin.bintray.com/maven/com/linkedin/pegasus/gradle-plugins/28.3.7/gradle-plugins-28.3.7.pom'.
#9 80.64             > Could not GET 'https://linkedin.bintray.com/maven/com/linkedin/pegasus/gradle-plugins/28.3.7/gradle-plugins-28.3.7.pom'.
#9 80.64                > linkedin.bintray.com: Temporary failure in name resolution
#9 80.64 
#9 80.64 * Try:
#9 80.64 Run with --stacktrace option to get the stack trace. Run with --info or --debug option to get more log output. Run with --scan to get full insights.
#9 80.64 
#9 80.64 * Get more help at https://help.gradle.org
#9 80.64 
#9 80.64 BUILD FAILED in 1m 20s
------
executor failed running [/bin/sh -c cd datahub-src && ./gradlew :metadata-ingestion-examples:mce-cli:build]: exit code: 1
ERROR: Service 'ingestion' failed to build

解決方案1:

1確認(rèn)運(yùn)行目錄是在datahub的根目錄下
在目錄下新建一個(gè)文件
vim sources.list
deb http://mirrors.ustc.edu.cn/debian buster main contrib non-free 
deb http://mirrors.ustc.edu.cn/debian buster-backports main contrib non-free 
deb http://mirrors.ustc.edu.cn/debian buster-proposed-updates main contrib non-free  
deb http://mirrors.ustc.edu.cn/debian-security buster/updates main contrib non-free 


2.修改./docker/ingestion/Dockerfile
 cat ./docker/ingestion/Dockerfile
# Defining environment
ARG APP_ENV=prod

FROM openjdk:8-jre-alpine as base
FROM openjdk:8 as prod-build
COPY . datahub-src
COPY sources.list /etc/apt/sources.list
#為阿里云的地址
RUN apt-get update
#更新
RUN ls /etc/apt/sources.list
RUN cd datahub-src && ./gradlew :metadata-ingestion-examples:mce-cli:build

FROM base as prod-install

COPY --from=prod-build datahub-src/metadata-ingestion-examples/mce-cli/build/libs/mce-cli.jar /datahub/ingestion/bin/mce-cli.jar
COPY --from=prod-build datahub-src/metadata-ingestion-examples/mce-cli/example-bootstrap.json /datahub/ingestion/example-bootstrap.json

FROM base as dev-install

# Dummy stage for development. Assumes code is built on your machine and mounted to this image.
# See this excellent thread https://github.com/docker/cli/issues/1134

FROM ${APP_ENV}-install as final

CMD java -jar /datahub/ingestion/bin/mce-cli.jar -m produce /datahub/ingestion/example-bootstrap.json

3.多次運(yùn)行./docker/ingestion/ingestion.sh

報(bào)錯(cuò)5

Step 1/7 : FROM jwilder/dockerize:0.6.1
 ---> 849596ab86ff
Step 2/7 : RUN apk add --no-cache curl jq
 ---> [Warning] IPv4 forwarding is disabled. Networking will not work.

解決方案

第一步:在宿主機(jī)上執(zhí)行echo "net.ipv4.ip_forward=1" >>/usr/lib/sysctl.d/00-system.conf
第二步:重啟network和docker服務(wù)

[root@localhost /]# systemctl restart network && systemctl restart docker

截圖留念


圖片.png
圖片.png
最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時(shí)請(qǐng)結(jié)合常識(shí)與多方信息審慎甄別。
平臺(tái)聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點(diǎn),簡書系信息發(fā)布平臺(tái),僅提供信息存儲(chǔ)服務(wù)。

相關(guān)閱讀更多精彩內(nèi)容

友情鏈接更多精彩內(nèi)容