Hadoop部署方式:
1.單機(jī)模式standalone: 1個(gè)java進(jìn)程
2.偽分布模式Pseudo-Distributed Mode: 開(kāi)發(fā)|學(xué)習(xí) 多個(gè)java進(jìn)程
3.集群模式Cluster Mode :生產(chǎn) 多臺(tái)機(jī)器多個(gè)java進(jìn)程
偽分布式部署: HDFS
1.創(chuàng)建hadoop服務(wù)的一個(gè)用戶(hù)
[root@hadoop003 software]# useradd hadoop
[root@hadoop003 software]# id hadoop
uid=515(hadoop) gid=515(hadoop) groups=515(hadoop)
配置hadoop用戶(hù)權(quán)限
[root@hadoop003 software]# vi /etc/sudoers
hadoop ALL=(root) NOPASSWD:ALL
2.部署JAVA
Oracle jdk1.8(Open JDK盡量不要使用)
2.1 解壓+環(huán)境變量
上節(jié)已操作過(guò)
2.2 CDH課 /usr/java
查看Java服務(wù)路徑
[root@hadoop003 jdk1.8.0_45]# which java
/usr/java/jdk1.8.0_45/bin/java
3.部署ssh服務(wù)是運(yùn)行
系統(tǒng)安裝時(shí)默認(rèn)安裝ssh服務(wù)
[root@hadoop003 ~]# service sshd status
openssh-daemon (pid 1386) is running...
4.解壓hadoop,并進(jìn)行軟連接
[root@hadoop003 software]# rz hadoop-2.8.1.tar.gz上傳壓縮包
[root@hadoop003 software]# tar -xzvf hadoop-2.8.1.tar.gz
[root@hadoop003 software]# ln -s /opt/software/hadoop-2.8.1 hadoop
修改權(quán)限
chown -R hadoop:hadoop hadoop
chown -R hadoop:hadoop hadoop/*
chown -R hadoop:hadoop /opt/software/hadoop-2.8.1
刪除無(wú)用文件
[root@hadoop003 software]# cd hadoop
[root@hadoop003 hadoop]# rm -f .txt
注意:
chown -R hadoop:hadoop 文件夾 -->文件夾和文件夾的里面的
chown -R hadoop:hadoop 軟連接文件夾 --> 只修改軟連接文件夾,不會(huì)修改文件夾里面的
chown -R hadoop:hadoop 軟連接文件夾/ --> 軟連接文件夾不修改,只會(huì)修改文件夾里面的
chown -R hadoop:hadoop hadoop-2.8.1 --> 修改原文件夾
bin/: 命令
etc/:配置文件
sbin/: 用來(lái)啟動(dòng)關(guān)閉hadoop進(jìn)程
5.切換hadoop用戶(hù)和配置
[root@hadoop003 hadoop]# su - hadoop
[hadoop@hadoop003 ~]$ ll
total 0
[hadoop@hadoop003 ~]$ cd /opt/software/hadoop
[hadoop@hadoop003 hadoop]$ ll
total 28
drwxr-xr-x. 2 hadoop hadoop 4096 Dec 10 11:54 bin
drwxr-xr-x. 3 hadoop hadoop 4096 Dec 10 11:54 etc
drwxr-xr-x. 2 hadoop hadoop 4096 Dec 10 11:54 include
drwxr-xr-x. 3 hadoop hadoop 4096 Dec 10 11:54 lib
drwxr-xr-x. 2 hadoop hadoop 4096 Dec 10 11:54 libexec
drwxr-xr-x. 2 hadoop hadoop 4096 Dec 10 11:54 sbin
drwxr-xr-x. 3 hadoop hadoop 4096 Dec 10 11:54 share
[hadoop@rzdatahadoop002 hadoop]$ cd etc/hadoop
說(shuō)明:
hadoop-env.sh : hadoop配置環(huán)境
core-site.xml : hadoop 核心配置文件
hdfs-site.xml : hdfs服務(wù)的 --> 會(huì)起進(jìn)程
[mapred-site.xml : mapred計(jì)算所需要的配置文件] 只當(dāng)在jar計(jì)算時(shí)才有
yarn-site.xml : yarn服務(wù)的 --> 會(huì)起進(jìn)程
slaves: 集群的機(jī)器名稱(chēng)
[hadoop@hadoop003 hadoop]$ vi core-site.xml
添加:
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
[hadoop@hadoop003 hadoop]$ vi hdfs-site.xml
添加:
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
6.配置hadoop用戶(hù)的ssh的信任關(guān)系
[hadoop@hadoop003 ~]$ ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa
[hadoop@hadoop003 ~]$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
[hadoop@hadoop003 ~]$ chmod 0600 ~/.ssh/authorized_keys
第一次輸入需要yes
[hadoop@hadoop003 .ssh]$ ssh hadoop003 date
The authenticity of host 'hadoop003 (192.168.137.201)' can't be established.
RSA key fingerprint is 9a:ea:f5:06:bf:de:ca:82:66:51:81:fe:bf:8a:62:36.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'hadoop003,192.168.137.201' (RSA) to the list of known hosts.
Wed Dec 13 22:15:01 CST 2017
[hadoop@hadoop003 .ssh]$ ll
total 16
-rw-------. 1 hadoop hadoop 404 Dec 13 22:13 authorized_keys
-rw-------. 1 hadoop hadoop 1675 Dec 13 22:13 id_rsa
-rw-r--r--. 1 hadoop hadoop 404 Dec 13 22:13 id_rsa.pub
-rw-r--r--. 1 hadoop hadoop 413 Dec 13 22:15 known_hosts
第二次輸入,不需要
[hadoop@hadoop003 .ssh]$ ssh hadoop003 date
Wed Dec 13 22:15:07 CST 2017
7.格式化
[hadoop@hadoop003 hadoop]$ bin/hdfs namenode -format
8.啟動(dòng)HDFS服務(wù)
[hadoop@hadoop003 sbin]$ ./start-dfs.sh
Starting namenodes on [localhost]
The authenticity of host 'localhost (::1)' can't be established.
RSA key fingerprint is 9a:ea:f5:06:bf:de:ca:82:66:51:81:fe:bf:8a:62:36.
Are you sure you want to continue connecting (yes/no)? yes
localhost: Warning: Permanently added 'localhost' (RSA) to the list of known hosts.
localhost: Error: JAVA_HOME is not set and could not be found.
localhost: Error: JAVA_HOME is not set and could not be found.
Starting secondary namenodes [0.0.0.0]
The authenticity of host '0.0.0.0 (0.0.0.0)' can't be established.
RSA key fingerprint is 9a:ea:f5:06:bf:de:ca:82:66:51:81:fe:bf:8a:62:36.
Are you sure you want to continue connecting (yes/no)? yes
0.0.0.0: Warning: Permanently added '0.0.0.0' (RSA) to the list of known hosts.
0.0.0.0: Error: JAVA_HOME is not set and could not be found.
[hadoop@hadoop003 sbin]$ ps -ef|grep hadoop
root 11292 11085 0 21:59 pts/1 00:00:00 su - hadoop
hadoop 11293 11292 0 21:59 pts/1 00:00:00 -bash
hadoop 11822 11293 0 22:34 pts/1 00:00:00 ps -ef
hadoop 11823 11293 0 22:34 pts/1 00:00:00 grep hadoop
[hadoop@hadoop003 sbin]$ echo $JAVA_HOME
/usr/java/jdk1.8.0_45
發(fā)現(xiàn)JAVA_HOME變量是存在的,無(wú)法啟動(dòng)HDFS服務(wù)
配置/etc/hadoop/hadoop-env.sh
[hadoop@hadoop003 sbin]$ vi ../etc/hadoop/hadoop-env.sh
The java implementation to use.添加:
export JAVA_HOME=/usr/java/jdk1.8.0_45
[hadoop@hadoop003 sbin]$ ./start-dfs.sh
Starting namenodes on [localhost]
localhost: starting namenode, logging to /opt/software/hadoop-2.8.1/logs/hadoop-hadoop-namenode-hadoop003.out
localhost: starting datanode, logging to /opt/software/hadoop-2.8.1/logs/hadoop-hadoop-datanode-hadoop003.out
Starting secondary namenodes [0.0.0.0]
0.0.0.0: starting secondarynamenode, logging to /opt/software/hadoop-2.8.1/logs/hadoop-hadoop-secondarynamenode-hadoop003.out
namenode(名稱(chēng)節(jié)點(diǎn)) : localhost
datanode(數(shù)據(jù)節(jié)點(diǎn)) : localhost
secondary namenode(第二名稱(chēng)節(jié)點(diǎn)): 0.0.0.0
http://localhost:50070/
默認(rèn)的端口:50070
web: localhost:9000
9.使用命令(hadoop、hdfs)
[hadoop@hadoop003 bin]$ ./hdfs dfs -mkdir /user
[hadoop@hadoop003 bin]$ ./hdfs dfs -mkdir /user/hadoop
[hadoop@hadoop003 bin]$ echo "000000" > lizhigang.log
[hadoop@hadoop003 bin]$ ./hadoop fs -put lizhigang.log hdfs://localhost:9000/
[hadoop@hadoop003 bin]$
[hadoop@hadoop003 bin]$ ./hadoop fs -ls hdfs://localhost:9000/
Found 2 items
-rw-r--r-- 1 hadoop supergroup 7 2017-12-13 22:56 hdfs://localhost:9000/rz.log
drwxr-xr-x - hadoop supergroup 0 2017-12-13 22:55 hdfs://localhost:9000/user
[hadoop@hadoop003 bin]$ ./hadoop fs -ls /
Found 2 items
-rw-r--r-- 1 hadoop supergroup 7 2017-12-13 22:56 hdfs://localhost:9000/rz.log
drwxr-xr-x - hadoop supergroup 0 2017-12-13 22:55 hdfs://localhost:9000/user
10.修改hdfs://localhost:9000為hdfs://192.168.137.200:9000
[hadoop@hadoop003 bin]$ ../sbin/stop-dfs.sh 停止服務(wù)
[hadoop@hadoop003 bin]$ vi ../etc/hadoop/core-site.xml
修改為:
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://192.168.137.200:9000</value>
</property>
</configuration>
[hadoop@hadoop003 bin]$ ./hdfs namenode -format
[hadoop@hadoop003 bin]$ ../sbin/start-dfs.sh
Starting namenodes on [hadoop003]
hadoop003: starting namenode, logging to /opt/software/hadoop-2.8.1/logs/hadoop-hadoop-namenode-hadoop003.out
localhost: starting datanode, logging to /opt/software/hadoop-2.8.1/logs/hadoop-hadoop-datanode-hadoop003.out
Starting secondary namenodes [0.0.0.0]
0.0.0.0: starting secondarynamenode, logging to /opt/software/hadoop-2.8.1/logs/hadoop-hadoop-secondarynamenode-hadoop003.out
[hadoop@hadoop003 bin]$ netstat -nlp|grep 9000
(Not all processes could be identified, non-owned process info
will not be shown, you would have to be root to see it all.)
tcp 0 0 192.168.137.200:9000 0.0.0.0:* LISTEN
14974/java
11.修改HDFS的服務(wù)以hadoop003啟動(dòng)
針對(duì)于datanode修改為hadoop003:
[hadoop@hadoop003 hadoop]$ vi slaves
修改為:
hadoop003
針對(duì)于secondarynamenode修改:
[hadoop@hadoop003 hadoop]$ vi hdfs-site.xml
添加:
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>hadoop003:50090</value>
</property>
<property>
<name>dfs.namenode.secondary.https-address</name>
<value>hadoop003:50091</value>
</property>
[hadoop@hadoop003 hadoop]$ cd ../../sbin
[hadoop@hadoop003 sbin]$ ./stop-dfs.sh
[hadoop@hadoop003 sbin]$ ./start-dfs.sh
Starting namenodes on [hadoop003]
hadoop003: starting namenode, logging to /opt/software/hadoop-2.8.1/logs/hadoop-hadoop-namenode-hadoop003.out
hadoop003: starting datanode, logging to /opt/software/hadoop-2.8.1/logs/hadoop-hadoop-datanode-hadoop003.out
Starting secondary namenodes [hadoop003]
hadoop003: starting secondarynamenode, logging to /opt/software/hadoop-2.8.1/logs/hadoop-hadoop-secondarynamenode-hadoop003.out
12.Yarn部署
進(jìn)入目錄:/opt/software/hadoop/etc/hadoop
[hadoop@hadoop003 hadoop]# cp mapred-site.xml.template mapred-site.xml
[hadoop@hadoop003 hadoop]# vi mapred-site.xml
添加:
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
[hadoop@hadoop003 hadoop]# vi yarn-site.xml
添加:
<configuration>
<!-- Site specific YARN configuration properties -->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
修改mapred-site.xml文件權(quán)限:
[hadoop@hadoop003 hadoop]# chown hadoop:hadoop *
啟動(dòng)服務(wù):
[hadoop@hadoop003 hadoop]$ sbin/start-yarn.sh
web界面查看:
http://192.168.137.200:8088/
http://localhost:8088/
13.MR Job測(cè)試
查看Mapreduce 測(cè)試?yán)搪窂剑?br>
[hadoop@hadoop003 hadoop]$ find ./ -name "example"
執(zhí)行:
[hadoop@hadoop003 hadoop]$ bin/hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-examples-2.8.1.jar pi 5 10
注意:執(zhí)行時(shí)Hadoop多次格式化出現(xiàn)17/12/19 00:19:51 WARN hdfs.DataStreamer: DataStreamer Excep
參考:https://blog.csdn.net/yu0_zhang0/article/details/78841623解決
停止Yarn:
[hadoop@hadoop003 sbin]$ ./stop-yarn.sh