Ubuntu搭建Hadoop

環(huán)境

服務(wù)器(虛擬機(jī)):

  • vm-master 10.211.55.23
  • vm-slave1 10.211.55.25
  • vm-slave2 10.211.55.24

軟件環(huán)境:

  • Hadoop 2.7
  • JDK 1.8
  • Ubuntu 14.04

Step1:創(chuàng)建賬號(hào)并授權(quán)

使用root賬戶創(chuàng)建 hadoop用戶,并設(shè)置密碼為 111111

adduser hadoop
輸入密碼:111111
確認(rèn)密碼:111111
以下步驟回車即可...

使用root賬戶給Hadoop用戶授予root權(quán)限

vim /etc/sudoers
在 "root ALL=(ALL:ALL) ALL" 下添加
hadoop  ALL=(ALL:ALL) ALL

如下:
# User privilege specification
root    ALL=(ALL:ALL) ALL
hadoop  ALL=(ALL:ALL) ALL

注:保存退出使用 wq!

Step2:修改hosts地址

vim /etc/hosts
10.211.55.23  vm-master
10.211.55.25  vm-slave1
10.211.55.24  vm-slave2

注:使用ip地址使用ifconfig查看

Step3:安裝SSH并設(shè)置免密碼登錄

若未安裝SSH服務(wù),使用:apt-get install ssh 即可。

使用hadoop用戶生成公鑰,私鑰

su hadoop

cd ~

ssh-keygen -t rsa -P ""

一路回車即可...

執(zhí)行完后,將在 /home/hadoop/.ssh 文件加中生成 id_rsa(私鑰),id_rsa.pub(公鑰)

將公鑰添加到 authorized_keys(此文件用于保存允許以當(dāng)前用戶身份登錄到ssh客戶端用戶的公鑰內(nèi)容)

cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

修改此文件權(quán)限
chmod 600 ~/.ssh/authorized_keys

修改配置文件
sudo vim /etc/ssh/sshd_config, 取消下列注釋:

RSAAuthentication yes
PubkeyAuthentication yes
AuthorizedKeysFile  .ssh/authorized_keys

重啟SSH服務(wù)
sudo service ssh restart

使用haddop用戶測(cè)試免密碼登錄localhost
ssh hadoop@localhost (彈出確認(rèn)信息,輸入yes即可)

Step4:安裝SDK

更新軟件
sudo apt-get update

安裝軟件
sudo apt-get install software-properties-common

添加源
sudo add-apt-repository ppa:webupd8team/java

再次更新
sudo apt-get update

安裝JDK
sudo apt-get install oracle-java8-installer

檢查是否安裝成功:
java -version

java version "1.8.0_121"
Java(TM) SE Runtime Environment (build 1.8.0_121-b13)
Java HotSpot(TM) 64-Bit Server VM (build 25.121-b13, mixed mode)

注:若此步驟安裝太慢,也可到 http://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html 下載自行安裝。

Step5:安裝vm-slave1,vm-slave2

重復(fù)以上操作

Step6:配置允許vm-master免密碼登錄vm-slave1, vm-slave2

在vm-master上操作

su hadoop

ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop@vm-slave1

ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop@vm-slave2

驗(yàn)證是否設(shè)置成功
在vm-master上,切換至hadoop用戶
ssh hadoop@vm-slave1, 看是否可以免密碼登錄

Step7:下載安裝Hadoop

wget http://statics.charlesjiang.com/hadoop-2.7.3.tar.gz

tar zxvf hadoop-2.7.3.tar.gz

sudo mv hadoop-2.7.3 /usr/local/hadoop

將hadoop文件夾屬主改為hadoop
sudo chown -R hadoop:hadoop /usr/local/hadoop

Step8:配置Hadoop

涵蓋配置文件:

/usr/local/hadoop/etc/hadoop/slaves
/usr/local/hadoop/etc/hadoop/core-site.xml
/usr/local/hadoop/etc/hadoop/hdfs-site.xml
/usr/local/hadoop/etc/hadoop/mapred-site.xml
/usr/local/hadoop/etc/hadoop/yarn-site.xml
/usr/local/hadoop/etc/hadoop/hadoop-env.sh
  • 修改core-site.xml
<configuration>
        <property>
                <name>fs.default.name</name>
                <value>hdfs://vm-master:9000</value>
        </property>
</configuration>
注:此處value不能配置為localhost
  • 修改mapred-site.xml
cp mapred-site.xml.template  ./mapred-site.xml

vim mapred-site.xml

<configuration>
        <property>
                <name>fs.default.name</name>
                <value>hdfs://vm-master:9000</value>
        </property>
        <property>
                <name>mapred.job.tracker</name>
                <value>hdfs://vm-master:9001</value>
        </property>
        <property>
                <name>mapreduce.framework.name</name>
                <value>yarn</value>
        </property>
</configuration>
  • 修改 hdfs-site.xml
<configuration>
        <property>
                <name>dfs.name.dir</name>
                <value>/usr/local/hadoop/namenode</value>
        </property>
        <property>
                <name>dfs.data.dir</name>
                <value>/usr/local/hadoop/datanode</value>
        </property>
        <property>
                <name>dfs.replication</name>
                <value>1</value>
        </property>
</configuration>

  • 修改 yarn-site.xml
<configuration>
        <property>
                <name>yarn.resourcemanager.address</name>
                <value>vm-master:8032</value>
        </property>
        <property>
                <name>yarn.resourcemanager.scheduler.address</name>
                <value>vm-master:8030</value>
        </property>
        <property>
                <name>yarn.resourcemanager.webapp.address</name>
                <value>vm-master:8088</value>
        </property>
        <property>
                <name>yarn.resourcemanager.resource-tracker.address</name>
                <value>vm-master:8031</value>
        </property>
        <property>
                <name>yarn.resourcemanager.admin.address</name>
                <value>vm-master:8033</value>
        </property>
        <property>
                <name>yarn.nodemanager.aux-services</name>
                <value>mapreduce_shuffle</value>
        </property>
        <property>
                <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
                <value>org.apache.hadoop.mapred.ShuffleHandler</value>
        </property>
</configuration>

  • 修改slaves
vm-master
vm-slave1
vm-slave2
  • 修改hadoop-env.sh
export JAVA_HOME=/usr/lib/jvm/java-8-oracle

注:此處填寫JDK絕對(duì)路徑

Step9:配置hadoop用戶環(huán)境變量

su hadoop
vim /home/hadoop/.bash_profile

具體配置如下:
export HADOOP_HOME=/usr/local/hadoop
export HADOOP_MAPRED_HOME=${HADOOP_HOME}
export HADOOP_COMMON_HOME=${HADOOP_HOME}
export HADOOP_HDFS_HOME=${HADOOP_HOME}
export YARN_HOME=${HADOOP_HOME}
export HADOOP_YARN_HOME=${HADOOP_HOME}
export HADOOP_CONF_DIR=${HADOOP_HOME}/etc/hadoop
export HDFS_CONF_DIR=${HADOOP_HOME}/etc/hadoop
export YARN_CONF_DIR=${HADOOP_HOME}/etc/hadoop
export SCALA_HOME=/usr/local/scala
export SPARK_HOME=/usr/local/spark

JAVA_HOME=/usr/lib/jvm/java-8-oracle
JRE_HOME=/usr/lib/jvm/java-8-oracle/jre
CLASSPATH=.:$JAVA_HOME/lib/tools.jar

PATH=$JAVA_HOME/bin:$SCALA_HOME/bin:$SPARK_HOME/bin:$PATH

export JAVA_HOME CLASSPATH PATH USER LOGNAME MAIL HOSTNAME

同時(shí)將vm-slave1, vm-slave2 做相同配置

執(zhí)行配置文件
source /home/hadoop/.bash_profile

Step10:將vm-master的Hadoop拷貝至 vm-slave1, vm-slave2

scp -r /usr/local/hadoop/ hadoop@vm-slave1:/home/hadoop

scp -r /usr/local/hadoop/ hadoop@vm-slave2:/home/hadoop

分別在vm-slave1, vm-slave2中,將hadoop文件夾移至 /usr/local/hadoop

sudo mv hadoop /usr/local/hadoop

移動(dòng)后再在vm-slave1, vm-slave2 將/usr/local/hadoop宿主改為hadoop

sudo chown -R hadoop:hadoop  /usr/local/hadoop/

Step11:格式化HDFS

在vm-master上操作

cd /usr/local/hadoop

./bin/hdfs namenode -format


輸出類似如下信息則格式成功:

17/03/29 15:14:36 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 0
17/03/29 15:14:36 INFO util.ExitUtil: Exiting with status 0
17/03/29 15:14:36 INFO namenode.NameNode: SHUTDOWN_MSG: 
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at vm-master.localdomain/127.0.1.1
************************************************************/

Step12:?jiǎn)?dòng)Hadoop

在vm-master上操作

sbin/start-dfs.sh
啟動(dòng)后,輸入jps

類似如下信息則成功:
2403 DataNode
3188 Jps
3079 SecondaryNameNode
2269 NameNode

注:彈出確認(rèn)信息,輸入yes即可

Step12:?jiǎn)?dòng)Yarn

在vm-master上操作

sbin/start-yarn.sh 
啟動(dòng)后,輸入jps

類似如下信息則成功:
3667 Jps
2403 DataNode
3237 ResourceManager
3079 SecondaryNameNode
2269 NameNode
3391 NodeManager

再在 vm-sleve1 或 vm-sleve2中輸入 jps
顯示類似如下信息:

2777 Jps
2505 DataNode
2654 NodeManager

Step13:驗(yàn)證安裝

  1. 查看Hadoop狀態(tài)
bin/hdfs dfsadmin -report

如下信息:

Configured Capacity: 198576648192 (184.94 GB)
Present Capacity: 180282531840 (167.90 GB)
DFS Remaining: 180282449920 (167.90 GB)
DFS Used: 81920 (80 KB)
DFS Used%: 0.00%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
Missing blocks (with replication factor 1): 0

-------------------------------------------------
Live datanodes (3):

Name: 10.211.55.24:50010 (vm-slave2)
Hostname: vm-master
Decommission Status : Normal
Configured Capacity: 66192216064 (61.65 GB)
DFS Used: 24576 (24 KB)
Non DFS Used: 6213672960 (5.79 GB)
DFS Remaining: 59978518528 (55.86 GB)
DFS Used%: 0.00%
DFS Remaining%: 90.61%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Wed Mar 29 16:38:34 CST 2017


Name: 10.211.55.25:50010 (vm-slave1)
Hostname: vm-master
Decommission Status : Normal
Configured Capacity: 66192216064 (61.65 GB)
DFS Used: 24576 (24 KB)
Non DFS Used: 6213672960 (5.79 GB)
DFS Remaining: 59978518528 (55.86 GB)
DFS Used%: 0.00%
DFS Remaining%: 90.61%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Wed Mar 29 16:38:34 CST 2017


Name: 10.211.55.23:50010 (vm-master)
Hostname: vm-master
Decommission Status : Normal
Configured Capacity: 66192216064 (61.65 GB)
DFS Used: 32768 (32 KB)
Non DFS Used: 5866770432 (5.46 GB)
DFS Remaining: 60325412864 (56.18 GB)
DFS Used%: 0.00%
DFS Remaining%: 91.14%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Wed Mar 29 16:38:34 CST 2017
  1. 查看HDFS管理頁(yè)面

常見(jiàn)問(wèn)題

  1. 系統(tǒng)亂碼如下圖:


    亂碼

解決辦法:

sudo vim /etc/environment
添加
LANG="en_US.UTF-8"
LANGUAGE="en_US:en"

sudo vim /var/lib/locales/supported.d/local
添加:
en_US.UTF-8 UTF-8

sudo vim /etc/default/locale
修改:
LANG="en_US.UTF-8"
LANGUAGE="en_US:en"

重啟
sudo reboot
  1. 克隆虛擬機(jī)后網(wǎng)卡失效問(wèn)題
    解決辦法:
vim /etc/udev/rules.d/70-persistent-net.rules

刪除含有eth0 的行,并將 eth1 改為eth0

重啟即可
網(wǎng)卡

3.未找到環(huán)境變量 JAVA_HOME

vm-slave2: Error: JAVA_HOME is not set and could not be found.
vm-slave1: Error: JAVA_HOME is not set and could not be found.

解決辦法:

將所有服務(wù)器的 hadoop-env.sh 中的 export JAVA_HOME= 設(shè)置為JDK絕對(duì)路徑
  1. 無(wú)法打開 http://vm-master:8088/
檢查 hosts文件,

127.0.0.1       localhost
#127.0.1.1      vm-master.localdomain   vm-master  【將此信息屏蔽】

# The following lines are desirable for IPv6 capable hosts
::1     localhost ip6-localhost ip6-loopback
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters

10.211.55.23  vm-master
10.211.55.25  vm-slave1
10.211.55.24  vm-slave2

關(guān)閉yarn  sbin/stop-yarn.sh
關(guān)閉hdfs  sbin/stop-dfs.sh
啟動(dòng)hdfs  sbin/start-dfs.sh
啟動(dòng)yarn  sbin/start-yarn.sh

博客地址:http://www.charlesjiang.com/archives/45.html

最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者
【社區(qū)內(nèi)容提示】社區(qū)部分內(nèi)容疑似由AI輔助生成,瀏覽時(shí)請(qǐng)結(jié)合常識(shí)與多方信息審慎甄別。
平臺(tái)聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點(diǎn),簡(jiǎn)書系信息發(fā)布平臺(tái),僅提供信息存儲(chǔ)服務(wù)。

相關(guān)閱讀更多精彩內(nèi)容

  • 1 目的將hadoop 2.7.1 安裝到 166、167、168 三臺(tái)機(jī)器上2 提供環(huán)境練習(xí)環(huán)境192.168....
    灼灼2015閱讀 3,639評(píng)論 4 40
  • 最近在學(xué)習(xí)大數(shù)據(jù)技術(shù),朋友叫我直接學(xué)習(xí)Spark,英雄不問(wèn)出處,菜鳥不問(wèn)對(duì)錯(cuò),于是我就開始了Spark學(xué)習(xí)。 為什...
    Plokmijn閱讀 26,850評(píng)論 6 26
  • Mapreduce+Hive+Spark平臺(tái)搭建 說(shuō)明 平臺(tái)搭建成功后,使用Scala語(yǔ)言進(jìn)行算法設(shè)計(jì)和應(yīng)用的開發(fā)...
    澤澤馥澤澤閱讀 5,191評(píng)論 4 6
  • 1.Linux安裝前的配置 1.1.關(guān)閉防火墻 本站文檔是假定「服務(wù)器在防火墻以內(nèi)」的前提下編纂的,在這個(gè)前提下...
    lao男孩閱讀 3,481評(píng)論 0 4
  • 突然好想你,你會(huì)在哪里,過(guò)得快樂(lè)或委屈。突然好想你,突然鋒利的回憶,突然模糊的眼睛。 一覺(jué)醒來(lái),就陷入了深深的難過(guò)...
    膚白大燕閱讀 457評(píng)論 0 1

友情鏈接更多精彩內(nèi)容