一、概述
上篇文章我們寫了Hadoop3.1.1的原始碼編譯,這裡我們將編譯的原始碼進行部署,作為我後面遠端程式碼除錯的目標叢集,這裡我把部署的一些重要的步驟寫一寫,希望對新手們有空,對Hadoop很熟悉的就不用看了。
叢集節點:
節點主機名說明192.168.0.101master.hadoop.ljsmaster節點192.168.0.102worker1.hadoop.ljsworker1節點192.168.0.103worker2.hadoop.ljsworker2節點
軟體版本:
Apache Hadoop3.1.1
JDK1.8
Centos7.2
二、安裝部署
1.叢集的初始化工作,請參照:Spark2.x入門:叢集(Standalone)安裝、配置、啟動指令碼詳解,ssh免密,關閉防火牆、jdk安裝等上面這篇文章已經詳解講了,這裡不再講解;
2.修改配置檔案,在Master節點配置好後,直接複製到另外兩個worker節點即可:
1).修改hadoop-env.sh檔案,新增以下內容,我這裡用root使用者安裝的,你如果用其他使用者下面就配置你的使用者即可:
# export JAVA_HOME= export JAVA_HOME=/opt/jdk1.8.0_112# Location of Hadoop. By default, Hadoop will attempt to determine# this location based upon its execution path.# export HADOOP_HOME=export HDFS_NAMENODE_USER=rootexport HDFS_DATANODE_USER=rootexport HDFS_SECONDARYNAMENODE_USER=rootexport YARN_RESOURCEMANAGER_USER=rootexport YARN_NODEMANAGER_USER=root
2).修改hdfs-site.xml,檔案內容如下:
<configuration> //namenode元資料目錄 <property> <name>dfs.name.dir</name> <value>/data/app/dataDir/dfs/name</value> <description> Path on the local filesystem where theNameNode stores the namespace and transactions logs persistently. </description> </property> //datanode資料目錄,就是你自己的資料 <property> <name>dfs.data.dir</name> <value>/data/app/dataDir/dfs/data</value> <description> Comma separated list of paths on the localfilesystem of a DataNode where it should store itsblocks. </description> </property> //web的埠一般就是50070 <property> <name>dfs.namenode.http-address</name> <value>master.hadoop.ljs:50070</value> </property> <property> <name>dfs.namenode.secondary.http-address</name> <value>master.hadoop.ljs:50090</value> </property> //三副本 <property> <name>dfs.replication</name> <value>3</value> </property> //檔案操作許可權檢查,這裡配置成false <property> <name>dfs.permissions</name> <value>false</value> <description>need not permissions</description> </property></configuration>
3).修改core-site.xml,檔案內容如下:
<configuration> <property> <name>fs.defaultFS</name> <value>hdfs://master.hadoop.ljs:8020</value> </property> //臨時檔案路徑 <property> <name>hadoop.tmp.dir</name> <value>/data/app/dataDir/tmp</value> </property></configuration>
4).修改yarn-site.xml,為方便檢視日誌,這裡配置了日誌聚合、每個nodemanager分配多少記憶體,檔案內容如下:
<configuration> <property> <name>yarn.resourcemanager.hostname</name> <value>master.hadoop.ljs</value> </property> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.log-aggregation-enable</name> <value>true</value> </property> <property> <name>yarn.log-aggregation.retain-seconds</name> <value>2592000</value> </property> <property> <name>yarn.log.server.url</name> <value>http://master.hadoop.ljs:19888/jobhistory/logs</value> </property> <property> <name>yarn.nodemanager.local-dirs</name> <value>/data/app/dataDir/yarn/local</value> </property> <property> <name>yarn.nodemanager.log-dirs</name> <value>/data/app/dataDir/yarn/log</value> </property> <property> <name>yarn.nodemanager.log.retain-second</name> <value>604800</value> </property> <property> <name>yarn.nodemanager.remote-app-log-dir</name> <value>/app-logs</value> </property> <property> <name>yarn.nodemanager.remote-app-log-dir-suffix</name> <value>logs</value> </property> <property> <name>yarn.nodemanager.delete.debug-delay-sec</name> <value>600</value> </property><property> <name>yarn.nodemanager.localizer.cache.target-size-mb</name> <value>1024</value></property><property> <name>yarn.nodemanager.localizer.cache.cleanup.interval-ms</name> <value>60000</value></property><property> <name>yarn.nodemanager.resource.memory-mb</name> <value>2048</value></property><property> <name>yarn.scheduler.maximum-allocation-mb</name> <value>2048</value></property><property> <name>yarn.scheduler.minimum-allocation-mb</name> <value>1024</value></property><property> <name>yarn.scheduler.minimum-allocation-mb</name> <value>1024</value></property><property> <name>yarn.nodemanager.resource.cpu-vcores</name> <value>2</value></property><property> <name>yarn.scheduler.maximum-allocation-vcores</name> <value>2</value></property><property> <name>yarn.scheduler.minimum-allocation-vcores</name> <value>1</value></property></configuration>
5).mapred-site.xml,上面yarn-site.xml也配置了historyserver服務,這裡跟它配置要一致:
<configuration><property><name>mapreduce.framework.name</name><value>yarn</value></property><property><name>mapreduce.jobhistory.address</name><value>master.hadoop.ljs:10020</value></property><property><name>mapreduce.jobhistory.webapp.address</name><value>master.hadoop.ljs:19888</value></property><property> <name>mapreduce.reduce.memory.mb</name> <value>1024</value></property><property> <name>mapreduce.map.memory.mb</name> <value>1024</value></property><property> <name>yarn.app.mapreduce.am.resource.mb</name> <value>1024</value></property></configuration>
6).修改workers檔案,上面指定了資料三副本,這裡最少也得配置三個datanode,如果你配置的是一個副本,可以配置一個或者更多datanode,檔案內容如下:
[root@master hadoop]# cat workers master.hadoop.ljsworker1.hadoop.ljsworker1.hadoop.ljs
3.配置檔案修改完成,拷貝到worker1、worker節點:
[root@master hadoop]# scp -r /data/app/hadoop-3.1.1 worker1:/data/app/[root@master hadoop]# scp -r /data/app/hadoop-3.1.1 worker1:/data/app/
4.為了方便操作,這裡可以修改下環境變數,在/etc/profile新增:
export HADOOP_HOME=/data/app/hadoop-3.1.1export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH
執行 source /etc/profile生效
source /etc/profile
5.啟動叢集,這裡把常用命令列一下:
1)啟動整個叢集,在namenode節點執行
/data/app/hadoop-3.1.1/sbin/start-all.sh
2)停止整個叢集,在namenode節點執行
/data/app/hadoop-3.1.1/sbin/stop-all.sh
3)單獨啟動/停止namenode,只需在namenode節點執行
/data/app/hadoop-3.1.1/sbin/hadoop-daemon.sh start namenode /data/app/hadoop-3.1.1/sbin/hadoop-daemon.sh stop namenode
4)單獨啟動/停止datanode,各個datanode都要執行
/data/app/hadoop-3.1.1/sbin/hadoop-daemon.sh start datanode /data/app/hadoop-3.1.1/sbin/hadoop-daemon.sh stop datanode
5)啟動/停止所有datanode,在namenode節點執行
/data/app/hadoop-3.1.1/sbin/hadoop-daemons.sh start datanode /data/app/hadoop-3.1.1/sbin/hadoop-daemons.sh stop datanode
6)啟動/停止整個yarn服務,在namenode節點執行
/data/app/hadoop-3.1.1/sbin/start-yarn.sh/data/app/hadoop-3.1.1/sbin/stop-yarn.sh
7)啟動/停止yarn resourcemanager服務,在namenode節點執行:
/data/app/hadoop-3.1.1/sbin/yarn-daemon.sh start resourcemanager/data/app/hadoop-3.1.1/sbin/yarn-daemon.sh stop resourcemanager
8)啟動/停止單個yarn nodemanager服務,各個nodemanager都要執行
/data/app/hadoop-3.1.1/sbin/yarn-daemon.sh start nodemanager/data/app/hadoop-3.1.1/sbin/yarn-daemon.sh stop nodemanager
9)啟動/停止所有yarn nodemanager服務,在namenode節點執行:
/data/app/hadoop-3.1.1/sbin/yarn-daemons.sh start nodemanager/data/app/hadoop-3.1.1/sbin/yarn-daemons.sh stop nodemanager
10)啟動/停止historyserver
/data/app/hadoop-3.1.1/sbin/mr-jobhistory-daemon.sh start historyserver/data/app/hadoop-3.1.1/sbin/mr-jobhistory-daemon.sh stop historyserver
6.叢集啟動後,可訪問master.hadoop.ljs:50070埠,進行驗證。