大数据Hadoop系列之Hadoop分布式集群部署 分手后的思念是犯贱 2022-05-25 09:27 202阅读 0赞 # **一、部署规划** # ## **1、部署环境** ## * 各部署单元的软件部署情况如下: ![watermark_type_ZmFuZ3poZW5naGVpdGk_shadow_10_text_aHR0cHM6Ly9ibG9nLmNzZG4ubmV0L3ZvbGl0YXRpb25Mb25n_size_16_color_FFFFFF_t_70][] -------------------- # **二、环境准备** # ## **1、修改主机名** ## \[root@VM1 ~\]\# vim /etc/sysconfig/network NETWORKING=yes HOSTNAME=master60 \[root@VM2 ~\]\# vim /etc/sysconfig/network NETWORKING=yes HOSTNAME=slave61 \[root@VM3 ~\]\# vim /etc/sysconfig/network NETWORKING=yes HOSTNAME=slave62 \[root@VM4 ~\]\# vim /etc/sysconfig/network NETWORKING=yes HOSTNAME=slave63 -------------------- ## **2、配置网络映射** ## * 所有主机执行 \# vim /etc/hosts 192.168.9.60 master60 192.168.9.61 slave61 192.168.9.62 slave62 192.168.9.63 slave63 -------------------- ## **3、关闭防火墙** ## * 所有主机执行 \# service iptables stop && chkconfig iptables off && chkconfig --list | grep iptables -------------------- ## **4、禁用selinux** ## * 所有主机执行 \# vi /etc/selinux/config SELINUX=disabled -------------------- ## **5、禁用IPv6** ## * 所有主机执行 \# echo " " >> /etc/modprobe.d/dist.conf \# echo "alias net-pf-10 off" >> /etc/modprobe.d/dist.conf \# echo "alias ipv6 off" >> /etc/modprobe.d/dist.conf -------------------- ## **6、重启** ## * 所有主机执行 \# reboot -------------------- ## **7、配置ssh无秘钥登录** ## * 主机master60上执行 ### **7.1、生产公钥和私钥** ### \# ssh-keygen ### **7.2、拷贝公钥给目标服务器** ### \# ssh-copy-id -i master60 \# ssh-copy-id -i slave61 \# ssh-copy-id -i slave62 \# ssh-copy-id -i slave63 -------------------- ## **8、集群时间同步** ## ### **8.1、启动ntpd服务** ### * 所有主机执行 \# service ntpd start && chkconfig ntpd on && chkconfig --list | grep ntpd ### **8.2、更新时间** ### * 主机master60上执行 \# ntpdate -u ntp.sjtu.edu.cn ### **8.3、将系统时间同步给本机硬件时间** ### * 主机master60上执行 \# hwclock --localtime \# hwclock --localtime -w ### **8.4、硬件时间自动同步给系统时间** ### * 主机master60上执行 \# vi /etc/sysconfig/ntpd SYNC\_HWCLOCK=yes ### **8.5、系统时间自动更新硬件时间** ### * 主机master60上执行 \# vi /etc/sysconfig/ntpdate SYNC\_HWCLOCK=yes ### **8.6、时间进行更新并同步至集群其它主机** ### * 主机master60上执行 \# ntpdate -u slave61 \# ntpdate -u slave62 \# ntpdate -u slave63 ### **8.7、重启ntpd服务** ### * 所有主机执行 \# service ntpd restart && service crond restart -------------------- ## **9、设置用户的可打开最大文件数及进程数** ## * 所有主机执行 \# vi /etc/security/limits.conf # soft 软指标 ,给警告 # hard 硬指标 ,直接停止 * soft nofile 32728 * hard nofile 1024567 * soft nproc 65535 * hard nproc unlimited * soft memlock unlimited * hard memlock unlimited -------------------- ## **10、创建一个Hadoop专用的普通用户** ## * 所有主机执行 ### **10.1、创建hadoop用户并修改密码** ### \# useradd hadoop \# passwd hadoop ### **10.2、创建hadoop组,并将hadoop用户加入hadoop组** ### \# groupadd hadoop \# usermod -a -G hadoop hadoop ### **10.3、创建工作目录** ### \# mkdir /apps && cd /apps && mkdir lib logs run sh sharedstorage svr \# chown -R hadoop:hadoop /apps/\* -------------------- ***\#\#\#\#\#\#\#\#\#\#\#\#\#\#\#\#\#\#\#\#\#\# 后续操作转hadoop用户 \#\#\#\#\#\#\#\#\#\#\#\#\#\#\#\#\#\#\#\#\#\#*** # **三、Hadoop集群部署** # ## **1、JDK安装** ## * Java官网:[http://www.oracle.com/technetwork/java/javase/overview/index.html][http_www.oracle.com_technetwork_java_javase_overview_index.html]上下载JDK1.8的Linux安装程序,并上传到服务器; * 本平台采用JDK8U172的版本,jdk-8u172-linux-x64.tar.gz; * 下载链接:[http://download.oracle.com/otn-pub/java/jdk/8u172-b11/a58eab1ec242421181065cdc37240b08/jdk-8u172-linux-x64.tar.gz?AuthParam=1526031622\_e7d7b8c755b98e2dd9545d5a1b5bf17a][http_download.oracle.com_otn-pub_java_jdk_8u172-b11_a58eab1ec242421181065cdc37240b08_jdk-8u172-linux-x64.tar.gz_AuthParam_1526031622_e7d7b8c755b98e2dd9545d5a1b5bf17a]。 ### **1.1、创建JDK工作目录** ### * 所有主机执行 $ mkdir -p /apps/svr/java/ ### **1.2、上传解压JDK软件并拷贝至集群其它主机** ### * 主机master60上执行 $ tar -zxvf ~/jdk-8u172-linux-x64.tar.gz -C /apps/svr/java/ $ scp -r /apps/svr/java/jdk1.8.0\_172/ slave61:/apps/svr/java/ $ scp -r /apps/svr/java/jdk1.8.0\_172/ slave62:/apps/svr/java/ $ scp -r /apps/svr/java/jdk1.8.0\_172/ slave63:/apps/svr/java/ ### **1.3、配置JDK环境变量** ### * 所有主机执行 $ vim ~/.bash\_profile # JAVA_HOME export JAVA_HOME=/apps/svr/java/jdk1.8.0_172 export CLASSPATH=.:$JAVA_HOME/lib/tools.jar:$JAVA_HOME/lib/dt.jar export PATH=$PATH:$JAVA_HOME/bin ### **1.4、立即生效** ### * 所有主机执行 $ source ~/.bash\_profile -------------------- ## **2、部署Hadoop** ## * Hadoop官网:[http://archive.apache.org/dist/hadoop][http_archive.apache.org_dist_hadoop]上下载hadoop2.x的Linux安装程序,并上传到服务器; * 本平台采用hadoop-2.7.3的版本,hadoop-2.7.3.tar.gz; * 下载链接:[http://archive.apache.org/dist/hadoop/core/hadoop-2.7.3/hadoop-2.7.3.tar.gz][http_archive.apache.org_dist_hadoop_core_hadoop-2.7.3_hadoop-2.7.3.tar.gz]。 ### **2.1、部署准备** ### * 主机master60上执行 **a、创建hadoop工作目录,上传解压hadoop软件** $ mkdir -p /apps/svr/hadoop/ $ cd /apps/svr/hadoop/ $ mkdir conf data1 data2 lib logs run $ mkdir -p /apps/svr/hadoop/data1/dfs/dn $ mkdir -p /apps/svr/hadoop/data2/dfs/dn $ mkdir -p /apps/svr/hadoop/data1/dfs/nn $ mkdir -p /apps/svr/hadoop/data2/dfs/nn $ tar -zxvf ~/hadoop-2.7.3.tar.gz -C /apps/svr/hadoop/ **b、配置hadoop-env.sh** $ vim /apps/svr/hadoop/hadoop-2.7.3/etc/hadoop/hadoop-env.sh export JAVA\_HOME=/apps/svr/java/jdk1.8.0\_172/ -------------------- ### **2.2、部署HDFS** ### * 主机master60上执行 **a. 配置core-site.xml** $ vim /apps/svr/hadoop/hadoop-2.7.3/etc/hadoop/core-site.xml <configuration> <!-- 设置NameNode运行的主机 --> <property> <name>fs.defaultFS</name> <value>hdfs://master60:9000</value> </property> <!-- hadoop文件系统依赖的基础配置 --> <property> <name>hadoop.tmp.dir</name> <value>/apps/svr/hadoop/run/</value> </property> <!-- hadoop的用户代理机制 --> <property> <name>hadoop.proxyuser.hadoop.hosts</name> <value>*</value> </property> <property> <name>hadoop.proxyuser.hadoop.groups</name> <value>*</value> </property> </configuration> -------------------- **b. 配置hdfs-site.xml** $ vim /apps/svr/hadoop/hadoop-2.7.3/etc/hadoop/hdfs-site.xml <configuration> <!-- datanode数据保存路径 --> <property> <name>dfs.datanode.data.dir</name> <value>/apps/svr/hadoop/data1/dfs/dn,/apps/svr/hadoop/data2/dfs/dn</value> </property> <!-- namenode所使用的元数据保存路径 --> <property> <name>dfs.namenode.name.dir</name> <value>/apps/svr/hadoop/data1/dfs/nn,/apps/svr/hadoop/data2/dfs/nn</value> </property> <!-- hdfs数据块的复制份数 --> <property> <name>dfs.replication</name> <value>3</value> </property> <!-- dfs权限是否打开 --> <property> <name>dfs.permissions</name> <value>true</value> </property> </configuration> -------------------- **c. 配置slaves,设置DataNode运行的主机** $ vim /apps/svr/hadoop/hadoop-2.7.3/etc/hadoop/slaves slave61 slave62 slave63 -------------------- ### **2.3、部署YARN** ### * 主机master60上执行 **a. 配置yarn-site.xml** $ vim /apps/svr/hadoop/hadoop-2.7.3/etc/hadoop/yarn-site.xml <configuration> <!-- ResourceManager对客户端暴露的地址 --> <property> <name>yarn.resourcemanager.address</name> <value>master60:8032</value> </property> <!-- ResourceManager对ApplicationMaster暴露的访问地址 --> <property> <name>yarn.resourcemanager.scheduler.address</name> <value>master60:8030</value> </property> <!-- ResourceManager对NodeManager暴露的地址 --> <property> <name>yarn.resourcemanager.resource-tracker.address</name> <value>master60:8031</value> </property> <!-- ResourceManager对管理员暴露的访问地址 --> <property> <name>yarn.resourcemanager.admin.address</name> <value>master60:8033</value> </property> <!-- ResourceManager对外web ui地址 --> <property> <name>yarn.resourcemanager.webapp.address</name> <value>master60:8038</value> </property> <!-- mapreduce中间需要经过shuffle过程 --> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> </configuration> -------------------- **b. 配置mapred-site.xml** $ vim /apps/svr/hadoop/hadoop-2.7.3/etc/hadoop/mapred-site.xml <configuration> <!-- mapreduce运行在yarn平台上 --> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> </configuration> -------------------- ### **2.4、拷贝hadoop** ### * 主机master60上执行 $ scp -r /apps/svr/hadoop/ slave61:/apps/svr/ $ scp -r /apps/svr/hadoop/ slave62:/apps/svr/ $ scp -r /apps/svr/hadoop/ slave63:/apps/svr/ -------------------- ### **2.5、配置hadoop环境变量** ### * 所有主机执行 $ vim ~/.bash\_profile # HADOOP_HOME export HADOOP_HOME=/apps/svr/hadoop/hadoop-2.7.3 export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib/native" export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:$HADOOP_HOME/share/hadoop/tools/lib/* export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin * 立即生效 $ source ~/.bash\_profile -------------------- ### **2.6、启动测试hadoop** ### * 主机master60上执行 **a. 格式化** $ hadoop namenode -format **b. 启动** $ start-dfs.sh $ start-yarn.sh **c. 检查启动结果** $ yarn node -list **d. WEB UI验证** HDFS : [http://192.168.9.60:50070][http_192.168.9.60_50070] YARN : [http://192.168.9.60:8038][http_192.168.9.60_8038] -------------------- ### **2.7、历史服务器(88主机)** ### * 主机master62上执行 **a. 配置mapred-site.xml** $ cd /apps/svr/hadoop/hadoop-2.7.3/etc/hadoop/ $ vim mapred-site.xml <configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> <property> <name>mapreduce.jobhistory.address</name> <value>192.168.9.62:10020</value> </property> <property> <name>mapreduce.jobhistory.webapp.address</name> <value>192.168.9.62:19888</value> </property> <property> <name>mapreduce.jobhistory.done-dir</name> <value>${yarn.app.mapreduce.am.staging-dir}/history/done</value> </property> <property> <name>mapreduce.jobhistory.intermediate-done-dir</name> <value>${yarn.app.mapreduce.am.staging-dir}/history/done_intermediate</value> </property> <property> <name>yarn.app.mapreduce.am.staging-dir</name> <value>/tmp/hadoop-yarn/staging</value> </property> </configuration> -------------------- **b. 启动历史服务器** $ mr-jobhistory-daemon.sh start historyserver **C. WEB UI验证** [JobHistory : http://192.168.9.62:19888][JobHistory _ http_192.168.9.62_19888] [watermark_type_ZmFuZ3poZW5naGVpdGk_shadow_10_text_aHR0cHM6Ly9ibG9nLmNzZG4ubmV0L3ZvbGl0YXRpb25Mb25n_size_16_color_FFFFFF_t_70]: /images/20220525/142e32b8bf9b4aa885f17aa8c3a940fb.png [http_www.oracle.com_technetwork_java_javase_overview_index.html]: http://www.oracle.com/technetwork/java/javase/overview/index.html [http_download.oracle.com_otn-pub_java_jdk_8u172-b11_a58eab1ec242421181065cdc37240b08_jdk-8u172-linux-x64.tar.gz_AuthParam_1526031622_e7d7b8c755b98e2dd9545d5a1b5bf17a]: http://download.oracle.com/otn-pub/java/jdk/8u172-b11/a58eab1ec242421181065cdc37240b08/jdk-8u172-linux-x64.tar.gz?AuthParam=1526031622_e7d7b8c755b98e2dd9545d5a1b5bf17a [http_archive.apache.org_dist_hadoop]: http://archive.apache.org/dist/hadoop [http_archive.apache.org_dist_hadoop_core_hadoop-2.7.3_hadoop-2.7.3.tar.gz]: http://archive.apache.org/dist/hadoop/core/hadoop-2.7.3/hadoop-2.7.3.tar.gz [http_192.168.9.60_50070]: http://192.168.9.60:50070 [http_192.168.9.60_8038]: http://192.168.9.60:8038 [JobHistory _ http_192.168.9.62_19888]: http://192.168.9.62:19888
还没有评论,来说两句吧...