CentOS7 从零开始搭建 Hadoop2.7集群

太过爱你忘了你带给我的痛 2022-09-26 02:18 281阅读 0赞
  • 序言
  • 文件准备
  • 权限修改
  • 配置系统环境
  • 配置Hadoop集群
  • 配置无密码登录
  • 启动Hadoop
  • 默认举例

序言

  • 下载软件与工具包
  1. * [pscp.exe][] : 用于从本地到目标机器的文件传输
  2. * [hadoop-2.7.3.targ.gz][]: Hadoop 2.7 软件包
  3. * [JDK 1.8][]: Java 运行环境
  • 准备四台安装好CentOS Minimal 的机器,且已经配置网络环境。(只需要记住四台机器的IP地址,主机名后面设置)
  1. * 机器1 主机名 node IP: 192.168.169.131
  2. * 机器1 主机名 node1 IP: 192.168.169.133
  3. * 机器1 主机名 node2 IP: 192.168.169.132
  4. * 机器1 主机名 node3 IP: 192.168.169.134

文件准备

  1. 添加用户组与用户

    1. groupadd hadoop
    2. useradd -d /home/hadoop -g hadoop hadoop
  2. 复制本机文件到目标机器

    1. pscp.exe -pw 12345678 hadoop-2.7.3.tar.gz root@192.168.169.131:/usr/local
    2. pscp.exe -pw 12345678 spark-2.0.0-bin-hadoop2.7.tgz root@192.168.169.131:/usr/local
  3. 解压并复制文件

    1. tar -zxvf /usr/local/jdk-8u101-linux-x64.tar.gz
  1. #重命名
  2. mv /usr/local/jdk1.8.0_101 /usr/local/jdk1.8
  3. tar -zxvf /usr/local/hadoop-2.7.3.tar.gz
  4. mv /usr/local/hadoop-2.7.3 /home/hadoop/hadoop2.7

权限修改

  1. 修改夹所有者

    1. chmod -R hadoop:hadoop /home/hadoop/hadoop2.7
  2. 修改组执行权限

    1. chmod -R g=rwx /home/hadoop/hadoop2.7

配置系统环境

  1. 配置系统变量

    1. echo 'export JAVA_HOME=/usr/local/jdk1.8' >> /etc/profile
    2. echo 'export JRE_HOME=$JAVA_HOME/jre' >> /etc/profile
    3. echo 'export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar' >> /etc/profile
    4. echo 'export HADOOP_HOME=${hadoopFolder}' >> /etc/profile
    5. echo 'export PATH=$HADOOP_HOME/bin:$PATH' >> /etc/profile
    6. source /etc/profile
  2. 配置主机域名

    1. hostname node #当前机器名称
    2. echo NETWORKING=yes >> /etc/sysconfig/network
    3. echo HOSTNAME=node >> /etc/sysconfig/network #当前机器名称,避免重启主机名失效
    4. echo '192.168.169.131 node' >> /etc/hosts
    5. echo '192.168.169.133 node1' >> /etc/hosts
    6. echo '192.168.169.132 node2' >> /etc/hosts
    7. echo '192.168.169.134 node3' >> /etc/hosts
  3. 关闭防火墙

    1. systemctl stop firewalld.service
    2. systemctl disable firewalld.service

配置Hadoop集群

  1. 修改配置文件

    1. sed -i 's/\${ JAVA_HOME}/\/usr\/local\/jdk1.8\//' $HADOOP_HOME/etc/hadoop/hadoop-env.sh sed -i 's/# export JAVA_HOME=\/home\/y\/libexec\/jdk1.6.0\//export JAVA_HOME=\/usr\/local\/jdk1.8\//' $HADOOP_HOME/etc/hadoop/yarn-env.sh
    2. sed -i 's/# export JAVA_HOME=\/home\/y\/libexec\/jdk1.6.0\//export JAVA_HOME=\/usr\/local\/jdk1.8\//' $HADOOP_HOME/etc/hadoop/mapred-env.sh
  2. 配置从节点主机名

    1. echo node1 > $HADOOP_HOME/etc/hadoop/slaves
    2. echo node2 >> $HADOOP_HOME/etc/hadoop/slaves
    3. echo node3 >> $HADOOP_HOME/etc/hadoop/slaves
  3. 拷贝文件并覆盖以下文件

    • /home/hadoop/hadoop2.7/etc/hadoop/core-site.xml



      fs.defaultFS
      hdfs://node:9000/
      namenode settings


      hadoop.tmp.dir
      /home/hadoop/tmp/hadoop-${user.name}
      temp folder


      hadoop.proxyuser.hadoop.hosts



      hadoop.proxyuser.hadoop.groups


    • /home/hadoop/hadoop2.7/etc/hadoop/hdfs-site.xml



      dfs.namenode.http-address
      node:50070
      fetch NameNode images and edits.注意主机名称


      dfs.namenode.secondary.http-address
      node1:50090
      fetch SecondNameNode fsimage


      dfs.replication
      3
      replica count


      dfs.namenode.name.dir
      file:///home/hadoop/hadoop2.7/hdfs/name
      namenode


      dfs.datanode.data.dir
      file:///home/hadoop/hadoop2.7/hdfs/data
      DataNode


      dfs.namenode.checkpoint.dir
      file:///home/hadoop/hadoop2.7/hdfs/namesecondary
      check point


      dfs.webhdfs.enabled
      true


      dfs.stream-buffer-size
      131072
      buffer


      dfs.namenode.checkpoint.period
      3600
      duration

    • /home/hadoop/hadoop2.7/etc/hadoop/mapred-site.xml



      mapreduce.framework.name
      yarn


      mapreduce.jobtracker.address
      hdfs://trucy:9001


      mapreduce.jobhistory.address
      node:10020
      MapReduce JobHistory Server host:port, default port is 10020.


      mapreduce.jobhistory.webapp.address
      node:19888
      MapReduce JobHistory Server Web UI host:port, default port is 19888.

    • /home/hadoop/hadoop2.7/etc/hadoop/yarn-site.xml



      yarn.resourcemanager.hostname
      node



      yarn.nodemanager.aux-services
      mapreduce_shuffle



      yarn.nodemanager.aux-services.mapreduce.shuffle.class
      org.apache.hadoop.mapred.ShuffleHandler



      yarn.resourcemanager.address
      node:8032



      yarn.resourcemanager.scheduler.address
      node:8030



      yarn.resourcemanager.resource-tracker.address
      node:8031



      yarn.resourcemanager.admin.address
      node:8033



      yarn.resourcemanager.webapp.address
      node:8088

配置无密码登录

  1. 在所有主机上创建目录并赋予权限

    1. mkdir /home/hadoop/.ssh
    2. chomod 700 /home/hadoop/.ssh
  2. 在node主机上生成RSA文件

    1. ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa
  3. 生成并拷贝 authorized_keys文件

    1. cp /home/hadoop/.ssh/id_rsa.pub authorized_keys
    2. scp /home/hadoop/.ssh/authorized_keys node1:/home/hadoop/.ssh
    3. scp /home/hadoop/.ssh/authorized_keys node2:/home/hadoop/.ssh
    4. scp /home/hadoop/.ssh/authorized_keys node3:/home/hadoop/.ssh
  4. 在所有主机上修改拥有者和权限

    1. chmod 600 .ssh/authorized_keys
    2. chown -R hadoop:hadoop .ssh
  5. 修改ssh 配置文件

    1. 注释掉
    2. # AuthorizedKeysFile .ssh/authorized_keys
  6. 重新启动ssh

    1. service sshd restart

    Note: 第一次连接仍然需要输入密码。

启动Hadoop

  1. 进入Node 主机,并切换到hadoop账号

    1. su hadoop
  2. 格式化 namenode

    1. /home/hadoop/hadoop2.7/bin/hdfs namenode -format
  3. 启动 hdfs

    1. /home/hadoop/hadoop2.7/sbin/start-dfs.sh
  4. 验证 hdfs 状态
    这里写图片描述
  5. 启动 yarn

    1. /home/hadoop/hadoop2.7/sbin/start-yarn.sh
  6. 验证 yarn 状态
    这里写图片描述

默认举例

  1. 创建文件夹

    1. /home/hadoop/hadoop2.7/bin/hadoop fs -mkdir -p /data/wordcount
    2. /home/hadoop/hadoop2.7/bin/hadoop fs -mkdir -p /output/
  2. 上传文件

    1. hadoop fs -put /home/hadoop/hadoop2.2/etc/hadoop/*.xml /data/wordcount/
    2. hadoop fs -ls /data/wordcount
  3. 执行Map-Reduce

    1. hadoop jar /home/hadoop/hadoop2.2/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar wordcount /data/wordcount /output/wordcount
  4. 查看状态

    1. http://192.168.169.131:8088/cluster
  5. 浏览结果

    1. hadoop fs -cat /output/wordcount/part-r-00000 | more

" class="reference-link">结果

引用:
https://www.baidu.com/s?wd=hadoop%E9%9B%86%E7%BE%A4%E7%8E%AF%E5%A2%83%E6%90%AD%E5%BB%BA&rsv_spt=1&rsv_iqid=0xffb1db6f0002def2&issp=1&f=8&rsv_bp=1&rsv_idx=2&ie=utf-8&rqlang=cn&tn=baiduhome_pg&rsv_enter=1&oq=shell%20%E5%88%9B%E5%BB%BA%E6%96%87%E4%BB%B6%E5%A4%B9&rsv_t=b457e81oNfX2KqE2N62rZkxga5NJ4LA1PBga1gBeH2Y2RZr1dK5cXsG6jPkERHUs8L6b&inputT=7316&rsv_pq=8f6d21860003ff4f&rsv_sug3=177&rsv_sug1=134&rsv_sug7=100&bs=shell%20%E5%88%9B%E5%BB%BA%E6%96%87%E4%BB%B6%E5%A4%B9

http://www.cnblogs.com/liuling/archive/2013/06/16/2013-6-16-01.html

发表评论

表情:
评论列表 (有 0 条评论,281人围观)

还没有评论,来说两句吧...

相关阅读