全部产品
存储与CDN 数据库 域名与网站(万网) 应用服务 数加·人工智能 数加·大数据基础服务 互联网中间件 视频服务 开发者工具 解决方案 物联网 钉钉智能硬件
E-MapReduce

搭建集群提交Gateway

更新时间:2018-01-09 13:54:10

Gateway

一些客户需要自主搭建Gateway向E-MapReduce集群提交作业,目前E-MapReduce里面有2种方式可以创建Gateway

  1. 推荐方式:直接通过EMR控制台来创建。点击创建
  2. 手动搭建

手动搭建Gateway指南

网络

首先要保证Gateway机器在EMR对应集群的安全组中,Gateway节点可以顺利的访问EMR集群。设置机器的安全组请参考ECS的安全组设置说明。

环境

  • 系统环境 推荐使用CentOS 7.2+
  • Java环境 安装至少 JDK 1.7及以上 推荐使用 openjdk version 1.8.0

搭建步骤

EMR 2系列 2.7及以上版本,3系列 3.2及以上版本

注意: 该版本请直接使用EMR控制台来进行创建

将下面脚本拷贝到Gataway机器并执行.示例: sh deploy.sh master_password_file

  • deploy.sh 是脚本名称,内容见下面代码
  • masteri_ip 是集群的master节点的ip,需要能够访问到
  • master_password_file 是保存master节点的密码文件,将master节点的密码直接写在文件内即可
  1. #!/usr/bin/bash
  2. if [ $# != 2 ]
  3. then
  4. echo "Usage: $0 master_ip master_password_file"
  5. exit 1;
  6. fi
  7. masterip=$1
  8. masterpwdfile=$2
  9. if ! type sshpass >/dev/null 2>&1; then
  10. yum install -y sshpass
  11. fi
  12. if ! type java >/dev/null 2>&1; then
  13. yum install -y java-1.8.0-openjdk
  14. fi
  15. mkdir -p /opt/apps
  16. mkdir -p /etc/ecm
  17. echo "Start to copy package from $masterip to local gateway(/opt/apps)"
  18. echo " -copying hadoop-2.7.2"
  19. sshpass -f $masterpwdfile scp -r -o 'StrictHostKeyChecking no' root@$masterip:/usr/lib/hadoop-current /opt/apps/
  20. echo " -copying hive-2.0.1"
  21. sshpass -f $masterpwdfile scp -r root@$masterip:/usr/lib/hive-current /opt/apps/
  22. echo " -copying spark-2.1.1"
  23. sshpass -f $masterpwdfile scp -r root@$masterip:/usr/lib/spark-current /opt/apps/
  24. echo "Start to link /usr/lib/\${app}-current to /opt/apps/\${app}"
  25. if [ -L /usr/lib/hadoop-current ]
  26. then
  27. unlink /usr/lib/hadoop-current
  28. fi
  29. ln -s /opt/apps/hadoop-current /usr/lib/hadoop-current
  30. if [ -L /usr/lib/hive-current ]
  31. then
  32. unlink /usr/lib/hive-current
  33. fi
  34. ln -s /opt/apps/hive-current /usr/lib/hive-current
  35. if [ -L /usr/lib/spark-current ]
  36. then
  37. unlink /usr/lib/spark-current
  38. fi
  39. ln -s /opt/apps/spark-current /usr/lib/spark-current
  40. echo "Start to copy conf from $masterip to local gateway(/etc/ecm)"
  41. sshpass -f $masterpwdfile scp -r root@$masterip:/etc/ecm/hadoop-conf /etc/ecm/hadoop-conf
  42. sshpass -f $masterpwdfile scp -r root@$masterip:/etc/ecm/hive-conf /etc/ecm/hive-conf
  43. sshpass -f $masterpwdfile scp -r root@$masterip:/etc/ecm/spark-conf /etc/ecm/spark-conf
  44. echo "Start to copy environment from $masterip to local gateway(/etc/profile.d)"
  45. sshpass -f $masterpwdfile scp root@$masterip:/etc/profile.d/hdfs.sh /etc/profile.d/
  46. sshpass -f $masterpwdfile scp root@$masterip:/etc/profile.d/yarn.sh /etc/profile.d/
  47. sshpass -f $masterpwdfile scp root@$masterip:/etc/profile.d/hive.sh /etc/profile.d/
  48. sshpass -f $masterpwdfile scp root@$masterip:/etc/profile.d/spark.sh /etc/profile.d/
  49. if [ -L /usr/lib/jvm/java ]
  50. then
  51. unlink /usr/lib/jvm/java
  52. fi
  53. echo "" >>/etc/profile.d/hdfs.sh
  54. echo export JAVA_HOME=/usr/lib/jvm/jre-1.8.0 >>/etc/profile.d/hdfs.sh
  55. echo "Start to copy host info from $masterip to local gateway(/etc/hosts)"
  56. sshpass -f $masterpwdfile scp root@$masterip:/etc/hosts /etc/hosts_bak
  57. cat /etc/hosts_bak | grep emr | grep cluster >>/etc/hosts
  58. if ! id hadoop >& /dev/null
  59. then
  60. useradd hadoop
  61. fi

EMR 2系列 2.7以下版本,3系列 3.2以下版本

将下面脚本拷贝到Gataway机器并执行.示例: sh deploy.sh master_password_file

  • deploy.sh 是脚本名称,内容见下面代码
  • masteri_ip 是集群的master节点的ip,需要能够访问到
  • master_password_file 是保存master节点的密码文件,将master节点的密码直接写在文件内即可
  1. !/usr/bin/bash
  2. if [ $# != 2 ]
  3. then
  4. echo "Usage: $0 master_ip master_password_file"
  5. exit 1;
  6. fi
  7. masterip=$1
  8. masterpwdfile=$2
  9. if ! type sshpass >/dev/null 2>&1; then
  10. yum install -y sshpass
  11. fi
  12. if ! type java >/dev/null 2>&1; then
  13. yum install -y java-1.8.0-openjdk
  14. fi
  15. mkdir -p /opt/apps
  16. mkdir -p /etc/emr
  17. echo "Start to copy package from $masterip to local gateway(/opt/apps)"
  18. echo " -copying hadoop-2.7.2"
  19. sshpass -f $masterpwdfile scp -r -o 'StrictHostKeyChecking no' root@$masterip:/usr/lib/hadoop-current /opt/apps/
  20. echo " -copying hive-2.0.1"
  21. sshpass -f $masterpwdfile scp -r root@$masterip:/usr/lib/hive-current /opt/apps/
  22. echo " -copying spark-2.1.1"
  23. sshpass -f $masterpwdfile scp -r root@$masterip:/usr/lib/spark-current /opt/apps/
  24. echo "Start to link /usr/lib/\${app}-current to /opt/apps/\${app}"
  25. if [ -L /usr/lib/hadoop-current ]
  26. then
  27. unlink /usr/lib/hadoop-current
  28. fi
  29. ln -s /opt/apps/hadoop-current /usr/lib/hadoop-current
  30. if [ -L /usr/lib/hive-current ]
  31. then
  32. unlink /usr/lib/hive-current
  33. fi
  34. ln -s /opt/apps/hive-current /usr/lib/hive-current
  35. if [ -L /usr/lib/spark-current ]
  36. then
  37. unlink /usr/lib/spark-current
  38. fi
  39. ln -s /opt/apps/spark-current /usr/lib/spark-current
  40. echo "Start to copy conf from $masterip to local gateway(/etc/emr)"
  41. sshpass -f $masterpwdfile scp -r root@$masterip:/etc/emr/hadoop-conf /etc/emr/hadoop-conf
  42. sshpass -f $masterpwdfile scp -r root@$masterip:/etc/emr/hive-conf /etc/emr/hive-conf
  43. sshpass -f $masterpwdfile scp -r root@$masterip:/etc/emr/spark-conf /etc/emr/spark-conf
  44. echo "Start to copy environment from $masterip to local gateway(/etc/profile.d)"
  45. sshpass -f $masterpwdfile scp root@$masterip:/etc/profile.d/hadoop.sh /etc/profile.d/
  46. if [ -L /usr/lib/jvm/java ]
  47. then
  48. unlink /usr/lib/jvm/java
  49. fi
  50. ln -s /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.131-3.b12.el7_3.x86_64/jre /usr/lib/jvm/java
  51. echo "Start to copy host info from $masterip to local gateway(/etc/hosts)"
  52. sshpass -f $masterpwdfile scp root@$masterip:/etc/hosts /etc/hosts_bak
  53. cat /etc/hosts_bak | grep emr | grep cluster >>/etc/hosts
  54. if ! id hadoop >& /dev/null
  55. then
  56. useradd hadoop
  57. fi

测试

  • Hive

    1. [hadoop@iZ23bc05hrvZ ~]$ hive
    2. hive> show databases;
    3. OK
    4. default
    5. Time taken: 1.124 seconds, Fetched: 1 row(s)
    6. hive> create database school;
    7. OK
    8. Time taken: 0.362 seconds
    9. hive>
  • 运行Hadoop作业

    1. [hadoop@iZ23bc05hrvZ ~]$ hadoop jar /usr/lib/hadoop-current/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar pi 10 10
    2. Number of Maps = 10
    3. Samples per Map = 10
    4. Wrote input for Map #0
    5. Wrote input for Map #1
    6. Wrote input for Map #2
    7. Wrote input for Map #3
    8. Wrote input for Map #4
    9. Wrote input for Map #5
    10. Wrote input for Map #6
    11. Wrote input for Map #7
    12. Wrote input for Map #8
    13. Wrote input for Map #9
    14. File Input Format Counters
    15. Bytes Read=1180
    16. File Output Format Counters
    17. Bytes Written=97
    18. Job Finished in 29.798 seconds
    19. Estimated value of Pi is 3.20000000000000000000
本文导读目录