Shell

更新时间:
复制为 MD 格式

Zeppelin支持Shell脚本(以%sh开头)。与开源Zeppelin相比,E-MapReduce(简称EMR)数据开发集群中的Shell解释器支持在不同EMR集群环境里切换。本文通过示例为您介绍如何在Zeppelin中使用Shell。

使用示例

  • 运行hadoop命令
    执行如下命令会显示当前EMR集群根目录下的所有文件,切换到不同的EMR集群,会显示不同集群下的情况。
    hadoop fs -ls /
    返回信息如下所示:
    hadoop fs -ls /
    Found 7 items
    drwxr-xr-x   - hadoop     hadoop          0 2021-01-27 14:25 /apps
    drwxrwxrwx   - flowagent  hadoop          0 2021-01-27 14:25 /emr-flow
    drwxr-x--x   - root       hadoop          0 2021-01-27 14:25 /emr-sparksql-udf
    drwxr-x--x   - hadoop     hadoop          0 2021-02-09 12:06 /flink
    drwxr-x--x   - hadoop     hadoop          0 2021-02-23 10:40 /spark-history
    drwxrwxrwt   - hive       hadoop          0 2021-02-23 10:40 /tmp
    drwxr-x--t   - hadoop     hadoop          0 2021-02-12 08:28 /user
  • 运行Spark-Submit命令提交Spark作业。示例如下所示:
    %sh
    spark-submit --class org.apache.spark.examples.SparkPi --master yarn-cluster oss://xxx/spark-examples_2.11-2.4.6.jar
    Warning: Master yarn-cluster is deprecated since 2.0. Please use master "yarn" with specified deploy mode instead.
    21/02/10 11:38:17 WARN [main] NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    21/02/10 11:38:17 WARN [main] DependencyUtils: Skip remote jar oss://xxx/spark-examples_2.11-2.4.6.jar.
    21/02/10 11:38:17 WARN [main] DomainSocketFactory: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.
    21/02/10 11:38:17 INFO [main] RMProxy: Connecting to ResourceManager at emr-header-1.cluster-2xxx:8032
    21/02/10 11:38:18 INFO [main] Client: Requesting a new application from cluster with 4 NodeManagers
    21/02/10 11:38:18 INFO [main] Client: Verifying our application has not requested more than the maximum memory capability of the cluster (26144 MB per container)
    21/02/10 11:38:18 INFO [main] Client: Will allocate AM container, with 3072 MB memory including 1024 MB overhead
    21/02/10 11:38:18 INFO [main] Client: Setting up container launch context for our AM
    21/02/10 11:38:18 INFO [main] Client: Setting up the launch environment for our AM container
    21/02/10 11:38:18 INFO [main] Client: Preparing resources for our AM container
    21/02/10 11:38:18 WARN [main] Client: Neither spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading libraries under SPARK_HOME.
    21/02/10 11:38:21 INFO [main] Client: Uploading resource file:/tmp/spark-bab232ea-978f-48f0-8378-36e97fb40866/__spark_libs__4172798070712495432.zip -> hdfs://emr-header-1.cluster-203701:9000/user/hadoop/.sparkStaging/application_1612885840821_0009/__spark_libs__4172798070712495432.zip
    Warn: cannot load bigboot.cfg from B2SDK_CONF_DIR <null>
    21/02/10 11:38:23 INFO [main] OssNativeStore: Filesystem support for magic committers is enabled, write buffer size 1048576
    21/02/10 11:38:23 INFO [main] Client: Uploading resource oss://xxx/spark-examples_2.11-2.4.6.jar -> hdfs://emr-header-1.cluster-203701:9000/user/hadoop/.sparkStaging/application_1612885840821_0009/spark-examples_2.11-2.4.6.jar
    ......
    Pi is roughly 3.1415926535897934