Zeppelin支持Shell脚本(以%sh开头)。与开源Zeppelin相比,E-MapReduce(简称EMR)数据开发集群中的Shell解释器支持在不同EMR集群环境里切换。本文通过示例为您介绍如何在Zeppelin中使用Shell。
使用示例
- 运行hadoop命令执行如下命令会显示当前EMR集群根目录下的所有文件,切换到不同的EMR集群,会显示不同集群下的情况。
hadoop fs -ls /返回信息如下所示:hadoop fs -ls / Found 7 items drwxr-xr-x - hadoop hadoop 0 2021-01-27 14:25 /apps drwxrwxrwx - flowagent hadoop 0 2021-01-27 14:25 /emr-flow drwxr-x--x - root hadoop 0 2021-01-27 14:25 /emr-sparksql-udf drwxr-x--x - hadoop hadoop 0 2021-02-09 12:06 /flink drwxr-x--x - hadoop hadoop 0 2021-02-23 10:40 /spark-history drwxrwxrwt - hive hadoop 0 2021-02-23 10:40 /tmp drwxr-x--t - hadoop hadoop 0 2021-02-12 08:28 /user - 运行Spark-Submit命令提交Spark作业。示例如下所示:
%sh spark-submit --class org.apache.spark.examples.SparkPi --master yarn-cluster oss://xxx/spark-examples_2.11-2.4.6.jar Warning: Master yarn-cluster is deprecated since 2.0. Please use master "yarn" with specified deploy mode instead. 21/02/10 11:38:17 WARN [main] NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 21/02/10 11:38:17 WARN [main] DependencyUtils: Skip remote jar oss://xxx/spark-examples_2.11-2.4.6.jar. 21/02/10 11:38:17 WARN [main] DomainSocketFactory: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded. 21/02/10 11:38:17 INFO [main] RMProxy: Connecting to ResourceManager at emr-header-1.cluster-2xxx:8032 21/02/10 11:38:18 INFO [main] Client: Requesting a new application from cluster with 4 NodeManagers 21/02/10 11:38:18 INFO [main] Client: Verifying our application has not requested more than the maximum memory capability of the cluster (26144 MB per container) 21/02/10 11:38:18 INFO [main] Client: Will allocate AM container, with 3072 MB memory including 1024 MB overhead 21/02/10 11:38:18 INFO [main] Client: Setting up container launch context for our AM 21/02/10 11:38:18 INFO [main] Client: Setting up the launch environment for our AM container 21/02/10 11:38:18 INFO [main] Client: Preparing resources for our AM container 21/02/10 11:38:18 WARN [main] Client: Neither spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading libraries under SPARK_HOME. 21/02/10 11:38:21 INFO [main] Client: Uploading resource file:/tmp/spark-bab232ea-978f-48f0-8378-36e97fb40866/__spark_libs__4172798070712495432.zip -> hdfs://emr-header-1.cluster-203701:9000/user/hadoop/.sparkStaging/application_1612885840821_0009/__spark_libs__4172798070712495432.zip Warn: cannot load bigboot.cfg from B2SDK_CONF_DIR <null> 21/02/10 11:38:23 INFO [main] OssNativeStore: Filesystem support for magic committers is enabled, write buffer size 1048576 21/02/10 11:38:23 INFO [main] Client: Uploading resource oss://xxx/spark-examples_2.11-2.4.6.jar -> hdfs://emr-header-1.cluster-203701:9000/user/hadoop/.sparkStaging/application_1612885840821_0009/spark-examples_2.11-2.4.6.jar ...... Pi is roughly 3.1415926535897934
该文章对您有帮助吗?