Spark on MaxCompute is a MaxCompute computing service that is compatible with open source Spark. It provides the Spark computing framework on a unified platform for computing resources and dataset permissions. This lets you submit and run Spark jobs in a familiar way to meet a wider range of data processing and analysis needs.
Limits
Spark on MaxCompute supports the following scenarios:
Offline computing scenarios, such as GraphX, MLlib, RDD, Spark SQL, and PySpark.
Reading from and writing to MaxCompute tables.
Referencing file resources in MaxCompute.
Reading from and writing to services in a Virtual Private Cloud (VPC) environment, such as ApsaraDB RDS, Redis, HBase, and services deployed on Elastic Compute Service (ECS).
Reading from and writing to unstructured storage in Object Storage Service (OSS).
Reading from foreign tables in OSS, Hologres, and HBase.
Spark on MaxCompute does not currently support the following scenarios:
Interactive and stream computing, such as Spark shell, Spark SQL shell, PySpark shell, and Spark Streaming.
Accessing MaxCompute foreign tables other than those in OSS, Hologres, and HBase, built-in functions, or MaxCompute user-defined functions (UDFs).
Running Spark jobs in projects that use pay-as-you-go developer edition resources. The pay-as-you-go developer edition supports only MaxCompute SQL (with UDF support) and PyODPS jobs.
The checkpoint feature is not supported.
Key features
Support for native multi-version Spark jobs
MaxCompute supports native open source Spark, is fully compatible with Spark APIs, and lets you run multiple Spark versions simultaneously. Spark on MaxCompute also provides the native Spark web UI.
Unified computing resources
Similar to other job types, such as MaxCompute SQL and MapReduce, Spark on MaxCompute runs on the unified computing resources provisioned for a MaxCompute project.
Unified data and permission management
Spark on MaxCompute fully adheres to the permission system of MaxCompute projects. This ensures that data is queried securely and within the scope of user access permissions.
User experience consistent with open source systems
Spark on MaxCompute provides the same user experience as open source Spark. This includes familiar features such as application UIs and online interactions. To debug open source applications, you can use the native, real-time open source UI and query historical logs. Some applications also support an interactive experience, which lets you interact in real time after the backend engine starts.
System structure
The Spark on MaxCompute solution from Alibaba Cloud allows native Spark to run on MaxCompute.

The diagram on the left shows the native Spark architecture. The diagram on the right shows Spark on MaxCompute, which runs on Cupid, a platform developed by Alibaba Cloud. The Cupid platform natively supports the same computing frameworks as open source YARN.