Java/Scala class conflicts

更新时间:
复制 MD 格式

Common Java/Scala class conflict issues and solutions for MaxCompute Spark jobs.

Overview of class conflicts

  • These errors typically throw java.lang.NoClassDefFoundError or method-not-found exceptions. Check your POM and exclude the conflicting dependencies.

  • Cause: Some dependencies in your custom JAR package may have different versions from those in the Spark client jars directory. The JVM may load your JAR packages first during class loading, causing conflicts.

Differences between provided and compile scopes

  • provided: The dependency is required at compile time but not packaged for runtime. The cluster supplies the JAR package at runtime, mainly from the Spark client jars directory. If you do not set these dependencies to provided, class conflicts or class/method-not-found errors may occur.

  • compile: The dependency is required at both compile time and runtime. These are typically third-party libraries related to your code logic that do not exist in the cluster and must be included in your JAR package.

Important

The main JAR package must be a fat JAR that includes all compile-scoped dependencies so that the required classes can be loaded at runtime.

POM self-check

JAR packages that must be set to provided

  • JAR packages with groupId org.apache.spark:

    These are community Spark JAR packages already available in the Spark client jars directory. They do not need to be included in your JAR package and are automatically uploaded to the MaxCompute cluster when the Spark client submits a job.

  • cupid-sdk:

    Automatically uploaded to the MaxCompute cluster during job submission.

  • odps-sdk:

    Automatically uploaded to the MaxCompute cluster during job submission.

  • hadoop-yarn-client:

    Used for job uploads. This package may be pulled in as a transitive dependency, so check and exclude it before packaging.

JAR packages that must not be set to provided

  • JAR packages used to access external services, such as MySQL or other third-party services.