Integrate tracing analysis
In distributed microservices environments, end-to-end tracing analysis helps you query job execution durations and analyze exceptions. This topic describes how to integrate it with scheduled jobs.
Prerequisites
-
The agent version is 1.7.0 or later.
-
The agent depends on the Trace plug-in. For more information, see Integration configuration.
-
You must upgrade your application to the Professional Edition. For more information, see How to upgrade to Professional Edition.
Integrate tracing analysis
Integration configuration
The following example shows how to add a dependency to the pom.xml file of a Spring Boot application.
<dependency>
<groupId>com.aliyun.schedulerx</groupId>
<artifactId>schedulerx2-spring-boot-starter</artifactId>
<version>{latest_version}</version>
<!-- If you use Logback, you must exclude Log4j and Log4j2. -->
<exclusions>
<exclusion>
<groupId>org.apache.logging.log4j</groupId>
<artifactId>log4j-api</artifactId>
</exclusion>
<exclusion>
<groupId>org.apache.logging.log4j</groupId>
<artifactId>log4j-core</artifactId>
</exclusion>
<exclusion>
<groupId>log4j</groupId>
<artifactId>log4j</artifactId>
</exclusion>
</exclusions>
</dependency>
<!-- Add the dependency for the tracing analysis extension plug-in. -->
<dependency>
<groupId>com.aliyun.schedulerx</groupId>
<artifactId>schedulerx-plugin-trace-opentelemetry</artifactId>
<version>{latest_version}</version>
</dependency>
<!-- Use this plug-in for SkyWalking. -->
<!--<dependency>
<groupId>com.aliyun.schedulerx</groupId>
<artifactId>schedulerx-plugin-trace-skywalking</artifactId>
<version>{latest_version}</version>
</dependency>-->
Deployment configuration
-
Option 1: Deploy the application in EDAS (Recommended)
-
If you already use EDAS to deploy your applications, you have the required integration capabilities and can skip the deployment steps in this section.
-
New users can activate EDAS and deploy their applications on the platform. EDAS automatically enables end-to-end tracing analysis, requiring no extra configuration. For more information, see Get started with EDAS.
-
-
Option 2: Integrate with Application Real-Time Monitoring Service (ARMS) for standalone deployment
Follow the Application Real-Time Monitoring Service (ARMS) integration process. Download the required JAR packages as prompted, configure the application information, and add the arms javaagent configuration to your startup script. For more information, see Manually install an agent.
-
Option 3: Integrate with a self-hosted platform
If you use a self-hosted end-to-end tracing analysis platform, you can still integrate. The following steps show how to integrate with SkyWalking.
-
Download and configure the SkyWalking agent installation package.
-
Add the following JVM parameter to the Java application startup script:
-javaagent:{agent.path}/skywalking-agent.jar. -
Switch the SchedulerX Trace plug-in dependency in your Java application to the SkyWalking dependency, as shown below.
<dependency> <groupId>com.aliyun.schedulerx</groupId> <artifactId>schedulerx-plugin-trace-skywalking</artifactId> <version>{latest_version}</version> </dependency>
-
Data collection uses a default sampling rate, so not all execution paths are captured. You can adjust the sampling rate based on your business scenario.
View traces
After completing the configuration and application deployment, you can use end-to-end tracing analysis to visualize the execution of scheduled jobs. This feature currently supports standalone jobs (including HTTP jobs), broadcast jobs, and visual MapReduce jobs.
-
Log in to the Distributed Task Scheduling platform.
-
In the top navigation bar, select a region.
-
In the left-side navigation pane, click Execution List and select the Task Instance List tab.
-
In the Actions column, click Details.
-
Standalone job
Click the TraceId in the Details panel to view the corresponding execution trace. This feature also applies to standalone HTTP jobs because the host application supports end-to-end tracing analysis with OpenTelemetry. On the Basic Information tab within the Details panel, find the job's TraceId (for example,
7bf737****) to correlate it with your tracing system. On the trace details page, if a trace's duration exceeds 3.00s, a yellow notification bar appears on the Trace tab, allowing you to identify the root cause with a single click. For example, the diagnostic summary might identify the root-cause application asschedulerx-example-traceand the problematic span ascom.alibaba.schedulerx.example.trace.processor.SimpleHelloProcessor.process. This span's duration is 3.00s, accounting for 95.40% of the total time. The timeline would show two spans: one for the SchedulerX standalone job call (3150 ms) and another for the/helloHTTP service entry point (3 ms). -
Broadcast job
Click Details. On the Task Instance Details page, click Current execution details. Each machine has a corresponding TraceId that you can click to view its trace. After a broadcast job runs, you can view its status in the task instance list. Click a task instance to open the details pane. The Current execution details tab shows the overall job progress, including the number and percentage of completed subtasks. The Statistics by machine table lists details for each machine, including IP, subtask counts (total, queued, running, succeeded, failed), the TraceId, and an Operation column. You can use the TraceId to track the job execution path. In the Operation column, click stack trace to view the details for the corresponding subtask.
-
Visual MapReduce job
Click Details. On the Task Instance Details page, click Subtask list. For visual MapReduce jobs, you can query the execution path for each subtask from the execution records. From the subtask list in the execution record details, you can click the TraceId of a subtask to view its trace. In the task instance list, click a task to open its details, then select the Subtask list tab. This list includes columns for Subtask ID, Subtask Name, Status, worker, TraceId, and Operation. The TraceId column provides the trace for each subtask. The Operation column contains options to Rerun the subtask or view its Logs.
-