This topic analyzes a common scenario of sustained high CPU utilization that is identified during Java thread stack analysis.
Symptoms
The business team reported the following issues:
1. During an increase in business volume, CPU utilization continuously increased for an unknown reason.
2. The system code was primarily waiting for responses from downstream services and did not have complex local processing logic.
Thread stack analysis
The business team saved the on-site jstack log (thread stack log). You can upload the thread stack log and analyze it using ATP thread stack analysis. Then, open the Method Hotspots view. This view aggregates data to show the hottest methods, which are the most frequently called methods by all threads in the Java process at a specific point in time:
Select the hottest method (the deepest bar).
The method name indicates that the hottest method is deserialization. During the deserialization process, URLClassLoader is used to load classes:
Hessian2Input.readObject();
...
ClassLoader.loadClass();
URLClassLoader.loadClass();
URLClassPath.getResource();
URLClassPath.getNextLoader();
URLClassPath.getLoader();
URLClassLoader contains a URLClassPath (ucp) object that records the JAR packages loaded by the class loader. When loading a class, URLClassLoader traverses all its JAR packages, opens each one, and searches for the required class. The business team reported that there were about 500 JAR packages. Therefore, the initial hypothesis from the thread stack analysis was as follows: During deserialization, the system encountered a class that was not yet loaded. This event triggered URLClassLoader to search for the class by traversing more than 500 JAR packages. This intensive search process caused the sustained high CPU utilization.
Based on the stack name, you can find the corresponding thread in the jstack log:
The class loading process uses locks in three places:
In reality, there are two locks, one of which is locked recursively. More importantly, thread 1068 holds both locks. No other threads are waiting for these locks, which indicates that there is no lock contention. The class search process is performed by thread 1068 alone.
Conclusion
This indicates that only one thread was performing the high-frequency operation of traversing JAR packages to find a class. Based on this hypothesis, the CPU utilization would be expected to drop as soon as the class is found or when deserialization finishes. After checking with the business team, it was confirmed that deserialization was continuous. This confirmed the hypothesis, because the CPU utilization would continue to rise only if deserialization was continuous.
When this hypothesis was shared with the business team, they checked the logs and confirmed that a specific class could not be found. The system attempted to find the class again during each deserialization process. If the class was not found, the system only recorded a log entry. This concluded the analysis. The business team's remaining task was to optimize the code and fix the issue that caused the missing class.
