SocketTimeoutException with the Tablestore Java SDK

更新时间:
复制 MD 格式

A client may time out when accessing Tablestore due to network failures, network jitter, high server load, or client-side Full Garbage Collection (GC) events. If a client times out, troubleshoot the issue by checking network connectivity, server latency, and client-side Full GC problems.

Symptoms

When using the Java SDK to access Tablestore, you may receive an Unexpected error: java.net.SocketTimeoutException. The following is a sample error message:

content: 11:56:48.072 WARN  com.alicloud.openservices.tablestore.core.utils.LogUtil - TraceId:4bc30ca1-f112-2d52-d8b1-61a95072eda5	Failed	RetriedCount:1	com.alicloud.openservices.tablestore.ClientException: Unexpected error: java.net.SocketTimeoutException

Possible causes

This error occurs when the request-to-response time exceeds the socketTimeoutInMillisecond value. The default value for this parameter is 30,000 milliseconds.

Note

Under normal conditions, avoid setting the socketTimeoutInMillisecond value too low. You can configure this value in ClientConfiguration.

Possible causes include the following:

  • The client experiences a Full GC event.

  • Network issues, such as network failures or jitter.

  • High server-side latency that exceeds the socketTimeoutInMillisecond value.

Solution

  1. Check for client-side Full GC problems.

    Use tools such as jmap and jcmd to inspect JVM memory usage and check for out-of-memory (OOM) errors.

    If an OOM error occurs, the HttpClient's background I/O thread may exit unexpectedly. To resolve this:

    • If memory usage is abnormal or a memory leak exists, optimize the code.

    • If memory usage is reasonable but machine resources are insufficient, increase the available memory.

    • If the machine is idle and memory usage is low, increase the JVM heap memory to reduce the likelihood of Full GC events.

    Additionally, a SocketTimeoutException can also occur under high machine load, high network error rates, or high CPU utilization. In these cases, the request might time out before it is sent.

  2. Check the network connectivity between the client and the server.

    If all requests fail with a SocketTimeoutException, a network failure is the likely cause. Use the ping or curl command to test for network issues.

    The following examples show how to run the test. Replace myinstance with the name of your Tablestore instance.

    ping myinstance.cn-hangzhou.ots.aliyuncs.com
    curl myinstance.cn-hangzhou.ots.aliyuncs.com
    • If you find a network failure, you might be using an internal endpoint in a non-ECS environment. Ensure you use the correct endpoint. For more information about endpoints, see Endpoints.

      If your client accesses Tablestore from an ECS instance, connect through a VPC or the classic network.

    • If there is network connectivity but timeouts persist, network jitter might be causing high latency. Check for high traffic, insufficient bandwidth, or a high packet retransmission rate. If you detect significant network jitter, contact network support.

  3. To check the server-side latency on the Tablestore console, follow these steps:

    1. Log on to the Tablestore console.

    2. In the top navigation bar, select a resource group and a region.

    3. On the Overview page, click the name of your instance in the instance list.

    4. On the Instance Details tab, click the data table name in the Tables area.

    5. On the Manage Table page, click the Monitoring Indicators tab. Then, select a table or index, specify a time range, and set Metric Group to Average Latency to view the average latency for different operation types.

      If server-side latency exceeds the socketTimeoutInMillisecond value, submit a ticket to Tablestore technical support.

      fig_20220217_mointor