System configuration optimization-Alibaba Cloud Linux(Alinux)-阿里云帮助中心

Alibaba Cloud Linux 3 ships with a set of pre-tuned kernel parameters optimized for cloud workloads. This document describes those defaults and the common parameters you may need to adjust for your specific workload.

Important

Adjust kernel parameters only when you have observed data that justifies the change. Understand what each parameter does before modifying it — parameter behavior can vary across kernel versions and environment types.

Optimized configurations for Alibaba Cloud Linux 3

These parameters are pre-configured in Alibaba Cloud Linux 3. The values listed are the optimized defaults applied by the OS.

Performance improvement

Parameter	Value	Description
`net.ipv4.tcp_timeout_init`	1000	Initial TCP retransmission timeout, in milliseconds. Minimum value: 2 HZ. > Important This is a custom feature in Alibaba Cloud Linux 3. Long-term maintenance is not guaranteed. Deprecated in Alibaba Cloud Linux 4 and later.
`net.ipv4.tcp_synack_timeout_init`	1000	Initial timeout for SYN-ACK retransmission, in milliseconds. Minimum value: 2 HZ. After the first retransmission, the timeout doubles. > Important This is a custom feature in Alibaba Cloud Linux 3. Long-term maintenance is not guaranteed. Deprecated in Alibaba Cloud Linux 4 and later.
`net.ipv4.tcp_synack_timeout_max`	120000	Maximum SYN-ACK retransmission timeout, in milliseconds. Minimum value: 2 HZ. Each retransmission doubles the timeout, starting from `tcp_synack_timeout_init`, up to this cap. > Important This is a custom feature in Alibaba Cloud Linux 3. Long-term maintenance is not guaranteed. Deprecated in Alibaba Cloud Linux 4 and later.
`net.ipv4.tcp_ato_min`	40	Minimum ACK timeout, in milliseconds. Valid values: 4–200 ms. > Important This is a custom feature in Alibaba Cloud Linux 3. Long-term maintenance is not guaranteed. Deprecated in Alibaba Cloud Linux 4 and later.
`net.ipv4.tcp_init_cwnd`	10	Initial TCP congestion window size. > Important This is a custom feature in Alibaba Cloud Linux 3. Long-term maintenance is not guaranteed. Deprecated in Alibaba Cloud Linux 4 and later.
`net.ipv4.tcp_synack_retries`	2	Number of SYN-ACK retransmissions when the server does not receive the final ACK. On a good-quality network, three retries take approximately 7 seconds before the connection is dropped.
`net.ipv4.tcp_slow_start_after_idle`	0	Controls whether slow start restarts after a TCP connection becomes idle. `0` disables restart, preserving the congestion window across idle periods. `1` enables restart. For long-lived connections with intermittent traffic bursts, set this to `0` to avoid throughput penalties after short idle gaps.
`/sys/kernel/mm/transparent_hugepage/hugetext_enabled`	0	Controls the Hugetext feature, which maps code segments of binaries and dynamic libraries using huge pages to reduce iTLB misses. Valid values: `0` = disabled; `1` = huge pages for binaries and dynamic libraries only; `2` = executable anonymous huge pages only; `3` = both. Enable Hugetext for workloads with large code segments, such as databases and large applications, to reduce iTLB misses and improve performance. > Important This is a custom feature in Alibaba Cloud Linux 3. Long-term maintenance is not guaranteed.

Resource utilization improvement

Parameter	Value	Description
`net.ipv4.tcp_syn_retries`	4	Number of SYN retransmissions when the client does not receive a SYN-ACK. With an initial retransmission timeout (RTO) of 1 second, four retransmissions take approximately 15 seconds and the connection times out after about 31 seconds.
`net.ipv4.tcp_retries2`	8	Maximum retransmissions for an active TCP connection that stops receiving ACKs. With an initial RTO of 200 ms, eight retransmissions take approximately 51 seconds and the final timeout occurs after about 102 seconds.
`net.ipv4.tcp_tw_timeout`	60	Timeout for a TCP socket in TIME_WAIT state, in seconds. Valid values: 1–600 seconds. For more information, see Modify the TCP TIME-WAIT timeout period. > Important This is a custom feature in Alibaba Cloud Linux 3. Long-term maintenance is not guaranteed. Deprecated in Alibaba Cloud Linux 4 and later.
`net.ipv4.tcp_max_tw_buckets`	5000	Maximum number of TCP connections allowed in TIME_WAIT state simultaneously. When TIME_WAIT connections exhaust the port range defined by `net.ipv4.ip_local_port_range`, new `connect()` calls fail. Increase this value if you see `TCP: time wait bucket table overflow` errors. For details, see Why do many "TCP: time wait bucket table overflow" errors occur on a Linux ECS instance?

Network security

Parameter	Value	Description
`net.ipv4.conf.all.rp_filter`	0	Reverse path filtering for all current network interface cards (NICs). Valid values: `0` = disabled; `1` = strict (discard packet if its reverse path does not match the receiving interface); `2` = loose (discard only if the source address is unreachable via any interface). > Warning Setting this to `1` causes packet loss in multi-NIC systems where inbound and outbound traffic uses different NICs. Do not enable strict mode in multi-NIC environments.
`net.ipv4.conf.default.rp_filter`	0	Reverse path filtering applied to newly added NICs. Same valid values and warning as `net.ipv4.conf.all.rp_filter`.
`net.ipv4.conf.default.arp_announce`	2	Source IP selection for ARP requests sent from newly added NICs. Valid values: `0` = any local address on any interface; `1` = prefer a source IP in the same subnet as the destination; `2` = must use the IP of the outbound interface (no ARP sent if no suitable address exists).
`net.ipv4.conf.all.arp_announce`	2	Source IP selection for ARP requests sent from all current NICs. Same valid values as `net.ipv4.conf.default.arp_announce`.
`net.ipv4.tcp_syncookies`	1	SYN flood protection. Valid values: `0` = disabled; `1` = enabled (activates only when the SYN backlog is full); `2` = unconditionally enabled (testing only). > Important SYN cookies are a fallback mechanism, not a solution for overloaded servers. If SYN flood warnings appear in your logs but the source is legitimate traffic rather than an attack, tune `net.core.somaxconn`, `net.ipv4.tcp_max_syn_backlog`, and `net.ipv4.tcp_synack_retries` instead. Note that SYN cookies disable TCP options such as window scaling and timestamps, which can degrade performance for some services.

Other common system configurations for Alibaba Cloud Linux 3

These parameters ship with upstream defaults. Use them as a reference when diagnosing performance or resource issues — adjust only when you have observed data that justifies a change.

Performance improvement

Parameter	Default value	Description
`net.ipv4.ip_local_port_range`	32768 60999	Ephemeral port range for outbound TCP/UDP connections. When most ports in this range are in use, the kernel's linear search for a free port increases CPU utilization. Widen this range if you observe high CPU from port exhaustion, or if `connect()` calls start returning `EADDRNOTAVAIL`.
`net.ipv4.tcp_rmem`	4096 131072 6291456	Per-TCP-socket receive buffer size, in bytes: minimum, default, and maximum. The default is independent of instance type. Increase these values on high-memory instances with sustained high-bandwidth connections. > Important Setting a very large maximum can consume significant memory. Each socket can use up to the maximum value — for example, 1 million sockets at 6 MiB each could require up to 6 TiB of buffer space.
`net.ipv4.tcp_wmem`	4096 16384 4194304	Per-TCP-socket send buffer size, in bytes: minimum, default, and maximum. Same tuning guidance as `net.ipv4.tcp_rmem`.
`net.core.netdev_max_backlog`	1000	Maximum length of the per-CPU socket buffer (skb) queue used for receive packet steering (RPS) and loopback or veth traffic. Increase this if you see dropped packets on high-throughput loopback or veth interfaces.
`net.core.somaxconn`	4096	Maximum listen backlog queue length per socket. For applications like NGINX that handle large numbers of short-lived connections, increase this value. To check whether tuning is needed, run `ss -ntl` and compare the Recv-Q (current backlog) against the Send-Q (socket backlog limit). If Recv-Q approaches Send-Q, increase this parameter.
`net.core.rmem_max`	212992	Maximum receive socket buffer size, in bytes. For TCP, this cap applies only when an application calls `setsockopt(SO_RCVBUF)` explicitly; otherwise `net.ipv4.tcp_rmem` controls the limit. For UDP with many connections on a single socket, increase this value.
`net.core.wmem_max`	212992	Maximum send socket buffer size, in bytes. For TCP, this cap applies only when an application calls `setsockopt(SO_SNDBUF)` explicitly; otherwise `net.ipv4.tcp_rmem` controls the limit.
`/sys/block/<device>/queue/nomerges`	0	Controls I/O merge behavior for the device. Valid values: `0` = all merge types enabled; `1` = only simple one-shot merges (disables complex merges); `2` = all merges disabled. Most workloads benefit from merging. For workloads with purely random I/O where the chance of mergeable requests is low, set to `2` to save the CPU cycles spent checking for merges.
`/sys/block/<device>/queue/read_ahead_kb`	4096	Read-ahead size for sequential reads, in KB. The kernel default is 128 KB; the `tuned` service increases it to 4,096 KB. For sequential workloads (large file reads, log processing), keep the higher value or increase further. For random I/O workloads, reduce to 128 KB to avoid prefetching data that will not be used.
`/sys/block/<device>/queue/rq_affinity`	1	Controls which CPU handles I/O completion.

For rq_affinity, the trade-offs between values are:

Value	Behavior	Best for
`0`	Completion runs on the CPU that triggered the interrupt	Lowest latency for interrupt-heavy workloads
`1`	Completion runs on any CPU in the same socket as the submitter (cache-friendly, but the first CPU in the group gets higher load)	Most workloads — default and recommended
`2`	Completion runs on the exact CPU that submitted the I/O (balanced CPU load, slightly lower efficiency than `1`)	High-concurrency workloads with many cores

Parameter	Default value	Description
`/sys/block/<device>/queue/scheduler`	`mq-deadline` (single queue) or `none` (multiple queues)	I/O scheduler. Alibaba Cloud Linux 3 supports `mq-deadline`, `kyber`, `bfq`, and `none`. The blk-mq layer selects `mq-deadline` for single-queue devices and `none` for multi-queue devices. For workloads that need low read latency, switch to `kyber` and configure the target latency value.
`/sys/kernel/mm/pagecache_limit/enabled`	0	Enables or disables the page cache limit feature system-wide. `0` = disabled; `1` = enabled. > Important This is a custom feature in Alibaba Cloud Linux 3. Long-term maintenance is not guaranteed.
`/sys/fs/cgroup/memory/memory.pagecache_limit.enable`	0	Enables or disables the page cache limit feature for a specific memcg. `0` = disabled for this memcg; `1` = enabled.
`/sys/fs/cgroup/memory/memory.pagecache_limit.size`	0	Page cache usage cap for the current memcg tree, in bytes. Valid values: `0` to the value of `memory.limit_in_bytes` for the current memcg. Setting to `0` disables the page cache limit feature for this memcg regardless of the global or per-memcg switch. A non-zero value sets the upper limit of page cache usage for the memcg tree.

Network security

Parameter	Default value	Description
`net.ipv4.conf.all.arp_ignore`	0	Controls ARP reply behavior for all current NICs. Valid values: `0` = reply to ARP requests for any local IP, including loopback addresses, regardless of which NIC receives the request; `1` = reply only if the target IP is configured on the receiving NIC; `2` = reply only if the target IP is on the receiving NIC and the source IP is in the same subnet. For example, if `eth0` receives an ARP request for the IP of `eth1`: with value `0`, `eth0` replies; with value `1` or `2`, it does not.
`net.ipv4.conf.default.arp_ignore`	0	ARP reply behavior for newly added NICs. Same valid values and behavior as `net.ipv4.conf.all.arp_ignore`.
`net.ipv4.ip_forward`	0	Enables or disables IPv4 packet forwarding. `0` = disabled; `1` = enabled. Enable this when the instance acts as a router or NAT gateway.

Resource utilization

Parameter	Default value	Description
`net.ipv4.tcp_fin_timeout`	60	Duration a TCP connection stays in FIN_WAIT2 state after the local side initiates a close, in seconds. The default of 60 seconds is appropriate for most workloads. If you observe a large number of FIN_WAIT2 connections (check with `netstat -ant	grep FIN_WAIT2	wc -l`), reduce this value to reclaim ports faster. For more information, see Why does a Linux ECS instance have many TCP connections in the FIN_WAIT2 state?
`net.ipv4.tcp_tw_reuse`	2	Controls reuse of TIME_WAIT sockets for new connections. `0` = disabled; `1` = globally enabled; `2` = enabled for loopback only.
`net.ipv4.tcp_keepalive_time`	7200	Interval between keepalive probes when TCP keepalive is enabled, in seconds. Keepalive probes confirm that the remote end of an idle connection is still reachable.

System limits

Parameter	Default value	Description
`fs.aio-max-nr`	65536	Maximum number of concurrent asynchronous I/O (AIO) requests system-wide. The kernel accumulates the `nr_events` argument of each `io_setup()` call into `aio-nr`. If `aio-nr + nr_events > aio-max-nr`, `io_setup()` returns `-EAGAIN`. Increase this for database or search workloads that rely heavily on Linux AIO. Monitor `aio-nr` to determine the right value for your environment.
`fs.file-max`	Set based on reserved memory at boot	Maximum number of file handles the kernel allows system-wide. Up to 10% of reserved memory can be used for file handles. The minimum is 8,192 (the `NR_FILE` value). Increase this only if processes fail with "too many open files" at the system level.
`fs.nr_open`	1048576	Maximum number of open file handles per process. The per-process limit set by `ulimit -n` (RLIMIT_NOFILE) cannot exceed this value. Increase `fs.nr_open` before raising `ulimit -n` beyond 1,048,576.

Monitoring

Parameter	Default value	Description
`net.netfilter.nf_conntrack_max`	262144	Maximum number of connection tracking entries in the nf_conntrack hash table. Calculated as `4 × net.netfilter.nf_conntrack_buckets`. Increase this if applications experience intermittent packet loss and the kernel log shows `nf_conntrack: table full, dropping packet`. For more information, see What do I do if applications on an ECS instance occasionally experience packet loss and the kernel log contains the "kernel: nf_conntrack: table full, dropping packet" error?
`net.netfilter.nf_conntrack_tcp_timeout_time_wait`	120	How long nf_conntrack tracks a TCP connection in TIME_WAIT state, in seconds.
`net.netfilter.nf_conntrack_tcp_timeout_established`	432000	How long iptables keeps an established TCP connection in the tracking table before closing it due to inactivity, in seconds.
`fs.inotify.max_queued_events`	16384	Maximum number of events that can queue for an inotify instance before events are dropped. inotify is the kernel subsystem for monitoring file and directory events. Use the default unless your application processes file events in large batches.
`fs.inotify.max_user_instances`	128	Maximum number of inotify instances a user can create. This limit prevents runaway processes from consuming excessive memory by creating many monitoring instances. Use the default unless your application requires more instances.
`fs.inotify.max_user_watches`	8192	Maximum number of watches a user can add across all inotify instances. A watch is a (path, event mask) pair that tells inotify which events to report for a specific file or directory. Increase this if your application monitors a large number of files or directories.
`/sys/block/<device>/queue/hang_threshold`	5000	I/O hang detection threshold, in milliseconds. The kernel flags an I/O operation as hung if it does not complete within this time. Adjust this based on your storage and workload characteristics. For more information, see Detect I/O hangs in the file system and block layer. > Important This is a custom feature in Alibaba Cloud Linux 3. Long-term maintenance is not guaranteed.