DNS is one of the most critical foundational services in a Kubernetes cluster. Improper client-side configuration or a large cluster scale can cause DNS resolution to time out or fail. This topic describes best practices for DNS in Kubernetes clusters to help you avoid these issues.
Important notes
This topic does not apply to managed CoreDNS or ACK clusters with Auto Mode enabled. Managed CoreDNS automatically scales based on workload, so manual adjustments are unnecessary.
Contents
DNS best practices cover both the client side and the server side:
-
On the client side, you can reduce resolution latency by optimizing domain name resolution requests. You can also minimize resolution anomalies using appropriate container images, suitable node operating systems, and NodeLocal DNSCache.
-
On the CoreDNS server side, you can monitor the CoreDNS runtime status to detect DNS anomalies and quickly identify root causes. You can also improve CoreDNS high availability and queries per second (QPS) throughput by adjusting CoreDNS deployment settings.
For more information about CoreDNS, see the official CoreDNS documentation.
Optimize domain name resolution requests
DNS domain name resolution is one of the most frequent network activities in Kubernetes. Many of these requests can be optimized or avoided. You can optimize domain name resolution requests in the following ways:
-
(Recommended) Use connection pools. If a containerized application frequently accesses another service, use a connection pool. Connection pools cache upstream service connections in memory, which avoids the overhead of DNS resolution and TCP connection setup for each access.
-
Use asynchronous or long polling modes to obtain the IP address that corresponds to a DNS domain name.
-
Use DNS caching:
-
(Recommended) If your application cannot be modified to use connection pools, consider caching DNS resolution results on the application side. For more information, see Use NodeLocal DNSCache.
-
If NodeLocal DNSCache is unavailable, you can cache DNS queries inside the container using Name Service Cache Daemon (NSCD). For more information, see Use NSCD in Kubernetes clusters.
-
-
Optimize the resolv.conf file: The ndots and search parameters in the resolv.conf file affect the efficiency of domain name resolution based on how you write domain names in container configurations. For more information about how the ndots and search parameters work, see DNS policy configuration and domain name resolution.
-
Optimize the domain name configuration. When a containerized application accesses a domain name, configure the domain name as follows to minimize resolution attempts and reduce resolution time:
-
For pods that access a Service in the same namespace, use
<service-name>, whereservice-nameis the name of the Service. -
For pods that access a Service across namespaces, use
<service-name>.<namespace-name>, wherenamespace-nameis the namespace of the Service. -
When you access external domains, use fully qualified domain names (FQDNs). Append a trailing dot (.) to common domain names to specify them as absolute addresses. This practice avoids multiple invalid searches caused by
searchdomain concatenation. For example, when you access www.aliyun.com, use the FQDN www.aliyun.com..-
In clusters of version 1.33 or later, you can configure the search domain as a single "." (see related issue: 125883) to achieve a similar effect:
dnsPolicy: None dnsConfig: nameservers: ["192.168.0.10"] ## Replace 192.168.0.10 with the actual CoreDNS service clusterIP searches: - . - default.svc.cluster.local ## Replace "default" with your namespace name - svc.cluster.local - cluster.localAfter you apply this configuration, the /etc/resolv.conf file in the pod appears as follows:
search . default.svc.cluster.local svc.cluster.local cluster.local nameserver 192.168.0.10The "." as the first search domain ensures that all domain requests are treated as FQDNs and resolved directly without unnecessary search attempts.
ImportantNote that this configuration requires you to set
dnsPolicytoNoneto take effect.
-
-
Understand DNS configuration in containers
-
Different DNS resolvers may behave slightly differently due to implementation differences. You might observe cases where dig <domain> resolves successfully but ping <domain> fails.
-
Avoid using Alpine base images. The musl libc library in Alpine container images differs from the standard glibc and can cause issues such as the following:
-
Alpine 3.18 and earlier versions do not support the tc command falling back to the TCP protocol.
-
Alpine versions 3.3 and earlier do not support the search parameter or search domains, which prevents service discovery.
-
Concurrent queries to multiple DNS servers that are configured in /etc/resolv.conf can invalidate NodeLocal DNSCache optimizations.
-
Concurrent A and AAAA record queries that use the same socket can trigger conntrack source port conflicts on older kernels, which causes packet loss.
For more information, see musl libc.
-
-
If you use Go applications, make sure that you understand the differences between CGO and Pure Go DNS resolver implementations.
Avoid probabilistic DNS resolution timeouts caused by IPVS defects
If you use IPVS as the kube-proxy load balancing mode, you may encounter intermittent DNS resolution timeouts during a CoreDNS scale-in or restart. This issue is caused by a defect in the Linux kernel. For more information, see IPVS.
You can reduce the impact of IPVS defects using one of the following methods:
-
Use NodeLocal DNSCache. For more information, see Use NodeLocal DNSCache.
-
Modify the IPVS UDP session persistence timeout in kube-proxy. For more information, see How do I modify the IPVS UDP session persistence timeout in kube-proxy?.
Use NodeLocal DNSCache
In some scenarios, CoreDNS may encounter the following issues:
-
Rarely, concurrent A and AAAA queries may cause packet loss, which leads to DNS resolution failures.
-
Full conntrack tables on nodes may cause packet loss, which results in DNS resolution failures.
To improve the stability and performance of the DNS service in your cluster, you can install the NodeLocal DNSCache component. This component runs a DNS cache on each node in the cluster to improve DNS performance. For more information about NodeLocal DNSCache and how to deploy it in ACK clusters, see Use the NodeLocal DNSCache component.
After you install NodeLocal DNSCache, you must inject the DNS cache configuration into your pods. You can run the following command to add a label to a namespace. New pods that are created in this namespace automatically receive the DNS cache configuration. For information about other injection methods, see the document linked in the previous paragraph.
kubectl label namespace default node-local-dns-injection=enabled
Use an appropriate CoreDNS version
CoreDNS maintains good backward compatibility with different Kubernetes versions. We recommend that you keep CoreDNS at a recent, stable version. The Component Management page in the ACK console provides features to install, upgrade, and configure CoreDNS. You can monitor the status of your components on the Component Management page. If an upgrade is available for CoreDNS, perform the upgrade during off-peak hours.
-
For instructions about how to perform an upgrade, see Automatic upgrade for non-managed CoreDNS.
-
For the release notes of CoreDNS, see CoreDNS.
CoreDNS versions earlier than v1.7.0 have known risks, including but not limited to the following:
-
If connectivity between CoreDNS and the API server fails, for example, during an API server restart, migration, or due to network jitter, CoreDNS may restart because it fails to write error logs. For more information, see Set klog's logtostderr flag.
-
CoreDNS consumes extra memory during startup. The default memory limit may trigger out-of-memory (OOM) issues in large clusters. This can cause CoreDNS pods to repeatedly restart without automatic recovery. For more information, see CoreDNS uses a lot of memory during initialization phase.
-
CoreDNS has known issues that affect Headless Service domains and external domain resolution. For more information, see plugin/kubernetes: handle tombstones in default processor and Data is not synced when CoreDNS reconnects to kubernetes api server after protracted disconnection.
-
If a node is abnormal, outdated CoreDNS versions may deploy pods on the abnormal node due to default toleration policies. These pods may not be automatically evicted, which causes domain resolution failures.
The recommended minimum CoreDNS version varies based on the Kubernetes cluster version, as shown in the following table:
|
Cluster version |
Recommended minimum CoreDNS version |
|
Below 1.14.8 |
v1.6.2 (no longer maintained) |
|
1.14.8 and later, below 1.20.4 |
v1.7.0.0-f59c03d-aliyun |
|
1.20.4 and later, below 1.21.0 |
v1.8.4.1-3a376cc-aliyun |
|
1.21.0 and later |
v1.11.3.2-f57ea7ed6-aliyun |
Monitor CoreDNS runtime status
Monitoring metrics
CoreDNS exposes health metrics, such as resolution results, through a standard Prometheus interface. This helps you detect anomalies in CoreDNS or upstream DNS servers.
Prometheus for ACK includes built-in CoreDNS monitoring metrics and alerting rules. You can enable the Prometheus and Dashboard features in the Container Service for Kubernetes console. For more information, see CoreDNS component monitoring.
If you run a self-managed Prometheus instance for your Kubernetes cluster, you can observe the relevant metrics and set alerts for critical ones. For more information, see the official CoreDNS Prometheus documentation.
Operational logs
When DNS anomalies occur, you can use CoreDNS logs to quickly diagnose the root causes. You can enable CoreDNS domain resolution logs and SLS log collection. For more information, see Analyze and monitor CoreDNS logs.
Kubernetes event delivery
In CoreDNS v1.9.3.6-32932850-aliyun and later, you can enable the k8s_event plugin to deliver critical CoreDNS logs as Kubernetes events to the Event Hub. For more information about the k8s_event plugin, see k8s_event.
Newly deployed CoreDNS instances have this feature enabled by default. If you are upgrading from an older version of CoreDNS, you must manually modify the configuration file to enable the feature.
-
Run the following command to open the CoreDNS configuration file.
kubectl -n kube-system edit configmap/coredns -
Add the kubeAPI and k8s_event plugins.
apiVersion: v1 data: Corefile: | .:53 { errors health { lameduck 15s } // Begin addition (ignore other differences). kubeapi k8s_event { level info error warning // Deliver critical logs with info, error, or warning levels. } // End addition. kubernetes cluster.local in-addr.arpa ip6.arpa { pods verified fallthrough in-addr.arpa ip6.arpa } // Omitted below. } -
Check the CoreDNS pod status and logs. If the logs contain the word
reload, the modification is successful.
Ensure CoreDNS high availability
CoreDNS serves as the authoritative DNS for the cluster. A CoreDNS failure can cause internal Service access to fail, which can disrupt large portions of your business. You can ensure CoreDNS high availability using the following measures:
Assess CoreDNS component pressure
You can run DNS stress tests in your cluster to evaluate component pressure. Many open-source tools, such as DNSPerf, can help with this. If you cannot accurately assess the DNS pressure, follow these recommendations:
-
Always deploy at least two CoreDNS pods, with each pod having resource limits of at least 1 CPU core and 1 GB of memory.
-
The QPS capacity of CoreDNS is directly correlated with CPU usage. With NodeLocal DNSCache, each CPU core can support more than 10,000 QPS. The DNS QPS requirements of business workloads vary significantly. You should monitor the peak CPU usage of each CoreDNS pod. If the CPU usage exceeds one core during peak business periods, scale out the CoreDNS replicas. If the peak CPU usage is unknown, as a conservative measure, deploy one CoreDNS pod for every eight nodes in the cluster.
Adjust the number of CoreDNS pods
The number of CoreDNS pods directly determines the available compute resources. You can adjust this number based on your assessment.
Because UDP does not have retransmission mechanisms, a CoreDNS pod scale-in or restart may cause cluster-wide DNS resolution timeouts or anomalies that last for up to five minutes if IPVS UDP defects exist on the cluster nodes. For solutions to IPVS-related resolution issues, see Troubleshoot DNS resolution anomalies.
-
Automatically adjust the number of pods based on recommended policies
You can deploy the following
dns-autoscaler. It automatically adjusts the number of CoreDNS pods in real time based on the recommended ratio of one pod for every eight cluster nodes. The formula for the number of replicas is `replicas = max(ceil(cores × 1/coresPerReplica), ceil(nodes × 1/nodesPerReplica))`, which is constrained by themaxandminlimits. -
Manual adjustments
You can manually adjust the number of CoreDNS pods by running the following command.
kubectl scale --replicas={target} deployment/coredns -n kube-system # Replace {target} with the desired pod count -
Do not use workload autoscaling
Although workload autoscaling mechanisms such as horizontal pod autoscaling (HPA) or CronHPA can automatically adjust the number of pods, they trigger frequent scaling operations. Due to the resolution anomalies that can occur during a pod scale-in as described earlier, do not use workload autoscaling to control the number of CoreDNS pods.
Adjust CoreDNS pod specifications
Another way to adjust CoreDNS resources is to modify the pod specifications. In ACK managed clusters Pro, CoreDNS pods have a default memory limit of 2Gi and no CPU limit. We recommend that you set the CPU limit to 4096m, with a minimum of 1024m. You can adjust the CoreDNS pod configuration in the console.
Schedule CoreDNS pods
Incorrect scheduling configurations may prevent CoreDNS pods from being deployed, which can cause CoreDNS to fail. Before you proceed, make sure that you fully understand scheduling.
Deploy CoreDNS pods across different zones and cluster nodes to avoid single-node or single-zone failures. CoreDNS versions earlier than v1.8.4.3 use weak node anti-affinity by default. This may cause some or all pods to be deployed on the same node due to insufficient node resources. If this occurs, you can delete the pods to trigger rescheduling or upgrade to the latest component version. CoreDNS versions earlier than v1.8 are no longer maintained. We recommend that you upgrade them as soon as possible.
Avoid deploying CoreDNS on cluster nodes that have full CPU or memory utilization because this affects DNS QPS and response latency. When possible, use custom parameters to schedule CoreDNS on dedicated cluster nodes to ensure stable domain name resolution.
Optimize CoreDNS configuration
Container Service for Kubernetes (ACK) provides only default CoreDNS configurations. You should review all parameters and optimize them to ensure that CoreDNS properly serves your application containers. The CoreDNS configuration is highly flexible. For more information, see DNS policy configuration and domain name resolution and the official CoreDNS documentation.
Default CoreDNS configurations that are deployed with older Kubernetes versions may have risks. You can check for and optimize these configurations in the following ways:
You can also use the scheduled inspection and fault diagnosis features in Container Intelligence Operations to check the CoreDNS configuration files. If Container Intelligence Operations reports CoreDNS ConfigMap configuration anomalies, review the items mentioned above.
CoreDNS may consume extra memory when it reloads configurations. After you modify CoreDNS configuration items, monitor the pod status. If the pods experience memory shortages, promptly increase the memory limit in the CoreDNS Deployment. We recommend that you set the memory to 2 GB.
Disable affinity settings for kube-dns service
Affinity settings may cause significant load imbalances among CoreDNS replicas. You can disable them in one of the following ways:
Console method
Log on to the ACK console. In the left navigation pane, click Clusters.
On the Clusters page, click the name of your cluster. In the left navigation pane, click .
-
In the kube-system namespace, click Edit YAML to the right of the kube-dns service.
-
If the value of the sessionAffinity field is
None, you can skip the following steps. -
If the value of sessionAffinity is
ClientIP, proceed with the following steps.
-
-
Delete the sessionAffinity and sessionAffinityConfig fields and all their subkeys. Then, click Update.
# Delete all the following content. sessionAffinity: ClientIP sessionAffinityConfig: clientIP: timeoutSeconds: 10800 -
Click Edit YAML again to the right of the kube-dns service and verify that the value of the sessionAffinity field is
None. If the value of the field isNone, the Kube-DNS service has been successfully updated.
Command-line method
-
Run the following command to view the kube-dns service configuration.
kubectl -n kube-system get svc kube-dns -o yaml-
If the value of the sessionAffinity field is
None, you can skip the following steps. -
If the value of sessionAffinity is
ClientIP, proceed with the following steps.
-
-
Run the following command to open and edit the kube-dns service.
kubectl -n kube-system edit service kube-dns -
Delete all sessionAffinity-related settings (sessionAffinity, sessionAffinityConfig, and all subkeys). Then, save the change and exit.
# Delete all the following content. sessionAffinity: ClientIP sessionAffinityConfig: clientIP: timeoutSeconds: 10800 -
After the modification, run the following command again to verify that the value of the sessionAffinity field is
None. If it isNone, the kube-dns service update is successful.kubectl -n kube-system get svc kube-dns -o yaml
Disable the Autopath plugin
Some older versions of CoreDNS have the Autopath plugin enabled. This plugin may produce incorrect resolution results in extreme scenarios. You can verify whether the plugin is enabled and disable it by editing the configuration file. For more information, see Autopath.
After you disable the Autopath plugin, the client-side DNS query QPS may increase by up to three times, and the single-domain resolution time may increase by up to three times. You should monitor the CoreDNS load and the impact on your business.
-
Run the
kubectl -n kube-system edit configmap corednscommand to open the CoreDNS configuration file. -
Delete the
autopath @kubernetesline and save the change. -
Check the CoreDNS pod status and logs. If the logs contain the word
reload, the modification is successful.
Configure graceful shutdown for CoreDNS
The lameduck mechanism in CoreDNS enables graceful shutdown. When CoreDNS stops or restarts, this mechanism ensures that ongoing requests are completed normally without being abruptly interrupted. The lameduck mechanism works as follows:
-
When CoreDNS terminates, it enters Lameduck mode.
-
In
lameduckmode, CoreDNS stops accepting new requests but continues to process existing requests until they are complete or until thelameducktimeout period expires.
Console method
Log on to the ACK console. In the left navigation pane, click Clusters.
On the Clusters page, click the name of your cluster. In the left navigation pane, click .
-
In the kube-system namespace, click Edit YAML to the right of the coredns configuration item.
-
Refer to the following CoreDNS configuration file. Make sure that the health plugin is enabled and set the lameduck timeout to
15s. Then, click OK.
.:53 {
errors
# The health plugin may have different default settings in various CoreDNS versions.
# Case 1: health plugin disabled by default.
# Case 2: health plugin enabled by default but lameduck time not set.
# health
# Case 3: health plugin enabled by default with lameduck time set to 5s.
# health {
# lameduck 5s
# }
# For all three cases, modify uniformly as follows to set lameduck to 15s.
health {
lameduck 15s
}
# Do not modify other plugins; omitted here.
}
If the CoreDNS pods run as normal, the graceful shutdown configuration is successfully updated. If the pods become abnormal, check the pod events and logs to identify the cause.
Command-line method
-
Run the following command to open the CoreDNS configuration file.
-
Refer to the following Corefile. Make sure that the
healthplugin is enabled and set the lameduck parameter to15s. -
Save the change and exit after you modify the CoreDNS configuration file.
-
If CoreDNS runs as normal, the graceful shutdown configuration is successfully updated. If the pods become abnormal, check the pod events and logs to identify the cause.
kubectl -n kube-system edit configmap/coredns
.:53 {
errors
# The health plugin may have different default settings in various CoreDNS versions.
# Case 1: health plugin disabled by default.
# Case 2: health plugin enabled by default but lameduck time not set.
# health
# Case 3: health plugin enabled by default with lameduck time set to 5s.
# health {
# lameduck 5s
# }
# For all three cases, modify uniformly as follows to set lameduck to 15s.
health {
lameduck 15s
}
# Do not modify other plugins; omitted here.
}
Set the default protocol for the Forward plugin to communicate with upstream VPC DNS servers
NodeLocal DNSCache communicates with CoreDNS using TCP. CoreDNS uses the same protocol as the incoming request when it communicates with upstream DNS servers. By default, external domain resolution requests from application containers pass through NodeLocal DNSCache and CoreDNS, and then reach the VPC DNS servers (100.100.2.136 and 100.100.2.138) using TCP.
VPC DNS servers have limited support for TCP. If you use NodeLocal DNSCache, you should modify the CoreDNS configuration to always prefer UDP when communicating with upstream DNS servers to avoid resolution anomalies. You can modify the ConfigMap named coredns in the kube-system namespace as follows. For more information, see Manage ConfigMaps. In the forward plugin, specify prefer_udp as the upstream protocol. After this modification, CoreDNS prefers to use UDP for upstream communication:
# Before modification
forward . /etc/resolv.conf
# After modification
forward . /etc/resolv.conf {
prefer_udp
}
Configure the Ready readiness probe plugin
CoreDNS versions later than 1.5.0 require the ready plugin to enable readiness probes.
-
Run the following command to open the CoreDNS configuration file.
kubectl -n kube-system edit configmap/coredns -
Check for the
readyline. If it is missing, addready. Press Esc, type:wq!, and then press Enter to save the change and exit.apiVersion: v1 data: Corefile: | .:53 { errors health { lameduck 15s } ready # Add this line if missing, ensuring consistent indentation with Kubernetes. kubernetes cluster.local in-addr.arpa ip6.arpa { pods verified fallthrough in-addr.arpa ip6.arpa } prometheus :9153 forward . /etc/resolv.conf { max_concurrent 1000 prefer_udp } cache 30 loop log reload loadbalance } -
Check the CoreDNS pod status and logs. If the logs contain the word
reload, the modification is successful.
Configure the multisocket plugin to enhance CoreDNS resolution performance
CoreDNS v1.12.1 introduced the multisocket plugin. If you enable this plugin, CoreDNS can use multiple sockets to listen on the same port, which enhances performance in high-CPU scenarios. For more information about the plugin, see the community documentation.
You can enable multisocket in the coredns ConfigMap:
.:53 {
...
prometheus :9153
multisocket [NUM_SOCKETS]
forward . /etc/resolv.conf
...
}
NUM_SOCKETS specifies the number of sockets that listen on the same port.
Configuration recommendations: We recommend that you align the value of NUM_SOCKETS with the estimated CPU usage, CPU resource limits, and available resources in the cluster. For example:
-
If CoreDNS consumes 4 cores at its peak and 8 cores are available, set
NUM_SOCKETSto 2. -
If CoreDNS consumes 8 cores at its peak and 64 cores are available, set
NUM_SOCKETSto 8.
To determine the optimal configuration, you can test different settings and measure the QPS and load.
If you do not specify NUM_SOCKETS, CoreDNS uses GOMAXPROCS by default. The value of `GOMAXPROCS` is equal to the CPU limit of the CoreDNS pod. If no CPU limit is set, the value is equal to the number of CPU cores on the node.