DNS best practices-Container Service for Kubernetes(ACK)-阿里云帮助中心

Notes

This topic does not apply to the managed edition of CoreDNS or ACK clusters that have Auto Mode enabled. The managed edition of CoreDNS automatically scales based on load, requiring no manual adjustment.

Optimize domain name resolution requests

DNS resolution is one of the most frequent network operations in a Kubernetes cluster. Many of these requests can be optimized or avoided to reduce latency and load on the DNS infrastructure. You can optimize domain name resolution requests in the following ways:

(Recommended) Use connection pools: When a containerized application frequently requests another service, use a connection pool to cache active connections to upstream services in memory. This eliminates the overhead of DNS resolution and TCP handshakes for each request.
Use an asynchronous or long-polling mode to obtain the IP addresses for a domain name.
Use DNS caching:
- (Recommended) If your application cannot be modified to use a connection pool, consider caching DNS resolution results on the application side. For more information, see Use NodeLocal DNSCache.
- If you cannot use NodeLocal DNSCache, you can use the built-in Name Service Cache Daemon (NSCD) cache in your containers. For more information, see Use NSCD in Kubernetes clusters.
Optimize the resolv.conf file: Because of the mechanisms of the ndots and search parameters in the resolv.conf file, the way you write domain names in a container determines the efficiency of domain name resolution. For more information about the mechanisms of the ndots and search parameters, see DNS Policy Configuration and Domain Name Resolution.

Optimize domain name configurations: When an application in a container accesses a domain name, configure it as follows to minimize resolution attempts and reduce resolution latency.

To access a Service in the same namespace from a pod, use <service-name>, where service-name is the name of the Service.
To access a Service in a different namespace from a pod, use <service-name>.<namespace-name>, where namespace-name is the namespace where the Service resides.

When a pod accesses an external domain name, use a Fully Qualified Domain Name (FQDN), which ends in a trailing dot (.), to prevent multiple invalid DNS lookups caused by appending domains from the search list. For example, to access www.aliyun.com, use its FQDN www.aliyun.com..

In clusters that run Kubernetes 1.22 or later, you can configure the search domain as a single period (.) to achieve a similar effect (see Issue 125883):

dnsPolicy: None
dnsConfig:
  nameservers: ["192.168.0.10"]  ## Replace with the actual ClusterIP of your CoreDNS Service.
  searches:
  - .
  - default.svc.cluster.local  ## Note: Replace default with the actual namespace.
  - svc.cluster.local
  - cluster.local

After you apply the preceding configuration, the /etc/resolv.conf file in the pod is configured as follows:

search . default.svc.cluster.local svc.cluster.local cluster.local
nameserver 192.168.0.10

The first search domain is ".", which makes the resolver treat the target domain as an FQDN. The resolver first attempts to resolve the domain name as-is, skipping unnecessary search domain expansions.

Important

You must set dnsPolicy to None for the preceding configuration to take effect.

Complete workload example

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: nginx
  name: nginx
  namespace: default
spec:
  progressDeadlineSeconds: 600
  replicas: 3
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app: nginx
  strategy:
    rollingUpdate:
      maxSurge: 25%
      maxUnavailable: 25%
    type: RollingUpdate
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - image: registry.openanolis.cn/openanolis/nginx:1.14.1-8.6
        imagePullPolicy: Always
        name: nginx
        resources: {}
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
      dnsPolicy: None
      dnsConfig:
        nameservers: ["192.168.0.10"]  ## Replace with the actual ClusterIP of your CoreDNS Service.
        searches:
        - .
        - default.svc.cluster.local
        - svc.cluster.local
        - cluster.local
      hostname: nginx
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      subdomain: subdomain
      terminationGracePeriodSeconds: 30

Understand DNS configurations in containers

Different DNS resolvers may behave differently due to implementation variations. You might encounter cases where dig <domain> succeeds but ping <domain> fails.
Avoid using Alpine as the base image. Use other base images, such as Debian or CentOS, instead. The musl libc library built into Alpine container images has several implementation differences compared to the standard glibc, which can lead to issues that include but are not limited to the following:
- TCP fallback: Alpine 3.18 and earlier do not support fallback to TCP when a truncated (TC) flag is returned.
- Search domains: Alpine 3.3 and earlier do not support the search parameter, which breaks service discovery.
- Optimization conflicts: Alpine concurrently queries all DNS servers that are configured in /etc/resolv.conf, which can bypass and invalidate NodeLocal DNSCache optimizations.
- Conntrack race conditions: Concurrent A and AAAA record requests that use the same socket can trigger conntrack source port conflicts in older Linux kernels, which results in packet loss.
For more information about these issues, see musl libc.
If you use a Go application, be aware of the differences between the DNS resolvers in the CGO and Pure GO implementations.

Avoid DNS timeouts caused by IPVS defects

When a cluster uses IPVS as the kube-proxy load balancing mode, you may encounter probabilistic DNS resolution timeouts when CoreDNS is scaled down or restarted. This issue is caused by a defect in the community Linux kernel. For more information, see IPVS.

You can use one of the following methods to mitigate the impact of the IPVS defect:

Use the NodeLocal DNSCache. For more information, see Use NodeLocal DNSCache.
Modify the timeout period for IPVS UDP session persistence in kube-proxy. For more information, see How do I modify the timeout period for IPVS UDP session persistence in kube-proxy?.

Use NodeLocal DNSCache

CoreDNS may experience the following issues:

In rare cases, concurrent A and AAAA queries can cause packet loss, which leads to DNS resolution failures.
A full conntrack table on a node can cause packet loss, which leads to DNS resolution failures.

To improve DNS stability and performance in your cluster, install the NodeLocal DNSCache component. It enhances cluster DNS performance by running a DNS cache on each cluster node. For more information about NodeLocal DNSCache and how to deploy it in an ACK cluster, see Use the NodeLocal DNSCache component.

Important

After you install NodeLocal DNSCache, you must inject the DNS cache configuration into your pods. You can run the following command to add a label to a specific namespace. New pods created in this namespace will automatically have the DNS cache configuration injected. For more information about other injection methods, see the documentation referenced in the previous paragraph.

kubectl label namespace default node-local-dns-injection=enabled

Use a suitable CoreDNS version

CoreDNS offers good backward compatibility with Kubernetes versions. Keep CoreDNS updated to the latest stable version. The Add-ons page in the ACK console allows you to install, upgrade, and configure CoreDNS. Check the status of the CoreDNS component on the Add-ons page. If an upgrade is available, schedule the upgrade during off-peak hours.

For more information about how to upgrade CoreDNS, see Automatic upgrade for unmanaged CoreDNS.
For the release notes of CoreDNS, see CoreDNS.

CoreDNS versions earlier than v1.7.0 have several potential risks, including:

When connectivity between CoreDNS and the API server is abnormal, for example, due to API server restarts, migrations, or network jitter, CoreDNS may restart because it fails to write error logs. For more information, see Set klog's logtostderr flag.
CoreDNS consumes extra memory at startup. The default memory limit may trigger out-of-memory (OOM) issues in large-scale clusters. In severe cases, this can cause CoreDNS pods to enter a restart loop and fail to recover. For more information, see CoreDNS uses a lot memory during initialization phase.
CoreDNS has several issues that can affect the resolution of headless Service domain names and domain names outside the cluster. For more information, see plugin/kubernetes: handle tombstones in default processor and Data is not synced when CoreDNS reconnects to kubernetes api server after protracted disconnection.
If a cluster node becomes abnormal, the default toleration policy in some earlier CoreDNS versions may cause CoreDNS pods to be scheduled onto the abnormal node. These pods cannot be automatically evicted, leading to DNS resolution failures.

The recommended minimum CoreDNS version varies depending on the Kubernetes version of the cluster.

Cluster version	Minimum CoreDNS version
Earlier than 1.14.8	v1.6.2 (End of Life)
1.14.8 or later, but earlier than 1.20.4	v1.7.0.0-f59c03d-aliyun
1.20.4 or later, but earlier than 1.21.0	v1.8.4.1-3a376cc-aliyun
1.21.0 and later	v1.11.3.2-f57ea7ed6-aliyun

Monitor the operational status of CoreDNS

Metrics

CoreDNS exposes health metrics, including resolution results, through a standard Prometheus interface. This helps detect anomalies on the CoreDNS server and even upstream DNS servers.

Managed Service for Prometheus provides built-in metrics monitoring dashboards and alerting rules for CoreDNS. You can enable Prometheus and its dashboard features in the ACK console. For more information, see Monitor the CoreDNS component.

If you use a self-managed Prometheus instance to monitor your Kubernetes cluster, you can observe the relevant metrics in Prometheus and set up alerts for key indicators. For more information, see the official CoreDNS documentation for Prometheus.

Logs

In the event of a DNS anomaly, CoreDNS logs can help you quickly diagnose the root cause. We recommend that you enable CoreDNS domain name resolution logging and collect its logs with Log Service. For more information, see Analyze and monitor CoreDNS logs.

Kubernetes event delivery

In CoreDNS v1.9.3.6-32932850-aliyun and later, you can enable the k8s_event plugin to deliver critical CoreDNS logs as Kubernetes events to the Event Center. For more information about the k8s_event plugin, see k8s_event.

This feature is enabled by default in new CoreDNS deployments. If you upgrade from an earlier version to CoreDNS v1.9.3.6-32932850-aliyun or later, you need to manually modify the configuration file to enable it.

Run the following command to open the CoreDNS configuration file.
```
kubectl -n kube-system edit configmap/coredns
```

Add the kubeapi and k8s_event plugins.

apiVersion: v1
data:
  Corefile: |
    .:53 {
        errors
        health {
            lameduck 15s
        }
        // Start of addition (ignore other differences).
        kubeapi
        k8s_event {
          level info error warning // Deliver critical logs with info, error, and warning statuses.
        }
        // End of addition.
        kubernetes cluster.local in-addr.arpa ip6.arpa {
            pods verified
            fallthrough in-addr.arpa ip6.arpa
        }
        // ... (remaining content omitted)
    }

Check the operational status and logs of the CoreDNS pods. If the logs contain the word reload, the modification is successful.

Ensure CoreDNS high availability

CoreDNS is the authoritative DNS for the cluster. A failure in CoreDNS can cause Service access within the cluster to fail, potentially leading to widespread service unavailability. You can take the following measures to ensure the high availability of CoreDNS:

Assess CoreDNS component pressure

You can perform a DNS stress test in the cluster to assess component pressure. Many open-source tools, including DNSPerf, can help you with this. If you cannot accurately assess the DNS pressure in your cluster, follow these recommendations.

Always set the number of CoreDNS pods to at least 2, with a resource limit of at least 1 core and 1 GiB for a single pod.
CoreDNS's domain name resolution QPS is positively correlated with its CPU consumption. With NodeLocal DNSCache enabled, each CPU core can support over 10,000 QPS. The QPS demand for domain name requests varies significantly across different types of services. You can observe the peak CPU usage of each CoreDNS pod. If a pod uses more than one CPU core during peak hours, we recommend that you scale out the CoreDNS replicas. If you cannot determine the peak CPU usage, you can conservatively use a 1:8 ratio of pods to cluster nodes. That is, for every 8 cluster nodes that you add, add one CoreDNS pod.

Adjust CoreDNS pod count

The number of CoreDNS pods directly determines the computing resources that CoreDNS can use. You can adjust the number of CoreDNS pods based on your assessment.

Important

Due to the lack of a retransmission mechanism in UDP packets, if there is a risk of packet loss on cluster nodes due to the IPVS UDP defect, scaling in or restarting CoreDNS pods can cause cluster-wide DNS resolution timeouts or exceptions for up to five minutes. For solutions to resolution exceptions that are caused by the IPVS defect, see Troubleshoot DNS resolution issues.

Automatically adjust based on the recommended policy

You can deploy the following dns-autoscaler. It automatically adjusts the number of CoreDNS pods in real time based on the recommended policy (a 1:8 ratio of pods to cluster nodes). The number of pods is calculated by using the following formula: replicas = max(ceil(cores × 1/coresPerReplica), ceil(nodes × 1/nodesPerReplica)), and is limited by the max and min parameters.

dns-autoscaler

apiVersion: apps/v1
kind: Deployment
metadata:
  name: dns-autoscaler
  namespace: kube-system
  labels:
    k8s-app: dns-autoscaler
spec:
  selector:
    matchLabels:
      k8s-app: dns-autoscaler
  template:
    metadata:
      labels:
        k8s-app: dns-autoscaler
    spec:
      serviceAccountName: admin
      containers:
      - name: autoscaler
        image: registry.cn-hangzhou.aliyuncs.com/acs/cluster-proportional-autoscaler:1.8.4
        resources:
          requests:
            cpu: "200m"
            memory: "150Mi"
        command:
        - /cluster-proportional-autoscaler
        - --namespace=kube-system
        - --configmap=dns-autoscaler
        - --nodelabels=type!=virtual-kubelet
        - --target=Deployment/coredns
        - --default-params={"linear":{"coresPerReplica":64,"nodesPerReplica":8,"min":2,"max":100,"preventSinglePointFailure":true}}
        - --logtostderr=true
        - --v=9

Manually adjust

You can run the following command to manually adjust the number of CoreDNS pods.

kubectl scale --replicas={target} deployment/coredns -n kube-system # Replace {target} with the desired number of pods.

Do not use workload auto-scaling

Although workload auto-scaling features like Horizontal Pod Autoscaler (HPA) and CronHPA can also automatically adjust the number of pods, they perform frequent scaling operations. Due to the resolution exceptions that occur when pods are scaled in, do not use workload auto-scaling to control the number of CoreDNS pods.

Adjust CoreDNS pod specifications

Another way to adjust CoreDNS resources is to modify pod specifications. In an ACK managed Pro cluster, the default memory limit for CoreDNS pods is 2Gi, with no CPU limit. Set the CPU limit to 4096m, with a minimum of 1024m. You can adjust the CoreDNS pod configuration in the console.

Modify the CoreDNS configuration in the console

Log on to the ACK console. In the left navigation pane, click Clusters.
On the Clusters page, click the name of your cluster. In the left navigation pane, click Add-ons.
Click the Network tab and find the CoreDNS card. On the card, click Configuration.
Modify the CoreDNS configuration, and then click OK.

The dialog box shows a message that updating parameters regenerates the component's YAML template and may overwrite changes made via kubectl. Configurable parameters include MemoryRequest, CpuRequest, MemoryLimit (setting this too low may trigger OOMKilled), CpuLimit (recommended to be twice the CpuRequest), and NodeSelector (in key-value pairs).

Schedule CoreDNS pods

Important

An incorrect scheduling configuration may prevent CoreDNS pods from being deployed, leading to CoreDNS failure. Before you perform this operation, make sure that you are familiar with scheduling.

We recommend that you deploy CoreDNS pods across different availability zones and cluster nodes to avoid single-node or single-availability-zone failures. CoreDNS component versions earlier than v1.8.4.3 have a default soft anti-affinity policy at the node level, which may cause some or all pods to be deployed on the same node if resources are insufficient. If this occurs, delete the pods to trigger rescheduling, or upgrade the component to the latest version. CoreDNS component versions earlier than v1.8 are no longer maintained and should be upgraded as soon as possible.

The cluster nodes where CoreDNS runs should not have their CPU or memory fully utilized, because this affects the QPS and response latency of domain name resolution. When cluster node conditions permit, consider using custom parameters to schedule CoreDNS to dedicated cluster nodes to provide a stable domain name resolution service.

Use custom parameters to deploy CoreDNS on dedicated nodes

Log on to the ACK console. In the left navigation pane, click Clusters.
On the Clusters page, click the name of your cluster. In the left navigation pane, click Nodes > Nodes.
On the Nodes page, click Manage Labels and Taints.
On the Manage Labels and Taints page, select the target nodes and click Add Label.

Note
The number of nodes should be greater than the number of CoreDNS replicas to avoid running multiple CoreDNS replicas on a single node.
In the Add dialog box, set the following parameters and click OK.
- Name: node-role-type
- Value: coredns
In the left-side navigation pane of the cluster management page, choose Operations > Add-ons, and then search for CoreDNS.
On the CoreDNS card, click Configuration. In the Configuration dialog box, click + Add to the right of NodeSelector, set the following parameters, and then click OK.
- Key: node-role-type
- Value: coredns
CoreDNS is rescheduled to the nodes with the specified label.

Optimize CoreDNS configurations

ACK provides a default configuration for CoreDNS. You should review and optimize these parameters to ensure that CoreDNS can provide proper DNS services for your business containers. CoreDNS configuration is highly flexible. For more information, see Configure DNS policies and resolve domain names and the official CoreDNS documentation.

The default CoreDNS configurations deployed with earlier Kubernetes cluster versions may have some risks. Check and optimize them as follows:

Disable session affinity for the kube-dns Service
Disable the autopath plugin
Configure graceful shutdown for CoreDNS
Configure the default protocol for the forward plugin and upstream VPC DNS servers
Configure the ready plugin for readiness probes

You can also use the scheduled inspection and fault diagnosis features of Container Intelligence Service to check CoreDNS configuration files. If the inspection result from Container Intelligence Service indicates a CoreDNS ConfigMap configuration exception, check each of the preceding items.

Note

CoreDNS may consume extra memory when it refreshes its configuration. After you modify a CoreDNS ConfigMap, observe the pod status. If a pod runs out of memory, promptly increase the container memory limit in the CoreDNS Deployment. Adjust the memory limit to 2 GB.

Disable session affinity for kube-dns

Session affinity can lead to significant load imbalances between CoreDNS replicas. Disable it by following these steps:

Console

Log on to the ACK console. In the left navigation pane, click Clusters.
On the Clusters page, click the name of your cluster. In the left navigation pane, click Network > Services.
In the kube-system namespace, click Edit YAML to the right of the kube-dns Service.
- If the sessionAffinity field is set to None, no further action is needed.
- If the sessionAffinity field is set to ClientIP, proceed with the following steps.

Delete the sessionAffinity and sessionAffinityConfig fields and all their sub-keys, and then click Update.

# Delete all of the following content.
sessionAffinity: ClientIP
sessionAffinityConfig:
  clientIP:
    timeoutSeconds: 10800

Click Edit YAML to the right of the kube-dns service again and verify that the sessionAffinity field is set to None. A value of None indicates that the Kube-DNS service is successfully modified.

CLI

Run the following command to view the configuration information of the kube-dns Service.
```
kubectl -n kube-system get svc kube-dns -o yaml
```
- If the sessionAffinity field is set to None, no further action is needed.
- If the sessionAffinity field is set to ClientIP, proceed with the following steps.
Run the following command to open and edit the Service named kube-dns.
```
kubectl -n kube-system edit service kube-dns
```
Delete the sessionAffinity-related settings (sessionAffinity, sessionAffinityConfig, and all their sub-keys), and then save and exit.
```
# Delete all of the following content.
sessionAffinity: ClientIP
sessionAffinityConfig:
  clientIP:
    timeoutSeconds: 10800
```
After the modification is complete, run the following command again to check if the sessionAffinity field is set to None. If the value is None, the change to the Kube-DNS service is successful.
```
kubectl -n kube-system get svc kube-dns -o yaml
```

Disable the autopath plugin

Some earlier versions of CoreDNS enabled the autopath plugin, which can cause resolution errors in some edge cases. Check if it is enabled and edit the configuration file to disable it. For more information, see Autopath.

Note

After you disable the autopath plugin, the client-side QPS can increase by up to three times and the time taken to resolve a single domain name can also increase by up to three times. Monitor the CoreDNS load and business impact.

Run the kubectl -n kube-system edit configmap coredns command to open the CoreDNS configuration file.
Delete the autopath @kubernetes line and save the file.
Check the operational status and logs of the CoreDNS pods. If the logs contain the word reload, the modification is successful.

Configure graceful shutdown

lameduck is a mechanism in CoreDNS that enables graceful shutdown. It ensures that when CoreDNS needs to stop or restart, ongoing requests are completed without being abruptly interrupted. lameduck works as follows:

When a CoreDNS process is about to terminate, it enters Lameduck mode.
In lameduck mode, CoreDNS stops accepting new requests but continues to process existing requests until they are all completed or the lameduck timeout period is exceeded.

Console

Log on to the ACK console. In the left navigation pane, click Clusters.
On the Clusters page, click the name of your cluster. In the left navigation pane, click Configurations > ConfigMaps.
In the kube-system namespace, click Edit YAML to the right of the coredns ConfigMap.
In the CoreDNS configuration file, ensure that the health plugin is enabled and set the lameduck timeout to 15s. Then, click OK.

.:53 {
        errors       
        # The health plugin may have different settings in different CoreDNS versions.
        # Scenario 1: The health plugin is not enabled by default.   
        # Scenario 2: The health plugin is enabled, but no lameduck duration is set.
        # health      
        # Scenario 3: The health plugin is enabled, and the lameduck duration is set to 5s.   
        # health {
        #     lameduck 5s
        # }      
        # For all three scenarios, modify the configuration as follows to set the lameduck parameter to 15s.
        health {
            lameduck 15s
        }       
        # Other plugins do not need to be modified and are omitted here.
    }

If the CoreDNS pods run normally, the change was successful. If a CoreDNS pod becomes abnormal, you can identify the cause by viewing its events and logs.

CLI

Run the following command to open the CoreDNS configuration file.

kubectl -n kube-system edit configmap/coredns

In the Corefile, ensure that the health plugin is enabled and set the lameduck parameter to 15s.

.:53 {
        errors     
        # The health plugin may have different settings in different CoreDNS versions.
        # Scenario 1: The health plugin is not enabled by default.     
        # Scenario 2: The health plugin is enabled, but no lameduck duration is set.
        # health
        # Scenario 3: The health plugin is enabled, and the lameduck duration is set to 5s.   
        # health {
        #     lameduck 5s
        # }
        # For all three scenarios, modify the configuration as follows to set the lameduck parameter to 15s.
        health {
            lameduck 15s
        }
        # Other plugins do not need to be modified and are omitted here.
    }

Save and exit after you modify the CoreDNS configuration file.
If CoreDNS runs normally, the change was successful. If a CoreDNS pod becomes abnormal, you can identify the cause by viewing its events and logs.

Configure default protocol for the forward plugin

NodeLocal DNSCache uses TCP to communicate with CoreDNS. CoreDNS then communicates with upstream DNS servers by using the same protocol as the incoming request. Therefore, by default, requests from business containers to resolve domain names outside the cluster pass through NodeLocal DNSCache and CoreDNS, and finally reach the VPC DNS servers (by default, 100.100.2.136 and 100.100.2.138 on ECS instances) over TCP.

VPC DNS servers have limited support for TCP. If you use NodeLocal DNSCache, you need to modify the CoreDNS configuration to prioritize UDP for communication with upstream DNS servers to avoid resolution exceptions. We recommend that you modify the CoreDNS configuration file, which is the ConfigMap named coredns in the kube-system namespace. For more information, see Manage ConfigMaps. In the forward plugin, specify the protocol for upstream requests as prefer_udp. After this modification, CoreDNS prioritizes UDP to communicate with upstream servers. The modification is as follows:

# Before modification
forward . /etc/resolv.conf
# After modification
forward . /etc/resolv.conf {
  prefer_udp
}

Configure the ready plugin

CoreDNS versions later than 1.5.0 must have the ready plugin configured to enable readiness probes.

Run the following command to open the CoreDNS configuration file.
```
kubectl -n kube-system edit configmap/coredns
```

Check if the file contains the ready line. If not, add the ready line, press Esc, enter :wq!, and then press Enter to save the modified configuration file and exit edit mode.

apiVersion: v1
data:
 Corefile: |
  .:53 {
    errors
    health {
      lameduck 15s
    }
    ready # If this line does not exist, add it. Make sure that the indentation is consistent with Kubernetes.
    kubernetes cluster.local in-addr.arpa ip6.arpa {
      pods verified
      fallthrough in-addr.arpa ip6.arpa
    }
    prometheus :9153
    forward . /etc/resolv.conf {
      max_concurrent 1000
            prefer_udp
    }
    cache 30
    loop
    log
    reload
    loadbalance
  }

Check the operational status and logs of the CoreDNS pods. If the logs contain the word reload, the modification is successful.

Enhance performance with the multisocket plugin

CoreDNS v1.12.1 introduced the multisocket plugin. Enabling this plugin allows CoreDNS to use multiple sockets to listen on the same port, enhancing CoreDNS performance in high-CPU scenarios. For a detailed description of the plugin, see the community documentation.

You need to enable multisocket by using the coredns ConfigMap:

.:53 {
        ...
        prometheus :9153
        multisocket [NUM_SOCKETS]
        forward . /etc/resolv.conf
        ...
}

NUM_SOCKETS specifies the number of sockets that listen on the same port.

Recommended configuration: Align NUM_SOCKETS with the estimated CPU utilization, CPU resource limits, and available cluster resources. For example:

If CoreDNS consumes 4 cores at peak and 8 cores are available, set NUM_SOCKETS to 2.
If CoreDNS consumes 8 cores at peak and 64 cores are available, set NUM_SOCKETS to 8.

To determine the optimal configuration, we recommend that you test the QPS and load with different settings.

If you do not specify NUM_SOCKETS, the default value is GOMAXPROCS, which is equal to the CPU limit of the CoreDNS pod. If the pod's CPU limit is not set, the value is equal to the number of CPU cores on the node where the pod is running.

Notes

In this topic

Optimize domain name resolution requests

Understand DNS configurations in containers

Avoid DNS timeouts caused by IPVS defects

Use NodeLocal DNSCache

Use a suitable CoreDNS version

Monitor the operational status of CoreDNS

Metrics

Logs

Kubernetes event delivery

Ensure CoreDNS high availability

Assess CoreDNS component pressure

Adjust CoreDNS pod count

Adjust CoreDNS pod specifications

Schedule CoreDNS pods

Optimize CoreDNS configurations

Disable session affinity for kube-dns

Console

CLI

Disable the autopath plugin

Configure graceful shutdown

Console

CLI

Configure default protocol for the forward plugin

Configure the ready plugin

Enhance performance with the multisocket plugin