DNS Cache指标采集和监控大盘配置

更新时间:
复制为 MD 格式

本文通过配置 Prometheus 指标采集和导入预制 Grafana 大盘,快速实现对 DNS Cache 核心性能指标的监控。

适用范围

  • acs-virtual-node 组件版本需 v2.16.0 及以上。

  • 已安装 Alibaba DNS Cache 组件。

使用方式

  1. 开启DNS Cache指标采集。

    集群维度

    Alibaba DNS Cache组件开启enabledMetrics

    kubectl patch configmap blazing-dns -n kube-system --type merge -p '{"data":{"config":"enabled: true\nenabledMetrics: true\n"}}'

    Pod维度

    Pod添加注解,开启enable-dns-cache-metrics

    apiVersion: v1
    kind: Pod
    metadata:
      annotations:
        network.alibabacloud.com/enable-dns-cache-metrics: "true"
        ...
  2. 新增Prometheus采集配置。

    开源Prometheus

    在开源Prometheus中找到Prometheus的配置文件(通常位于/etc/prometheus/prometheus.yml或者自定义的配置目录下),增加以下指标采集Job。

    社区版Prometheus Operator

    社区Prometheus Operator方案以及ACK应用市场ack-prometheus-operator组件的相关信息,请参见开源Prometheus监控。关于自定义采集配置,请参见Prometheus Operator进行数据采集配置。

    scrape_configs:
    # ...其他job配置。
    - job_name: _arms-prom/virtual-node/blazing-dns-cache
      honor_labels: true
      scrape_interval: 15s
      scrape_timeout: 15s
      metrics_path: /metrics/cadvisor
      scheme: https
      kubernetes_sd_configs:
      - role: node
        follow_redirects: true
      authorization:
        type: Bearer
        credentials_file: /var/run/secrets/target.kubernetes.io/serviceaccount/token
      tls_config:
        insecure_skip_verify: true
      metric_relabel_configs:
      - source_labels:
        - __name__
        regex: dns_metric_cache_expired_count|dns_metric_cache_hit_count|dns_metric_cache_miss_count|dns_metric_cache_updated_count|dns_metric_request_count|dns_metric_response_count
        action: keep
      relabel_configs:
      - source_labels: [ __meta_kubernetes_node_name ]
        regex: (^virtual-kubelet.*)
        target_label: __param_nodeName
        action: replace
      - separator: ;
        regex: (.*)
        target_label: job
        replacement: _arms/kubelet/cadvisor
        action: replace
      - source_labels:
        - __meta_kubernetes_node_name
        separator: ;
        regex: (.*)
        target_label: node
        replacement: ${1}
        action: replace
  3. 导入Grafana预制大盘。

    1. 在对应Prometheus数据源的Grafana系统中,导入DNS Cache功能的预制监控大盘模板,并选择对应Prometheus数据源。详细操作,请参见import-dashboards

      大盘配置JSON文件

      {
        "__inputs": [
          {
            "name": "DS_PROM-C22490DD721CC4DE09F97506062E6508D",
            "label": "prom-c22490dd721cc4de09f97506062e6508d",
            "description": "",
            "type": "datasource",
            "pluginId": "prometheus",
            "pluginName": "Prometheus"
          }
        ],
        "__elements": {},
        "__requires": [
          {
            "type": "grafana",
            "id": "grafana",
            "name": "Grafana",
            "version": "9.0.8"
          },
          {
            "type": "datasource",
            "id": "prometheus",
            "name": "Prometheus",
            "version": "1.0.0"
          },
          {
            "type": "panel",
            "id": "timeseries",
            "name": "Time series",
            "version": ""
          }
        ],
        "annotations": {
          "list": [
            {
              "builtIn": 1,
              "datasource": {
                "type": "prometheus",
                "uid": "${DS_PROM-C22490DD721CC4DE09F97506062E6508D}"
              },
              "enable": true,
              "hide": true,
              "iconColor": "rgba(0, 211, 255, 1)",
              "name": "Annotations & Alerts",
              "target": {
                "limit": 100,
                "matchAny": false,
                "tags": [],
                "type": "dashboard"
              },
              "type": "dashboard"
            }
          ]
        },
        "description": "A dashboard for the CoreDNS DNS server.",
        "editable": true,
        "fiscalYearStartMonth": 0,
        "gnetId": 7279,
        "graphTooltip": 0,
        "id": null,
        "links": [
          {
            "icon": "external link",
            "tags": [
              "arms-k8s",
              "coreddns"
            ],
            "targetBlank": true,
            "title": "DNS Cache Doc",
            "type": "link",
            "url": "https://help.aliyun.com/zh/cs/user-guide/use-alibaba-dns-cache-to-improve-dns-performance"
          }
        ],
        "liveNow": false,
        "panels": [
          {
            "datasource": {
              "type": "prometheus",
              "uid": "${DS_PROM-C22490DD721CC4DE09F97506062E6508D}"
            },
            "fieldConfig": {
              "defaults": {
                "color": {
                  "mode": "palette-classic"
                },
                "custom": {
                  "axisLabel": "",
                  "axisPlacement": "auto",
                  "axisSoftMax": 1,
                  "barAlignment": 0,
                  "drawStyle": "line",
                  "fillOpacity": 10,
                  "gradientMode": "opacity",
                  "hideFrom": {
                    "graph": false,
                    "legend": false,
                    "tooltip": false,
                    "viz": false
                  },
                  "lineInterpolation": "linear",
                  "lineWidth": 2,
                  "pointSize": 5,
                  "scaleDistribution": {
                    "type": "linear"
                  },
                  "showPoints": "never",
                  "spanNulls": true,
                  "stacking": {
                    "group": "A",
                    "mode": "none"
                  },
                  "thresholdsStyle": {
                    "mode": "off"
                  }
                },
                "mappings": [],
                "min": 0,
                "noValue": "0",
                "thresholds": {
                  "mode": "absolute",
                  "steps": [
                    {
                      "color": "red",
                      "value": null
                    },
                    {
                      "color": "green",
                      "value": 0.8
                    }
                  ]
                },
                "unit": "percentunit"
              },
              "overrides": []
            },
            "gridPos": {
              "h": 12,
              "w": 12,
              "x": 0,
              "y": 0
            },
            "id": 21,
            "links": [],
            "options": {
              "graph": {},
              "legend": {
                "calcs": [
                  "lastNotNull",
                  "max",
                  "mean"
                ],
                "displayMode": "table",
                "placement": "right",
                "showLegend": true
              },
              "tooltip": {
                "mode": "multi",
                "sort": "asc"
              }
            },
            "pluginVersion": "7.5.6",
            "targets": [
              {
                "datasource": {
                  "type": "prometheus",
                  "uid": "${DS_PROM-C22490DD721CC4DE09F97506062E6508D}"
                },
                "editorMode": "code",
                "exemplar": true,
                "expr": "avg(max(dns_metric_cache_hit_count{namespace=~\"$Namespace\", pod=~\"$Pod\"}) by (namespace, pod) / max(dns_metric_request_count{namespace=~\"$Namespace\", pod=~\"$Pod\"}) by (namespace, pod))",
                "format": "time_series",
                "hide": false,
                "interval": "",
                "intervalFactor": 2,
                "legendFormat": "Avg Cache hit rate",
                "range": true,
                "refId": "B",
                "step": 60
              }
            ],
            "title": "Avg DNSCache Cache hit rate",
            "type": "timeseries"
          },
          {
            "datasource": {
              "type": "prometheus",
              "uid": "${DS_PROM-C22490DD721CC4DE09F97506062E6508D}"
            },
            "fieldConfig": {
              "defaults": {
                "color": {
                  "mode": "palette-classic"
                },
                "custom": {
                  "axisLabel": "",
                  "axisPlacement": "auto",
                  "barAlignment": 0,
                  "drawStyle": "line",
                  "fillOpacity": 10,
                  "gradientMode": "opacity",
                  "hideFrom": {
                    "graph": false,
                    "legend": false,
                    "tooltip": false,
                    "viz": false
                  },
                  "lineInterpolation": "linear",
                  "lineWidth": 2,
                  "pointSize": 5,
                  "scaleDistribution": {
                    "type": "linear"
                  },
                  "showPoints": "never",
                  "spanNulls": true,
                  "stacking": {
                    "group": "A",
                    "mode": "none"
                  },
                  "thresholdsStyle": {
                    "mode": "off"
                  }
                },
                "mappings": [],
                "min": 0,
                "noValue": "0",
                "thresholds": {
                  "mode": "absolute",
                  "steps": [
                    {
                      "color": "green",
                      "value": null
                    }
                  ]
                },
                "unit": "none"
              },
              "overrides": []
            },
            "gridPos": {
              "h": 12,
              "w": 12,
              "x": 12,
              "y": 0
            },
            "id": 22,
            "links": [],
            "options": {
              "graph": {},
              "legend": {
                "calcs": [
                  "lastNotNull",
                  "max",
                  "mean"
                ],
                "displayMode": "table",
                "placement": "right",
                "showLegend": true
              },
              "tooltip": {
                "mode": "multi",
                "sort": "desc"
              }
            },
            "pluginVersion": "7.5.6",
            "targets": [
              {
                "datasource": {
                  "type": "prometheus",
                  "uid": "${DS_PROM-C22490DD721CC4DE09F97506062E6508D}"
                },
                "editorMode": "code",
                "exemplar": true,
                "expr": "avg(max(dns_metric_request_count{namespace=~\"$Namespace\", pod=~\"$Pod\"}) by (namespace, pod))",
                "format": "time_series",
                "hide": false,
                "interval": "",
                "intervalFactor": 2,
                "legendFormat": "Sum DNS Request Count",
                "range": true,
                "refId": "B",
                "step": 60
              }
            ],
            "title": "Total DNSCache Request",
            "type": "timeseries"
          },
          {
            "datasource": {
              "type": "prometheus",
              "uid": "${DS_PROM-C22490DD721CC4DE09F97506062E6508D}"
            },
            "fieldConfig": {
              "defaults": {
                "color": {
                  "mode": "palette-classic"
                },
                "custom": {
                  "axisLabel": "",
                  "axisPlacement": "auto",
                  "axisSoftMax": 1,
                  "barAlignment": 0,
                  "drawStyle": "line",
                  "fillOpacity": 10,
                  "gradientMode": "opacity",
                  "hideFrom": {
                    "graph": false,
                    "legend": false,
                    "tooltip": false,
                    "viz": false
                  },
                  "lineInterpolation": "linear",
                  "lineWidth": 2,
                  "pointSize": 5,
                  "scaleDistribution": {
                    "type": "linear"
                  },
                  "showPoints": "never",
                  "spanNulls": true,
                  "stacking": {
                    "group": "A",
                    "mode": "none"
                  },
                  "thresholdsStyle": {
                    "mode": "off"
                  }
                },
                "mappings": [],
                "min": 0,
                "noValue": "0",
                "thresholds": {
                  "mode": "absolute",
                  "steps": [
                    {
                      "color": "red",
                      "value": null
                    },
                    {
                      "color": "green",
                      "value": 0.8
                    }
                  ]
                },
                "unit": "percentunit"
              },
              "overrides": []
            },
            "gridPos": {
              "h": 9,
              "w": 12,
              "x": 0,
              "y": 12
            },
            "id": 23,
            "links": [],
            "options": {
              "graph": {},
              "legend": {
                "calcs": [
                  "lastNotNull",
                  "max",
                  "mean"
                ],
                "displayMode": "table",
                "placement": "right",
                "showLegend": true
              },
              "tooltip": {
                "mode": "multi",
                "sort": "asc"
              }
            },
            "pluginVersion": "7.5.6",
            "targets": [
              {
                "datasource": {
                  "type": "prometheus",
                  "uid": "${DS_PROM-C22490DD721CC4DE09F97506062E6508D}"
                },
                "editorMode": "code",
                "exemplar": true,
                "expr": "topk($topk, max(dns_metric_cache_hit_count{namespace=~\"$Namespace\", pod=~\"$Pod\"}) by (namespace, pod) / max(dns_metric_request_count{namespace=~\"$Namespace\", pod=~\"$Pod\"}) by (namespace, pod)) ",
                "format": "time_series",
                "hide": false,
                "interval": "",
                "intervalFactor": 2,
                "legendFormat": "[NS]{{namespace}} - [Pod]{{pod}}",
                "range": true,
                "refId": "B",
                "step": 60
              }
            ],
            "title": "TopK DNSCache Cache hit rate",
            "type": "timeseries"
          },
          {
            "datasource": {
              "type": "prometheus",
              "uid": "${DS_PROM-C22490DD721CC4DE09F97506062E6508D}"
            },
            "fieldConfig": {
              "defaults": {
                "color": {
                  "mode": "palette-classic"
                },
                "custom": {
                  "axisLabel": "",
                  "axisPlacement": "auto",
                  "barAlignment": 0,
                  "drawStyle": "line",
                  "fillOpacity": 10,
                  "gradientMode": "opacity",
                  "hideFrom": {
                    "graph": false,
                    "legend": false,
                    "tooltip": false,
                    "viz": false
                  },
                  "lineInterpolation": "linear",
                  "lineWidth": 2,
                  "pointSize": 5,
                  "scaleDistribution": {
                    "type": "linear"
                  },
                  "showPoints": "never",
                  "spanNulls": true,
                  "stacking": {
                    "group": "A",
                    "mode": "none"
                  },
                  "thresholdsStyle": {
                    "mode": "off"
                  }
                },
                "mappings": [],
                "min": 0,
                "noValue": "0",
                "thresholds": {
                  "mode": "absolute",
                  "steps": [
                    {
                      "color": "green",
                      "value": null
                    }
                  ]
                },
                "unit": "none"
              },
              "overrides": []
            },
            "gridPos": {
              "h": 9,
              "w": 12,
              "x": 12,
              "y": 12
            },
            "id": 25,
            "links": [],
            "options": {
              "graph": {},
              "legend": {
                "calcs": [
                  "lastNotNull",
                  "max",
                  "mean"
                ],
                "displayMode": "table",
                "placement": "right",
                "showLegend": true
              },
              "tooltip": {
                "mode": "multi",
                "sort": "desc"
              }
            },
            "pluginVersion": "7.5.6",
            "targets": [
              {
                "datasource": {
                  "type": "prometheus",
                  "uid": "${DS_PROM-C22490DD721CC4DE09F97506062E6508D}"
                },
                "editorMode": "code",
                "exemplar": true,
                "expr": "topk($topk, max(dns_metric_request_count{namespace=~\"$Namespace\", pod=~\"$Pod\"}) by (namespace, pod))",
                "format": "time_series",
                "hide": false,
                "interval": "",
                "intervalFactor": 2,
                "legendFormat": "[NS]{{namespace}} - [Pod]{{pod}}",
                "range": true,
                "refId": "B",
                "step": 60
              }
            ],
            "title": "Topk DNSCache Request",
            "type": "timeseries"
          },
          {
            "datasource": {
              "type": "prometheus",
              "uid": "${DS_PROM-C22490DD721CC4DE09F97506062E6508D}"
            },
            "fieldConfig": {
              "defaults": {
                "color": {
                  "mode": "palette-classic"
                },
                "custom": {
                  "axisLabel": "",
                  "axisPlacement": "auto",
                  "axisSoftMax": 1,
                  "barAlignment": 0,
                  "drawStyle": "line",
                  "fillOpacity": 10,
                  "gradientMode": "opacity",
                  "hideFrom": {
                    "graph": false,
                    "legend": false,
                    "tooltip": false,
                    "viz": false
                  },
                  "lineInterpolation": "linear",
                  "lineWidth": 2,
                  "pointSize": 5,
                  "scaleDistribution": {
                    "type": "linear"
                  },
                  "showPoints": "never",
                  "spanNulls": true,
                  "stacking": {
                    "group": "A",
                    "mode": "none"
                  },
                  "thresholdsStyle": {
                    "mode": "off"
                  }
                },
                "mappings": [],
                "min": 0,
                "noValue": "0",
                "thresholds": {
                  "mode": "absolute",
                  "steps": [
                    {
                      "color": "red",
                      "value": null
                    },
                    {
                      "color": "green",
                      "value": 0.8
                    }
                  ]
                },
                "unit": "percentunit"
              },
              "overrides": []
            },
            "gridPos": {
              "h": 9,
              "w": 12,
              "x": 0,
              "y": 21
            },
            "id": 24,
            "links": [],
            "options": {
              "graph": {},
              "legend": {
                "calcs": [
                  "lastNotNull",
                  "max",
                  "mean"
                ],
                "displayMode": "table",
                "placement": "right",
                "showLegend": true
              },
              "tooltip": {
                "mode": "multi",
                "sort": "asc"
              }
            },
            "pluginVersion": "7.5.6",
            "targets": [
              {
                "datasource": {
                  "type": "prometheus",
                  "uid": "${DS_PROM-C22490DD721CC4DE09F97506062E6508D}"
                },
                "editorMode": "code",
                "exemplar": true,
                "expr": "bottomk($topk, max(dns_metric_cache_hit_count{namespace=~\"$Namespace\", pod=~\"$Pod\"}) by (namespace, pod) / max(dns_metric_request_count{namespace=~\"$Namespace\", pod=~\"$Pod\"}) by (namespace, pod)) ",
                "format": "time_series",
                "hide": false,
                "interval": "",
                "intervalFactor": 2,
                "legendFormat": "[NS]{{namespace}} - [Pod]{{pod}}",
                "range": true,
                "refId": "B",
                "step": 60
              }
            ],
            "title": "BottomK DNSCache Cache hit rate",
            "type": "timeseries"
          },
          {
            "datasource": {
              "type": "prometheus",
              "uid": "${DS_PROM-C22490DD721CC4DE09F97506062E6508D}"
            },
            "fieldConfig": {
              "defaults": {
                "color": {
                  "mode": "palette-classic"
                },
                "custom": {
                  "axisLabel": "",
                  "axisPlacement": "auto",
                  "barAlignment": 0,
                  "drawStyle": "line",
                  "fillOpacity": 10,
                  "gradientMode": "opacity",
                  "hideFrom": {
                    "graph": false,
                    "legend": false,
                    "tooltip": false,
                    "viz": false
                  },
                  "lineInterpolation": "linear",
                  "lineWidth": 2,
                  "pointSize": 5,
                  "scaleDistribution": {
                    "type": "linear"
                  },
                  "showPoints": "never",
                  "spanNulls": true,
                  "stacking": {
                    "group": "A",
                    "mode": "none"
                  },
                  "thresholdsStyle": {
                    "mode": "off"
                  }
                },
                "mappings": [],
                "min": 0,
                "noValue": "0",
                "thresholds": {
                  "mode": "absolute",
                  "steps": [
                    {
                      "color": "green",
                      "value": null
                    }
                  ]
                },
                "unit": "none"
              },
              "overrides": []
            },
            "gridPos": {
              "h": 9,
              "w": 12,
              "x": 12,
              "y": 21
            },
            "id": 26,
            "links": [],
            "options": {
              "graph": {},
              "legend": {
                "calcs": [
                  "lastNotNull",
                  "max",
                  "mean"
                ],
                "displayMode": "table",
                "placement": "right",
                "showLegend": true
              },
              "tooltip": {
                "mode": "multi",
                "sort": "desc"
              }
            },
            "pluginVersion": "7.5.6",
            "targets": [
              {
                "datasource": {
                  "type": "prometheus",
                  "uid": "${DS_PROM-C22490DD721CC4DE09F97506062E6508D}"
                },
                "editorMode": "code",
                "exemplar": true,
                "expr": "bottomk($topk, max(dns_metric_request_count{namespace=~\"$Namespace\", pod=~\"$Pod\"}) by (namespace, pod))",
                "format": "time_series",
                "hide": false,
                "interval": "",
                "intervalFactor": 2,
                "legendFormat": "[NS]{{namespace}} - [Pod]{{pod}}",
                "range": true,
                "refId": "B",
                "step": 60
              }
            ],
            "title": "Bottomk DNSCache Request",
            "type": "timeseries"
          }
        ],
        "refresh": "",
        "schemaVersion": 36,
        "showAgsLink": "cn",
        "style": "dark",
        "tags": [],
        "templating": {
          "list": [
            {
              "current": {
                "selected": false,
                "text": "prom-c22490dd721cc4de09f97506062e6508d",
                "value": "prom-c22490dd721cc4de09f97506062e6508d"
              },
              "hide": 2,
              "includeAll": false,
              "multi": false,
              "name": "datasource",
              "options": [],
              "query": "prometheus",
              "refresh": 1,
              "regex": "/^prom-c22490dd721cc4de09f97506062e6508d/",
              "skipUrlSync": false,
              "type": "datasource"
            },
            {
              "allValue": ".*",
              "current": {},
              "datasource": {
                "type": "prometheus",
                "uid": "${DS_PROM-C22490DD721CC4DE09F97506062E6508D}"
              },
              "definition": "label_values({__name__=~ \"dns_metric_request_count\"},namespace)",
              "hide": 0,
              "includeAll": true,
              "label": "Namespace",
              "multi": true,
              "name": "Namespace",
              "options": [],
              "query": {
                "query": "label_values({__name__=~ \"dns_metric_request_count\"},namespace)",
                "refId": "StandardVariableQuery"
              },
              "refresh": 1,
              "regex": "",
              "skipUrlSync": false,
              "sort": 3,
              "tagValuesQuery": "",
              "tagsQuery": "",
              "type": "query",
              "useTags": false
            },
            {
              "allValue": ".*",
              "current": {},
              "datasource": {
                "type": "prometheus",
                "uid": "${DS_PROM-C22490DD721CC4DE09F97506062E6508D}"
              },
              "definition": "label_values({__name__=~\"dns_metric_request_count\",namespace=~\"$Namespace\"}, pod)",
              "hide": 0,
              "includeAll": true,
              "label": "Pod",
              "multi": true,
              "name": "Pod",
              "options": [],
              "query": {
                "query": "label_values({__name__=~\"dns_metric_request_count\",namespace=~\"$Namespace\"}, pod)",
                "refId": "StandardVariableQuery"
              },
              "refresh": 1,
              "regex": "",
              "skipUrlSync": false,
              "sort": 0,
              "type": "query"
            },
            {
              "current": {
                "selected": false,
                "text": "100",
                "value": "100"
              },
              "hide": 0,
              "name": "topk",
              "options": [
                {
                  "selected": true,
                  "text": "100",
                  "value": "100"
                }
              ],
              "query": "100",
              "skipUrlSync": false,
              "type": "textbox"
            }
          ]
        },
        "time": {
          "from": "now-30m",
          "to": "now"
        },
        "timepicker": {
          "now": true,
          "refresh_intervals": [
            "5s",
            "10s",
            "30s",
            "1m",
            "5m",
            "15m",
            "30m",
            "1h",
            "2h",
            "1d"
          ],
          "time_options": [
            "5m",
            "15m",
            "1h",
            "6h",
            "12h",
            "24h",
            "2d",
            "7d",
            "30d"
          ]
        },
        "timezone": "",
        "title": "DNSCache",
        "uid": "tLUYqpIvz",
        "version": 5,
        "weekStart": ""
      }
    2. data source下拉框中选择实际的Prometheus实例,点击Import

      image

指标说明和数据样例

# HELP dns_metric_cache_expired_count request handler found an expired cache
# TYPE dns_metric_cache_expired_count gauge
dns_metric_cache_expired_count{namespace="default",pod="test-pod-1"} 552
dns_metric_cache_expired_count{namespace="default",pod="test-pod-metrics"} 17376
# HELP dns_metric_cache_hit_count request handler respond dns request with dns cache
# TYPE dns_metric_cache_hit_count gauge
dns_metric_cache_hit_count{namespace="default",pod="test-pod-1"} 2
dns_metric_cache_hit_count{namespace="default",pod="test-pod-metrics"} 1
# HELP dns_metric_cache_miss_count request handler unable to find a cache
# TYPE dns_metric_cache_miss_count gauge
dns_metric_cache_miss_count{namespace="default",pod="test-pod-1"} 27
dns_metric_cache_miss_count{namespace="default",pod="test-pod-metrics"} 73
# HELP dns_metric_cache_updated_count response handler updated a local cache
# TYPE dns_metric_cache_updated_count gauge
dns_metric_cache_updated_count{namespace="default",pod="test-pod-1"} 579
dns_metric_cache_updated_count{namespace="default",pod="test-pod-metrics"} 17449
# HELP dns_metric_request_count request handlers handles dns requests
# TYPE dns_metric_request_count gauge
dns_metric_request_count{namespace="default",pod="test-pod-1"} 581
dns_metric_request_count{namespace="default",pod="test-pod-metrics"} 17450
# HELP dns_metric_response_count response handlers handles dns responses
# TYPE dns_metric_response_count gauge
dns_metric_response_count{namespace="default",pod="test-pod-1"} 579
dns_metric_response_count{namespace="default",pod="test-pod-metrics"} 17449