使用Terraform管理环境实例的PodMonitor

在K8s中,PodMonitor资源用于指定Prometheus需要监控的Pod。您可以在Terraform中配置PodMonitor资源并添加至Prometheus,使Prometheus实例能够自动发现目标Pod并抓取监控指标,从而提高系统的稳定性和可靠性。

前提条件

已创建容器服务类型的环境。具体操作,请参见使用Terraform管理环境实例

使用限制

仅支持容器服务类型的环境。

为容器服务环境实例添加PodMonitor

  1. 创建一个工作目录,并在工作目录中创建名为main.tf的配置文件,用于配置PodMonitor资源。

    provider "alicloud" {
      # access_key = "************"
      # secret_key = "************"
      # region = "cn-beijing"
    }
    # 容器服务环境实例的PodMonitor配置。
    resource "alicloud_arms_env_pod_monitor" "my-pod-monitor1" {
      environment_id = "容器服务环境的ID,如env-xxxxx"
      config_yaml = <<-EOT
        apiVersion: monitoring.coreos.com/v1
        kind: PodMonitor
        metadata:
          name: my-pod-monitor1 #podMonitor名称
          namespace: arms-prom #podMonitor所在的命名空间
          annotations:
            arms.prometheus.io/discovery: 'true' #arms-prometheus固定要求的annotation,true则表示当前podMonitor有效,false则表示无效。
        spec:
          selector:
            matchLabels:
              app: xxx
              release: yyy
          namespaceSelector:
            any: true 
          podMetricsEndpoints:
            - interval: 30s
              targetPort: 9335
              path: /metrics
            - interval: 10s
              targetPort: 9335
              path: /metrics1
      EOT
    }
    
  2. 执行以下命令,初始化Terraform运行环境。

    terraform init

    预期输出:

    Initializing the backend...
    
    Initializing provider plugins...
    - Checking for available provider plugins...
    - Downloading plugin for provider "alicloud" (hashicorp/alicloud) 1.90.1...
    ...
    
    You may now begin working with Terraform. Try running "terraform plan" to see
    any changes that are required for your infrastructure. All Terraform commands
    should now work.
    
    If you ever set or change modules or backend configuration for Terraform,
    rerun this command to reinitialize your working directory. If you forget, other
    commands will detect it and remind you to do so if necessary.
  3. 执行以下命令,生成资源规划。

    terraform plan

    预期输出:

    Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
     + create
    
    Terraform will perform the following actions:
    
     # alicloud_arms_env_pod_monitor.myPodMonitor1 will be created
     + resource "alicloud_arms_env_pod_monitor" "myPodMonitor1" {
    				environment_id = "xxx"
    				config_yaml = ....
     }
    
    Plan: 1 to add, 0 to change, 0 to destroy.
  4. 执行以下命令,创建PodMonitor。

    terraform apply

    预期输出:

    Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
     + create
    
    Terraform will perform the following actions:
    
     # alicloud_arms_env_pod_monitor.myPodMonitor1 will be created
     + resource "alicloud_arms_env_pod_monitor" "myPodMonitor1" {
    				environment_id = "xxx"
    				config_yaml = ....
     }
    
    Plan: 1 to add, 0 to change, 0 to destroy.
    
    Do you want to perform these actions?
     Terraform will perform the actions described above.
     Only 'yes' will be accepted to approve.
    
     Enter a value: yes

    提示Enter a value时,输入yes,当前环境实例的PodMonitor创建成功。

  5. 登录Prometheus控制台

  6. 在左侧导航栏,单击接入管理。在接入管理页面的环境列表中,单击目标环境名称进入详情页面。

  7. 在环境详情页面,单击指标采集页签,在PodMonitor列表中查看是否已成功创建PodMonitor配置。

删除容器服务环境实例的PodMonitor

  1. 执行以下命令,删除通过Terraform创建的集群。

    terraform destroy

    预期输出:

    ...
    Do you really want to destroy all resources?
     Terraform will destroy all your managed infrastructure, as shown above.
     There is no undo. Only 'yes' will be accepted to confirm.
    
     Enter a value: yes
    ...
    Destroy complete! Resources: 1 destroyed.

    提示Enter a value时,输入yes

  2. 登录Prometheus控制台

  3. 在左侧导航栏,单击接入管理。在接入管理页面的环境列表中,单击目标环境名称进入详情页面。

  4. 在环境详情页面,单击指标采集页签,在PodMonitor列表中查看是否已成功删除PodMonitor配置。