使用Terraform管理环境实例的ServiceMonitor

通过在Terraform中配置ServiceMonitor资源并添加至Prometheus,帮助您发现并收集特定服务的性能指标,实现运维团队对系统性能的精准监控,从而提升系统的性能和健康。

前提条件

已创建容器服务类型的环境。具体操作,请参见使用Terraform管理环境实例

使用限制

仅支持容器服务类型的环境。

为容器服务环境实例添加ServiceMonitor

  1. 创建一个工作目录,并在工作目录中创建名为main.tf的配置文件,用于配置ServiceMonitor资源。

    provider "alicloud" {
    # access_key = "************"
    # secret_key = "************"
    # region = "cn-beijing"
    }
    # 容器服务环境实例的ServiceMonitor配置。
    resource "alicloud_arms_env_service_monitor" "my-service-monitor1" {
      environment_id = "容器服务环境的ID,如env-xxxxx"
      config_yaml    = <<-EOT
          apiVersion: monitoring.coreos.com/v1
          kind: ServiceMonitor
          metadata:
            name: my-service-monitor1 #serviceMonitor名称
            namespace: arms-prom  #serviceMonitor的命名空间
            annotations:
              arms.prometheus.io/discovery: 'true' #arms-prometheus固定要求的annotation,true则表示当前serviceMonitor有效,false则表示无效。
          spec:
            endpoints:
            - interval: 30s
              port: operator
              path: /metrics
            - interval: 10s
              port: operator1
              path: /metrics
            namespaceSelector:
              any: true
            selector:
              matchLabels:
               app: xxx
    	EOT
    }
     
  2. 执行以下命令,初始化Terraform运行环境。

    terraform init

    预期输出:

    Initializing the backend...
    
    Initializing provider plugins...
    - Checking for available provider plugins...
    - Downloading plugin for provider "alicloud" (hashicorp/alicloud) 1.90.1...
    ...
    
    You may now begin working with Terraform. Try running "terraform plan" to see
    any changes that are required for your infrastructure. All Terraform commands
    should now work.
    
    If you ever set or change modules or backend configuration for Terraform,
    rerun this command to reinitialize your working directory. If you forget, other
    commands will detect it and remind you to do so if necessary.
  3. 执行以下命令,生成资源规划。

    terraform plan

    预期输出:

    Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
     + create
    
    Terraform will perform the following actions:
    
     # alicloud_arms_env_service_monitor.myServiceMonitor1 will be created
     + resource "alicloud_arms_env_service_monitor" "myServiceMonitor1" {
    				environment_id = "xxx"
    				config_yaml = ....
     }
    
    Plan: 1 to add, 0 to change, 0 to destroy.
  4. 执行以下命令,创建ServiceMonitor。

    terraform apply

    预期输出:

    Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
     + create
    
    Terraform will perform the following actions:
    
     # alicloud_arms_env_service_monitor.myServiceMonitor1 will be created
     + resource "alicloud_arms_env_service_monitor" "myServiceMonitor1" {
    				environment_id = "xxx"
    				config_yaml = ....
     }
    
    Plan: 1 to add, 0 to change, 0 to destroy.
    
    Do you want to perform these actions?
     Terraform will perform the actions described above.
     Only 'yes' will be accepted to approve.
    
     Enter a value: yes

    提示Enter a value时,输入yes,当前环境实例的ServiceMonitor配置创建成功。

  5. 登录ARMS控制台

  6. 在左侧导航栏,单击接入管理。在接入管理页面的环境列表中,单击目标环境名称进入详情页面。

  7. 在环境详情页面,单击指标采集页签,在ServiceMonitor列表中查看是否已成功创建ServiceMonitor配置。

删除容器服务环境实例的ServiceMonitor

  1. 执行以下命令,删除通过Terraform创建的集群。

    terraform destroy

    预期输出:

    ...
    Do you really want to destroy all resources?
     Terraform will destroy all your managed infrastructure, as shown above.
     There is no undo. Only 'yes' will be accepted to confirm.
    
     Enter a value: yes
    ...
    Destroy complete! Resources: 1 destroyed.

    提示Enter a value时,请输入yes

  2. 登录ARMS控制台

  3. 在左侧导航栏,单击接入管理。在接入管理页面的环境列表中,单击目标环境名称进入详情页面。

  4. 在环境详情页面,单击指标采集页签,在ServiceMonitor列表中查看是否已成功删除ServiceMonitor配置。