Python application monitoring

更新时间:
复制 MD 格式

For Python applications deployed in ACK clusters, such as web applications built with frameworks like Django, Flask, and FastAPI, or AI and LLM applications developed using LlamaIndex and Langchain, you can use Application Real-Time Monitoring Service (ARMS) to monitor application performance. To enable monitoring, install the ack-onepilot component and modify your Dockerfile. This enables features such as application topology, tracing, API call analysis, anomaly detection, and detailed tracking of large model interactions.

Prerequisites

Step 1: Install the ack-onepilot component

Important

The legacy arms-pilot component is no longer maintained. You can install the upgraded ack-onepilot component to monitor your applications. ack-onepilot is fully compatible with arms-pilot, which allows you to seamlessly migrate without modifying your application configurations. For more information, see How do I uninstall arms-pilot and install ack-onepilot?.

  1. Log on to the ACK console. In the left navigation pane, click Clusters.

  2. On the Clusters page, click the name of your cluster. In the left navigation pane, click Components and Add-ons.

  3. In the Logs and Monitoring section, find the ack-onepilot add-on and click Install on its card. In the dialog box, configure the parameters and click OK to complete the installation. We recommend that you keep the default values.

    Note

    Make sure that the version of ack-onepilot is 3.2.4 or later. By default, the ack-onepilot component supports up to 1,000 pods. For every 1,000 pods that exceed this limit, increase the CPU resources for ack-onepilot by 0.5 cores and the memory resources by 512 MB.

    After installation, you can upgrade, configure, or uninstall the ack-onepilot add-on on the Add-ons page.

Step 2: Grant permissions to access ARMS resources

  • To monitor applications in an ACK cluster that does not have the addon.arms.token secret, you must manually grant permissions to the cluster to access ARMS resources. If the addon.arms.token secret already exists, you can skip this step.

    Note

    When addon.arms.token exists in an ACK cluster, ARMS can automatically complete the password-free authorization process. Typically, ACK managed clusters include addon.arms.token by default. However, some older ACK managed clusters may be missing addon.arms.token.

    1. Check whether the addon.arms.token secret exists.

      Check if the addon.arms.token secret exists in the cluster

      1. Log on to the ACK console. In the left navigation pane, click Clusters.

      2. On the Clusters page, click the name of your cluster. In the left navigation pane, click Configurations > Secrets.

      3. On the Secrets page, select Namespace from the Namespace drop-down list and check whether addon.arms.token is in the list.

    2. If the addon.arms.token secret exists, no further action is required. Otherwise, manually grant permissions to the cluster to access ARMS resources.

      1. Log on to the ACK console. In the left navigation pane, click Clusters.

      2. On the Clusters page, click the name of your cluster. In the left navigation pane, click Cluster Information.

      3. On the Basic Information tab, click the link next to Worker RAM Role in the Cluster Resources section.

      4. On the Role page, on the Permissions tab, click Add Permissions.

      5. Select the AliyunARMSFullAccess policy and click OK.

  • If you want to monitor applications in an ACK managed cluster that is integrated with Elastic Container Instance (ECI), go to the RAM Quick Authorization page to complete the authorization. Then, restart all pods of the ack-onepilot add-on.

Step 3: Integrate the ARMS Python agent into your Dockerfile

Modify your Dockerfile to integrate the ARMS Python agent and use it to start your Python application.

  1. Download the agent installer from the PyPI repository.

    pip3 install aliyun-bootstrap
  2. Use aliyun-bootstrap to install the agent.

    # The region ID of your Alibaba Cloud account.
    ARMS_REGION_ID=xxx aliyun-bootstrap -a install
    Note

    To install a specific version of the Python agent, run the following command:

    # Replace ${version} with the actual version number.
    aliyun-bootstrap -a install -v ${version}

    For a list of all released Python agent versions, see Release notes of the Python agent.

  3. Start the application with the ARMS Python agent.

    aliyun-instrument python app.py
  4. Build the image.

The following code shows a sample Dockerfile before and after modification:

    Original Dockerfile

    # Use the Python 3.10 base image
    FROM docker.m.daocloud.io/python:3.10
    
    # Set the working directory
    WORKDIR /app
    
    # Copy the requirements.txt file to the working directory
    COPY requirements.txt .
    
    # Install dependencies using pip
    RUN pip install --no-cache-dir -r requirements.txt
    
    COPY ./app.py /app/app.py
    # Expose port 8000
    EXPOSE 8000
    CMD ["python","app.py"]

    Modified Dockerfile

    # Use the official Python 3.10 base image
    FROM docker.m.daocloud.io/python:3.10
    
    # Set the working directory
    WORKDIR /app
    
    # Copy the requirements.txt file to the working directory
    COPY requirements.txt .
    
    # Install dependencies using pip
    RUN pip install --no-cache-dir -r requirements.txt
    ######################### Install the Aliyun Python agent ###############################
    # The region ID of your Alibaba Cloud account.
    RUN pip3 install aliyun-bootstrap && ARMS_REGION_ID=xxx aliyun-bootstrap -a install 
    ##########################################################
    
    COPY ./app.py /app/app.py
    
    
    # Expose port 8000
    EXPOSE 8000
    #########################################################
    CMD ["aliyun-instrument","python","app.py"]

Step 4: Enable ARMS monitoring

  1. Log on to the ACK console. In the left navigation pane, click Clusters.

  2. On the Clusters page, click the name of your cluster. In the left navigation pane, click Workloads > Deployments.

  3. On the Deployments page, find your application and click image > Edit YAML in the Actions column.

  4. In the YAML file, add the following labels under spec.template.metadata.

    labels:
      aliyun.com/app-language: python # Required. Specifies the application language as Python.
      armsPilotAutoEnable: 'on'
      armsPilotCreateAppName: "deployment-name"    # The application's display name in the ARMS console.
    Important

    If the version of the ack-onepilot component is 5.0.0 or later, the component automatically downloads and injects the Python agent package when you perform this step. This enables fully non-intrusive monitoring for Python applications, eliminating the need to modify the startup command in the Dockerfile. If you do not want to use this feature, or if you have already installed the Python agent manually in your container, we recommend disabling this feature by adding the following label:

    labels:
      aliyun.com/app-language: python # Required. Specifies the application language as Python.
      armsPilotAutoEnable: 'on'
      armsPilotCreateAppName: "deployment-name"    # The application's display name in the ARMS console.
      armsAutoInstrumentationEnable: "off"  # Disables non-intrusive agent injection.

    image

    The following code provides a complete YAML template to create a Deployment and enable application monitoring:

    Complete YAML sample

    apiVersion: apps/v1
    kind: Deployment
    metadata:
      labels:
        app: arms-python-client
      name: arms-python-client
      namespace: arms-demo
    spec:
      progressDeadlineSeconds: 600
      replicas: 1
      revisionHistoryLimit: 10
      selector:
        matchLabels:
          app: arms-python-client
      strategy:
        rollingUpdate:
          maxSurge: 25%
          maxUnavailable: 25%
        type: RollingUpdate
      template:
        metadata:
          labels:
            app: arms-python-client
            aliyun.com/app-language: python # Required. Specifies the application language as Python.
            armsPilotAutoEnable: 'on'
            armsPilotCreateAppName: "arms-python-client"    # The application's display name in the ARMS console.
        spec:
          containers:
            - image: registry.cn-hangzhou.aliyuncs.com/arms-default/python-agent:arms-python-client
              imagePullPolicy: Always
              name: client
              resources:
                requests:
                  cpu: 250m
                  memory: 300Mi
              terminationMessagePath: /dev/termination-log
              terminationMessagePolicy: File
          dnsPolicy: ClusterFirst
          restartPolicy: Always
          schedulerName: default-scheduler
          securityContext: {}
          terminationGracePeriodSeconds: 30
    ---
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      labels:
        app: arms-python-server
      name: arms-python-server
      namespace: arms-demo
    spec:
      progressDeadlineSeconds: 600
      replicas: 1
      revisionHistoryLimit: 10
      selector:
        matchLabels:
          app: arms-python-server
      strategy:
        rollingUpdate:
          maxSurge: 25%
          maxUnavailable: 25%
        type: RollingUpdate
      template:
        metadata:
          labels:
            app: arms-python-server
            aliyun.com/app-language: python # Required. Specifies the application language as Python.
            armsPilotAutoEnable: 'on'
            armsPilotCreateAppName: "arms-python-server"    # The application's display name in the ARMS console.
        spec:
          containers:
            - env:
              - name: CLIENT_URL
                value: 'http://arms-python-client-svc:8000'
              image: registry.cn-hangzhou.aliyuncs.com/arms-default/python-agent:arms-python-server
              imagePullPolicy: Always
              name: server
              resources:
                requests:
                  cpu: 250m
                  memory: 300Mi
              terminationMessagePath: /dev/termination-log
              terminationMessagePolicy: File
          dnsPolicy: ClusterFirst
          restartPolicy: Always
          schedulerName: default-scheduler
          securityContext: {}
          terminationGracePeriodSeconds: 30
    ---
    apiVersion: v1
    kind: Service
    metadata:
      labels:
        app: arms-python-server
      name: arms-python-server-svc
      namespace: arms-demo
    spec:
      internalTrafficPolicy: Cluster
      ipFamilies:
        - IPv4
      ipFamilyPolicy: SingleStack
      ports:
        - name: http
          port: 8000
          protocol: TCP
          targetPort: 8000
      selector:
        app: arms-python-server
      sessionAffinity: None
      type: ClusterIP
    ---
    apiVersion: v1
    kind: Service
    metadata:
      name: arms-python-client-svc
      namespace: arms-demo
    spec:
      internalTrafficPolicy: Cluster
      ipFamilies:
        - IPv4
      ipFamilyPolicy: SingleStack
      ports:
        - name: http
          port: 8000
          protocol: TCP
          targetPort: 8000
      selector:
        app: arms-python-client
      sessionAffinity: None
      type: ClusterIP
    

Step 5: View monitoring details

  1. After about one minute, log on to the ARMS console. In the left-side navigation pane, choose Application Monitoring > Applications to view your Python application and its reported data.

    In the application list, you can see the connected arms-python-client application and its metrics, such as requests per second, error rate, and average response time.

  2. Click the Name to go to the application monitoring page in the ARMS console and view detailed monitoring information. For more information, see Application overview.

(Optional) Step 6: Release resources

If you no longer need to monitor your Python application with ARMS, you can uninstall the ARMS Python agent to stop monitoring. For more information, see Uninstall the Python agent.