Enable control plane log collection and log-based alerting (legacy)

更新时间:
复制 MD 格式

Service Mesh (ASM) supports control plane log collection and log-based alerting. For example, you can collect logs about configuration pushes from the ASM control plane to the sidecars on the data plane. This topic describes how to enable control plane log collection and log-based alerting.

Background information

One of the main functions of the ASM control plane is to push mesh rule configurations to the Sidecar proxies or gateways on the data plane. If a mesh rule configuration has conflicts that cause the push to fail, the proxy or gateway does not receive the latest configuration. The proxy or gateway continues to run with the last successful configuration without restarting. However, if these pods restart, the Sidecar proxy or gateway might fail to start. In many real-world scenarios, misconfigurations often cause gateways or proxies to become unavailable. Therefore, it is essential to enable control plane log-based alerting to promptly find and resolve these issues.

Prerequisites

  • You have activated Simple Log Service (SLS) for your Alibaba Cloud account. For more information, see Activate Simple Log Service.

    Important

    Collecting logs to the ASM log service does not incur extra fees. However, SLS charges for data writes and the features that you use. For more information about SLS billing, see Billing overview.

Enable control plane log collection

  1. Log on to the ASM console. In the left-side navigation pane, choose Service Mesh > Mesh Management.

  2. On the Mesh Management page, click the name of the ASM instance. In the left-side navigation pane, choose ASM Instance > Base Information.

  3. On the Basic Information page, in the Configuration Information section, click Enable to the right of Control Plane Log Collection.

    Important

    The control plane Logstore saves log data for the last 30 days. Logs older than 30 days are discarded.

    • If you enable control plane log collection for the first time, the Enable Control Plane Logs dialog box appears. You can select Create New Project or Use Existing Project and then click Confirm.

      If you select Create Project, you can use the default project name or specify a custom project name.

      启用控制面日志

    • If you previously enabled and disabled control plane log collection, an Important Note dialog box appears when you enable the feature again. Click OK, and the previously specified project is automatically selected.

    After you enable control plane log collection, on the Basic Information page, click View Log to the right of Control Plane Log Collection to view the detailed control plane logs on the project page.

Enable control plane log-based alerting

Important

You must enable control plane log collection before you can enable control plane log-based alerting. Otherwise, this feature will not be available.

When the control plane sends an xDS request to the data plane and the data plane rejects the request, a data plane synchronization failure alert is triggered. In this case, the Sidecar proxy or ASM gateway on your data plane cannot receive the latest configuration information. The following two scenarios are possible:

  • If the Sidecar on the data plane has previously received a successful configuration push, it retains the last successfully pushed configuration.

  • If the Sidecar on the data plane has not previously received a successful configuration push, the Sidecar has no configuration information. This means the node might not have any listeners and cannot process any requests or routing rules.

  1. Log on to the ASM console. In the left-side navigation pane, choose Service Mesh > Mesh Management.

  2. On the Mesh Management page, click the name of the ASM instance. In the left-side navigation pane, choose ASM Instance > Base Information.

  3. On the Basic Information page, in the Configuration Information section, click Alerting Settings to the right of Control Plane Log Collection.

  4. In the target action policy area of the Control Plane Log-based Alerting Settings dialog box, select Service Mesh ASM Built-in Action Policy (Recommended) or Custom Action Policy, and click Enable Alerting.

    An action policy defines the behavior when an alert is triggered. You can create and edit action policies in your SLS project. For more information, see Action policies.

  5. In the Important dialog box, click OK.

Configure alert notification recipients

You can configure the built-in action policy for the SLS service gateway to set alert notification recipients, notification templates, and other settings.

  1. Log on to the Simple Log Service console.

  2. In the Project List area, click the target project name, and then in the navigation pane on the left, click Alerting.

  3. On the Alert Center page, click Notification Recipient > User Group Management.

  4. On the User Group Management tab, click Modify in the Actions column of the SLS Service Mesh Built-in User Group.

  5. In the Modify User Group dialog box, select the target member in the Available Members area, click the Add icon to add the member to the Added Members area, and then click Confirm.Modify user group

Example of triggering an alert notification

Note

This topic cannot cover every alert metric. The following example triggers the Failed to push configurations from the service mesh control plane alert using an incorrect configuration.

  1. Log on to the ASM console. In the left-side navigation pane, choose Service Mesh > Mesh Management.

  2. On the Mesh Management page, click the name of the ASM instance. In the left-side navigation pane, choose ASM Gateways > Gateway. On the page that appears, click Create from YAML.

  3. On the Create page, select the target namespace and any scenario template, and configure the YAML.

    This topic uses the default namespace as an example to configure a service gateway. The following code provides a YAML example:

    apiVersion: networking.istio.io/v1beta1
    kind: Gateway
    metadata:
      name:  gateway-test
      namespace: default
    spec:
      selector:
        istio: ingressgateway
      servers:
        - hosts:
            - '*console.aliyun.com'
          port:
            name: https
            number: 443
            protocol: HTTPS
          tls:
            credentialName: not-existing-credential
            mode: SIMPLE
  4. View the alert notification.

    1. In the navigation pane on the left of the instance details page, choose ASM Instance > Basic Information.

    2. On the Basic Information page, in the Configuration Information section, click View Log to the right of Control Plane Log Collection.

    3. In the Simple Log Service console, search for 'ACK ERROR' to view the alert information.

      The following figure shows a sample output:

      If you have configured an email address for the notification recipient, you can view the alert notification in your mailbox.邮箱告警

Reference solutions for handling alerts

Diagnosed mesh configurations with warnings

ASM mesh diagnosis has found potentially risky mesh configurations in your cluster. These configurations might cause ASM to produce unexpected results. You can view the alert content on the mesh diagnosis page and follow the prompts to correct the configuration.

Diagnosed mesh configurations with errors

ASM mesh diagnosis has found incorrect mesh configurations in your cluster. These configurations have a high risk of causing unexpected behavior. You should view the alert content on the mesh diagnosis page as soon as possible and follow the prompts to correct the configuration.

Failed to push configurations from the control plane because mesh rule configurations do not meet specifications

The following table lists common data plane synchronization failure error messages and suggested actions. If you cannot find the corresponding error message, submit a ticket.

Error message

Suggested action

Internal:Error adding/updating listener(s) 0.0.0.0_443: Failed to load certificate chain from <inline>, only P-256 ECDSA certificates are supported

This alert indicates that the cluster on the data plane does not support the certificate that you configured. Only P-256 ECDSA certificates are supported. You must reconfigure the certificate. For more information, see Enable HTTPS security services using an ASM gateway.

Internal:Error adding/updating listener(s) 0.0.0.0_443: Invalid path: ****

This alert indicates that the certificate path that you configured for the data plane is incorrect or the certificate does not exist. You must check whether the certificate mount path matches the path that is configured in the Gateway. For more information, see Enable HTTPS security services using an ASM gateway.

Internal:Error adding/updating listener(s) 0.0.0.0_xx: duplicate listener 0.0.0.0_xx found

This alert indicates that you have configured a duplicate listening port for the gateway. You must check your Gateway configuration and delete the duplicate port.

Internal:Error adding/updating listener(s) 192.168.33.189_15021: Didn't find a registered implementation for name: '***'

This alert indicates that the implementation '***' that is referenced in the EnvoyFilter patch for the 15021 listener cannot be found in the Sidecar and Ingress gateway. You must delete this reference.

Internal:Error adding/updating listener(s) 0.0.0.0_80: V2 (and AUTO) xDS transport protocol versions are deprecated in grpc_service ***

This alert indicates that the XDS V2 protocol on your data plane is about to be deprecated. This issue usually occurs because the version of the Sidecar on your data plane does not match the version of the control plane. You can upgrade the Sidecar on the data plane to resolve this issue. You must delete the pod. When the pod is automatically recreated, the latest version of the Sidecar is automatically injected.

Related operations

Modify the control plane log Project

  1. Log on to the ASM console. In the left-side navigation pane, choose Service Mesh > Mesh Management.

  2. On the Mesh Management page, click the name of the ASM instance. In the left-side navigation pane, choose ASM Instance > Base Information.

  3. On the Basic Information page, in the Configuration Information area, click Change Log Project to the right of Control Plane Log Collection. In the Change Log Project dialog box, make the required changes, and then click Confirm.