Use p2p acceleration

更新时间:
复制 MD 格式

Peer-to-peer (P2P) acceleration accelerates image pulls and reduces application deployment time. This feature is particularly effective for batch image pulls in large-scale container clusters. This topic describes how to use the p2p acceleration feature to accelerate image pulls.

Background

When many nodes in a large-scale container cluster pull an image at the same time, the network bandwidth of the container image registry can become a bottleneck, which slows down image pulls. The p2p acceleration feature uses the bandwidth of compute nodes to distribute images. This reduces pressure on the container image registry, significantly accelerates image pulls, and shortens application deployment time. In a test on a 1,000-node cluster pulling a 1 GB image, p2p acceleration reduced the image pull time by more than 95% compared to standard image pulls over a 10 Gbps network. Additionally, the new p2p acceleration solution is 30% to 50% faster than the previous version. By default, the new solution supports using p2p acceleration when you load container images on demand. For more information, see Load resources of a container image on demand.

You can use p2p acceleration in the following scenarios:

  • ACK clusters

  • On-premises clusters or clusters from third-party cloud service providers

Prerequisites

A p2p acceleration agent is installed.

Limitations

When you enable p2p acceleration, the p2p acceleration agent uses a webhook to replace your container image address with a p2p image address. For example, if your original image address is test****vpc.cn-hangzhou.cr.aliyuncs.com/docker-builder/nginx:latest, the new p2p-accelerated image address is test****vpc.distributed.cn-hangzhou.cr.aliyuncs.com:65001/docker-builder/nginx:latest.

The webhook also automatically generates an image pull secret for the accelerated image address by copying the original image pull secret. The creation of the p2p image pull secret and the replacement of the image address are asynchronous processes. To prevent image pull failures, apply the required image pull secret before you deploy the workload. Alternatively, you can manually create an image pull secret for p2p image pulls using the domain test-registry-vpc.distributed.cn-hangzhou.cr.aliyuncs.com:65001, and then deploy the workload.

Enable p2p acceleration

You can enable p2p acceleration by adding a label to an application workload, such as a pod or Deployment, or to a namespace of an ACK cluster. After you add the p2p acceleration label to a namespace, p2p acceleration is enabled for all eligible application workloads in the namespace. You do not need to modify the YAML files of the application workloads. Choose one of the following methods to add the p2p acceleration label:

Note

The label name is k8s.aliyun.com/image-accelerate-mode and the value is p2p.

  • Add the p2p acceleration label to an application workload.

    The following example shows how to add a label to a Deployment. Run the following command to edit the Deployment:

    kubectl edit deploy <Deployment-Name>

    Add the k8s.aliyun.com/image-accelerate-mode: p2p label to the Deployment file.

    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: test
      labels:
        app: nginx
    spec:
      replicas: 1
      selector:
        matchLabels:
          app: nginx
      template:
        metadata:
          labels:
            # Enable p2p.
            k8s.aliyun.com/image-accelerate-mode: p2p
            app: nginx
        spec:
          # Your ACR instance image pull secret.
          imagePullSecrets:
          - name: test-registry
          containers:
          # Your ACR instance image.
          - image: test-registry-vpc.cn-hangzhou.cr.aliyuncs.com/docker-builder/nginx:latest
            name: test
            command: ["sleep", "3600"]
  • Add the p2p acceleration label to a namespace

    • Add the p2p acceleration label by using the console.

      1. Log on to the ACK console. In the left-side navigation pane, choose Clusters.

      2. On the Clusters page, click the name of your cluster. In the left navigation pane, click Namespaces and Quotas.

      3. On the Namespace page, find the target namespace and click Edit in the Actions column.

      4. In the Edit the namespace dialog box, click +Labels, set Variable Key to k8s.aliyun.com/image-accelerate-mode, set Variable Value to p2p, and then click Confirm.

    • Add the p2p acceleration label by using the CLI.

      kubectl label namespaces <YOUR-NAMESPACE> k8s.aliyun.com/image-accelerate-mode=p2p

Verify p2p acceleration

After you enable p2p acceleration, the p2p component automatically injects the p2p-related annotation, the p2p-accelerated image address, and the corresponding image pull credential into the pod.

Important

The p2p image pull credential is the same as your original image pull credential, except for the image registry domain name. Therefore, if the user information in your original image pull credential is incorrect, the p2p image pull will also fail.

Run the following command to view the pod:

kubectl get po <Pod-Name> -oyaml

Expected output:

apiVersion: v1
kind: Pod
metadata:
  annotations:
    # Injects p2p annotations automatically.
    k8s.aliyun.com/image-accelerate-mode: p2p
    k8s.aliyun.com/p2p-config: '...'
spec:
  containers:
   # Injects the image to the p2p endpoint.
   - image: test-registry-vpc.distributed.cn-hangzhou.cr.aliyuncs.com:65001/docker-builder/nginx:latest
  imagePullSecrets:
  - name: test-registry
  # Injects the image pull secret for the p2p endpoint.
  - name: acr-credential-test-registry-p2p

The pod is injected with the p2p-related annotation, p2p-accelerated image address, and the corresponding image pull credential. This confirms that p2p acceleration is enabled.

(Optional) Enable client metric collection

P2P metrics

Enable metrics

To enable metrics, set 'exporter.enable' to 'true' in the agent's YAML configuration.

p2p:

  v2:
    # Component for P2P v2.
    image: registry-vpc.__ACK_REGION_ID__.aliyuncs.com/acs/dadi-agent
    imageTag: v0.1.2-72276d4-aliyun

    # The maximum number of layers that can be concurrently downloaded by each node proxy.
    proxyConcurrencyLimit: 128

    # The server port to communicate with P2P nodes.
    p2pPort: 65002

    cache:
      # The disk cache capacity in bytes. Default value: 4 GB.
      capacity: 4294967296
      # Set this parameter to 1 if you are using high-performance disks, such as ESSD PL2/PL3, on your ECS instance.
      aioEnable: 0
    exporter:
      # Set this parameter to true if you want to collect component metrics.
      enable: false
      port: 65003

    # The limit for downstream throughput.
    throttleLimitMB: 512

Access method

The ExporterConfig section in the p2p YAML file defines the metric port.

ExporterConfig:
  enable: true # Specifies whether to enable metrics.
  port: 65006  # The listening port.
  standaloneExporterPort: true # Specifies whether to expose a standalone port. If this parameter is set to false, metrics are exported through the HTTP service port.

Run curl 127.0.0.1:$port/metrics to view the available metrics.

# HELP DADIP2P_Alive 
# TYPE DADIP2P_Alive gauge
DADIP2P_Alive{node="192.168.69.172:65005",mode="agent"} 1.000000 1692156721833

# HELP DADIP2P_Read_Throughtput Bytes / sec
# TYPE DADIP2P_Read_Throughtput gauge
DADIP2P_Read_Throughtput{node="192.168.69.172:65005",type="pread",mode="agent"} 0.000000 1692156721833
DADIP2P_Read_Throughtput{node="192.168.69.172:65005",type="download",mode="agent"} 0.000000 1692156721833
DADIP2P_Read_Throughtput{node="192.168.69.172:65005",type="peer",mode="agent"} 0.000000 1692156721833
DADIP2P_Read_Throughtput{node="192.168.69.172:65005",type="disk",mode="agent"} 0.000000 1692156721833
DADIP2P_Read_Throughtput{node="192.168.69.172:65005",type="http",mode="agent"} 0.000000 1692156721833

# HELP DADIP2P_QPS 
# TYPE DADIP2P_QPS gauge
DADIP2P_QPS{node="192.168.69.172:65005",type="pread",mode="agent"} 0.000000 1692156721833
DADIP2P_QPS{node="192.168.69.172:65005",type="download",mode="agent"} 0.000000 1692156721833
DADIP2P_QPS{node="192.168.69.172:65005",type="peer",mode="agent"} 0.000000 1692156721833
DADIP2P_QPS{node="192.168.69.172:65005",type="disk",mode="agent"} 0.000000 1692156721833
DADIP2P_QPS{node="192.168.69.172:65005",type="http",mode="agent"} 0.000000 1692156721833

# HELP DADIP2P_MaxLatency us
# TYPE DADIP2P_MaxLatency gauge
DADIP2P_MaxLatency{node="192.168.69.172:65005",type="pread",mode="agent"} 0.000000 1692156721833
DADIP2P_MaxLatency{node="192.168.69.172:65005",type="download",mode="agent"} 0.000000 1692156721833
DADIP2P_MaxLatency{node="192.168.69.172:65005",type="peer",mode="agent"} 0.000000 1692156721833
DADIP2P_MaxLatency{node="192.168.69.172:65005",type="disk",mode="agent"} 0.000000 1692156721833
DADIP2P_MaxLatency{node="192.168.69.172:65005",type="http",mode="agent"} 0.000000 1692156721833

# HELP DADIP2P_Count Bytes
# TYPE DADIP2P_Count gauge
DADIP2P_Count{node="192.168.69.172:65005",type="pread",mode="agent"} 0.000000 1692156721833
DADIP2P_Count{node="192.168.69.172:65005",type="download",mode="agent"} 0.000000 1692156721833
DADIP2P_Count{node="192.168.69.172:65005",type="peer",mode="agent"} 0.000000 1692156721833
DADIP2P_Count{node="192.168.69.172:65005",type="disk",mode="agent"} 0.000000 1692156721833
DADIP2P_Count{node="192.168.69.172:65005",type="http",mode="agent"} 0.000000 1692156721833

# HELP DADIP2P_Cache 
# TYPE DADIP2P_Cache gauge
DADIP2P_Cache{node="192.168.69.172:65005",type="allocated",mode="agent"} 4294967296.000000 1692156721833
DADIP2P_Cache{node="192.168.69.172:65005",type="used",mode="agent"} 4294971392.000000 1692156721833

# HELP DADIP2P_Label 
# TYPE DADIP2P_Label gauge

Metrics

Metric names

  • DADIP2P_Alive: Indicates whether the service is alive.

  • DADIP2P_Read_Throughtput: The throughput of the p2p service. Unit: byte/s.

  • DADIP2P_QPS: QPS (queries per second).

  • DADIP2P_MaxLatency: Latency statistics. Unit: μs.

  • DADIP2P_Count: Traffic statistics. Unit: bytes.

  • DADIP2P_Cache: The cache usage of a single node. Unit: bytes.

Tags

  • node: The service IP address and port of the p2p agent or root.

  • type: The metric type.

    • pread: Processes downstream requests.

    • download: Back-to-origin requests.

    • peer: P2P network distribution.

    • disk: Processes disk operations.

    • http: Processes HTTP requests.

    • allocated: The allocated cache space.

    • used: The used cache space.

Metric examples

DADIP2P_Count{node="11.238.108.XXX:9877",type="http",mode="agent"} 4248808352.000000 1692157615810
Total HTTP request traffic processed by the agent service: 4,248,808,352 bytes.

DADIP2P_Cache{node="11.238.108.XXX:9877",type="used",mode="agent"} 2147487744.000000 1692157615810
Current cache usage of the agent: 2,147,487,744 bytes.

Audit logs

Enable audit logs

In the p2p configmap, set the logAudit field to true.

DeployConfig:
  mode: agent
  logDir: /dadi-p2p/log
  logAudit: true
  logAuditMode: stdout # Sends logs to the console. Set the value to file to write logs to /dadi-p2p/log/audit.log.

Audit log format

Each entry records the processing time from when a request is received to when the result is returned. Unit: microseconds (μs).

2022/08/30 15:44:52|AUDIT|th=00007FBA247C5280|download[pathname=/https://cri-pi840la*****-registry.oss-cn-hangzhou.aliyuncs.com/docker/registry/v2/blobs/sha256/dd/dd65726c224b09836aeb6ecebd6baf58c96be727ba86da14e62835569896008a/data][offset=125829120][size=2097152][latency=267172]
....
2022/08/30 15:44:55|AUDIT|th=00007FBA2EFEAEC0|http:pread[pathname=/https://cri-pi840lacia*****-registry.oss-cn-hangzhou.aliyuncs.com/docker/registry/v2/blobs/sha256/dd/dd65726c224b09836aeb6ecebd6baf58c96be727ba86da14e62835569896008a/data][offset=127467520][size=65536][latency=21]

The main fields are: time, AUDIT, thread pointer, and operation code fields such as [pathname=], [size=], and [latency=].

In most cases, you can ignore the AUDIT and thread pointer fields. size is the data size of a single request. A negative value indicates an exception. latency is the latency of a single request. Unit: microseconds (μs).

Common operation codes:

  • http:pread: The HTTP proxy processes a downstream data request.

  • rpc:stat: The p2p agent obtains the file size.

  • rpc:pread: The p2p agent processes a downstream data request.

  • download: The p2p agent downloads data from an upstream source.

  • filewrite: The p2p agent writes the current data chunk to the cache.

  • fileread: The p2p agent reads a data chunk from the cache.

Log examples

download[pathname=mytest][offset=0][size=65536][latency=26461]
  ## The latency when the p2p agent downloads the data segment [0,65536) of the mytest file from the upstream source is 26,461 μs.
rpc:pread[pathname=mytest][offset=0][size=65536][latency=2]
  ## The latency when the p2p agent returns the data segment [0,65536) of the mytest file to the downstream client is 2 μs.
http:pread[pathname=mytest][offset=0][size=65536][latency=26461]
  ## The latency when the proxy downloads the data segment [0,65536) of the mytest file from the upstream source is 26,461 μs.

(Optional) Disable p2p for on-demand loading

Note

Note that subsequent node maintenance operations may overwrite this configuration.

  1. Log on to the ACK console. In the left-side navigation pane, choose Clusters.

  2. On the Clusters page, click the name of the target cluster. In the left-side navigation pane, choose Nodes > Nodes.

  3. On the Node page, click the Instance ID that corresponds to the IP address of the target node.

  4. On the instance details page, use Connect to log on to the node.

  5. Use the vi command to edit the /etc/overlaybd/overlaybd.json file. In the p2pConfig section, change the value of enable to false.

    {
         "p2pConfig": {
            "enable": false,
            "address": "https://localhost:6****/accelerator"
        },
    ... ...
    }
  6. Run the following command to restart the load-on-demand component:

    service overlaybd-tcmu restart

Appendix

Performance reference

Image pull performance

Test image specifications:

  • 4 GB (512 MB * 8 layers)

  • 10 GB (10 GB * 1 layer)

  • 20 GB (4 GB * 5 layers, 10 GB * 2 layers, 512 MB * 40 layers, 20 GB * 1 layer, 2 GB * 10 layers)

Test environment

  • ACK cluster: 1,000 nodes

  • ECS specification: 4 vCPUs, 8 GB memory

  • Cloud disk specification: 200 GB ESSD PL1

  • p2p acceleration agent specification: 1 vCPU, 1 GB memory, 4 GB cache

Test scenario

1,000 nodes pulling the same image (including image download and decompression).

Test results (P95 time consumption)

Image specification

Time

Peak source throughput (Gbps)

512 MB * 8 layers

116 seconds

2

10 GB * 1 layer

6 minutes and 20 seconds

1.2

4 GB * 5 layers

9 minutes and 15 seconds

5.1

10 GB * 2 layers

9 minutes and 50 seconds

6.7

512 MB * 40 layers

7 minutes and 55 seconds

3.8

20 GB * 1 layer

11 minutes

2.5

2 GB * 10 layers

8 minutes and 13 seconds

3.2