用ACS实例运行镜像构建任务

在日常容器使用场景中,可能会遇到需要在ACS集群中构建镜像的诉求。ACS提供了一个基于buildah社区标准能力构建的功能验证镜像,用于创建新的容器镜像并推送到镜像仓库。

前提条件

ACS集群中构建镜像时,网络要求主要体现在镜像的推送和拉取过程中。在开始构建镜像前,请先确保:

  • Pod中对base镜像仓库拉取的网络连通性。

  • Pod中对push目标镜像仓库的网络连通性。

基本功能验证

验证Pod会使用buildah基础镜像,在非特权模式下,使用Dockerfile构建一个基本的镜像,并将其推送到环境变量中指定的仓库。可以通过Pod的标准输出日志看到Pod内执行的命令,如果Pod可以正常结束,这说明环境已经满足了在集群中使用buildah构建镜像的基本能力,可以进一步添加业务逻辑。

本示例使用的镜像registry-cn-hangzhou.ack.aliyuncs.com/ack-demo/buildah:v1.40.1基于社区镜像quay.io/buildah/stable:latest构建。如果希望确保用到社区最新版本,可以考虑使用社区最新Release或从源代码构建。

buildah构建默认不复用镜像构建过程的中间层,可使用buildah build --layers命令开启复用。

步骤一:创建验证Pod

  1. 使用以下Dockerfile构建示例镜像。

     FROM registry.cn-hangzhou.aliyuncs.com/eci_open/nginx:latest
     RUN echo 'hello acs' 
  2. 使用以下YAMLACS集群中创建一个Pod,用来构建和推送镜像。请按照实际情况修改YAML中的环境变量(env)。

    环境变量键

    环境变量含义

    LOGIN_USERNAME

    用于登录推送目标仓库的username。

    LOGIN_PASSWORD

    用于登录推送目标仓库的password。

    TARGET_REPO

    推送目标镜像仓库的完整地址(包括命名空间、仓库名称和tag),如:registry.cn-hangzhou.aliyuncs.com/yourRepoNS/yourRepoName:testTag

    TARGET_HOSTNAME

    推送目标仓库的域名,如:registry.cn-hangzhou.aliyuncs.com

    IMAGE_NAME

    buildah构建本地镜像名(不会影响镜像构建和推送),需要使用英文小写字母,如:buildahtestimage

    CPU场景

    kind: Pod
    apiVersion: v1
    metadata:
      name: buildah-demo-pod
      labels:
        alibabacloud.com/compute-qos: default
        alibabacloud.com/compute-class: general-purpose
    spec:
      restartPolicy: Never
      containers:
      - name: builder
        image: registry-cn-hangzhou.ack.aliyuncs.com/ack-demo/buildah:v1.40.1
        command:
        - sh
        - -c
        - |
          echo "=== build and push  ==="
          echo "buildah info"
          buildah info
          echo "[step 1] create  Dockerfile... "
          echo -e "FROM registry.cn-hangzhou.aliyuncs.com/eci_open/nginx:latest\nRUN echo 'hello acs'" > Dockerfile
          cat Dockerfile
          echo ""
    
          echo "[step 2] start build (IMAGE_NAME=${IMAGE_NAME})..."
          if buildah build -t "${IMAGE_NAME}" .; then
            echo "√ build success!"
            buildah images
          else
            echo "× build failed"
            exit 1
          fi
          echo ""
    
          echo "[step 3] login repo (HOST=${TARGET_HOSTNAME})..."
          if buildah login -u "${LOGIN_USERNAME}" -p "${LOGIN_PASSWORD}" "${TARGET_HOSTNAME}"; then
            echo "√ login success"
          else
            echo "× login failed"
            exit 1
          fi
          echo ""
    
          echo "[step 4] push repo (TARGET=${TARGET_REPO})..."
          if buildah push "${IMAGE_NAME}" docker://"${TARGET_REPO}"; then
            echo "√ push success"
            echo "the repo is :${TARGET_REPO}"
          else
            echo "× push failed!"
            exit 1
          fi
    
          echo "=== end  ==="
        env:
          - name: LOGIN_USERNAME
            value: "TODO:targetRepoLoginUsername"
          - name: LOGIN_PASSWORD
            value: "TODO:targetRepoLoginPassword"
          - name: TARGET_REPO
            value: "registry.cn-hangzhou.aliyuncs.com/yourRepoNS/yourRepoName:testTag"
          - name: TARGET_HOSTNAME
            value: "registry.cn-hangzhou.aliyuncs.com"
          - name: IMAGE_NAME
            value: "buildahtestimage"    # 镜像名称需要使用英文小写字母 
        imagePullPolicy: Always
        resources:
          limits:
            cpu: "1"
            memory: "2Gi"
          requests:
            cpu: "1"
            memory: "2Gi"

    GPU场景

    关于alibabacloud.com/gpu-model-series的配置,请参见ACS支持的GPU规格族

    在使用GPU场景中,一般AI容器镜像较大,建议把storage driver改为overlay。需要把/var/lib/containers/storage单独挂载成支持 overlay的文件系统。这里推荐使用emptyDir的方式来调整/var/lib/containers/storage以支持overlay。
    kind: Pod
    apiVersion: v1
    metadata:
      name: buildah-demo-pod
      labels:
        # 指定compute-classgpu类型
        alibabacloud.com/compute-class: gpu
        alibabacloud.com/gpu-model-series: <ACS支持的GPU规格族,如GU8TF等>
        alibabacloud.com/compute-qos: default
    spec:
      restartPolicy: Never
      volumes:
      - emptyDir: {}
        name: buildah
      containers:
      - name: builder
        image: registry-cn-hangzhou.ack.aliyuncs.com/ack-demo/buildah:v1.40.1
        command:
        - sh
        - -c
        - |
          echo "=== build and push  ==="
          echo "buildah info"
          buildah info
          echo "[step 1] create  Dockerfile... "
          echo -e "FROM registry.cn-hangzhou.aliyuncs.com/eci_open/nginx:latest\nRUN echo 'hello acs'" > Dockerfile
          cat Dockerfile
          echo ""
    
          echo "[step 2] start build (IMAGE_NAME=${IMAGE_NAME})..."
          if buildah build -t "${IMAGE_NAME}" .; then
            echo "√ build success!"
            buildah images
          else
            echo "× build failed"
            exit 1
          fi
          echo ""
    
          echo "[step 3] login repo (HOST=${TARGET_HOSTNAME})..."
          if buildah login -u "${LOGIN_USERNAME}" -p "${LOGIN_PASSWORD}" "${TARGET_HOSTNAME}"; then
            echo "√ login success"
          else
            echo "× login failed"
            exit 1
          fi
          echo ""
    
          echo "[step 4] push repo (TARGET=${TARGET_REPO})..."
          if buildah push "${IMAGE_NAME}" docker://"${TARGET_REPO}"; then
            echo "√ push success"
            echo "the repo is :${TARGET_REPO}"
          else
            echo "× push failed!"
            exit 1
          fi
    
          echo "=== end  ==="
        env:
          - name: LOGIN_USERNAME
            value: "TODO:targetRepoLoginUsername"
          - name: LOGIN_PASSWORD
            value: "TODO:targetRepoLoginPassword"
          - name: TARGET_REPO
            value: "registry.cn-hangzhou.aliyuncs.com/yourRepoNS/yourRepoName:testTag"
          - name: TARGET_HOSTNAME
            value: "registry.cn-hangzhou.aliyuncs.com"
          - name: IMAGE_NAME
            value: "buildahtestimage"
        imagePullPolicy: Always
        volumeMounts:
        - mountPath: /var/lib/containers/storage
          name: buildah
        resources:
          limits:
            cpu: "1"
            memory: "2Gi"
            nvidia.com/gpu: 1
          requests:
            cpu: "1"
            memory: "2Gi"
            nvidia.com/gpu: 1

步骤二:验证构建结果

  1. 查看容器构建日志, 看到push success,表示构建并推送成功。

    kubectl logs buildah-demo-pod

    预期输出:

    === build and push  ===
    [step 1] create  Dockerfile... 
    FROM registry.cn-hangzhou.aliyuncs.com/eci_open/nginx:latest
    RUN echo 'hello acs'
    ...
    [step 4] push repo (TARGET=registry.cn-hangzhou.aliyuncs.com/yourRepoNS/yourRepoName:testTag)...
    Getting image source signatures
    ...
    Writing manifest to image destination
    √ push success
    the repo is :registry.cn-hangzhou.aliyuncs.com/yourRepoNS/yourRepoName:testTag
    === end  ===

通过外部存储来加速镜像构建

如果选择通过Dockerfile构建容器镜像,且希望复用base image的内容,无需在每次构建的时候重新下载base image,可以考虑通过外部存储作为buildah的后端缓存,在启动镜像构建容器时挂载并使用该存储来加速镜像构建流程。

重要
  • 以下示例通过创建 CPU 类型的 Pod 来验证镜像构建加速。GPU 类型仅L20支持云盘,其他卡型暂不支持,建议使用NAS等共享存储。

  • 使用云盘作为后端存储时,请注意云盘不支持多点挂载。

步骤一:存储准备

可以选择任意方式来将希望缓存并共享的base image(base layer)填充到外部存储介质中,本步骤以将ACR制品中心的alinux3镜像(alibaba-cloud-linux-3-registry.cn-hangzhou.cr.aliyuncs.com/alinux3/alinux3:230602.1)缓存到云盘(使用云盘动态存储卷)为例。

  1. 按照如下YAMLACS集群中准备云盘存储, 示例使用Immediate模式验证。

    allowVolumeExpansion: true
    apiVersion: storage.k8s.io/v1
    kind: StorageClass
    metadata:
      name: alicloud-disk-essd
    parameters:
      fstype: ext4
      performanceLevel: PL1
      type: cloud_essd
    provisioner: diskplugin.csi.alibabacloud.com
    reclaimPolicy: Delete
    volumeBindingMode: Immediate
    
    ---
    
    apiVersion: v1
    kind: PersistentVolumeClaim
    metadata:
      name: disk-pvc
    spec:
      accessModes:
      - ReadWriteOncePod
      volumeMode: Filesystem
      resources:
        requests:
          storage: 20Gi
      storageClassName: alicloud-disk-essd
  2. 执行以下命令,查看PVC状态。

    kubectl get pvc disk-pvc

    预期输出:

    # 输出以下内容,说明云盘准备成功
    NAME       STATUS   VOLUME                   CAPACITY   ACCESS MODES   STORAGECLASS         VOLUMEATTRIBUTESCLASS   AGE
    disk-pvc   Bound    d-bp13vgtpxxphort*****   20Gi       RWOP           alicloud-disk-essd   <unset>                 40s

步骤二:创建Pod

  1. 使用以下Dockerfile构建示例镜像。

    FROM alibaba-cloud-linux-3-registry.cn-hangzhou.cr.aliyuncs.com/alinux3/alinux3:230602.1
    
    CMD ["tail","-f","/dev/null"]
  2. 使用以下YAML创建Pod,拉取示例镜像。

    kind: Pod 
    apiVersion: v1
    metadata:
      name: filler-pod
      labels:
        alibabacloud.com/compute-qos: default
        alibabacloud.com/compute-class: general-purpose
    spec:
      containers:
      - name: builder
        image: registry-cn-hangzhou.ack.aliyuncs.com/ack-demo/buildah:v1.40.1
        command: ["buildah", "pull", "alibaba-cloud-linux-3-registry.cn-hangzhou.cr.aliyuncs.com/alinux3/alinux3:230602.1"]
        imagePullPolicy: Always
        resources:
          limits:
            cpu: "1"
            memory: "1Gi"
          requests:
            cpu: "1" 
            memory: "1Gi" 
        volumeMounts:
          - name: disk 
            mountPath: "/var/lib/containers/storage"
      restartPolicy: Never
      volumes:
      - name: disk 
        persistentVolumeClaim:
          claimName: disk-pvc
  3. 执行以下命令,查看缓存。

    kubectl logs filler-pod

    预期输出:

    # 预期输出,说明镜像数据已经被缓存到云盘
    Trying to pull alibaba-cloud-linux-3-registry.cn-hangzhou.cr.aliyuncs.com/alinux3/alinux3:230602.1...
    Getting image source signatures
    Copying blob sha256:b078a9c0307ee7852fadced3640fe9851edfc9e0d9d4d8a0ae30542d51846d1b
    Copying blob sha256:006534dfe5bbb72ff9a9a84fa20e675c49dbec8e4c7e2b02b977d1bdb5af3fd4

步骤三:验证缓存

  1. 使用以下YAML创建一个新的Pod。

    kind: Pod 
    apiVersion: v1
    metadata:
      name: builder-pod
      labels:
        alibabacloud.com/compute-qos: default
        alibabacloud.com/compute-class: general-purpose
    spec:
      containers:
      - name: builder
        image: registry-cn-hangzhou.ack.aliyuncs.com/ack-demo/buildah:v1.40.1
        command: ["buildah", "bud", "-t", "registry.cn-hangzhou.aliyuncs.com/acsimage/buildah:resultImage", "."]
        imagePullPolicy: Always
        resources:
          limits:
            cpu: "1"
            memory: "1Gi"
          requests:
            cpu: "1" 
            memory: "1Gi" 
        volumeMounts:
          - name: disk 
            mountPath: "/var/lib/containers/storage"
      restartPolicy: Never
      volumes:
      - name: disk 
        persistentVolumeClaim:
          claimName: disk-pvc
  2. 执行以下命令,验证缓存效果。

    kubectl logs builder-pod

    预期输出:

    # 预期输出,说明复用了云盘中的镜像缓存,没有重复拉取基础镜像
    STEP 1/2: FROM alibaba-cloud-linux-3-registry.cn-hangzhou.cr.aliyuncs.com/alinux3/alinux3:230602.1
    STEP 2/2: CMD ["tail","-f","/dev/null"]
    COMMIT registry.cn-hangzhou.aliyuncs.com/acsimage/buildah:resultImage
    Getting image source signatures
    Copying blob sha256:933412459c628636b34d7952531f5f6e62f9bc190529d1a15872df9e635b90c0
    ...
    Writing manifest to image destination
    --> c0fb82a0eb9d
    Successfully tagged registry.cn-hangzhou.aliyuncs.com/acsimage/buildah:resultImage
    c0fb82a0eb9d8be7c543ccc097d929b5f47ffd916a848d13fbdbff88f976c6b2