文档

准备工作

更新时间:

为了方便您快速提交训练任务,您需要在创建训练任务前准备好所需的资源,并配置好可能需要使用的镜像、数据集和代码集。PAI支持添加文件系统NAS、CPFS或对象存储OSS类型的数据集以及Git代码集。本文介绍提交训练任务前所需的准备工作。

前提条件

如果您使用OSS作为存储系统,请确保已经根据业务需求为服务关联角色授予了OSS访问权限。否则挂载OSS后,进行数据访问时,可能产生I/O错误。关于如何为服务关联角色授予OSS访问权限,请参见云产品依赖与授权:DLC

使用限制

OSS并非一个真正的文件系统,而是一个分布式对象存储。因此使用OSS作为存储系统时,不支持文件系统的部分功能。例如,挂载OSS后,不支持对已经存在的文件追加写和覆盖写。

步骤一:准备资源

提交训练任务前,您需要准备计算资源,用于后续AI训练。以下资源任选其一即可:

  • 准备公共资源组

    完成DLC授权后,即为您准备好通用计算资源公共资源组,无需您手动添加资源组等操作。具体操作,请参见云产品依赖与授权:DLC。在工作空间的新建任务页面提交训练任务时,支持选择公共资源组。

  • 准备通用计算资源

    您可以预先创建专有资源组,并购买所需的通用计算资源。通过新增资源配额来分配专有资源组的计算资源。后续,您只需将资源配额绑定到指定的工作空间中,就可以在该工作空间内使用资源配额提交训练任务。具体操作,请参见通用计算资源配额

  • 准备灵骏智算资源

    如果您想高性能完成AI训练任务,提交训练任务前,您需要准备好训练任务所需的灵骏智算资源,并关联到工作空间内。具体操作,请参见灵骏智算资源配额

步骤二:准备镜像

提交训练任务前,请准备训练环境需要安装的镜像。以下镜像任选其一即可:

  • 社区标准镜像:如果您的需求与通用开发环境相符,您可以直接使用公开的社区标准镜像,无需额外配置。

  • PAI平台镜像:针对特定于阿里云服务的优化和集成,PAI提供了基于不同框架的官方镜像。此类镜像适合在阿里云平台上进行训练任务,能够获得更好的兼容性和性能。

  • 自定义镜像:如果您的训练任务需要特殊的环境或依赖,您可以创建自定义镜像来满足具体要求。

提交分布式训练任务时支持选择的公共镜像列表如下:

类型

框架

镜像

社区镜像

TensorFlow

tensorflow-training:2.3-cpu-py36-ubuntu18.04

tensorflow-training:2.3-gpu-py36-cu101-ubuntu18.04

tensorflow-training:1.15-cpu-py36-ubuntu18.04

tensorflow-training:1.15-gpu-py36-cu100-ubuntu18.04

PyTorch

pytorch-training:1.6.0-gpu-py37-cu101-ubuntu18.04

pytorch-training:1.7.1-gpu-py37-cu110-ubuntu18.04

PAI平台镜像

TensorFlow

tensorflow-training:1.12.2PAI-cpu-py27-ubuntu16.04

tensorflow-training:1.12.2PAI-mkl-cpu-py27-ubuntu16.04

tensorflow-training:1.12.2PAI-gpu-py27-cu100-ubuntu16.04

tensorflow-training:1.12.2PAI-cpu-py36-ubuntu16.04

tensorflow-training:1.12.2PAI-mkl-cpu-py36-ubuntu16.04

tensorflow-training:1.12.2PAI-gpu-py36-cu100-ubuntu16.04

tensorflow-training:1.15.0PAI-gpu-py27-cu100-ubuntu16.04

tensorflow-training:1.15.0PAI-gpu-py36-cu100-ubuntu16.04

PyTorch

pytorch-training:1.3.1PAI-gpu-py37-cu100-ubuntu16.04

pytorch-training:1.4.0PAI-gpu-py37-cu100-ubuntu16.04

pytorch-training:1.5.1PAI-gpu-py37-cu100-ubuntu16.04

pytorch-training:1.6.0PAI-gpu-py37-cu100-ubuntu16.04

社区镜像

镜像列表

由社区提供的标准镜像,支持不同的资源类型。单击此处,查看镜像文件列表详情。

registry.${region}.aliyuncs.com/pai-dlc/pytorch-training:1.6.0-gpu-py37-cu101-ubuntu18.04
registry.${region}.aliyuncs.com/pai-dlc/pytorch-training:1.7.1-gpu-py37-cu110-ubuntu18.04
registry.${region}.aliyuncs.com/pai-dlc/tensorflow-training:2.3.0-cpu-py36-ubuntu18.04
registry.${region}.aliyuncs.com/pai-dlc/tensorflow-training:2.3.0-gpu-py36-cu101-ubuntu18.04
registry.${region}.aliyuncs.com/pai-dlc/tensorflow-training:1.15.4-cpu-py36-ubuntu18.04
registry.${region}.aliyuncs.com/pai-dlc/tensorflow-training:1.15.4-gpu-py36-cu100-ubuntu18.04

其中${region}需要替换为具体的地域,包括如下取值:

  • cn-hangzhou

  • cn-shanghai

  • cn-qingdao

  • cn-beijing

  • cn-zhangjiakou

  • cn-huhehaote

  • cn-shenzhen

  • cn-chengdu

  • cn-hongkong

  • ap-southeast-1

例如${region}取值为cn-hangzhou时,社区提供的镜像如下表所示。

${region}

框架

CPU/GPU

Python版本

镜像的URL

cn-hangzhou

Tensorflow 2.3

CPU

3.6(py36)

  • registry.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:2.3.0-cpu-py36-ubuntu18.04

  • registry-vpc.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:2.3.0-cpu-py36-ubuntu18.04

Tensorflow 2.3

GPU

3.6 (py36)

  • registry.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:2.3.0-gpu-py36-cu101-ubuntu18.04

  • registry-vpc.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:2.3.0-gpu-py36-cu101-ubuntu18.04

Tensorflow 1.15

CPU

3.6(py36)

  • registry.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.15.4-cpu-py36-ubuntu18.04

  • registry-vpc.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.15.4-cpu-py36-ubuntu18.04

Tensorflow 1.15

GPU

3.6 (py36)

  • registry.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.15.4-gpu-py36-cu100-ubuntu18.04

  • registry-vpc.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.15.4-gpu-py36-cu100-ubuntu18.04

PyTorch 1.6

GPU

3.7 (py37)

  • registry.cn-hangzhou.aliyuncs.com/pai-dlc/pytorch-training:1.6.0-gpu-py37-cu101-ubuntu18.04

  • registry-vpc.cn-hangzhou.aliyuncs.com/pai-dlc/pytorch-training:1.6.0-gpu-py37-cu101-ubuntu18.04

PyTorch 1.7

GPU

3.7 (py37)

  • registry.cn-hangzhou.aliyuncs.com/pai-dlc/pytorch-training:1.7.1-gpu-py37-cu110-ubuntu18.04

  • registry-vpc.cn-hangzhou.aliyuncs.com/pai-dlc/pytorch-training:1.7.1-gpu-py37-cu110-ubuntu18.04

镜像版本详情

每个社区镜像(由社区提供的标准镜像)支持的操作系统、Python版本及三方库列表如下:

  • tensorflow-training:2.3-cpu-py36-ubuntu18.04

    • 操作系统:Ubuntu 18.04.5 LTS

    • Python版本:3.6.9

    • 三方库:三方库列表及版本信息如下表所示。

      三方库及版本

      absl-py 0.11.0

      asn1crypto 0.24.0

      astunparse 1.6.3

      cachetools 4.2.0

      certifi 2020.12.5

      cryptography 2.1.4

      gast 0.3.3

      google-auth 1.24.0

      google-auth-oauthlib 0.4.2

      google-pasta 0.2.0

      grpcio 1.34.0

      h5py 2.10.0

      idna 2.6

      importlib-metadata 3.3.0

      Keras-Preprocessing 1.1.2

      keyring 10.6.0

      keyrings.alt 3.0

      Markdown 3.3.3

      numpy 1.18.5

      oauthlib 3.1.0

      opt-einsum 3.3.0

      pip 20.2.4

      protobuf 3.14.0

      pyasn1 0.4.8

      pyasn1-modules 0.2.8

      pycrypto 2.6.1

      pygobject 3.26.1

      pyxdg 0.25

      requests 2.25.1

      requests-oauthlib 1.3.0

      rsa 4.6

      SecretStorage 2.3.1

      setuptools 51.1.1

      six 1.15.0

      tensorboard 2.4.0

      tensorboard-plugin-wit 1.7.0

      tensorflow 2.3.2

      tensorflow-estimator 2.3.0

      termcolor 1.1.0

      typing-extensions 3.7.4.3

      urllib3 1.26.2

      werkzeug 1.0.1

      wheel 0.30.0

      wrapt 1.12.1

      zipp 3.4.0

  • tensorflow-training:2.3-gpu-py36-cu101-ubuntu18.04

    • 操作系统:Ubuntu 18.04.5 LTS

    • Python版本:3.6.9

    • CUDA版本:10.1

    • 三方库:三方库列表及版本信息如下表所示。

      三方库及版本

      absl-py 0.11.0

      asn1crypto 0.24.0

      astunparse 1.6.3

      cachetools 4.2.0

      certifi 2020.12.5

      cryptography 2.1.4

      grpcio 1.34.0

      gast 0.3.3

      google-auth 1.24.0

      google-auth-oauthlib 0.4.2

      google-pasta 0.2.0

      h5py 2.10.0

      idna 2.6

      importlib-metadata 3.3.0

      Keras-Preprocessing 1.1.2

      keyrings.alt 3.0

      keyring 10.6.0

      Markdown 3.3.3

      numpy 1.18.5

      oauthlib 3.1.0

      opt-einsum 3.3.0

      python-apt 1.6.5+ubuntu0.5

      pip 20.2.4

      protobuf 3.14.0

      pyasn1 0.4.8

      pyasn1-modules 0.2.8

      pycrypto 2.6.1

      pygobject 3.26.1

      pyxdg 0.25

      requests 2.25.1

      requests-oauthlib 1.3.0

      rsa 4.6

      SecretStorage 2.3.1

      setuptools 51.1.1

      six 1.15.0

      tensorboard 2.4.0

      tensorboard-plugin-wit 1.7.0

      tensorflow-gpu 2.3.2

      tensorflow-estimator 2.3.0

      termcolor 1.1.0

      typing-extensions 3.7.4.3

      urllib3 1.26.2

      werkzeug 1.0.1

      wheel 0.30.0

      wrapt 1.12.1

      zipp 3.4.0

  • tensorflow-training:1.15-cpu-py36-ubuntu18.04

    • 操作系统:Ubuntu 18.04.5 LTS

    • Python版本:3.6.9

    • 三方库:三方库列表及版本信息如下表所示。

      三方库及版本

      absl-py 0.11.0

      asn1crypto 0.24.0

      astor 0.8.1

      cryptography 2.1.4

      gast 0.2.2

      google-pasta 0.2.0

      grpcio 1.34.0

      h5py 2.10.0

      idna 2.6

      importlib-metadata 3.3.0

      Keras-Preprocessing 1.1.2

      Keras-Applications 1.0.8

      keyring 10.6.0

      keyrings.alt 3.0

      Markdown 3.3.3

      numpy 1.18.5

      opt-einsum 3.3.0

      pip 20.3.3

      protobuf 3.14.0

      pycrypto 2.6.1

      pygobject 3.26.1

      pyxdg 0.25

      SecretStorage 2.3.1

      setuptools 51.1.1

      six 1.11.0

      tensorboard 1.15.0

      tensorflow 1.15.5

      tensorflow-estimator 1.15.1

      termcolor 1.1.0

      typing-extensions 3.7.4.3

      werkzeug 1.0.1

      wheel 0.30.0

      wrapt 1.12.1

      zipp 3.4.0

  • tensorflow-training:1.15-gpu-py36-cu100-ubuntu18.04

    • 操作系统:Ubuntu 18.04.5 LTS

    • Python版本:3.6.9

    • CUDA版本:10.0

    • 三方库:三方库列表及版本信息如下表所示。

      三方库及版本

      absl-py 0.11.0

      asn1crypto 0.24.0

      astor 0.8.1

      cryptography 2.1.4

      gast 0.2.2

      google-pasta 0.2.0

      grpcio 1.34.0

      h5py 2.10.0

      idna 2.6

      importlib-metadata 3.3.0

      Keras-Preprocessing 1.1.2

      Keras-Applications 1.0.8

      keyring 10.6.0

      keyrings.alt 3.0

      Markdown 3.3.3

      numpy 1.18.5

      opt-einsum 3.3.0

      pip 20.3.3

      protobuf 3.14.0

      pycrypto 2.6.1

      pygobject 3.26.1

      pyxdg 0.25

      SecretStorage 2.3.1

      setuptools 51.1.1

      six 1.11.0

      tensorboard 1.15.0

      tensorflow-gpu 1.15.5

      tensorflow-estimator 1.15.1

      termcolor 1.1.0

      typing-extensions 3.7.4.3

      werkzeug 1.0.1

      wheel 0.30.0

      wrapt 1.12.1

      zipp 3.4.0

      python-apt 1.6.5+ubuntu0.5

  • pytorch-training:1.6.0-gpu-py37-cu101-ubuntu18.04

    • 操作系统:Ubuntu 18.04.4 LTS

    • Python版本:3.7.7

    • CUDA版本:10.1

    • 三方库:三方库列表及版本信息如下表所示。

      三方库及版本

      backcall 0.2.0

      beautifulsoup4 4.9.1

      certifi 2020.6.20

      cffi 1.14.0

      cryptography 2.9.2

      conda 4.8.3

      conda-build 3.18.11

      conda-package-handling 1.7.0

      decorator 4.4.2

      filelock 3.0.12

      glob2 0.7

      ipython-genutils 0.2.0

      idna 2.9

      ipython 7.16.1

      jedi 0.17.1

      Jinja2 2.11.2

      libarchive-c 2.9

      MarkupSafe 1.1.1

      mkl-fft 1.1.0

      mkl-service 2.3.0

      mkl-random 1.1.1

      numpy 1.18.5

      olefile 0.46

      PyYAML 5.3.1

      parso 0.7.0

      pexpect 4.8.0

      pickleshare 0.7.5

      Pillow 7.2.0

      pip 20.0.2

      pkginfo 1.5.0.1

      prompt-toolkit 3.0.5

      psutil 5.7.0

      ptyprocess 0.6.0

      pycosat 0.6.3

      pycparser 2.20

      Pygments 2.6.1

      pyOpenSSL 19.1.0

      PySocks 1.7.1

      pytz 2020.1

      ruamel-yaml 0.15.87

      requests 2.23.0

      soupsieve 2.0.1

      setuptools 46.4.0.post20200518

      six 1.14.0

      traitlets 4.3.3

      torch 1.6.0

      torchvision 0.7.0

      tqdm 4.46.0

      urllib3 1.25.8

      wheel 0.34.2

      wcwidth 0.2.5

  • pytorch-training:1.7.1-gpu-py37-cu110-ubuntu18.04

    • 操作系统:Ubuntu 18.04.5 LTS

    • Python版本:3.8.5

    • CUDA版本:11.0

    • 三方库:三方库列表及版本信息如下表所示。

      三方库及版本

      backcall 0.2.0

      beautifulsoup4 4.9.3

      brotlipy 0.7.0

      certifi 2020.12.5

      cffi 1.14.3

      cryptography 3.2.1

      conda 4.9.2

      conda-build 3.21.4

      conda-package-handling 1.7.2

      dnspython 2.1.0

      decorator 4.4.2

      filelock 3.0.12

      glob2 0.7

      ipython-genutils 0.2.0

      idna 2.10

      ipython 7.19.0

      Jinja2 2.11.2

      jedi 0.17.2

      libarchive-c 2.9

      mkl-service 2.3.0

      MarkupSafe 1.1.1

      mkl-fft 1.2.0

      mkl-random 1.1.1

      numpy 1.19.2

      olefile 0.46

      PyYAML 5.3.1

      parso 0.7.0

      pexpect 4.8.0

      pickleshare 0.7.5

      Pillow 8.1.0

      pip 20.2.4

      pkginfo 1.7.0

      prompt-toolkit 3.0.8

      psutil 5.7.2

      ptyprocess 0.7.0

      pycosat 0.6.3

      pycparser 2.20

      Pygments 2.7.4

      pyOpenSSL 19.1.0

      PySocks 1.7.1

      python-etcd 0.4.5

      pytz 2020.5

      ruamel-yaml 0.15.87

      requests 2.24.0

      soupsieve 2.1

      setuptools 50.3.1.post20201107

      six 1.15.0

      typing-extensions 3.7.4.3

      torch 1.7.1

      torchelastic 0.2.1

      torchvision 0.8.2

      tqdm 4.51.0

      traitlets 5.0.5

      urllib3 1.25.11

      wheel 0.35.1

      wcwidth 0.2.5

PAI平台镜像

镜像列表

DLC提供了多种官方镜像,单击此处,查看官方公共镜像文件列表详情。

registry.${region}.aliyuncs.com/pai-dlc/tensorflow-training:1.12.2PAI-cpu-py27-ubuntu16.04
registry.${region}.aliyuncs.com/pai-dlc/tensorflow-training:1.12.2PAI-mkl-cpu-py27-ubuntu16.04
registry.${region}.aliyuncs.com/pai-dlc/tensorflow-training:1.12.2PAI-gpu-py27-cu100-ubuntu16.04

registry.${region}.aliyuncs.com/pai-dlc/tensorflow-training:1.12.2PAI-cpu-py36-ubuntu16.04
registry.${region}.aliyuncs.com/pai-dlc/tensorflow-training:1.12.2PAI-mkl-cpu-py36-ubuntu16.04
registry.${region}.aliyuncs.com/pai-dlc/tensorflow-training:1.12.2PAI-gpu-py36-cu100-ubuntu16.04

registry.${region}.aliyuncs.com/pai-dlc/tensorflow-training:1.15.0PAI-gpu-py27-cu100-ubuntu16.04
registry.${region}.aliyuncs.com/pai-dlc/tensorflow-training:1.15.0PAI-gpu-py36-cu100-ubuntu16.04

registry.${region}.aliyuncs.com/pai-dlc/pytorch-training:1.3.1PAI-gpu-py37-cu100-ubuntu16.04
registry.${region}.aliyuncs.com/pai-dlc/pytorch-training:1.4.0PAI-gpu-py37-cu100-ubuntu16.04
registry.${region}.aliyuncs.com/pai-dlc/pytorch-training:1.5.1PAI-gpu-py37-cu100-ubuntu16.04
registry.${region}.aliyuncs.com/pai-dlc/pytorch-training:1.6.0PAI-gpu-py37-cu100-ubuntu16.04

registry.${region}.aliyuncs.com/pai-dlc/tensorflow-training:1.12.2PAI-cpu-py27-ubuntu18.04
registry.${region}.aliyuncs.com/pai-dlc/tensorflow-training:1.12.2PAI-gpu-py27-cu101-ubuntu18.04
registry.${region}.aliyuncs.com/pai-dlc/tensorflow-training:1.12.2PAI-cpu-py36-ubuntu18.04
registry.${region}.aliyuncs.com/pai-dlc/tensorflow-training:1.12.2PAI-gpu-py36-cu101-ubuntu18.04
registry.${region}.aliyuncs.com/pai-dlc/tensorflow-training:1.15.4PAI-cpu-py36-ubuntu18.04
registry.${region}.aliyuncs.com/pai-dlc/tensorflow-training:1.15.4PAI-gpu-py36-cu101-ubuntu18.04

其中${region}需要替换为具体的地域,包括如下取值:

  • cn-hangzhou

  • cn-shanghai

  • cn-qingdao

  • cn-beijing

  • cn-zhangjiakou

  • cn-huhehaote

  • cn-shenzhen

  • cn-chengdu

  • cn-hongkong

  • ap-southeast-1

例如${region}取值为cn-hangzhou时,DLC所有的PAI平台镜像如下表所示。

${region}

框架

CPU/GPU

Python版本

镜像的URL

cn-hangzhou

TensorFlow 1.12

CPU

2.7(py27)

  • registry.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.12.2PAI-cpu-py27-ubuntu16.04

  • registry-vpc.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.12.2PAI-cpu-py27-ubuntu16.04

  • registry.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.12.2PAI2011-cpu-py27-ubuntu16.04

  • registry-vpc.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.12.2PAI2011-cpu-py27-ubuntu16.04

  • registry.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.12.2PAI-cpu-py27-ubuntu18.04

  • registry-vpc.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.12.2PAI-cpu-py27-ubuntu18.04

MKL-CPU

2.7(py27)

  • registry.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.12.2PAI-mkl-cpu-py27-ubuntu16.04

  • registry-vpc.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.12.2PAI-mkl-cpu-py27-ubuntu16.04

  • registry.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.12.2PAI2011-mkl-cpu-py27-ubuntu16.04

  • registry-vpc.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.12.2PAI2011-mkl-cpu-py27-ubuntu16.04

GPU

2.7(py27)

  • registry.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.12.2PAI-gpu-py27-cu100-ubuntu16.04

  • registry-vpc.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.12.2PAI-gpu-py27-cu100-ubuntu16.04

  • registry.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.12.2PAI2011-gpu-py27-cu100-ubuntu16.04

  • registry-vpc.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.12.2PAI2011-gpu-py27-cu100-ubuntu16.04

  • registry.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.12.2PAI-gpu-py27-cu101-ubuntu18.04

  • registry-vpc.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.12.2PAI-gpu-py27-cu101-ubuntu18.04

CPU

3.6 (py36)

  • registry.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.12.2PAI-cpu-py36-ubuntu16.04

  • registry-vpc.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.12.2PAI-cpu-py36-ubuntu16.04

  • registry.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.12.2PAI2011-cpu-py36-ubuntu16.04

  • registry-vpc.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.12.2PAI2011-cpu-py36-ubuntu16.04

  • registry.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.12.2PAI-cpu-py36-ubuntu18.04

  • registry-vpc.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.12.2PAI-cpu-py36-ubuntu18.04

MKL-CPU

3.6 (py36)

  • registry.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.12.2PAI-mkl-cpu-py36-ubuntu16.04

  • registry-vpc.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.12.2PAI-mkl-cpu-py36-ubuntu16.04

  • registry.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.12.2PAI2011-mkl-cpu-py36-ubuntu16.04

  • registry-vpc.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.12.2PAI2011-mkl-cpu-py36-ubuntu16.04

GPU

3.6 (py36)

  • registry.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.12.2PAI-gpu-py36-cu100-ubuntu16.04

  • registry-vpc.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.12.2PAI-gpu-py36-cu100-ubuntu16.04

  • registry.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.12.2PAI2011-gpu-py36-cu100-ubuntu16.04

  • registry-vpc.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.12.2PAI2011-gpu-py36-cu100-ubuntu16.04

  • registry.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.12.2PAI-gpu-py36-cu101-ubuntu18.04

  • registry-vpc.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.12.2PAI-gpu-py36-cu101-ubuntu18.04

TensorFlow 1.15

GPU

2.7 (py27)

  • registry.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.15.0PAI-gpu-py27-cu100-ubuntu16.04

  • registry-vpc.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.15.0PAI-gpu-py27-cu100-ubuntu16.04

  • registry.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.15.0PAI2011-gpu-py27-cu100-ubuntu16.04

  • registry-vpc.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.15.0PAI2011-gpu-py27-cu100-ubuntu16.04

CPU

3.6 (py36)

  • registry.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.15.4PAI-cpu-py36-ubuntu18.04

  • registry-vpc.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.15.4PAI-cpu-py36-ubuntu18.04

GPU

3.6 (py36)

  • registry.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.15.0PAI-gpu-py36-cu100-ubuntu16.04

  • registry-vpc.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.15.0PAI-gpu-py36-cu100-ubuntu16.04

  • registry.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.15.0PAI2011-gpu-py36-cu100-ubuntu16.04

  • registry-vpc.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.15.0PAI2011-gpu-py36-cu100-ubuntu16.04

  • registry.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.15.4PAI-gpu-py36-cu101-ubuntu18.04

  • registry-vpc.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.15.4PAI-gpu-py36-cu101-ubuntu18.04

PyTorch 1.3

GPU

3.7 (py37)

  • registry.cn-hangzhou.aliyuncs.com/pai-dlc/pytorch-training:1.3.1PAI-gpu-py37-cu100-ubuntu16.04

  • registry-vpc.cn-hangzhou.aliyuncs.com/pai-dlc/pytorch-training:1.3.1PAI-gpu-py37-cu100-ubuntu16.04

  • registry.cn-hangzhou.aliyuncs.com/pai-dlc/pytorch-training:1.3.1PAI2011-gpu-py37-cu100-ubuntu16.04

  • registry-vpc.cn-hangzhou.aliyuncs.com/pai-dlc/pytorch-training:1.3.1PAI2011-gpu-py37-cu100-ubuntu16.04

PyTorch 1.4

GPU

3.7 (py37)

  • registry.cn-hangzhou.aliyuncs.com/pai-dlc/pytorch-training:1.4.0PAI-gpu-py37-cu100-ubuntu16.04

  • registry-vpc.cn-hangzhou.aliyuncs.com/pai-dlc/pytorch-training:1.4.0PAI-gpu-py37-cu100-ubuntu16.04

  • registry.cn-hangzhou.aliyuncs.com/pai-dlc/pytorch-training:1.4.0PAI2011-gpu-py37-cu100-ubuntu16.04

  • registry-vpc.cn-hangzhou.aliyuncs.com/pai-dlc/pytorch-training:1.4.0PAI2011-gpu-py37-cu100-ubuntu16.04

PyTorch 1.5

GPU

3.7 (py37)

  • registry.cn-hangzhou.aliyuncs.com/pai-dlc/pytorch-training:1.5.1PAI-gpu-py37-cu100-ubuntu16.04

  • registry-vpc.cn-hangzhou.aliyuncs.com/pai-dlc/pytorch-training:1.5.1PAI-gpu-py37-cu100-ubuntu16.04

  • registry.cn-hangzhou.aliyuncs.com/pai-dlc/pytorch-training:1.5.1PAI2011-gpu-py37-cu100-ubuntu16.04

  • registry-vpc.cn-hangzhou.aliyuncs.com/pai-dlc/pytorch-training:1.5.1PAI2011-gpu-py37-cu100-ubuntu16.04

PyTorch 1.6

GPU

3.7 (py37)

  • registry.cn-hangzhou.aliyuncs.com/pai-dlc/pytorch-training:1.6.0PAI-gpu-py37-cu100-ubuntu16.04

  • registry-vpc.cn-hangzhou.aliyuncs.com/pai-dlc/pytorch-training:1.6.0PAI-gpu-py37-cu100-ubuntu16.04

  • registry.cn-hangzhou.aliyuncs.com/pai-dlc/pytorch-training:1.6.0PAI2011-gpu-py37-cu100-ubuntu16.04

  • registry-vpc.cn-hangzhou.aliyuncs.com/pai-dlc/pytorch-training:1.6.0PAI2011-gpu-py37-cu100-ubuntu16.04

镜像版本详情

每个官方镜像(由PAI团队提供优化的镜像)支持的操作系统、Python版本及三方库列表如下:

  • tensorflow-training:1.12.2PAI-cpu-py27-ubuntu16.04

    • 操作系统:Ubuntu 16.04.6 LTS

    • Python版本:2.7.18 Anaconda

    • 三方库:三方库列表及版本信息如下表所示。

      三方库及版本

      absl-py 0.11.0

      aliyun-python-sdk-core 2.13.15

      aliyun-python-sdk-kms 2.14.0

      astor 0.8.1

      backports.weakref 1.0.post1

      certifi 2020.6.20

      crcmod 1.7

      Cython 0.29.14

      enum34 1.1.6

      funcsigs 1.0.2

      futures 3.3.0

      gast 0.4.0

      grpcio 1.27.2

      h5py 2.10.0

      jmespath 0.10.0

      Keras-Applications 1.0.8

      Keras-Preprocessing 1.1.2

      Markdown 3.1.1

      mkl-fft 1.0.15

      mkl-random 1.1.0

      mkl-service 2.3.0

      mock 3.0.5

      numpy 1.16.4

      opencv-python 4.2.0.32

      oss2 2.9.1

      paiio 0.1.0

      pip 9.0.1

      protobuf 3.14.0

      pycryptodome 3.9.7

      pyodps 0.10.4

      pypai 1.1.0+tensorflow.1.12.2pai2011

      requests 2.13.0

      setuptools 36.4.0

      six 1.15.0

      tensorboard 1.12.2

      tensorflow 1.12.2PAI2011

      termcolor 1.1.0

      toposort 1.5

      Werkzeug 1.0.1

      wheel 0.35.1

  • tensorflow-training:1.12.2PAI-mkl-cpu-py27-ubuntu16.04

    • 操作系统:Ubuntu 16.04.6 LTS

    • Python版本:2.7.18 Anaconda

    • 三方库:三方库列表及版本信息如下表所示。

      三方库及版本

      absl-py 0.11.0

      aliyun-python-sdk-core 2.13.15

      aliyun-python-sdk-kms 2.14.0

      astor 0.8.1

      backports.weakref 1.0.post1

      certifi 2020.6.20

      crcmod 1.7

      Cython 0.29.14

      enum34 1.1.6

      funcsigs 1.0.2

      futures 3.3.0

      gast 0.4.0

      grpcio 1.27.2

      h5py 2.10.0

      jmespath 0.10.0

      Keras-Applications 1.0.8

      Keras-Preprocessing 1.1.2

      Markdown 3.1.1

      mkl-fft 1.0.15

      mkl-random 1.1.0

      mkl-service 2.3.0

      mock 3.0.5

      numpy 1.16.4

      opencv-python 4.2.0.32

      oss2 2.9.1

      paiio 0.1.0

      pip 9.0.1

      protobuf 3.14.0

      pycryptodome 3.9.7

      pyodps 0.10.4

      pypai 1.1.0+tensorflow.1.12.2pai2011

      requests 2.13.0

      setuptools 36.4.0

      six 1.15.0

      tensorboard 1.12.2

      tensorflow 1.12.2PAI2011

      termcolor 1.1.0

      toposort 1.5

      Werkzeug 1.0.1

      wheel 0.35.1

  • tensorflow-training:1.12.2PAI-gpu-py27-cu100-ubuntu16.04

    • 操作系统:Ubuntu 16.04.6 LTS

    • Python版本:2.7.18 Anaconda

    • CUDA版本:10.0

    • 三方库:三方库列表及版本信息如下表所示。

      三方库及版本

      absl-py 0.11.0

      aliyun-python-sdk-core 2.13.15

      aliyun-python-sdk-kms 2.14.0

      astor 0.8.1

      backports.weakref 1.0.post1

      certifi 2020.6.20

      crcmod 1.7

      Cython 0.29.14

      enum34 1.1.6

      funcsigs 1.0.2

      futures 3.3.0

      gast 0.4.0

      grpcio 1.27.2

      h5py 2.10.0

      jmespath 0.10.0

      Keras-Applications 1.0.8

      Keras-Preprocessing 1.1.2

      Markdown 3.1.1

      mkl-fft 1.0.15

      mkl-random 1.1.0

      mkl-service 2.3.0

      mock 3.0.5

      numpy 1.16.4

      opencv-python 4.2.0.32

      oss2 2.9.1

      paiio 0.1.0

      pip 9.0.1

      protobuf 3.14.0

      pycryptodome 3.9.7

      pyodps 0.10.4

      pypai 1.1.0+tensorflow.gpu.1.12.2pai2011

      requests 2.13.0

      setuptools 36.4.0

      six 1.15.0

      tensorboard 1.12.2

      tensorflow-gpu 1.12.2PAI2011

      termcolor 1.1.0

      toposort 1.5

      Werkzeug 1.0.1

      wheel 0.35.1

      subprocess32 3.5.4

      tao-wrapper 0.1.1

      whale 0.0.2

  • tensorflow-training:1.12.2PAI-cpu-py36-ubuntu16.04

    • 操作系统:Ubuntu 16.04.6 LTS

    • Python版本:3.6.12 Anaconda

    • 三方库:三方库列表及版本信息如下表所示。

      三方库及版本

      absl-py 0.11.0

      aliyun-python-sdk-core 2.13.29

      aliyun-python-sdk-core-v3 2.13.11

      aliyun-python-sdk-kms 2.14.0

      astor 0.8.1

      cached-property 1.5.2

      certifi 2020.12.5

      crcmod 1.7

      Cython 0.29.21

      gast 0.4.0

      grpcio 1.31.0

      h5py 3.1.0

      importlib-metadata 3.4.0

      jmespath 0.10.0

      Keras-Applications 1.0.8

      Keras-Preprocessing 1.1.2

      Markdown 3.3.3

      mkl-fft 1.2.0

      mkl-random 1.1.1

      mkl-service 2.3.0

      numpy 1.16.4

      opencv-python 4.2.0.32

      oss2 2.12.1

      paiio 0.1.0

      pip 20.2.4

      protobuf 3.14.0

      pycryptodome 3.9.9

      pyodps 0.10.4

      pypai 1.1.0+tensorflow.1.12.2pai2011

      requests 2.13.0

      setuptools 50.3.1.post20201107

      six 1.15.0

      tensorboard 1.12.2

      tensorflow 1.12.2PAI2011

      termcolor 1.1.0

      toposort 1.5

      typing-extensions 3.7.4.3

      Werkzeug 1.0.1

      wheel 0.35.1

      zipp 3.4.0

  • tensorflow-training:1.12.2PAI-mkl-cpu-py36-ubuntu16.04

    • 操作系统:Ubuntu 16.04.6 LTS

    • Python版本:3.6.12 Anaconda

    • 三方库:三方库列表及版本信息如下表所示。

      三方库及版本

      absl-py 0.11.0

      aliyun-python-sdk-core 2.13.29

      aliyun-python-sdk-core-v3 2.13.11

      aliyun-python-sdk-kms 2.14.0

      astor 0.8.1

      cached-property 1.5.2

      certifi 2020.12.5

      crcmod 1.7

      Cython 0.29.21

      gast 0.4.0

      grpcio 1.31.0

      h5py 3.1.0

      importlib-metadata 3.4.0

      jmespath 0.10.0

      Keras-Applications 1.0.8

      Keras-Preprocessing 1.1.2

      Markdown 3.3.3

      mkl-fft 1.2.0

      mkl-random 1.1.1

      mkl-service 2.3.0

      numpy 1.16.4

      opencv-python 4.2.0.32

      oss2 2.12.1

      paiio 0.1.0

      pip 20.2.4

      protobuf 3.14.0

      pycryptodome 3.9.9

      pyodps 0.10.4

      pypai 1.1.0+tensorflow.1.12.2pai2011

      requests 2.13.0

      setuptools 50.3.1.post20201107

      six 1.15.0

      tensorboard 1.12.2

      tensorflow 1.12.2PAI2011

      termcolor 1.1.0

      toposort 1.5

      typing-extensions 3.7.4.3

      Werkzeug 1.0.1

      wheel 0.35.1

      zipp 3.4.0

  • tensorflow-training:1.12.2PAI-gpu-py36-cu100-ubuntu16.04

    • 操作系统:Ubuntu 16.04.6 LTS

    • Python版本:3.6.12 Anaconda

    • CUDA版本:10.0

    • 三方库:三方库列表及版本信息如下表所示。

      三方库及版本

      absl-py 0.11.0

      aliyun-python-sdk-core 2.13.29

      aliyun-python-sdk-core-v3 2.13.11

      aliyun-python-sdk-kms 2.14.0

      astor 0.8.1

      cached-property 1.5.2

      certifi 2020.12.5

      crcmod 1.7

      Cython 0.29.21

      gast 0.4.0

      grpcio 1.31.0

      h5py 3.1.0

      importlib-metadata 3.4.0

      jmespath 0.10.0

      Keras-Applications 1.0.8

      Keras-Preprocessing 1.1.2

      Markdown 3.3.3

      mkl-fft 1.2.0

      mkl-random 1.1.1

      mkl-service 2.3.0

      numpy 1.16.4

      opencv-python 4.2.0.32

      oss2 2.12.1

      paiio 0.1.0

      pip 20.2.4

      protobuf 3.14.0

      pycryptodome 3.9.9

      pyodps 0.10.4

      pypai 1.1.0+tensorflow.gpu.1.12.2pai2011

      requests 2.13.0

      setuptools 50.3.1.post20201107

      six 1.15.0

      tensorboard 1.12.2

      tensorflow-gpu 1.12.2PAI2011

      termcolor 1.1.0

      toposort 1.5

      typing-extensions 3.7.4.3

      Werkzeug 1.0.1

      wheel 0.35.1

      zipp 3.4.0

      subprocess32 3.5.4

      tao-wrapper 0.1.1

      whale 0.0.2

  • tensorflow-training:1.15.0PAI-gpu-py27-cu100-ubuntu16.04

    • 操作系统:Ubuntu 16.04.6 LTS

    • Python版本:2.7.18 Anaconda

    • CUDA版本:10.0

    • 三方库:三方库列表及版本信息如下表所示。

      三方库及版本

      absl-py 0.11.0

      aliyun-python-sdk-core 2.13.15

      aliyun-python-sdk-kms 2.14.0

      astor 0.8.1

      backports.weakref 1.0.post1

      certifi 2020.6.20

      crcmod 1.7

      Cython 0.29.14

      enum34 1.1.6

      funcsigs 1.0.2

      functools32 3.2.3.post2

      futures 3.3.0

      gast 0.2.2

      google-pasta 0.2.0

      opt-einsum 2.3.2

      tensorflow-estimator 1.15.1

      grpcio 1.27.2

      h5py 2.10.0

      jmespath 0.10.0

      Keras-Applications 1.0.8

      Keras-Preprocessing 1.1.2

      Markdown 3.1.1

      mkl-fft 1.0.15

      mkl-random 1.1.0

      mkl-service 2.3.0

      mock 3.0.5

      numpy 1.16.4

      opencv-python 4.2.0.32

      oss2 2.9.1

      paiio 0.1.0

      pip 9.0.1

      protobuf 3.14.0

      pycryptodome 3.9.7

      pyodps 0.10.4

      pypai 1.1.0+tensorflow.gpu.1.15.0

      requests 2.13.0

      setuptools 44.1.1

      six 1.15.0

      tensorboard 1.15.0

      tensorflow-gpu 1.15.0

      termcolor 1.1.0

      toposort 1.5

      Werkzeug 1.0.1

      wheel 0.35.1

      subprocess32 3.5.4

      tao-wrapper 0.1.1

      whale 0.0.2

      wrapt 1.12.1

  • tensorflow-training:1.15.0PAI-gpu-py36-cu100-ubuntu16.04

    • 操作系统:Ubuntu 16.04.6 LTS

    • Python版本:3.6.12 Anaconda

    • CUDA版本:10.0

    • 三方库:三方库列表及版本信息如下表所示。

      三方库及版本

      absl-py 0.11.0

      aliyun-python-sdk-core 2.13.29

      aliyun-python-sdk-core-v3 2.13.11

      aliyun-python-sdk-kms 2.14.0

      astor 0.8.1

      cached-property 1.5.2

      certifi 2020.12.5

      crcmod 1.7

      Cython 0.29.21

      gast 0.2.2

      grpcio 1.31.0

      h5py 3.1.0

      importlib-metadata 3.4.0

      jmespath 0.10.0

      Keras-Applications 1.0.8

      Keras-Preprocessing 1.1.2

      Markdown 3.3.3

      mkl-fft 1.2.0

      mkl-random 1.1.1

      mkl-service 2.3.0

      numpy 1.16.4

      opencv-python 4.2.0.32

      oss2 2.12.1

      paiio 0.1.0

      pip 20.2.4

      protobuf 3.14.0

      pycryptodome 3.9.9

      pyodps 0.10.4

      pypai 1.1.0+tensorflow.gpu.1.15.0

      requests 2.13.0

      setuptools 50.3.1.post20201107

      six 1.15.0

      tensorboard 1.15.0

      tensorflow-gpu 1.15.0

      termcolor 1.1.0

      toposort 1.5

      typing-extensions 3.7.4.3

      Werkzeug 1.0.1

      wheel 0.35.1

      zipp 3.4.0

      subprocess32 3.5.4

      tao-wrapper 0.1.1

      whale 0.0.2

      google-pasta 0.2.0

      opt-einsum 3.3.0

      tensorflow-estimator 1.15.1

      wrapt 1.12.1

  • pytorch-training:1.3.1PAI-gpu-py37-cu100-ubuntu16.04

    • 操作系统:Ubuntu 16.04.6 LTS

    • Python版本:3.7.4

    • CUDA版本:10.0

    • 三方库:三方库列表及版本信息如下表所示。

      三方库及版本

      absl-py 0.11.0

      aiohttp 3.7.3

      apex 0.1

      asn1crypto 1.2.0

      async-timeout 3.0.1

      attrs 20.3.0

      blinker 1.4

      cachetools 4.2.0

      certifi 2020.12.5

      cffi 1.13.0

      cryptography 2.8

      click 7.1.2

      conda 4.9.2

      conda-package-handling 1.6.0

      future 0.18.2

      grpcio 1.31.0

      google-auth 1.24.0

      google-auth-oauthlib 0.4.2

      importlib-metadata 2.0.0

      idna 2.8

      multidict 4.7.6

      Markdown 3.3.3

      mkl-fft 1.2.0

      mkl-random 1.1.1

      mkl-service 2.3.0

      nvidia-dali 0.15.0

      numpy 1.19.2

      oauthlib 3.1.0

      PySocks 1.7.1

      Pillow 8.1.0

      pip 20.2.4

      protobuf 3.13.0

      pyasn1 0.4.8

      pyasn1-modules 0.2.8

      pycosat 0.6.3

      pycparser 2.19

      PyJWT 2.0.0

      pyOpenSSL 19.0.0

      ruamel-yaml 0.15.46

      requests 2.22.0

      requests-oauthlib 1.3.0

      rsa 4.7

      six 1.15.0

      sailfish 1.0.1

      setuptools 50.3.1.post20201107

      typing-extensions 3.7.4.3

      tensorboard 2.3.0

      tensorboard-plugin-wit 1.6.0

      torch 1.3.1+ali

      torchsummary 1.5.1

      torchvision 0.4.2

      tqdm 4.36.1

      urllib3 1.24.2

      Werkzeug 1.0.1

      wheel 0.35.1

      yarl 1.5.1

      zipp 3.4.0

  • pytorch-training:1.4.0PAI-gpu-py37-cu100-ubuntu16.04

    • 操作系统:Ubuntu 16.04.6 LTS

    • Python版本:3.7.4

    • CUDA版本:10.0

    • 三方库:三方库列表及版本信息如下表所示。

      三方库及版本

      absl-py 0.11.0

      aiohttp 3.7.3

      apex 0.1

      asn1crypto 1.2.0

      async-timeout 3.0.1

      attrs 20.3.0

      blinker 1.4

      cachetools 4.2.0

      certifi 2020.12.5

      cffi 1.13.0

      cryptography 2.8

      click 7.1.2

      conda 4.9.2

      conda-package-handling 1.6.0

      future 0.18.2

      grpcio 1.31.0

      google-auth 1.24.0

      google-auth-oauthlib 0.4.2

      importlib-metadata 2.0.0

      idna 2.8

      multidict 4.7.6

      Markdown 3.3.3

      mkl-fft 1.2.0

      mkl-random 1.1.1

      mkl-service 2.3.0

      nvidia-dali 0.15.0

      numpy 1.19.2

      oauthlib 3.1.0

      PySocks 1.7.1

      Pillow 8.1.0

      pip 20.2.4

      protobuf 3.13.0

      pyasn1 0.4.8

      pyasn1-modules 0.2.8

      pycosat 0.6.3

      pycparser 2.19

      PyJWT 2.0.0

      pyOpenSSL 19.0.0

      ruamel-yaml 0.15.46

      requests 2.22.0

      requests-oauthlib 1.3.0

      rsa 4.7

      six 1.15.0

      setuptools 50.3.1.post20201107

      typing-extensions 3.7.4.3

      tensorboard 2.3.0

      tensorboard-plugin-wit 1.6.0

      torch 1.4.0+ali

      torchsummary 1.5.1

      torchvision 0.5.0

      tqdm 4.36.1

      urllib3 1.24.2

      wheel 0.35.1

      Werkzeug 1.0.1

      yarl 1.5.1

      zipp 3.4.0

  • pytorch-training:1.5.1PAI-gpu-py37-cu100-ubuntu16.04

    • 操作系统:Ubuntu 16.04.6 LTS

    • Python版本:3.7.4

    • CUDA版本:10.0

    • 三方库:三方库列表及版本信息如下表所示。

      三方库及版本

      absl-py 0.11.0

      aiohttp 3.7.3

      apex 0.1

      asn1crypto 1.2.0

      async-timeout 3.0.1

      attrs 20.3.0

      blinker 1.4

      cachetools 4.2.0

      certifi 2020.12.5

      cffi 1.13.0

      cryptography 2.8

      click 7.1.2

      conda 4.9.2

      conda-package-handling 1.6.0

      future 0.18.2

      grpcio 1.31.0

      google-auth 1.24.0

      google-auth-oauthlib 0.4.2

      importlib-metadata 2.0.0

      idna 2.8

      multidict 4.7.6

      Markdown 3.3.3

      mkl-fft 1.2.0

      mkl-random 1.1.1

      mkl-service 2.3.0

      nvidia-dali 0.15.0

      numpy 1.19.2

      oauthlib 3.1.0

      PySocks 1.7.1

      Pillow 8.1.0

      pip 20.2.4

      protobuf 3.13.0

      pyasn1 0.4.8

      pyasn1-modules 0.2.8

      pycosat 0.6.3

      pycparser 2.19

      PyJWT 2.0.0

      pyOpenSSL 19.0.0

      rsa 4.7

      requests 2.22.0

      requests-oauthlib 1.3.0

      ruamel-yaml 0.15.46

      six 1.15.0

      sailfish 1.0.1

      setuptools 50.3.1.post20201107

      typing-extensions 3.7.4.3

      tensorboard 2.3.0

      tensorboard-plugin-wit 1.6.0

      torch 1.5.1+ali

      torchsummary 1.5.1

      torchvision 0.6.1

      tqdm 4.36.1

      urllib3 1.24.2

      wheel 0.35.1

      Werkzeug 1.0.1

      yarl 1.5.1

      zipp 3.4.0

  • pytorch-training:1.6.0PAI-gpu-py37-cu100-ubuntu16.04

    • 操作系统:Ubuntu 16.04.6 LTS

    • Python版本:3.7.4

    • CUDA版本:10.0

    • 三方库:三方库列表及版本信息如下表所示。

      三方库及版本

      absl-py 0.11.0

      aiohttp 3.7.3

      asn1crypto 1.2.0

      async-timeout 3.0.1

      attrs 20.3.0

      blinker 1.4

      cachetools 4.2.0

      certifi 2020.12.5

      cffi 1.13.0

      cryptography 2.8

      click 7.1.2

      conda 4.9.2

      conda-package-handling 1.6.0

      future 0.18.2

      grpcio 1.31.0

      google-auth 1.24.0

      google-auth-oauthlib 0.4.2

      importlib-metadata 2.0.0

      idna 2.8

      multidict 4.7.6

      Markdown 3.3.3

      mkl-fft 1.2.0

      mkl-random 1.1.1

      mkl-service 2.3.0

      nvidia-dali 0.15.0

      numpy 1.19.2

      oauthlib 3.1.0

      PySocks 1.7.1

      Pillow 8.1.0

      pip 20.2.4

      protobuf 3.13.0

      pyasn1 0.4.8

      pyasn1-modules 0.2.8

      pycosat 0.6.3

      pycparser 2.19

      PyJWT 2.0.0

      pyOpenSSL 19.0.0

      ruamel-yaml 0.15.46

      requests 2.22.0

      requests-oauthlib 1.3.0

      rsa 4.7

      six 1.15.0

      setuptools 50.3.1.post20201107

      typing-extensions 3.7.4.3

      tensorboard 2.3.0

      tensorboard-plugin-wit 1.6.0

      torch 1.6.0+ali

      torchsummary 1.5.1

      torchvision 0.7.0

      tqdm 4.36.1

      urllib3 1.24.2

      Werkzeug 1.0.1

      wheel 0.35.1

      yarl 1.5.1

      zipp 3.4.0

用户自定义镜像

可选择使用您添加到PAI的自定义镜像,在选择前,您需要先将自定义镜像添加到PAI中。为了方便管理和使用,建议您在工作空间的AI资产管理 > 镜像页面中,将该镜像添加为PAI的AI资产,便于多个训练任务直接选择使用。操作详情请参见查看并添加镜像

重要

使用灵骏智算资源提交训练任务时,如果选择使用自定义镜像提交训练任务,则相关注意事项,请参见RDMA(灵骏智算资源)

步骤三:准备数据集

提交训练任务前,您需将训练任务所需的数据上传至OSS或NAS后,创建为训练任务可直接使用的数据集。

支持的数据集类型

支持阿里云对象存储(OSS)、阿里云文件存储(通用型NAS)、阿里云文件存储(极速型NAS)、阿里云文件存储(CPFS)和阿里云文件存储(智算CPFS)类型的数据集。

其中:

  • 阿里云对象存储(OSS)、阿里云文件存储(CPFS)类型的数据集支持开启数据集加速功能,后续提交分布式训练任务时可直接使用已开启加速的数据集,提升数据读取效率。

  • 使用灵骏智算资源提交训练任务(DLC)时,当前仅OSS类型的数据集支持加速,NAS、智算CPFS类型的数据集暂不支持加速。

创建数据集

操作入口及其他参数的配置详情请参见创建及管理数据集。准备数据集时,有以下注意事项:

  • 创建用于训练任务的数据集时,仅支持从阿里云云产品这种类型的数据集,且属性必须为文件夹

  • 由于OSS与NAS不同,并非一个真正的文件系统,而是一个分布式对象存储。因此使用OSS作为存储系统时,不支持文件系统的部分功能。例如,挂载OSS后,不支持对已经存在的文件追加写和覆盖写。

  • 如果创建的数据集类型为阿里云文件存储(CPFS),则在提交训练任务时,需要配置专有网络,并选择与CPFS一致的专有网络。否则,提交的DLC训练任务会运行异常,表现为已出队。

开启数据集加速功能

您可以开启数据集加速功能,在创建DSW实例或训练任务时,可以直接使用已开启加速的数据集,提升数据读取效率。具体操作,请参见在PAI平台使用数据集加速器

步骤四:准备代码集

提交训练任务前,您需将训练任务可能需要使用的代码添加为代码集。为了方便管理和使用,建议您在工作空间的AI资产管理 > 代码配置页面中,将该代码添加为PAI的AI资产,便于多个训练任务直接选择使用。操作详情请参见代码配置

相关文档

完成准备工作后,您可以创建训练任务,详情请参见创建训练任务

  • 本页导读 (1)
文档反馈