获取一个任务的详细配置和运行时信息。

调试

您可以在OpenAPI Explorer中直接运行该接口,免去您计算签名的困扰。运行成功后,OpenAPI Explorer可以自动生成SDK代码示例。

请求语法

GET /api/v1/jobs/JobId HTTP/1.1
Content-Type:application/json

请求参数

表 1. 请求Path参数
参数名称 类型 是否必选 示例 说明
JobId String dlc*******

任务ID。您可以调用ListJobs获取满足过滤条件的任务的ID;调用CreateJob的返回值中也可以获取刚创建的任务ID。

响应体语法

HTTP/1.1 200 OK
Content-Type:application/json

{
  "JobId" : "String",
  "JobType" : "String",
  "DisplayName" : "String",
  "UserId" : "String",
  "Status" : "String",
  "WorkspaceId" : "String",
  "WorkspaceName" : "String",
  "ResourceId" : "String",
  "ResourceLevel" : "String",
  "ReasonCode" : "String",
  "ReasonMessage" : "String",
  "JobSpecs" : [ {
    "Type" : "String",
    "Image" : "String",
    "PodCount" : Long,
    "EcsSpec" : "String",
    "ExtraPodSpec" : {
      "SideCarContainers" : [ {
        "Name" : "String",
        "Image" : "String",
        "Command" : [ "String" ],
        "Args" : [ "String" ],
        "WorkingDir" : "String",
        "Env" : [ {
          "Name" : "String",
          "Value" : "String"
        } ],
        "Resources" : { }
      } ],
      "InitContainers" : [ {
        "Name" : "String",
        "Image" : "String",
        "Command" : [ "String" ],
        "Args" : [ "String" ],
        "WorkingDir" : "String",
        "Env" : [ {
          "Name" : "String",
          "Value" : "String"
        } ],
        "Resources" : { }
      } ],
      "SharedVolumeMountPaths" : [ "String" ]
    },
    "ResourceConfig" : {
      "CPU" : "String",
      "GPU" : "String",
      "Memory" : "String",
      "SharedMemory" : "String",
      "GPUType" : "String"
    },
    "UseSpotInstance" : Boolean
  } ],
  "UserCommand" : "String",
  "DataSources" : [ {
    "DataSourceId" : "String",
    "MountPath" : "String"
  } ],
  "CodeSource" : {
    "CodeSourceId" : "String",
    "Branch" : "String",
    "Commit" : "String",
    "MountPath" : "String"
  },
  "ThirdpartyLibs" : [ "String" ],
  "ThirdpartyLibDir" : "String",
  "GmtCreateTime" : "String",
  "GmtSubmittedTime" : "String",
  "GmtRunningTime" : "String",
  "GmtSuccessedTime" : "String",
  "GmtStoppedTime" : "String",
  "GmtFailedTime" : "String",
  "GmtFinishTime" : "String",
  "Duration" : Long,
  "Pods" : [ {
    "Type" : "String",
    "PodId" : "String",
    "PodUid" : "String",
    "Status" : "String",
    "Ip" : "String",
    "GmtCreateTime" : "String",
    "GmtStartTime" : "String",
    "GmtFinishTime" : "String",
    "HistoryPods" : [ {
      "Type" : "String",
      "PodId" : "String",
      "PodUid" : "String",
      "Status" : "String",
      "Ip" : "String",
      "GmtCreateTime" : "String",
      "GmtStartTime" : "String",
      "GmtFinishTime" : "String"
    } ]
  } ],
  "RequestId" : "String",
  "Settings" : {
    "BusinessUserId" : "String",
    "Caller" : "String",
    "PipelineId" : "String",
    "EnableTideResource" : Boolean,
    "EnableErrorMonitoringInAIMaster" : Boolean,
    "ErrorMonitoringArgs" : "String",
    "EnableRDMA" : Boolean
  },
  "ClusterId" : "String",
  "Priority" : Integer
}

响应参数

表 2. 响应Body参数
参数名称 类型 示例 说明
JobId String dlc*******

任务ID

JobType String TFJob

任务类型;由CreateJob API中的JobType指定。

DisplayName String tf-mnist-test

任务名称

UserId String 12*********

任务提交人的阿里云UID

Status String Stopped

任务运行状态,枚举类型如下:

  • Created
  • Creating
  • Queuing
  • Dequeued
  • Running
  • Stopping
  • Succeeded
  • Failed
  • Stopped
WorkspaceId String 268

任务所属工作空间ID

WorkspaceName String dlc-workspace

任务所属工作空间名称

ResourceId String r******

任务运行所在的资源组ID

ResourceLevel String L0

任务运行时使用的资源级别

ReasonCode String JobStoppedByUser

状态详情码,对当前状态(Status)下的子状态的一个分类。

ReasonMessage String Job is stopped by user.

状态详情的详细描述。

JobSpecs Array of JobSpec

任务中的节点配置,参考CreateJob API中的JobSpecs

UserCommand String python /root/code/mnist.py

每个节点的启动命令

DataSources Array of DataSources

数据源列表

DataSourceId String d*******

数据源ID

MountPath String /mnt/data/

本地挂载路径,可选参数,默认为空(表示使用数据源中挂载路径)

CodeSource Object

代码源

CodeSourceId String code******

代码源ID

Branch String master

代码分支

Commit String 44da109b59f8596152987eaa8f3b2487xxxxxx

代码CommitID

MountPath String /mnt/data

本地挂载路径

ThirdpartyLibs Array of String numpy==1.16.1

第三方Python库

ThirdpartyLibDir String /root/code/

三方库(requirements.txt)文件所在文件夹。

Envs Map

环境变量配置

String ENABLE_DEBUG_MODE

环境变量Key和Value

GmtCreateTime String 2021-01-12T14:35:01Z

任务创建时间(UTC)

GmtSubmittedTime String 2021-01-12T14:36:01Z

任务提交到集群的时间

GmtRunningTime String 2021-01-12T14:36:21Z

任务开始运行的时间

GmtSuccessedTime String 2021-01-12T15:36:08Z

任务正常结束的时间

GmtStoppedTime String 2021-01-12T15:36:08Z

任务停止的时间

GmtFailedTime String 2021-01-12T15:36:08Z

任务运行失败的时间

GmtFinishTime String 2021-01-12T15:36:08Z

任务结束时间(UTC)

Duration Long 3602

任务运行时长,单位:秒。

Pods Array of Pods

任务运行中的所有节点。

Type String Worker

节点类型;与CreateJob中的JobSpecs中的某个JobSpec对应

PodId String Worker

节点ID,可用于GetPodLogs 和GetPodEvents API获取节点的详细日志和事件。

PodUid String fe846462-af2c-4521-bd6f-96787a57591d

Pod UId

Status String Running

节点状态。枚举值:

  • Pending
  • Running
  • Succeeded
  • Failed
  • Unknown
Ip String 10.0.1.2

节点的网络IP地址

GmtCreateTime String 2021-01-12T14:36:01Z

Pod创建时间(UTC)

GmtStartTime String 2021-01-12T14:36:01Z

节点启动时间(UTC)

GmtFinishTime String 2021-01-12T15:36:05Z

节点结束时间(UTC)

HistoryPods Array of HistoryPods

历史Pods

Type String Worker

Pod类型

PodId String Worker

Pod Id

PodUid String fe846462-af2c-4521-bd6f-96787a57591d

Pod UId

Status String Failed

Pod状态

Ip String 10.0.1.3

Pod Ip

GmtCreateTime String 2021-01-12T14:36:01Z

Pod创建时间(UTC)

GmtStartTime String 2021-01-12T14:36:01Z

Pod启动时间(UTC)

GmtFinishTime String 2021-01-12T14:36:01Z

Pod结束时间(UTC)

RequestId String 473469C7-AA6F-4DC5-B3DB-xxxxxxxx

请求ID,用于诊断和答疑。

Settings JobSettings

作业额外参数配置

ClusterId String a*****

集群ID

ElasticSpec JobElasticSpec

弹性任务参数

EnabledDebugger Boolean false

是否开启debugger任务

Priority Integer 1

任务的优先级

获取一个任务的详情

获取一个任务的配置与当前运行状态。

GET /api/v1/jobs/dlc******* HTTP/1.1
Host:pai-dlc.aliyuncs.com
Content-Type:application/json

正常返回示例

XML格式

HTTP/1.1 200 OK
Content-Type:application/xml

<不支持/>

JSON格式

HTTP/1.1 200 OK
Content-Type:application/json

{
  "JobId" : "dlc*******",
  "JobType" : "TFJob",
  "DisplayName" : "tf-mnist-test",
  "UserId" : "12*********",
  "Status" : "Stopped",
  "WorkspaceId" : "268",
  "WorkspaceName" : "dlc-workspace",
  "ResourceId" : "r******",
  "ResourceLevel" : "L0",
  "ReasonCode" : "JobStoppedByUser",
  "ReasonMessage" : "Job is stopped by user.",
  "JobSpecs" : [ {
    "Type" : "Worker",
    "Image" : "registry.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.12.2PAI-cpu-py27-ubuntu16.04",
    "PodCount" : 1,
    "EcsSpec" : "ecs.c6.large",
    "ExtraPodSpec" : {
      "SideCarContainers" : [ {
        "Name" : "data-init",
        "Image" : "registry.cn-hangzhou.aliyuncs.com/pai-dlc/curl:v1.0.0",
        "Command" : [ "curl www.aliyun.com" ],
        "Args" : [ ],
        "WorkingDir" : "/root",
        "Env" : [ {
          "Name" : "ENABLE_DEBUG",
          "Value" : "true"
        } ],
        "Resources" : { }
      } ],
      "InitContainers" : [ {
        "Name" : "data-init",
        "Image" : "registry.cn-hangzhou.aliyuncs.com/pai-dlc/curl:v1.0.0",
        "Command" : [ "curl www.aliyun.com" ],
        "Args" : [ ],
        "WorkingDir" : "/root",
        "Env" : [ {
          "Name" : "ENABLE_DEBUG",
          "Value" : "true"
        } ],
        "Resources" : { }
      } ],
      "SharedVolumeMountPaths" : [ "/root/share/" ]
    },
    "ResourceConfig" : {
      "CPU" : "10",
      "GPU" : "3",
      "Memory" : "10Gi",
      "SharedMemory" : "5Gi",
      "GPUType" : "Tesla-V100-16G"
    },
    "UseSpotInstance" : false
  } ],
  "UserCommand" : "python /root/code/mnist.py",
  "DataSources" : [ {
    "DataSourceId" : "d*******",
    "MountPath" : "/mnt/data/"
  } ],
  "CodeSource" : {
    "CodeSourceId" : "code******",
    "Branch" : "master",
    "Commit" : "44da109b59f8596152987eaa8f3b2487xxxxxx",
    "MountPath" : "/mnt/data"
  },
  "ThirdpartyLibs" : [ "numpy==1.16.1" ],
  "ThirdpartyLibDir" : "/root/code/",
  "GmtCreateTime" : "2021-01-12T14:35:01Z",
  "GmtSubmittedTime" : "2021-01-12T14:36:01Z",
  "GmtRunningTime" : "2021-01-12T14:36:21Z",
  "GmtSuccessedTime" : "2021-01-12T15:36:08Z",
  "GmtFinishTime" : "2021-01-12T15:36:08Z",
  "Duration" : 3602,
  "Pods" : [ {
    "Type" : "Worker",
    "PodId" : "dlc-20210126170216-mt*****-worker-0",
    "PodUid" : "fe846462-af2c-4521-bd6f-96787a57591d",
    "Status" : "Running",
    "Ip" : "10.0.1.2",
    "GmtCreateTime" : "2021-01-12T14:36:01Z",
    "GmtStartTime" : "2021-01-12T14:36:05Z",
    "GmtFinishTime" : "2021-01-12T15:36:05Z",
    "HistoryPods" : [ { } ]
  } ],
  "RequestId" : "473469C7-AA6F-4DC5-B3DB-xxxxxxxx",
  "Settings" : {
    "BusinessUserId" : "166924",
    "Caller" : "SilkFlow",
    "PipelineId" : "pid-123456",
    "EnableTideResource" : true,
    "EnableErrorMonitoringInAIMaster" : false,
    "ErrorMonitoringArgs" : "--enable-log-hang-detection true",
    "EnableRDMA" : true
  },
  "ClusterId" : "a*****",
  "ElasticSpec" : {
    "EnableElasticTraining" : true,
    "MinParallelism" : 1,
    "MaxParallelism" : 8
  },
  "EnabledDebugger" : false,
  "Priority" : 1
}

错误码

访问错误中心查看更多错误码。

开发者资源

  • SDK

    阿里云为您提供多种语言的SDK,帮助您快速通过API集成阿里云的产品和服务,推荐您使用SDK调用API,已免除您手动签名验证。

  • OpenAPI Explorer

    快速检索,可视化调试API,在线命令行工具,同步动态生成可执行的SDK代码示例。

  • 阿里云CLI

    阿里云资产管理和配置工具,可通过命令方式同时管理多个阿里云产品和服务,简单快捷,是您上云好帮手。