ALIYUN::PAI::Dataset

ALIYUN::PAI::Dataset类型用于创建一个数据集。

语法

{
  "Type": "ALIYUN::PAI::Dataset",
  "Properties": {
    "Options": String,
    "Description": String,
    "Accessibility": String,
    "DatasetName": String,
    "SourceType": String,
    "SourceId": String,
    "DataSourceType": String,
    "WorkspaceId": String,
    "DataType": String,
    "Uri": String,
    "Property": String
  }
}

属性

属性名称

类型

必须

允许更新

描述

约束

Options

String

扩展字段。

当DLC使用数据集时,可通过配置mountPath字段指定数据集默认挂载路径。示例:

{ "mountPath": "/mnt/data/" }

Description

String

描述。

用于标注的数据。

Accessibility

String

工作空间可见度。

取值:

  • PRIVATE(默认值):表示工作空间内自己以及管理员可见。

  • PUBLIC:表示工作空间所有用户可见。

DatasetName

String

数据集名称。

命名规则如下: 

  • 以小写字母、大写字母、数字或中文开头。  

  • 可以包含“_”或“-”。  

  • 长度为1~127个字符。 

SourceType

String

数据来源类型。

取值:

  • USER(默认值):用户。

  • ITAG:标注平台ITAG。

  • PAI_PUBLIC_DATASETPAI:公开数据集。

SourceId

String

数据来源ID。

取值:

  • 当SourceType为USER时,可以自定义SourceId。

  • 当SourceType为ITAG时,即ITAG模块对结果生成的数据集进行标签处理时,SourceId为ITAG的任务ID。  

  • 当SourceType为PAI_PUBLIC_DATASET,即使用PAI公共数据集创建的数据集时,SourceId默认为空。 

DataSourceType

String

数据源类型。

取值:

  • NAS:阿里云文件存储。

  • OSS:阿里云对象存储。

WorkspaceId

String

数据集所在工作空间ID。

DataType

String

数据集类型。

取值:

  • COMMON(默认值):普通。

  • PIC:图片。

  • TEXT:文本。

  • VIDEO:视频。

  • AUDIO:音频。

Uri

String

URI配置。

取值:

  • 数据源类型为OSS:oss://bucket.endpoint/object

  • 数据源类型为NAS:

    • 通用型NAS格式为:nas://<nasfisid>.region/subpath/to/dir/

    • CPFS1.0:

      nas://<cpfs-fsid>.region/subpath/to/dir/

    • CPFS2.0:

      nas://<cpfs-fsid>.region/<protocolserviceid>/

    说明

    CPFS1.0和CPFS2.0根据fsid的格式来区分。CPFS1.0 格式为CPFS-<8位ASCII字符> ,CPFS2.0 格式为CPFS-<16为ASCII字符>。

Property

String

数据集属性。

取值:

  • FILE:文件。

  • DIRECTORY:文件夹。

返回值

Fn::GetAtt

  • Options:扩展字段

  • Description:描述。

  • Accessibility:工作空间可见度。

  • SourceId:来源ID。

  • CreateTime:创建时间。

  • SourceType:来源类型

  • WorkspaceId:数据集所在工作空间ID。

  • Uri:Uri配置

  • GmtModifiedTime:更新时间。

  • DatasetId:数据集ID。

  • OwnerId:主账户ID。

  • DatasetName:数据集名称。

  • UserId:用户ID。

  • DataSourceType:数据源类型。

  • DataType:数据类型

  • Property:数据集属性。

示例

YAML格式

ROSTemplateFormatVersion: '2015-09-01'
Parameters:
  DataSourceType:
    AllowedValues:
    - OSS
    - NAS
    Description: 'The data source type. The following values are supported:

      - OSS: Alibaba Cloud Object Storage (OSS).

      - NAS: Alibaba cloud file storage (NAS).'
    Type: String
  DatasetName:
    Description: 'The name of the dataset. The naming rules are as follows:

      - Start with a lowercase letter, uppercase letter, number, or Chinese.

      - Can contain an underscore (_) or a dash (-).

      - 1~127 characters in length.'
    Type: String
  Property:
    AllowedValues:
    - FILE
    - DIRECTORY
    Description: 'The properties of the dataset. The following values are supported:

      - FILE: FILE.

      - DIRECTORY: folder.'
    Type: String
  Uri:
    Description: 'The Uri configuration sample is as follows:

      - The data source type is OSS:''oss:// bucket.endpoint/object''

      - The data source type is NAS:

      The general NAS format is: ''nas://.region/subpath/to/dir/'';

      CPFS1.0:''nas://.region/subpath/to/dir /'';

      CPFS2.0:''nas://.region//''.

      CPFS1.0 and CPFS2.0 are distinguished by the format of fsid: CPFS1.0 is cpfs-<8-bit ascii characters>;CPFS2.0 is cpfs-<16 ascii characters>.'
    Type: String
  WorkspaceId:
    Description: 'The ID of the workspace where the dataset is located. For details
      about how to obtain the workspace ID, see [ListWorkspaces](~~ 449124 ~~).

      If this parameter is not configured, the default workspace is used. If the default
      workspace does not exist, an error is reported.'
    Type: String
Resources:
  ExtensionResource:
    Properties:
      DataSourceType:
        Ref: DataSourceType
      DatasetName:
        Ref: DatasetName
      Property:
        Ref: Property
      Uri:
        Ref: Uri
      WorkspaceId:
        Ref: WorkspaceId
    Type: ALIYUN::PAI::Dataset
Outputs:
  Accessibility:
    Description: Workspace visibility.
    Value:
      Fn::GetAtt:
      - ExtensionResource
      - Accessibility
  CreateTime:
    Description: The creation time of the resource.
    Value:
      Fn::GetAtt:
      - ExtensionResource
      - CreateTime
  DataSourceType:
    Description: The data source type.
    Value:
      Fn::GetAtt:
      - ExtensionResource
      - DataSourceType
  DataType:
    Description: The dataset type. The default value is COMMON.
    Value:
      Fn::GetAtt:
      - ExtensionResource
      - DataType
  DatasetId:
    Description: The first ID of the resource.
    Value:
      Fn::GetAtt:
      - ExtensionResource
      - DatasetId
  DatasetName:
    Description: The name of the dataset.
    Value:
      Fn::GetAtt:
      - ExtensionResource
      - DatasetName
  Description:
    Description: Custom descriptions of datasets to distinguish between different
      datasets.
    Value:
      Fn::GetAtt:
      - ExtensionResource
      - Description
  GmtModifiedTime:
    Description: Update time.
    Value:
      Fn::GetAtt:
      - ExtensionResource
      - GmtModifiedTime
  Options:
    Description: The extended field, which is of the JsonString type.
    Value:
      Fn::GetAtt:
      - ExtensionResource
      - Options
  OwnerId:
    Description: The ID of the primary account.
    Value:
      Fn::GetAtt:
      - ExtensionResource
      - OwnerId
  Property:
    Description: The properties of the dataset.
    Value:
      Fn::GetAtt:
      - ExtensionResource
      - Property
  SourceId:
    Description: The data source ID.
    Value:
      Fn::GetAtt:
      - ExtensionResource
      - SourceId
  SourceType:
    Description: The data source type. The default value is USER.
    Value:
      Fn::GetAtt:
      - ExtensionResource
      - SourceType
  Uri:
    Description: The Uri configuration sample is as follows:.
    Value:
      Fn::GetAtt:
      - ExtensionResource
      - Uri
  UserId:
    Description: The ID of the user to which the dataset belongs.
    Value:
      Fn::GetAtt:
      - ExtensionResource
      - UserId
  WorkspaceId:
    Description: The ID of the workspace where the dataset is located. For details
      about how to obtain the workspace ID, see [ListWorkspaces](~~ 449124 ~~).
    Value:
      Fn::GetAtt:
      - ExtensionResource
      - WorkspaceId

JSON格式

{
  "ROSTemplateFormatVersion": "2015-09-01",
  "Parameters": {
    "DataSourceType": {
      "AllowedValues": [
        "OSS",
        "NAS"
      ],
      "Description": "The data source type. The following values are supported:\n- OSS: Alibaba Cloud Object Storage (OSS).\n- NAS: Alibaba cloud file storage (NAS).",
      "Type": "String"
    },
    "DatasetName": {
      "Description": "The name of the dataset. The naming rules are as follows:\n- Start with a lowercase letter, uppercase letter, number, or Chinese.\n- Can contain an underscore (_) or a dash (-).\n- 1~127 characters in length.",
      "Type": "String"
    },
    "Property": {
      "AllowedValues": [
        "FILE",
        "DIRECTORY"
      ],
      "Description": "The properties of the dataset. The following values are supported:\n- FILE: FILE.\n- DIRECTORY: folder.",
      "Type": "String"
    },
    "Uri": {
      "Description": "The Uri configuration sample is as follows:\n- The data source type is OSS:'oss:// bucket.endpoint/object'\n- The data source type is NAS:\nThe general NAS format is: 'nas://.region/subpath/to/dir/';\nCPFS1.0:'nas://.region/subpath/to/dir /';\nCPFS2.0:'nas://.region//'.\nCPFS1.0 and CPFS2.0 are distinguished by the format of fsid: CPFS1.0 is cpfs-<8-bit ascii characters>;CPFS2.0 is cpfs-<16 ascii characters>.",
      "Type": "String"
    },
    "WorkspaceId": {
      "Description": "The ID of the workspace where the dataset is located. For details about how to obtain the workspace ID, see [ListWorkspaces](~~ 449124 ~~).\nIf this parameter is not configured, the default workspace is used. If the default workspace does not exist, an error is reported.",
      "Type": "String"
    }
  },
  "Resources": {
    "ExtensionResource": {
      "Properties": {
        "DataSourceType": {
          "Ref": "DataSourceType"
        },
        "DatasetName": {
          "Ref": "DatasetName"
        },
        "Property": {
          "Ref": "Property"
        },
        "Uri": {
          "Ref": "Uri"
        },
        "WorkspaceId": {
          "Ref": "WorkspaceId"
        }
      },
      "Type": "ALIYUN::PAI::Dataset"
    }
  },
  "Outputs": {
    "Accessibility": {
      "Description": "Workspace visibility.",
      "Value": {
        "Fn::GetAtt": [
          "ExtensionResource",
          "Accessibility"
        ]
      }
    },
    "CreateTime": {
      "Description": "The creation time of the resource.",
      "Value": {
        "Fn::GetAtt": [
          "ExtensionResource",
          "CreateTime"
        ]
      }
    },
    "DataSourceType": {
      "Description": "The data source type.",
      "Value": {
        "Fn::GetAtt": [
          "ExtensionResource",
          "DataSourceType"
        ]
      }
    },
    "DataType": {
      "Description": "The dataset type. The default value is COMMON.",
      "Value": {
        "Fn::GetAtt": [
          "ExtensionResource",
          "DataType"
        ]
      }
    },
    "DatasetId": {
      "Description": "The first ID of the resource.",
      "Value": {
        "Fn::GetAtt": [
          "ExtensionResource",
          "DatasetId"
        ]
      }
    },
    "DatasetName": {
      "Description": "The name of the dataset.",
      "Value": {
        "Fn::GetAtt": [
          "ExtensionResource",
          "DatasetName"
        ]
      }
    },
    "Description": {
      "Description": "Custom descriptions of datasets to distinguish between different datasets.",
      "Value": {
        "Fn::GetAtt": [
          "ExtensionResource",
          "Description"
        ]
      }
    },
    "GmtModifiedTime": {
      "Description": "Update time.",
      "Value": {
        "Fn::GetAtt": [
          "ExtensionResource",
          "GmtModifiedTime"
        ]
      }
    },
    "Options": {
      "Description": "The extended field, which is of the JsonString type.",
      "Value": {
        "Fn::GetAtt": [
          "ExtensionResource",
          "Options"
        ]
      }
    },
    "OwnerId": {
      "Description": "The ID of the primary account.",
      "Value": {
        "Fn::GetAtt": [
          "ExtensionResource",
          "OwnerId"
        ]
      }
    },
    "Property": {
      "Description": "The properties of the dataset.",
      "Value": {
        "Fn::GetAtt": [
          "ExtensionResource",
          "Property"
        ]
      }
    },
    "SourceId": {
      "Description": "The data source ID.",
      "Value": {
        "Fn::GetAtt": [
          "ExtensionResource",
          "SourceId"
        ]
      }
    },
    "SourceType": {
      "Description": "The data source type. The default value is USER.",
      "Value": {
        "Fn::GetAtt": [
          "ExtensionResource",
          "SourceType"
        ]
      }
    },
    "Uri": {
      "Description": "The Uri configuration sample is as follows:.",
      "Value": {
        "Fn::GetAtt": [
          "ExtensionResource",
          "Uri"
        ]
      }
    },
    "UserId": {
      "Description": "The ID of the user to which the dataset belongs.",
      "Value": {
        "Fn::GetAtt": [
          "ExtensionResource",
          "UserId"
        ]
      }
    },
    "WorkspaceId": {
      "Description": "The ID of the workspace where the dataset is located. For details about how to obtain the workspace ID, see [ListWorkspaces](~~ 449124 ~~).",
      "Value": {
        "Fn::GetAtt": [
          "ExtensionResource",
          "WorkspaceId"
        ]
      }
    }
  }
}