Use the Dataset Accelerator in PAI

更新时间:
复制 MD 格式

When you create a dataset on PAI, you can enable the Dataset Accelerator feature. You can then use the accelerated dataset when you create a DSW instance or submit a DLC job to improve data read efficiency.

Prerequisites

You have created a dataset accelerator instance. For more information, see Create and manage a dataset accelerator instance.

Enable dataset acceleration for a new dataset

  1. On the Datasets page, create a dataset and configure the following key parameters. For more information, see Create and manage datasets.

    Parameter

    Description

    Create Dataset

    Select From Alibaba Cloud.

    Enable Dataset Acceleration

    Select Enable Dataset Acceleration and configure its parameters. After you select Enable Dataset Acceleration, you must also configure Read/Write Properties. For the accelerated mount target, you can choose Select Mount Target, Create New Mount Target, or Use Internal Mount Target. The default mount path for an accelerated dataset is /mnt/datasetacc.

    Based on the selected data storage type, select a Dataset Accelerator and configure the dataset acceleration slot parameters, such as the slot name, maximum capacity, and accelerated mount target. For more information, see Create and manage dataset acceleration slots.

  2. Click Submit.

    The created dataset appears in the Datasets list, showing details such as its data source, storage type, properties, visibility scope, and last modified time. For datasets with acceleration enabled, their acceleration status is also displayed.

Enable dataset acceleration for an existing dataset

  1. On the Datasets page, click the name of a dataset to open its Dataset Details page. For more information, see Create and manage datasets.

  2. On the Dataset Details page, click Dataset Acceleration in the upper-right corner, select a Dataset Accelerator, and configure the dataset acceleration slot parameters. For configuration details, see Create and manage dataset acceleration slots.

  3. Click Submit to enable acceleration for the dataset.

Use an accelerated dataset

You can use an accelerated dataset when you create a DSW instance or submit a DLC job.

  • When you create a DSW instance, select a dataset with acceleration enabled in the Storage Configurations configuration section. For more information, see Create and manage DSW instances. In the Shared Dataset area, select an accelerated dataset of type OSS, NAS, or CPFS from the drop-down list and specify a mount path, such as /mnt/data/.

  • When you submit a DLC job, select a dataset with acceleration enabled in the Dataset Configuration configuration section. For more information, see Create a training job. In the Environment Information area, set the configuration type to By dataset, click the edit icon next to the dataset input box to open the dataset selection dialog box, switch to the Accelerated Dataset tab, and select the target dataset to mount it.