Lifecycle management

更新时间:
复制 MD 格式

Configure lifecycle policies to automatically transition infrequently accessed data to the Infrequent Access storage class (IA) and reduce storage costs. You can access IA data directly or retrieve it in bulk for intelligent hot-cold data tiering.

Note

The lifecycle management feature is currently in invitational preview. To use this feature, submit a ticket to request access.

How it works

CPFS lifecycle management automatically tiers data between hot and cold storage based on file access time:

  • Automated tiering: After you configure a policy, the system periodically scans the file system and automatically transitions files not accessed for a specified number of days from Standard storage to the Infrequent Access storage class (IA).

  • Transparent access: Data in the Infrequent Access storage class (IA) is directly accessible and supports all standard POSIX operations without any changes to your file access methods.

  • Flexible retrieval: Depending on the policy configuration, the system either automatically transitions a file back to Standard storage on first access or keeps it in the Infrequent Access storage class (IA) for direct reads.

Configure a lifecycle policy

Step 1: Create a policy

You can create multiple policies for a single file system and apply them to different directories.

  1. Log on to the CPFS console and go to the details page of the target file system.

  2. In the left-side navigation pane, choose Lifecycle. On the Lifecycle Policies tab, click Create Lifecycle Policy.

  3. Configure the policy parameters:

    Parameter

    Description

    Policy Name

    A custom name for the policy. For example, AI-training-data-archive-policy.

    Apply To

    • / Entire file system: Applies the policy to all files.

    • Enter Directory Paths: Applies the policy only to the specified directory, such as /training/completed/.

    Convert to IA Storage Class

    Specify the number of days (from 1 to 365) after which the system automatically transitions an inactive file to the Infrequent Access storage class (IA).

    The inactivity period is the number of days since the file was last read or modified. Metadata operations, such as ls and stat, do not reset this timer.

    Transition to Standard Storage

    • Checked: When a file is accessed, the system automatically transitions it back to Standard storage. This is suitable for data that is expected to be accessed frequently again.

    • Unchecked: When a file is accessed, the system reads it directly from the Infrequent Access storage class (IA), and the file remains in that storage class.

  4. Click OK.

Step 2: Verify the policy

The policy takes effect during the next scan cycle. The system scans the file system every 24 hours, with the first scan scheduled based on the policy creation time.

  1. In the policy list, confirm that the status is enabled.

  2. After 24 hours, go to the Basic Information section on the file system details page and check if the IA Storage Capacity has started to increase.

  3. On the Performance Monitoring page, view the storage conversion success rate metric.

Configuration examples

The following table provides recommended configurations for different use cases.

Use case

Scope

Convert to IA Storage Class

Transition to Standard Storage

Description

AI training data archiving

/training/completed/

30 days

Checked

For completed training datasets that might be used to reproduce experiments.

Autonomous driving data archiving

/archive/sensors/

60 days

Unchecked

For historical sensor data that requires long-term retention and is rarely accessed.

Log audit archiving

/logs/

7 days

Unchecked

For system logs that require long-term retention and are accessed only occasionally.

Cold data archiving for the entire file system

/ Entire file system

90 days

Unchecked

For long-term archiving of the entire file system.

Access and retrieve IA data

You can use two methods to access data in the Infrequent Access storage class (IA):

Method 1: Direct access

You can read or write files in the Infrequent Access storage class (IA) directly from the mount path, just like local files. This method is suitable for viewing or operating on a small number of files. All standard file operations (read, write, modify, and delete) are supported. Access performance depends on your lifecycle policy configuration:

  • If you checked Transition to Standard Storage: The first access to a file has slightly higher latency while the system asynchronously transitions it to Standard storage. After the transition, the file delivers full Standard storage performance.

  • If you leave Transition to Standard Storage unchecked: Each access reads data directly from the Infrequent Access storage class (IA), and the file remains in IA. Performance is slightly lower than Standard storage, but you benefit from continuous IA cost savings.

Method 2: Bulk retrieval

To access a large number of IA files at once, such as reading thousands of files for an AI training job, create a data retrieval task. This transitions the files to Standard storage in bulk, avoiding the cumulative latency that can impact application performance.

  1. On the file system details page, go to the Lifecycle > Data Retrieval Tasks tab.

  2. Click Create Data Retrieval Task and configure the following parameters:

    • Task Name: Enter a name for the task. For example, "Training-data-warm-up-2024Q1".

    • Scope: Select / Entire file system or Enter Directory Paths. The path must be an absolute path that starts with a forward slash (/).

  3. Click OK to create the task.

  4. In the task list, monitor the retrieval progress:

    • Pending: The task is created and is waiting to run.

    • Running: The system is transitioning files from the Infrequent Access storage class (IA) to Standard storage. You can view the progress as a percentage.

    • Completed: All specified files have been successfully transitioned to Standard storage. You can now start your training job.

    • Partially Failed: Some files failed to transition. You can view the details and retry the failed files.

    • Failed: The task failed. You can click Retry.

View storage usage

On the file system details page, you can view the following information:

  • Basic Information: The used capacity of both Standard storage and the Infrequent Access storage class (IA).

  • Monitoring: Trends for capacity, IOPS, throughput, latency, and the storage conversion success rate.

Quotas and limits

Item

Limit

Number of lifecycle policies

You can create up to 20 policies per Alibaba Cloud account in a single region.

Number of directories per policy

A policy can apply to a maximum of 10 directories.

Number of data retrieval tasks

You can run up to 20 concurrent tasks per Alibaba Cloud account in a single region.

File system type

This feature is supported only for CPFS for Lingjun file systems.

Billing

The Infrequent Access storage class (IA) supports only the pay-as-you-go billing method. There is no minimum storage duration, and deleting files incurs no extra fees.

Billing item

Charged

Description

IA storage space

Yes

You are charged for the actual amount of storage you use in the Infrequent Access storage class (IA). Usage is measured hourly and settled monthly.

Transition from Standard to IA storage

No

Transitioning data from Standard storage to the Infrequent Access storage class (IA) incurs no data transfer fees.

Transition from IA to Standard storage

No

Transitioning data from the Infrequent Access storage class (IA) to Standard storage incurs no data transfer fees.

API requests

No

API requests to access files in the Infrequent Access storage class (IA) incur no charge.

Data retrieval tasks

No

No extra fees are charged for bulk data retrieval tasks.

FAQ

Why are my files not transitioned to IA?

Lifecycle policies are not applied in real time. The system periodically scans the file system, and the first scan typically starts within 24 hours after you create the policy.

Is there latency when accessing IA files?

  • If you selected the Transition to Standard Storage option in your policy, the first access to a file incurs slight latency while the system asynchronously transitions the storage class.

  • If you did not select the option, data is read directly from the Infrequent Access storage class (IA), and the performance is slightly lower than that of Standard storage.

How to identify IA files?

Individual file-level listings for the Infrequent Access storage class (IA) are not available. You can view the total IA storage usage in the CPFS console.

What happens when modifying IA files?

When you modify a file in the Infrequent Access storage class (IA), the system automatically retrieves the file to Standard storage before applying the modification. After you modify the file, its inactivity timer is reset.

Can multiple policies apply to one directory?

Yes. If a file matches the rules of multiple policies, the system applies the rules based on the following priority: Convert to IA Storage Class > Transition to Standard Storage.