Data migration

更新时间:
复制 MD 格式

This topic outlines usage notes, limitations, and steps for data migration.

Notes

Keep the following in mind when using Data Online Migration:

  • When creating a source data address, specify an absolute path for the Directory To Be Migrated. This path must begin and end with a forward slash (/) and cannot contain environment variables or special characters.

  • When creating a source data address, ensure the specified Directory To Be Migrated exists and is valid.

  • Online migration consumes resources at both the source and destination addresses, potentially affecting your business operations. For mission-critical services, consider setting a rate limit or running the migration task during off-peak hours to minimize the impact.

  • The service checks files at the source and destination addresses before the migration starts. However, if a file with the same name exists in both locations and the migration task is configured to overwrite files, the service directly overwrites the destination file. If the two files have different content, you must rename one of them or create a backup to prevent data loss.

  • Online migration preserves the last modified time of source files. If a lifecycle rule is configured for the destination bucket, the rule may delete or transition a migrated file to a specified storage class if the file's last modified time meets the rule's criteria.

Migration limitations

  • The migration excludes the following data types in the source data address: empty directories, symbolic links (files or directories), character device files, block device files, socket files, and pipeline files.

  • The migration converts hard links in the source data address to regular files, without preserving the link relationship.

  • The migration does not migrate parent directory attributes.

  • The migration does not migrate special file permissions, such as SUID, SGID, and SBID.

  • The following attribute limitations apply when migrating data from a local file system to OSS:

    • Supported attributes: The migration maps ModifyTime to X-Oss-Meta-Mtime, Permissions to X-Oss-Meta-Perms, and Uid:Gid to X-Oss-Meta-Owner.

      Note
      • Permissions: Includes nine permission bits for read, write, and execute access.

      • Uid:Gid: Represents the user ID and group ID, separated by a colon (:).

    • Unsupported attributes: Examples include AccessTime, ChangeTime, Attr, and Acl.

      Note

      The migration behavior for unlisted attributes is not guaranteed, and you must verify them after the migration completes.

Step 1: Select a region

  1. Log in to the Data Online Migration console as the RAM user you created.

  2. In the upper-left corner of the top navigation bar, select the region where your agent is located.

    Important
    • Tunnels, agents, data addresses, and migration tasks created in one region cannot be used in another. Choose your region carefully.

    • Select the region where your agent is located. If that region is not available, select the closest region to create the migration task.

    • For cross-border migration, we recommend that you enable transfer acceleration to increase migration speed. If you enable transfer acceleration for a bucket, transfer acceleration fees apply. For more information, see Access OSS using transfer acceleration.

Step 2: Create a tunnel

  1. In the left navigation pane, go to Data Online Migration > Channel Management and click Create Tunnel.

  2. In the Create Tunnel dialog box, configure the following parameters and click OK.

    Parameter

    Required

    Description

    name

    Yes

    The name of the tunnel.

    • The name cannot be empty and can be up to 100 characters in length.

    • The name can contain letters, digits, hyphens (-), and underscores (_).

    Maximum Bandwidth

    Yes

    The maximum bandwidth that the tunnel can use.

    • If you do not configure this parameter, the default value 0 is used, which indicates that the bandwidth for the tunnel is not limited.

    • If you configure this parameter, enter a value based on the note in the console.

    Important

    The bandwidth that is available for the tunnel depends on the actual bandwidth of the network connection.

    Requests/s

    Yes

    The maximum number of requests per second over the tunnel.

    • If you do not configure this parameter, the default value 0 is used, which indicates that the number of requests per second over the tunnel is not limited.

    • If you configure this parameter, enter a value based on the note in the console.

    Warning

    We recommend that you evaluate the capabilities of the storage system of the data source before you configure this parameter. If you set this parameter to a great value, your business is affected. We recommend that you enter a value based on the note in the console.

Note

To learn more about tunnels, see Tunnel Management.

Step 3: Create an agent

Important
  • If LocalFS is a local file system, you can deploy only one agent.

  • If LocalFS is a remote file system such as Network Attached Storage (NAS), you can deploy multiple agents. Mount the NAS to directories with the same name.

  1. In the left-side navigation pane, choose Data Online Migration > Agent Management, and click New Agent.

  2. In the New Agent dialog box, configure the following parameters and click OK.

    Parameter

    Required

    Description

    Name

    Yes

    The name of the agent.

    • The name must be 3 to 63 characters long.

      • The name can contain lowercase letters, digits, hyphens (-), and underscores (_). The name is case-sensitive.

      • The name must be UTF-8 encoded and cannot start with a hyphen (-) or an underscore (_).

    Network Type

    Yes

    The agent's network type. Valid values:

    • VPC (recommended): The agent connects to the Data Online Migration service over a VPC. The agent's host machine must be able to access the internal endpoint of Data Online Migration in the corresponding region. For example, if you use the migration service in the China (Beijing) region, the agent machine must be able to access the internal endpoint {TunnelId}.cn-beijing.mgw-tc-internal.aliyuncs.com. We recommend that you deploy the agent on an ECS instance in the same region as the Data Online Migration console.

    • Internet: The agent connects to the Data Online Migration service over the Internet. The agent's host machine must be able to access the public endpoint of Data Online Migration in the corresponding region. For example, if you use the migration service in the China (Beijing) region, the agent machine must be able to access the public endpoint {TunnelId}.cn-beijing.mgw-tc.aliyuncs.com.

    Note
    • TunnelId is the ID of the tunnel.

    • Use the ping command to test network connectivity between the agent and the migration service.

    Deployment method

    Yes

    The agent's deployment method. Only independent process mode is currently supported.

    Tunnel

    Yes

    The tunnel associated with the agent. Each agent must be associated with exactly one tunnel, and its bandwidth is limited by the tunnel's total bandwidth.

    For example, a tunnel named tunnel-1 has a maximum bandwidth of 10 Gbit/s. If tunnel-1 is associated with three agents, agent-1, agent-2, and agent-3, the sum of their bandwidth cannot exceed 10 Gbit/s. If agent-1 is allocated 3 Gbit/s of bandwidth, agent-2 and agent-3 have only 7 Gbit/s of bandwidth available to share. Plan and allocate your bandwidth carefully in advance.

  3. Generate the agent deployment script. For more information, see Generate an agent deployment script.

Note

For more information about agents, see Agent Management.

Step 4: Create a source address

Important
  • If the LocalFS source is a local file system on a single machine, you can deploy only one agent.

  • If the LocalFS source is a NAS file system mounted on multiple machines, ensure that the NAS is mounted to the same directory path on each machine. When you create the data address, enter this mount directory for the Directory to be migrated parameter.

  1. In the left-side navigation pane, choose Data Online Migration > Address management, and then click Create address.

  2. In the Create address panel, configure the following parameters, and then click OK.

    Parameter

    Required

    Description

    Name

    Yes

    Enter a name for the source. The name must meet the following requirements:

    • The name must be 3 to 63 characters in length.

    • The name is case-sensitive and can contain only lowercase letters, digits, hyphens (-), and underscores (_).

    • The name cannot start with a hyphen (-) or an underscore (_).

    Type

    Yes

    Select LocalFS.

    Directory To Be Migrated

    Yes

    Specify the path of the directory to be migrated. The path must be absolute, start and end with a forward slash (/), and contain no environment variables or special characters.

    For example, if the source prefix is /example/src/ and the destination prefix is example/dest/, a source file such as example.jpg is migrated to example/dest/example.jpg.

    Important
    • If multiple agents are associated with this data address, ensure each agent can access this directory. Otherwise, some data may fail to migrate.

    • If the LocalFS source is a NAS file system mounted as a local directory on multiple machines, ensure that the mount directory has the same name on each machine. When you create the data address, specify this local directory name as the directory to be migrated.

    Tunnel

    Yes

    Select the channel that you want to use.

    Important
    • This parameter is required only when you migrate data from self-managed storage to the cloud, or when you migrate data over a dedicated connection or VPN.

    • An agent is required when the destination is a local file system (LocalFs) or when migrating over a dedicated connection for services like Finance Cloud or Apsara Stack.

    Agent

    Yes

    Select one or more agents.

    Important
    • This parameter is required only when you migrate data from self-managed storage to the cloud, or when you migrate data over a dedicated connection or VPN.

    • You can select up to 200 agents for a specified channel.

Step 5: Create a destination address

  1. In the navigation pane on the left, choose Data Online Migration > Address Management, and then click Create Address.

  2. In the Create Address panel, configure the following parameters, and then click OK.

    Parameter

    Required

    Description

    Name

    Yes

    Enter a name for the destination address. The name must meet the following requirements:

    • The name must be 3 to 63 characters in length.

    • The name is case-sensitive and can contain only lowercase letters, digits, hyphens (-), and underscores (_).

    • The name cannot start with a hyphen (-) or an underscore (_).

    Type

    Yes

    Select OSS.

    Region

    No

    Select the destination region. For example, China (Hangzhou).

    Authorize role

    Yes

    Bucket

    Yes

    Enter the name of the destination OSS bucket in the current account.

    Agent

    No

    Select one or more agents.

    Important
    • This parameter is required only when you migrate data from self-managed storage to the cloud, or when you migrate data over a dedicated connection or VPN.

    • You can select up to 200 agents for a specified channel.

Step 6: Create a migration task

  1. In the navigation pane on the left, choose Data Online Migration > Migration Tasks, and then click Create Task.

  2. On the Select Address page, configure the following parameters, and then click Next.

    Parameter

    Required

    Description

    Name

    Yes

    Enter a name for the migration task. The name must meet the following requirements:

    • The name must be 3 to 63 characters in length.

    • The name is case-sensitive and can contain only lowercase letters, digits, hyphens (-), and underscores (_).

    • The name cannot start with a hyphen (-) or an underscore (_).

    Source Address

    Yes

    Select a previously created source address.

    Destination Address

    Yes

    Select a previously created destination address.

  3. On the Task Configurations page, configure the following parameters.

    Parameter

    Required

    Description

    Basic configurations

    Migration Bandwidth

    No

    Select the migration bandwidth.

    • Default: Uses the maximum available bandwidth. The actual migration speed depends on the file size and the number of files.

    • Specify an upper limit: Specify a bandwidth cap as prompted on the console.

    Important
    • The actual migration bandwidth is affected by factors such as the data source, network conditions, destination-side throttling, and file sizes. The bandwidth may not reach the specified upper limit.

    • Evaluate your data source, destination, business workloads, and network bandwidth to select a reasonable value. Improper throttling may affect your business operations.

    Files Migrated Per Second

    No

    Select the number of files to migrate per second.

    • Default: The default number of files migrated per second.

    • Specify an upper limit: Specify an upper limit as prompted on the console.

    Important
    • The actual migration rate is affected by factors such as the data source, network conditions, destination-side throttling, and file sizes. The rate may not reach the specified upper limit.

    • Evaluate your data source, destination, business workloads, and network bandwidth to select a reasonable value. Improper throttling may affect your business operations.

    Overwrite Mode

    No

    Select how to handle files with the same name at the destination.

    • Do not overwrite: Skips migrating the file.

    • Overwrite All: The source file overwrites the destination file.

    • Overwrite based on the last modification time:

      • The destination file is overwritten if the source file's last modified time is later.

      • If the last modified times are the same, the destination file is overwritten if its Size or Content-Type differs.

    • Warning
      • The Overwrite based on the last modification time policy does not guarantee that an older file will not overwrite a newer one.

      • If you select Overwrite based on the last modification time, ensure your source data can return metadata such as last modified time, Size, and Content-Type. Otherwise, the overwrite policy may not work as expected and can lead to unintended migration results.

      • If you select Do not overwrite or Overwrite based on last modified time, the service requests object metadata from both the source and destination to perform the comparison. This incurs request fees on both the source and destination.

    Auditing method

    Migration Report

    Yes

    Specifies how to deliver the migration report.

    • Do not push (Default): The migration report is not pushed to the destination bucket.

    • Push: Pushes the migration report to the destination bucket. For the detailed path, see Next steps.

    Important
    • Pushing the migration report consumes storage space in the destination bucket.

    • Migration report delivery may be delayed.

    • Each task execution record has a unique ID. The migration report is pushed only once per record. Be cautious when deleting it.

    Migration Logs

    Yes

    Specifies how to deliver the migration log.

    • Do not push (Default): The migration log is not pushed.

    • Push: Pushes the migration log to Log Service. You can view the migration log in Log Service.

    • Push only file error logs.: Pushes only logs for file migration errors to Log Service. You can view these error logs in Log Service.

    If you select Push or Push only file error logs., Online Migration Service creates a project in Log Service named aliyun-oss-import-log-Alibaba Cloud account ID-current region. For example: aliyun-oss-import-log-137918634953****-cn-hangzhou.

    Important

    Ensure that you complete the following actions before selecting Push or Push only file error logs.. Otherwise, the migration task may fail.

    • You have activated Log Service.

    • You have granted the required permissions on the Authorize page.

    Authorize

    No

    This option appears only when Migration Logs is set to Push or Push only file error logs..

    Click Authorize to go to the Cloud Resource Access Authorization page. The system creates a role named AliyunOSSImportSlsAuditRole and grants permissions to the role. Click Agree to Authorization to complete the authorization.

    Filter

    File Name

    No

    A filter for filenames.

    You can use Include and Exclude filtering rules based on the RE2 library regular expression syntax (only a subset of expressions is supported). Examples:

    • .*\.jpg$ matches all files ending with .jpg.

    • ^file.* matches all files in the root directory whose names start with file.

      If the source address is configured with a prefix, for example data/to/oss/, you must use ^data/to/oss/file.* to match all files under that prefix whose names start with file.

    • .*/picture/.* matches any file within a subdirectory named picture at any level.

    Important
    • If you configure include rules, any file that matches at least one rule is migrated.

      For example, consider two files: picture.jpg and picture.png. If you add an include rule .*\.jpg$, only picture.jpg is migrated. If you also add an include rule .*\.png$, both files are migrated.

    • If you configure exclude rules, any file that matches at least one rule is not migrated.

      For example, consider two files: picture.jpg and picture.png. If you add an exclude rule .*\.jpg$, only picture.png is migrated. If you also add an exclude rule .*\.png$, neither file is migrated.

    • Exclude rules take precedence over include rules. A file is not migrated if it matches both an exclude rule and an include rule.

      For example, consider the file file.txt. If you configure an exclude rule .*\.txt$ and an include rule file.*, the file file.txt is not migrated.

    File Modification Time

    No

    A filter based on the last modified time of files.

    You can specify a time range. Only files with a last modified time within the specified range are migrated. The rules are as follows:

    • If you specify only a start time (for example, January 1, 2019) and no end time, only files last modified on or after January 1, 2019 are migrated.

    • If you specify only an end time (for example, January 1, 2022) and no start time, only files last modified on or before January 1, 2022 are migrated.

    • If you specify a start time of January 1, 2019 and an end time of January 1, 2022, only files last modified on or after January 1, 2019 and on or before January 1, 2022 are migrated.

    Migrate special entities

    No

    Specifies whether to migrate special types of entities. Select the checkbox to enable migration, and clear it to disable.

    Directory:

    • Enable: Directories at the source address are added to the migration queue and are included in the task's file count and storage volume statistics. Corresponding empty objects ending with a forward slash (/) are created at the destination, and the attributes of the source directory (if supported) are set as user metadata on the destination object.

    • Disable: Directories at the source address are ignored and are not included in the task's file count and storage volume statistics. No corresponding empty objects are created at the destination.

    Symbolic link:

    • Enable: Symbolic links at the source address are added to the migration queue and are included in the task's file count and storage volume statistics. Corresponding symbolic link objects are created at the destination, and the attributes of the source symbolic link (if supported) are set as user metadata on the symbolic link object. The Target attribute of the symbolic link object depends on the Whether to Convert Target setting.

    • Disable: Symbolic links at the source address are ignored and are not included in the task's file count and storage volume statistics.

    Important

    The service does not migrate the target files or directories that symbolic links point to, unless those targets are also within the migration scope.

    Migration configurations

    Convert target

    No

    Converts the Target attribute of a source symbolic link so that the destination symbolic link points to the correct target object. Select the checkbox to enable conversion, and clear it to disable.

    Important
    • This option is effective only when symbolic link migration is enabled.

    • Regardless of this setting, the migration does not check whether the target object exists, if its type is valid, or if you have access permissions for it.

    Enable: The service first resolves the Target attribute into the shortest absolute path (AbsTarget), relative to the symbolic link's directory. It then performs a string replacement, replacing the SrcPrefix (if it matches) in the AbsTarget with the DestPrefix. The resulting value is set as the Target attribute of the destination symbolic link object.

    Note

    Example: Assume the migration task is configured with SrcPrefix="/mnt/nas1/" and DestPrefix="cloud_base/". A symbolic link /mnt/nas1/links/a.lnk exists at the source. Consider the following Target attributes:

    • If the Target is "../data/./a.txt", the resolved shortest absolute path is "/mnt/nas1/data/a.txt". The final replaced Target value becomes "cloud_base/data/a.txt".

    • If the Target is "/mnt/nas1/verbose/../data/./a.txt", the resolved shortest absolute path is "/mnt/nas1/data/a.txt". The final replaced Target value becomes "cloud_base/data/a.txt".

    • If the Target is "/root/outer/../data/./a.txt", the resolved shortest absolute path is "/root/data/a.txt". No prefix match is found, so the final Target value remains "/root/data/a.txt".

    Disable: No conversion is performed. The original Target attribute value of the source symbolic link is set on the destination symbolic link object.

    Preserve last modified time

    Yes

    Specifies whether to preserve the last modified time of the source file.

    • Preserve (Default): The last modified time of the source file is set on the destination object.

    • Do not preserve: The last modified time is not set.

    Destination storage class

    No

    Specifies whether to set a storage class for the destination objects.

    • Specify: Migrated objects are set to the specified storage class. The following storage classes are available:

      • Standard

      • Infrequent Access

      • Archive

      • Cold Archive

      • Deep Cold Archive

    • Not specified (default): The storage class is not set, and files migrated to the destination will be consistent with the default storage class of the destination.

    Important
    • This option is displayed only if your account is added to the allowlist for this feature.

    • This option is currently supported only for tasks where the destination is OSS.

    Task scheduling

    Execution time

    No

    Important
    1. If a task is still running when its next execution is scheduled, the current run will complete, the scheduled run is skipped, and the task will execute at the next interval.

    2. Concurrent migration task limit: Up to 10 in Chinese mainland and China (Hong Kong) regions, and up to 5 in other regions.

    Specify when to run the migration task.

    • Immediately: Runs the task immediately.

    • At the Specified Time: Sets a daily time window for the task to run. By default, the task starts at the specified start time and pauses at the specified stop time.

    • Periodic Scheduling: Runs the task based on a specified frequency and number of executions.

      • Execution Frequency: Supported frequencies are Hourly, Daily, Weekly, Specific days of the week, and Custom. For details, see Execution frequency.

      • Number of Executions: Specifies the number of times the task runs. If not set, the task runs once by default. For the maximum number of executions, refer to the prompt on the console.

    Important

    You can manually start and pause the task at any time, regardless of the scheduled execution time.

  4. Read the Online Migration Service Agreement, select the checkbox for I have understood and confirmed the compliance commitment statement, and I acknowledge my obligation and responsibility to verify the consistency of migrated data after the migration task is completed, and then click Next.

  5. Review the configuration information. If it is correct, click OK and wait for the migration task to run.

Execution frequency

Execution frequency

Description

Example

Hourly

Run the task once every hour. You can use this option with the maximum number of runs.

The current time is 8:05. The frequency is set to hourly with a maximum of 3 runs. The first run starts at the next hour, 9:00.

  • If a run finishes before the next hour, the second run starts at 10:00. This pattern continues until the specified number of runs is complete.

  • If a run has not finished by the next hour and ends at 12:30, the second run starts at the next hour, 13:00. This pattern continues until the specified number of runs is complete.

Daily

Run the task once a day. You must specify an hour (0-23) for the task to start. You can use this option with the maximum number of runs.

The current time is 8:05. The task is scheduled to run daily at 10:00, with a maximum of 5 runs. The first run starts at 10:00 today.

  • If a run finishes before 10:00 the next day, the second run starts at 10:00 the next day. This pattern continues until the specified number of runs is complete.

  • If a run has not finished by 10:00 the next day and ends at 12:05 the next day, the second run starts at 10:00 on the third day. This pattern continues until the specified number of runs is complete.

Weekly

Run the task once a week. You must specify a day of the week and an hour (0-23) for the task to start. You can use this option with the maximum number of runs.

The current time is Monday, 8:05. The task is scheduled to run every Monday at 10:00, with a maximum of 10 runs. The first run starts at 10:00 today.

  • If a run finishes before 10:00 next Monday, the second run starts at 10:00 next Monday. This pattern continues until the specified number of runs is complete.

  • If a run has not finished by 10:00 next Monday and ends at 12:05 next Monday, the second run starts at 10:00 on the following Monday. This pattern continues until the specified number of runs is complete.

Specific days of the week

Run the task on selected days of the week. You must specify the days and an hour (0-23) for the task to start.

The current time is Wednesday, 8:05. The task is scheduled to run on Mondays, Wednesdays, and Fridays at 10:00. The first run starts at 10:00 today.

  • If a run finishes before 10:00 on Friday, the second run starts at 10:00 on Friday. This pattern continues until the specified number of runs is complete.

  • If a run has not finished by 10:00 on Friday and ends at 12:05 next Monday, the second run starts at 10:00 next Wednesday. This pattern continues until the specified number of runs is complete.

Custom

Use a cron expression to define a custom schedule for the task start time.

Note

A cron expression consists of six space-separated fields that define the execution schedule: second, minute, hour, day of the month, month, and day of the week. The minimum interval is 1 hour.

The following cron expression examples are for reference only. For more options, use a cron expression generator.

  • 0 0 * * * *: Runs the task at the beginning of every hour (0 minutes, 0 seconds).

  • 0 30 0/3 * * ?: Runs the task every 3 hours at 30 minutes past the hour (for example, at 0:30, 3:30, 6:30, 9:30, 12:30, 15:30, 18:30, and 21:30).

  • 0 0 12 * * MON-FRI: Runs the task at 12:00 PM every weekday from Monday to Friday.

  • 0 0 12 1-15 * SAT,SUN: Runs the task at 12:00 PM on weekends (Saturday and Sunday) that fall between the 1st and 15th of the month.

  • 0 30 8 1,15 * *: Runs the task at 8:30 AM on the 1st and 15th of each month.

Step 7: Verify data

The Migration Service only transfers data and does not guarantee its consistency or integrity. After a migration task is complete, validate all migrated data to ensure consistency between the source and destination.

Warning

After the migration task is complete, you must verify the migrated data at the destination. You are solely responsible for any data loss and all associated consequences if you delete the source data before confirming the integrity of the destination data.