Data migration

更新时间:
复制 MD 格式

This topic describes the notes, limits, and procedure for data migration.

Usage notes

Note the following when migrating data with Data Online Migration:

  • When creating a source and a destination data address, the Directory To Be Migrated must be an absolute path that starts and ends with a forward slash (/) and contains no environment variables or special characters.

  • When creating a source and a destination data address, ensure the Directory To Be Migrated path exists and is valid.

  • A migration task consumes resources at both the source and destination. To reduce the impact on your workloads, consider configuring throttling for the migration task or running it during off-peak hours.

  • Before a migration task starts, Data Online Migration checks for files at the source and destination. If a file with the same name exists at both the source and destination, and the task is configured to overwrite, the destination file will be overwritten. To prevent data loss when the file contents differ, you must either rename one of the files or back up the destination file.

Limitations

  • Character device files, block device files, socket files, and pipeline files in the source data address are not migrated.

  • Hard links in the source data address are migrated as regular files.

  • When directory migration is enabled, directories in the source data address are always migrated to the destination data address, regardless of the migration task's overwrite mode.

  • Special file permissions such as SUID, SGID, and SBID are not preserved.

  • File attributes are handled as follows during data migration between local file systems (LocalFS):

    • Supported attributes: ModifyTime, Permissions, Uid, and Gid.

      Note
      • Permissions: Includes the nine basic read, write, and execute permissions.

      • Uid is the user ID, and Gid is the group ID.

    • Unsupported attributes include: AccessTime, ChangeTime, Attr, and ACL.

      Note

      The behavior of unlisted attributes is not guaranteed. Verify the final attributes after the migration is complete.

Step 1: Select a region

  1. Log in to the Data Migration console as the RAM user you created.

  2. In the upper-left corner of the top navigation bar, select the region where the agent is located.

    Important
    • Tunnels, agents, data addresses, and migration tasks are region-specific. Choose your region carefully.

    • We recommend selecting the region where your agent is located. If that region is unavailable, select the geographically closest one.

Step 2: Create a tunnel

  1. In the left-side navigation pane, choose Data Online Migration > Channel Management, and then click Create Tunnel.

  2. In the Create Tunnel dialog box, configure the following parameters and click OK.

    Parameter

    Required

    Description

    Name

    Yes

    The name of the tunnel.

    • The name cannot be empty and can be up to 100 characters in length.

    • The name can contain letters, digits, hyphens (-), and underscores (_).

    Maximum Bandwidth

    Yes

    The maximum bandwidth that the tunnel can use.

    • If you do not configure this parameter, the default value 0 is used, which indicates that the bandwidth for the tunnel is not limited.

    • If you configure this parameter, enter a value based on the note in the console.

    Important

    The bandwidth that is available for the tunnel depends on the actual bandwidth of the network connection.

    Requests/s

    Yes

    The maximum number of requests per second over the tunnel.

    • If you do not configure this parameter, the default value 0 is used, which indicates that the number of requests per second over the tunnel is not limited.

    • If you configure this parameter, enter a value based on the note in the console.

    Warning

    We recommend that you evaluate the capabilities of the storage system of the data source before you configure this parameter. If you set this parameter to a great value, your business is affected. We recommend that you enter a value based on the note in the console.

Note

For more information about tunnels, see Tunnel Management.

Step 3: Create an agent

  1. In the left-side navigation pane, choose Data Online Migration > Agent Management. On the Agent Management page, click New Agent.

  2. In the New Agent dialog box, configure the following parameters and click OK.

    Parameter

    Required

    Description

    Name

    Yes

    The agent's name.

    • The name must be 3 to 63 characters long.

      • The name can contain lowercase letters, digits, hyphens (-), and underscores (_). The name is case-sensitive.

      • The name must be UTF-8 encoded and cannot start with a hyphen (-) or an underscore (_).

    Network Type

    Yes

    The network connection type for the agent. The following options are available:

    • VPC (Recommended): The agent connects to the Data Online Migration service over a VPC. This method requires the machine that hosts the agent to access the internal endpoint of Data Online Migration in the corresponding region. For example, if you use Data Online Migration in the China (Beijing) region, the agent machine must be able to access the internal endpoint {TunnelId}.cn-beijing.mgw-tc-internal.aliyuncs.com. We recommend using an ECS instance in the same region as the Data Online Migration console to deploy the agent.

    • Public network: The agent connects to the Data Online Migration service over the public network. This method requires the machine that hosts the agent to access the public endpoint of Data Online Migration in the corresponding region. For example, if you use Data Online Migration in the China (Beijing) region, the agent machine must be able to access the public endpoint {TunnelId}.cn-beijing.mgw-tc.aliyuncs.com.

    Note
    • {TunnelId} is a placeholder for the tunnel ID.

    • You can use the ping command to test the network connectivity between the agent and the Data Online Migration service.

    Deployment Method

    Yes

    The agent's deployment method. Currently, only standalone process deployment is supported.

    Tunnel

    Yes

    The tunnel to associate with the agent. An agent can be associated with only one tunnel. The bandwidth of the agent is limited by the total bandwidth of the tunnel.

    For example, a tunnel named tunnel-1 is configured with a maximum bandwidth of 10 Gbit/s. tunnel-1 is associated with three agents: agent-1, agent-2, and agent-3. The combined bandwidth of these three agents cannot exceed 10 Gbit/s. If agent-1 is allocated 3 Gbit/s of bandwidth, only 7 Gbit/s of bandwidth remains available for agent-2 and agent-3. Plan and allocate your bandwidth carefully.

  3. Generate the deployment script for the agent. For details, see Generate the deployment script for the agent.

Note

For more information about agents, see Agent Management.

Step 4: Create a source data address

  1. In the left-side navigation pane, choose Data Online Migration > Address Management and click Create Address.

  2. In the Create Address panel, configure the following parameters and click OK.

    Parameter

    Required

    Description

    Name

    Yes

    Enter a source name. Requirements:

    • 3 to 63 characters in length.

    • Case-sensitive. Allows only lowercase letters, digits, hyphens (-), and underscores (_).

    • Cannot start with a hyphen (-) or underscore (_).

    Type

    Yes

    Select LocalFS.

    Directory To Be Migrated

    Yes

    Enter the path of the directory to migrate. The path must be absolute, start and end with a forward slash (/), and not contain environment variables or special characters.

    For example, if you set the source prefix to /example/src/ for a file named example.jpg, and set the destination prefix to example/dest/, the full path of the migrated file example.jpg becomes example/dest/example.jpg.

    Important

    If a data address uses multiple agents, ensure each agent can access the directory. Otherwise, some data may not be migrated.

    Tunnel

    Yes

    Select the channel to use.

    Important
    • Required only for self-managed-to-cloud migration, or migration over a dedicated connection or VPN.

    • An agent is required when the destination is a local file system (LocalFs) or when migrating over a dedicated connection for services like Finance Cloud or Apsara Stack.

    Agent

    Yes

    Select one or more agents.

    Important
    • Required only for self-managed-to-cloud migration, or migration over a dedicated connection or VPN.

    • You can select up to 200 agents for a specified channel.

Step 5: Create a destination address

  1. In the navigation pane on the left, choose Data Online Migration > Address Management, and then click Create Address.

  2. In the Create Address panel, configure the following parameters and click OK.

    Parameter

    Required

    Description

    Name

    Yes

    Enter a destination name. Requirements:

    • 3 to 63 characters in length.

    • Case-sensitive. Allows only lowercase letters, digits, hyphens (-), and underscores (_).

    • Cannot start with a hyphen (-) or underscore (_).

    Type

    Yes

    Select LocalFS.

    Directory To Be Migrated

    Yes

    You can specify a data path prefix to migrate source data to a specific directory.

    You must use an absolute path. The path must start and end with a forward slash (/). Environment variables and special characters are not supported.

    For example, if the source data prefix is /example/src/, which contains the file example.jpg, and the destination data prefix is /example/dest/, the full path of the migrated file example.jpg is /example/dest/example.jpg.

    Important

    If the data address is associated with multiple agents, you must ensure that each agent can access the directory. Otherwise, some files may fail to migrate.

    Tunnel

    Yes

    Select the channel to use.

    Important
    • Required only for self-managed-to-cloud migration, or migration over a dedicated connection or VPN.

    • An agent is required when the destination is a local file system (LocalFs) or when migrating over a dedicated connection for services like Finance Cloud or Apsara Stack.

    Agent

    Yes

    Select one or more agents.

    Important
    • Required only for self-managed-to-cloud migration, or migration over a dedicated connection or VPN.

    • You can select up to 200 agents for a specified channel.

Step 6: Create a migration task

  1. In the navigation pane on the left, choose Data Online Migration > Migration Tasks, and then click Create Task.

  2. On the Select Address page, configure the following parameters, and then click Next.

    Parameter

    Required

    Description

    Name

    Yes

    Enter a task name. Requirements:

    • 3 to 63 characters in length.

    • Case-sensitive. Allows only lowercase letters, digits, hyphens (-), and underscores (_).

    • Cannot start with a hyphen (-) or underscore (_).

    Source Address

    Yes

    Select a source address created earlier.

    Destination Address

    Yes

    Select a destination address created earlier.

  3. On the Task Configurations page, configure the following parameters.

    Parameter

    Required

    Description

    Basic settings

    Migration Bandwidth

    No

    Bandwidth limit for migration.

    • Default: Uses the maximum available bandwidth. The actual migration speed depends on the file size and the number of files.

    • Specify an upper limit: Specify a bandwidth cap as prompted on the console.

    Important
    • Actual bandwidth depends on data source, file sizes, network conditions, and destination throttling. The limit may not be reached.

    • Evaluate your data source, destination, workloads, and network capacity before setting this value. Improper throttling may affect your business.

    Files Migrated Per Second

    No

    Files migrated per second.

    • Default: The default number of files migrated per second.

    • Specify an upper limit: Specify an upper limit as prompted on the console.

    Important
    • Actual rate depends on data source, file sizes, network conditions, and destination throttling. The limit may not be reached.

    • Evaluate your data source, destination, workloads, and network capacity before setting this value. Improper throttling may affect your business.

    Overwrite Mode

    No

    Behavior when a file with the same name exists at the destination.

    • Do not overwrite: Skips migrating the file.

    • Overwrite All: The source file overwrites the destination file.

    • Overwrite based on the last modification time:

      • The destination file is overwritten if the source file's last modified time is later.

      • If the last modified times are the same, the destination file is overwritten if its Size or Content-Type differs.

    • Warning
      • The Overwrite based on the last modification time policy does not guarantee that an older file will not overwrite a newer one.

      • If you select Overwrite based on the last modification time, ensure your source data can return metadata such as last modified time, Size, and Content-Type. Otherwise, the overwrite policy may not work as expected and can lead to unintended migration results.

      • If you select Do not overwrite or Overwrite based on last modified time, the service requests object metadata from both the source and destination to perform the comparison. This incurs request fees on both the source and destination.

    Auditing

    Migration Report

    Yes

    Specify how the migration report is delivered.

    • Do not push (Default): The migration report is not pushed to the destination local file system (LocalFS).

    • Push: The migration report is pushed to the destination LocalFS. For the detailed path, see Next steps.

    Important
    • Pushing the migration report consumes storage space at the destination.

    • Migration report delivery may be delayed.

    • Each task execution record has a unique ID. The migration report is pushed only once. Exercise caution when deleting it.

    Migration Logs

    Yes

    Migration log delivery method.

    • Do not push (Default): Not pushed.

    • Push: Pushes full migration logs to Log Service.

    • Push only file error logs.: Pushes only error logs to Log Service.

    If you select Push or Push only file error logs., the service creates a Log Service project named aliyun-oss-import-log-{Alibaba Cloud account ID}-{region}. Example: aliyun-oss-import-log-137918634953****-cn-hangzhou.

    Important

    Complete the following before selecting Push or Push only file error logs.. Otherwise, the migration task may fail.

    • Log Service is activated.

    • Required permissions are granted on the Authorize page.

    Authorize

    No

    This option appears only when Migration Logs is set to Push or Push only file error logs..

    Click Authorize to go to the Cloud Resource Access Authorization page. The system creates a role named AliyunOSSImportSlsAuditRole and grants permissions to the role. Click Agree to Authorization to complete the authorization.

    Filters

    File Name

    No

    A filter for filenames.

    Use Include and Exclude rules based on the RE2 library regular expression syntax (only a subset is supported). Examples:

    • .*\.jpg$ matches all files ending with .jpg.

    • ^file.* matches all files in the root directory whose names start with file.

      If the source address is configured with a prefix, for example data/to/oss/, you must use ^data/to/oss/file.* to match all files under that prefix whose names start with file.

    • .*/picture/.* matches any file within a subdirectory named picture at any level.

    Important
    • If you configure include rules, any file that matches at least one rule is migrated.

      For example, consider two files: picture.jpg and picture.png. If you add an include rule .*\.jpg$, only picture.jpg is migrated. If you also add an include rule .*\.png$, both files are migrated.

    • If you configure exclude rules, any file that matches at least one rule is not migrated.

      For example, consider two files: picture.jpg and picture.png. If you add an exclude rule .*\.jpg$, only picture.png is migrated. If you also add an exclude rule .*\.png$, neither file is migrated.

    • Exclude rules take precedence over include rules. A file is not migrated if it matches both an exclude rule and an include rule.

      For example, consider the file file.txt. If you configure an exclude rule .*\.txt$ and an include rule file.*, the file file.txt is not migrated.

    File Modification Time

    No

    Filter by last modified time.

    Specify a time range to migrate only files modified within that range:

    • If you specify only a start time (for example, January 1, 2019) and no end time, only files last modified on or after January 1, 2019 are migrated.

    • If you specify only an end time (for example, January 1, 2022) and no start time, only files last modified on or before January 1, 2022 are migrated.

    • If you specify a start time of January 1, 2019 and an end time of January 1, 2022, only files last modified on or after January 1, 2019 and on or before January 1, 2022 are migrated.

    Migrate Special Entities

    No

    Specifies whether to migrate special entities such as directories and symlinks.

    Directory:

    • Enabled: All directories scanned at the source address are added to the migration queue. The statistics for the directories are included in the values of the File Count and Storage Volume fields for the migration task. Corresponding directories are created at the destination, and the migratable attributes of the source directories are applied to them.

    • Disabled: All directories scanned at the source address are ignored and not included in the task's File Count or Storage Volume fields. For any scanned source directory:

      • If the directory contains no files to be migrated (it is empty, or all its files are excluded by filters), no corresponding directory is created at the destination.

      • If the directory contains files to be migrated, a corresponding directory is created at the destination with default attributes to serve as a parent directory. However, the migratable attributes of the source directory are not applied.

    Symlink:

    • Enabled: All symlinks at the source address are added to the migration queue and included in the File Count and Storage Volume fields. Corresponding symlinks are created at the destination, and their migratable attributes are applied from the source symlinks. The Target attribute of the destination symlink depends on the Convert Target Path option.

    • Disabled: All symlinks at the source address are ignored and not included in the task's File Count or Storage Volume fields.

    Important

    In any case, the migration service does not migrate the target files or directories that the symlinks reference, unless those targets are also within the scope of the migration.

    Migration configuration

    Convert Target Path

    No

    Converts the Target attribute of source symlinks so that migrated symlinks point to the correct targets.

    Important
    • This option takes effect only when symlink migration is enabled.

    • Regardless of this setting, the service does not validate the symlink target's existence, type, or access permissions.

    Enabled: The format of the symlink's Target attribute is checked.

    • If the Target attribute is a relative path, it is not converted. The original value is used for the destination symlink's Target attribute.

    • If the Target attribute is an absolute path, it is first resolved to the shortest equivalent absolute path (AbsTarget), relative to the symlink's directory. Then, the service replaces the source prefix (SrcPrefix) in AbsTarget with the destination prefix (DestPrefix) if a match is found. The result is used as the Target attribute for the destination symlink.

    Note

    Example: Assume the source prefix (SrcPrefix) is /mnt/nas1/, the destination prefix (DestPrefix) is /mnt/nas2/, and a source symlink exists at /mnt/nas1/links/a.lnk. The behavior changes based on its Target attribute:

    • If the target is ../data/./a.txt, it is not converted. The final Target value remains ../data/./a.txt.

    • If the target is /mnt/nas1/verbose/../data/./a.txt, its shortest absolute path is /mnt/nas1/data/a.txt. After prefix replacement, the final Target value is /mnt/nas2/data/a.txt.

    • If the target is /root/outer/../data/./a.txt, its shortest absolute path is /root/data/a.txt. No prefix match occurs, so the final Target value remains /root/data/a.txt.

    Disabled: The source symlink's Target attribute is used for the destination symlink without conversion.

    Preserve Last Modified Time

    Yes

    Specifies whether to preserve the last modified time of the source file.

    • Preserve (Default): The last modified time of the source file is applied to the destination file.

    • Do not preserve: The last modified time is not applied.

    Task scheduling

    Execution Time

    No

    Important
    1. If a task is still running when its next execution is scheduled, the current run will complete, the scheduled run is skipped, and the task will execute at the next interval.

    2. Concurrent migration task limit: Up to 10 in Chinese mainland and China (Hong Kong) regions, and up to 5 in other regions.

    Specify when to run the task.

    • Immediately: Runs the task immediately.

    • At the Specified Time: Sets a daily time window for the task to run. By default, the task starts at the specified start time and pauses at the specified stop time.

    • Periodic Scheduling: Runs the task based on a specified frequency and number of executions.

      • Execution Frequency: Supported frequencies are Hourly, Daily, Weekly, Specific days of the week, and Custom. Execution frequency.

      • Number of Executions: Specifies the number of times the task runs. If not set, the task runs once by default. For the maximum number of executions, refer to the prompt on the console.

    Important

    You can manually start or pause the task at any time, regardless of the scheduled execution time.

  4. Read the Online Migration Service Agreement, select the checkbox for I have understood and confirmed the compliance commitment statement, and I acknowledge my obligation and responsibility to verify the consistency of migrated data after the migration task is completed, and then click Next.

  5. Review the configuration information. If it is correct, click OK and wait for the migration task to run.

Execution frequency

Execution frequency

Description

Example

Hourly

Run the task once every hour. You can use this option with the maximum number of runs.

The current time is 8:05. The frequency is set to hourly with a maximum of 3 runs. The first run starts at the next hour, 9:00.

  • If a run finishes before the next hour, the second run starts at 10:00. This pattern continues until the specified number of runs is complete.

  • If a run has not finished by the next hour and ends at 12:30, the second run starts at the next hour, 13:00. This pattern continues until the specified number of runs is complete.

Daily

Run the task once a day. You must specify an hour (0-23) for the task to start. You can use this option with the maximum number of runs.

The current time is 8:05. The task is scheduled to run daily at 10:00, with a maximum of 5 runs. The first run starts at 10:00 today.

  • If a run finishes before 10:00 the next day, the second run starts at 10:00 the next day. This pattern continues until the specified number of runs is complete.

  • If a run has not finished by 10:00 the next day and ends at 12:05 the next day, the second run starts at 10:00 on the third day. This pattern continues until the specified number of runs is complete.

Weekly

Run the task once a week. You must specify a day of the week and an hour (0-23) for the task to start. You can use this option with the maximum number of runs.

The current time is Monday, 8:05. The task is scheduled to run every Monday at 10:00, with a maximum of 10 runs. The first run starts at 10:00 today.

  • If a run finishes before 10:00 next Monday, the second run starts at 10:00 next Monday. This pattern continues until the specified number of runs is complete.

  • If a run has not finished by 10:00 next Monday and ends at 12:05 next Monday, the second run starts at 10:00 on the following Monday. This pattern continues until the specified number of runs is complete.

Specific days of the week

Run the task on selected days of the week. You must specify the days and an hour (0-23) for the task to start.

The current time is Wednesday, 8:05. The task is scheduled to run on Mondays, Wednesdays, and Fridays at 10:00. The first run starts at 10:00 today.

  • If a run finishes before 10:00 on Friday, the second run starts at 10:00 on Friday. This pattern continues until the specified number of runs is complete.

  • If a run has not finished by 10:00 on Friday and ends at 12:05 next Monday, the second run starts at 10:00 next Wednesday. This pattern continues until the specified number of runs is complete.

Custom

Use a cron expression to define a custom schedule for the task start time.

Note

A cron expression consists of six space-separated fields that define the execution schedule: second, minute, hour, day of the month, month, and day of the week. The minimum interval is 1 hour.

The following cron expression examples are for reference only. For more options, use a cron expression generator.

  • 0 0 * * * *: Runs the task at the beginning of every hour (0 minutes, 0 seconds).

  • 0 30 0/3 * * ?: Runs the task every 3 hours at 30 minutes past the hour (for example, at 0:30, 3:30, 6:30, 9:30, 12:30, 15:30, 18:30, and 21:30).

  • 0 0 12 * * MON-FRI: Runs the task at 12:00 PM every weekday from Monday to Friday.

  • 0 0 12 1-15 * SAT,SUN: Runs the task at 12:00 PM on weekends (Saturday and Sunday) that fall between the 1st and 15th of the month.

  • 0 30 8 1,15 * *: Runs the task at 8:30 AM on the 1st and 15th of each month.

Step 7: Verify data

Migration Service transfers data but does not guarantee consistency or integrity. Validate all migrated data after migration completes.

Warning

After migration completes, verify destination data integrity. You are solely responsible for any data loss if you delete source data before confirming destination data integrity.