Migrate data with rsync

更新时间:
复制 MD 格式

This topic describes how to use the rsync tool to migrate data between Apsara File Storage NAS file systems that use the NFS protocol.

Prerequisites

You have an NFS file system that contains data and a mount target in a VPC.

Billing

Migrating data between NAS file systems involves the following costs:

  • You are charged for the ECS instance that is used as a data transfer node based on its configuration. For more information about ECS billing, see Billing method overview.

  • You are charged for the storage usage of both NAS file systems. We recommend that you purchase a resource plan to offset the fees. For more information about NAS billing, see Billing overview.

  • If you use CEN to connect VPCs, you are charged for the transit routers and inter-region connections. For more information about CEN billing, see Billing.

Before you begin

To migrate data between NAS file systems, the ECS instance must be able to access both the source and destination file systems. Therefore, you must ensure that both file systems are accessible from the same VPC.

  1. View the mount target information of the source file system.

    Before you migrate data, record the mount target and VPC information of the source file system. For more information, see Manage mount targets.

    Note

    If your file system has only a classic network mount target, you must create a mount target in a VPC. For more information, see Add a mount target.

  2. Configure the mount target of the destination file system.

    • The source and destination file systems are in the same region

      • If the mount targets of the source and destination file systems are in the same VPC, obtain the destination mount target information and then proceed to the Procedure section to migrate data.

      • If the mount targets of the source and destination file systems are in different VPCs, prepare the mount target of the destination file system by using one of the following methods:

        • Create a new file system in the destination region and zone. A new mount target is automatically created. For more information, see Create a General-purpose NAS file system in the console.

          Note

          When you create a pay-as-you-go General-purpose NAS file system (Capacity, Performance, or Premium) that uses the NFS protocol, select the same VPC and vSwitch as the source mount target. A destination mount target is automatically created. After the new file system is created, you can purchase a resource plan to reduce costs.

        • Create a new mount target on an existing file system. For more information, see Add a mount target.

        • Use CEN to connect the VPCs of the source and destination mount targets. For more information, see Mount a NAS file system across VPCs in the same region by using CEN.

    • The source and destination file systems are in different accounts or regions

      If the mount targets of the source and destination file systems are in different accounts or regions, you must use CEN to connect their VPCs. For more information, see Mount a NAS file system across accounts and regions by using CEN.

Procedure

After you prepare the source and destination mount targets, create an ECS instance, mount both NFS file systems to it, and then use the rsync tool to copy the data.

  1. Mount the source and destination file systems.

    Important

    We recommend that you purchase a new, temporary ECS instance for the migration. If you use an existing ECS instance, the migration process may compete for CPU and network bandwidth resources with your running services.

    Log on to the ECS console, click Create Instance, and then configure the following key parameters.

    • Region: Select the region where the source file system resides.

    • Network and Zone: Select the VPC, zone, and vSwitch of the source file system.

    • Instance Type: A low-cost instance type is usually sufficient.

    • Image: Select CentOS 7.6.

    • Storage: Click the Elastic Ephemeral Disk | Apsara File Storage NAS | Dedicated Block Storage Cluster (Optional) section, and then click Add File System. For more information, see the following configuration details.

      Note
      • If both the source and destination mount targets are in the same VPC, you can configure the NAS mount information on the ECS purchase page. After the ECS instance starts, it automatically mounts both the source and destination NAS file systems.

      • If the source and destination mount targets are not in the same VPC, or are in different regions or accounts, configure only the source file system on the ECS purchase page. After you create the ECS instance, manually mount the destination file system. For more information, see Mount an NFS file system.

    After the ECS instance is created and both NAS file systems are mounted, run the following command to verify the mounts:

    mount | grep nas.aliyuncs.com

    If the mounts succeed, the output resembles the following: The source file system is mounted to the /mnt/volumeA directory and the destination file system is mounted to the /mnt/volumeB directory.

    [root@xxx          ~]# mount | grep nas.aliyuncs.com
                    xxx.nas.aliyuncs.com:/ on /mnt/volumeA type nfs (rw,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,nolock,noresvport,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=192.168.0.198,mountvers=3,mountport=4002,mountproto=tcp,local_lock=all,addr=192.168.0.198,_netdev)
                    xxx.nas.aliyuncs.com:/ on /mnt/volumeB type nfs (rw,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,nolock,noresvport,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=192.168.0.213,mountvers=3,mountport=2049,mountproto=tcp,local_lock=all,addr=192.168.0.213,_netdev)
  2. Install migration tools.

    Run the following command to install the migration tools:

    sudo yum install -y rsync tmux
    Note
    • rsync performs the data copy.

    • tmux is a terminal multiplexer that allows you to run the rsync process in a persistent session. This helps you monitor progress and prevents the process from terminating if your SSH connection is interrupted. For more information, see the tmux user guide.

  3. Migrate existing data.

    Run the following commands to synchronize existing data from the source file system to the destination file system:

    tmux
    sudo rsync -avP /mnt/volumeA/ /mnt/volumeB/

    You can also use rsync for concurrent copying:

    threads=<number of threads>; 
    src=<source path/>; 
    dest=<destination path/>; 
    rsync -av -f"+ */" -f"- *" $src $dest && (cd $src && find . -type f | xargs -n1 -P$threads -I% rsync -av % $dest/% )

    For example, if the number of threads is 10, the source path is /abc, and the destination path is /mnt1:

    threads=10; 
    src=/abc/; 
    dest=/mnt1/; 
    rsync -av -f"+ */" -f"- *" $src $dest && (cd $src && find . -type f | xargs -n1 -P$threads -I% rsync -av % $dest/% )
    Note
    • The source path in the rsync command must end with a trailing slash (/). Otherwise, the data paths will not match after synchronization.

    • The tmux command creates a new tmux session. Running rsync in a tmux session helps you monitor progress. If your connection to the ECS instance drops during migration, log on again and run tmux attach to resume the tmux session and continue monitoring.

    • In a test environment, migrating a source file system containing one million 100 KiB files (100 GiB in total) took approximately 320 minutes.

  4. Migrate incremental data.

    If applications write to the source file system during the initial data migration, you must perform another synchronization to copy the incremental data.

    1. Stop your applications.

      To prevent applications from writing new data, stop all applications on ECS instances and containers that use the source file system before you synchronize the incremental data.

      Important
      • After you stop the applications, do not manually delete any data from the source file system. Otherwise, you may lose data in the next step.

      • We recommend that you perform this operation during off-peak hours. You can use the fuser -mv <dir> command to find the process IDs (PIDs) that are accessing the NAS file system.

    2. Synchronize incremental data.

    3. Run the following rsync command to synchronize incremental data generated after the initial migration started:

      sudo rsync -avP --delete /mnt/volumeA/ /mnt/volumeB/

      The rsync command first scans the source path, so the process may take a long time even if there is little incremental data.

      Warning

      The --delete option deletes files from the destination file system that no longer exist on the source file system. Use this option with caution to avoid accidental data deletion.

  5. Verify the migration.

    After the migration is complete, run the following rsync command to check whether the destination file system is consistent with the source file system:

    sudo rsync -rvn /mnt/volumeA/ /mnt/volumeB/

    If the data is consistent, the output lists no file paths and resembles the following:

    sending incremental file list
    sent 13,570,658 bytes  received 5,008 bytes  17,173.52 bytes/sec
    total size is 100,000,000,000  speedup is 7,366.12 (DRY RUN)

Cut over to the new file system

After the data migration is complete, switch your applications to the new file system by unmounting the old file system and mounting the new one on all relevant ECS instances and containers.

  • For applications running on ECS instances:

    1. Run mount | grep nas.aliyuncs.com to view the current NAS mount information. Note the local path <dir> where the file system is mounted.

    2. Run fuser -mv <dir> to find the PIDs of the processes that are accessing the file system, and then stop them with the kill -9 command.

    3. Run umount <dir> to unmount the old file system.

    4. Mount the new file system to the original <dir> path. For more information about mount parameters, see Mount an NFS file system.

    5. Restart the application processes and verify that they can read from and write to the new file system.

    6. Modify the auto-mount configuration in the /etc/fstab file to replace the old mount target with the new one.

  • For applications running in Kubernetes containers:

    1. Modify your existing dynamic volume or persistent volume YAML configuration file to replace the old mount target with the new one.

    2. Use the modified configuration file to create new pods. Verify that the new file system is mounted correctly and is readable and writable.

    3. Terminate all pods that use the old file system.

Important

After you cut over to the new file system, keep the data in the old file system for at least one week. Do not immediately delete data from the old file system to prevent potential data loss from accidental deletion or synchronization errors.

References