After deploying a MongoDB-compatible PolarDB for PostgreSQL Lightweight Edition cluster, you need to migrate data from your existing MongoDB database. This topic covers two methods—online hot migration and offline restoration—to help you choose the best one for your scenario.
Choose a migration method
PolarDB provides the dsync tool for online migration and the mongorestore tool for offline restoration. The following table compares the two methods to help you choose the best migration plan based on your needs for downtime, data volume, and complexity.
|
Comparison item |
Dsync online migration |
Mongorestore offline restoration |
|
Migration type |
Online hot migration (full + incremental synchronization) |
Offline cold migration (point-in-time recovery based on a backup) |
|
Business downtime |
Minutes. Business writes are interrupted only for a short time during the final cutover. |
Hours or longer. The downtime is the total time required for backup and restoration. |
|
Data consistency |
High. Data can be synchronized up to the last second before the cutover, with no data loss. |
Point-in-time consistency. After restoration, data generated after the backup was created is lost. |
|
Core advantage |
Minimizes business disruption, making it ideal for smooth migration of production systems. |
Simple and straightforward, suitable for development and test environments, or scenarios where extended downtime is acceptable. |
|
Source database requirements |
Requires an active-passive architecture. |
No special requirements. The |
|
Recommended scenarios |
|
|
For production systems that require high business continuity, online migration using dsync is the recommended method. For all other scenarios, mongorestore provides a simpler, more direct alternative.
Online migration with dsync
The dsync tool performs both full and incremental data synchronization from a source MongoDB database to a destination PolarDB for PostgreSQL Lightweight Edition cluster. This automated process minimizes business downtime.
Prerequisites
Before you begin, ensure that your environment meets the following requirements:
-
Source MongoDB instance
-
The instance must use an active-passive architecture.
-
The connection endpoint for data synchronization must be the primary node. To confirm this, connect to the source instance using
mongoshand run thers.status()command. In the output, verify that thestateStrfield for the connected node isPRIMARY. -
Prepare a database account with
rootorreadAnyDatabasepermissions to allowdsyncto read data.
-
-
Destination PolarDB for PostgreSQL Lightweight Edition cluster
-
A MongoDB-compatible cluster must be deployed. If you have not deployed a cluster, see Installation and Deployment (MongoDB-Compatible).
-
Prepare the MongoDB protocol connection string and a high-privilege account for the destination cluster, such as the
adminaccount created during installation.
-
-
Install the
dsynctool-
To obtain the RPM package for the
dsynctool, submit a ticket. -
After you decompress the package, run the following command with
rootpermissions to install:sudo rpm -ivh t-polardb-pg-dsync-xxx.an8.x86_64.rpm -
Run the following commands to verify the installation. By default,
dsyncis installed in the/u01/dsync/directory.cd /u01/dsync/ ./dsync --versionIf a version number is returned, the installation is successful. For example:
dsync version 0.15-beta (git commit dd1c8xxxx) dsync exited successfully
-
Procedure
Step 1: Optimize destination cluster configuration
To improve the performance of the full data import, adjust the write batch size of the destination cluster before migration.
-
Connect to the destination PolarDB for PostgreSQL Lightweight Edition cluster.
# Replace localhost with the IP address of your database host. PGPASSWORD=postgres /u01/polardb_pg/bin/psql -h localhost -p 1523 -U admin -d admin_dbNoteCheck the
portparameter in thepostgresql.conffile to find the database's PostgreSQL protocol port. -
Run the following SQL commands to modify the parameter and apply the change. Ensure that you run the commands with an account that has
superuserpermissions, such as theadminaccount.ALTER SYSTEM SET documentdb.maxWriteBatchSize=100000; SELECT pg_reload_conf(); -
Run the
SHOW documentdb.maxWriteBatchSize;command to confirm that the return value is100000.
Step 2: Start the migration
The dsync tool uses environment variables to configure connection information for the source and destination databases. After starting, it automatically performs full and incremental synchronization.
-
Set the environment variables to configure the connection strings (URIs) for the source and destination databases.
# Set the connection string for the source MongoDB instance. export MDB_SRC='mongodb://<src_username>:<src_password>@<src_ip>:<src_port>/<src_db>' # Set the connection string for the destination PolarDB instance. export FERRETDB_DEST='mongodb://<dest_username>:<dest_password>@<dest_ip>:<dest_port>/<dest_db>'Parameter description
Parameter
Description
src_usernameThe source database account, which must have
rootorreadAnyDatabasepermissions.src_passwordThe password for the source database account.
src_ipThe IP address of the primary node of the source database.
src_portThe port of the source database.
src_dbThe name of the source database to migrate.
dest_usernameThe destination database account. You can use a high-privilege account such as
admin.dest_passwordThe password for the destination database account.
dest_ipThe IP address of the destination database.
dest_portThe MongoDB protocol port of the destination database. The default is 27030.
dest_dbThe name of the target database in the destination cluster.
Example
export MDB_SRC='mongodb://readAnyDatabase:password123@127.0.0.1:27017/?directConnection=true&serverSelectionTimeoutMS=2000&appName=mongosh+2.5.0' export FERRETDB_DEST='mongodb://test_user:superuser_20250530@127.0.0.1:27030/postgres' -
Start the
dsyncprocess. We recommend using the--progressparameter to monitor the progress../dsync --progress --logfile dsync.log "$MDB_SRC" "$FERRETDB_DEST"Parameter description
Field
Description
Example
Namespace
Indicates each collection being migrated.
.ycsb.new_users: Indicates the
new_userscollection in theycsbdatabase.Percent complete
Indicates the completion progress.
100%: Indicates that the full synchronization is complete.
Tasks completed
Indicates the status of chunked tasks for the full migration.
5/21: Indicates that 5 of 21 tasks are complete.
Docs synced
Indicates the number of documents synchronized for the collections.
282132: Indicates that 282,132 documents have been synchronized.
Throughput
Indicates the throughput in documents synchronized per second.
-
NoteTo prevent an SSH session interruption from stopping the migration, run the
dsynccommand in a terminal multiplexer such astmuxorscreen. For example:tmux new -t mysession export MDB_SRC='mongodb://readAnyDatabase:password123@127.0.0.1:27017/?directConnection=true&serverSelectionTimeoutMS=2000&appName=mongosh+2.5.0' export FERRETDB_DEST='mongodb://test_user:superuser_20250530@127.0.0.1:27030/postgres' ./dsync --progress --logfile dsync.log "$MDB_SRC" "$FERRETDB_DEST"If you created a session by using the
tmux new -s mysessioncommand, but the window timed out and closed, you can reattach to it by using thetmux attach -t mysessioncommand.
Step 3: Monitor the migration progress
After dsync starts, a real-time progress report appears in the terminal. The migration process has two stages:
-
Full Synchronization (InitialSync)
This stage copies all existing data from the source database. You can view the synchronization progress for each collection.
Dsync Progress Report : InitialSync Time Elapsed: 00:00:28 4/5 Namespaces synced Docs Synced: 360935 Namespace Percent Complete Tasks Completed Docs Synced Throughput: Docs/s .test.books 100% 1/1 4 0 .test_db.test_collection 100% 1/1 1 0 .ycsb.new_users 100% 1/1 3 0 .ycsb.usertable_dtstest 38% 9/21 (3 active) 360924 2183 .ycsb.users 100% 1/1 3 0 [############################## ] 38.10% 2183.14 docs/sec Error Logs -
Incremental Synchronization (Change Stream) After the full synchronization is complete,
dsyncautomatically enters the incremental synchronization stage. It continuously captures and applies data changes (inserts, deletes, and updates) from the source database.Dsync Progress Report : ChangeStream Time Elapsed: 00:04:26 5/5 Namespaces synced Processing change stream events Change Stream Events- 8614 Deletes Caught- 0 Events to catch up: 1 [----------------------------------------------------------------------------->>>---] 365.69 events/sec Error Logs
Step 4: (Optional) Data validation
Before the cutover, you can perform data validation to ensure that the data in the source and destination databases is identical.
-
Stop writes to the source database before performing data validation.
-
The validation process scans entire collections and consumes significant resources. As a lightweight alternative, you can first run the
db.collection.countDocuments()command to quickly compare the document counts between the two databases.
To perform a full data validation, first stop (Ctrl + C) the running dsync process, and then restart it with the --verify parameter:
# The --verify parameter indicates a full data validation. Note that the validation command must be run separately.
./dsync --verify --progress --logfile dsync.log "$MDB_SRC" "$FERRETDB_DEST"
If the verification results for all collections are OK, this indicates that the data is completely consistent.
Step 5: Cutover
When you have confirmed that the data synchronization is complete and validated, you can perform the final cutover.
-
Stop application writes to the source MongoDB instance.
-
Monitor the
dsyncincremental synchronization interface. When theChange Stream Eventscount stops increasing, all changes have been synchronized to the destination database. -
Press
Ctrl + Cto stop thedsyncprocess.ImportantDo not shut down the source MongoDB instance until the
dsyncprocess has completely exited. Otherwise, the synchronization state may be lost or an error may occur. -
Update your application's configuration to use the connection string for the destination PolarDB for PostgreSQL Lightweight Edition cluster.
-
Start your application to complete the migration.
Offline restoration with mongorestore
To restore data from a backup file, use the official MongoDB mongorestore tool. This is an offline operation.
-
Have the backup directory created by
mongodumpready. -
Run the following command to restore data to the destination PolarDB for PostgreSQL Lightweight Edition cluster.
mongorestore --uri="mongodb://<user>:<password>@<dest_ip>:<dest_port>/<dest_db>" /path/to/backup_dir/Example:
mongorestore --uri="mongodb://admin:postgres@10.0.0.1:27030/ycsb" /root/ycsb/The expected output is similar to
15 document(s), which indicates that 15 rows of data were imported.... 2025-08-25T11:05:20.505+0800 15 document(s) restored successfully. 0 document(s) failed to restore.NoteSecurity tip: Providing the password directly in the
--uriparameter poses a security risk. To prevent password exposure, we recommend omitting the password from the command and entering it when prompted.