Considerations and limits when the source database is MongoDB

更新时间:
复制 MD 格式

If your source database is MongoDB—such as a self-managed MongoDB instance or ApsaraDB for MongoDB—review these considerations and limits before you configure a data synchronization task. This helps ensure successful task execution.

Overview of synchronization solutions for MongoDB sources

Review the considerations and limits for each synchronization scenario based on the following solutions:

Synchronize from MongoDB (ReplicaSet) to MongoDB (ReplicaSet or sharded cluster)

If your destination database is MongoDB—such as a self-managed MongoDB instance or ApsaraDB for MongoDB—apply the following considerations and limits:

Type

Description

Source database limits

  • Bandwidth requirement: The server where your source database resides must have sufficient outbound bandwidth. Otherwise, synchronization speed may be affected.

  • The collections to synchronize must have a primary key or UNIQUE constraint, and the field values must be unique. Otherwise, duplicate data may appear in the destination database.

  • If you synchronize at the collection level and need to edit objects (such as rename collections), one synchronization task supports up to 1,000 collections. If you exceed this limit, the task fails with an error after submission. To resolve this, split the collections into batches and configure multiple tasks. Or configure a full-database synchronization task.

  • A single document in your source database cannot exceed 16 MB. Otherwise, the task fails.

  • Your source database cannot be Azure Cosmos DB for MongoDB or Amazon DocumentDB (elastic cluster).

  • Your source database must enable Oplog and retain Oplog for at least seven days. Or enable Change Streams and ensure DTS can subscribe to data changes in the last seven days through Change Streams. Otherwise, DTS may fail to capture data changes, causing task failure. In extreme cases, this may cause data inconsistency or loss. These issues are not covered by the DTS Service-level agreement (SLA).

    Important
    • We recommend using Oplog to capture data changes.

    • Only MongoDB 4.0 and later support Change Streams. Change Streams do not support two-way synchronization.

    • If your source database is Amazon DocumentDB (non-elastic cluster), manually enable Change Streams. When you configure the task, set Migration Method to ChangeStream and set Architecture to Sharded Cluster.

  • If the collections to synchronize contain TTL indexes, data inconsistency or latency may occur.

  • Source database operation limits:

    • During schema synchronization and full data synchronization, do not change the schema of databases or collections (including array-type data updates). Otherwise, the synchronization task may fail or cause data inconsistency between the source and destination databases.

    • If you perform only full data synchronization, do not write new data to the source instance. Otherwise, data inconsistency may occur between the source and destination databases.

Other limits

  • If the destination instance uses sharded cluster architecture:

    • Clear orphaned documents. Otherwise, synchronization performance may suffer. If documents with conflicting _id values appear during synchronization, data inconsistency or task failure may occur.

    • Before starting the task, add the shard key used by the destination to the data in the source. If you cannot add the shard key to the source, see Synchronize from MongoDB (no shard key) to MongoDB (sharded cluster architecture).

    • After starting the task, INSERT statements for the data to synchronize must include the shard key. UPDATE statements cannot modify the shard key.

  • If the destination instance uses ReplicaSet architecture:

    • When Access Method is Express Connect, VPN Gateway, or Smart Access Gateway or Cloud Enterprise Network (CEN), you must enter the address and port of the primary node for Domain Name or IP and Port Number, or configure a high-availability connection address. For more information about high-availability connection addresses, see Create an instance with a high-availability MongoDB source or destination database.

    • When Access Method is Self-managed Database on ECS, enter the port of the primary node in Port Number.

  • We recommend using the same MongoDB version for the source and destination databases. Or use a higher version for the destination to ensure compatibility. Synchronizing from a higher version to a lower version may cause compatibility issues.

  • DTS does not support connecting to MongoDB databases using SRV addresses.

  • DTS does not support synchronizing data from the admin, config, and local databases.

  • If the destination collection has a unique index or its capped property is set to true, concurrent replay is not supported during incremental synchronization. Only single-threaded writes are allowed. This may increase task latency.

  • Transaction information is not preserved. Transactions in the source database become individual records in the destination database.

  • When DTS writes data to the destination collection, if a primary key or unique key conflict occurs, DTS skips the write statement and keeps existing data in the destination collection.

  • If your source MongoDB version is earlier than 3.6 and your destination MongoDB version is 3.6 or later, differences in execution plans may cause inconsistent field ordering in synchronized data. Field-value relationships remain unchanged. If your business uses text-matching queries on nested structures, assess the impact of inconsistent field ordering.

  • Evaluate the performance of both the source and destination databases before synchronization. Run synchronization during off-peak hours. Otherwise, full data initialization consumes read and write resources, increasing database load.

  • Full initialization runs INSERT operations concurrently. This creates fragmentation in the destination database collection. After full initialization, the collection space in the destination instance is larger than in the source instance.

  • Do not write data to the destination database except through DTS during synchronization. Otherwise, data inconsistency may occur between the source and destination databases. For example, if you use DMS to perform online DDL operations while other data is written to the destination database, data loss may occur.

  • Because DTS writes data concurrently, the storage space used by the destination is 5% to 10% larger than the source.

  • To get the count of documents in the destination MongoDB, use the syntax db.$table_name.aggregate([{ $count:"myCount"}]).

  • Ensure the destination MongoDB does not have the same primary key (default is _id) as the source. Otherwise, data loss may occur. If the destination has the same primary key, clear related data in the destination without affecting your business (delete documents with the same _id value in the destination).

  • If a task fails, DTS support staff will attempt to restore it within eight hours. During restoration, they may restart the task or adjust its parameters.

    Note

    Only DTS task parameters are modified—not database parameters. Parameters that may be adjusted include those listed in Modify instance parameters.

  • If your destination database is a MongoDB sharded cluster, ensure your business behavior meets MongoDB's requirements for sharded collections after switching traffic to this database.

  • If the source database is MongoDB 5.0 or later and the destination database version is earlier than 5.0, you cannot sync capped collections. If you attempt to sync them, the task fails or data inconsistency occurs between the source and destination databases. This is because, starting from MongoDB 5.0, the behavior of capped collections changed to allow operations such as explicit deletion and increasing document size during updates, and earlier versions of the database kernel cannot support these new features.

  • DTS does not support synchronizing time-series collections introduced in MongoDB 5.0 and later.

Special cases

If your source database is a self-managed MongoDB:

  • If a primary/secondary switchover occurs in the source database during synchronization, the task fails.

  • DTS calculates latency by comparing the timestamp of the last synchronized document with the current timestamp. If the source database has no updates for a long time, latency information may be inaccurate. If latency appears too high, run an update operation in the source database to refresh latency information.

Note

If you select full-database synchronization, create a heartbeat table. Update or write to this table every second.

Two-way synchronization between MongoDB (sharded cluster) instances

Apply the following considerations and limits:

Type

Description

Source and destination database limits

  • Bandwidth requirement: The server where your source database resides must have sufficient outbound bandwidth. Otherwise, synchronization speed may be affected.

  • The collections to synchronize must have a primary key or UNIQUE constraint, and the field values must be unique. Otherwise, duplicate data may appear in the destination database.

  • The _id field in the collections to synchronize must have unique values. Otherwise, data inconsistency may occur.

  • If you synchronize at the collection level and need to edit objects (such as rename collections), one synchronization task supports up to 1,000 collections. If you exceed this limit, the task fails with an error after submission. To resolve this, split the collections into batches and configure multiple tasks. Or configure a full-database synchronization task.

  • A single document in your source database cannot exceed 16 MB. Otherwise, the task fails.

  • Your source database cannot be Azure Cosmos DB for MongoDB or Amazon DocumentDB (elastic cluster).

  • Your source database must enable Oplog and retain Oplog for at least seven days. Otherwise, DTS may fail to capture data changes, causing task failure. In extreme cases, this may cause data inconsistency or loss. These issues are not covered by the DTS SLA.

    Important
    • We recommend using Oplog to capture data changes.

    • Change Streams do not support two-way synchronization.

  • Two-way synchronization does not support scaling the number of shards for either the source or destination MongoDB sharded cluster. Otherwise, DTS tasks may fail and cause data inconsistency.

  • The number of Mongos nodes in your source MongoDB sharded cluster instance cannot exceed 10.

  • If the collections to synchronize contain TTL indexes, data inconsistency or latency may occur.

  • Ensure no orphaned documents exist in the source and destination instances. Otherwise, data inconsistency or task failure may occur. For more information, see orphaned documents and How to clean orphaned documents in MongoDB (sharded cluster architecture).

  • The source and destination databases must be ApsaraDB for MongoDB instances of the same architecture. Two-way synchronization is not supported for self-managed MongoDB instances or MongoDB instances of different architectures.

  • Source database operation limits:

    • During schema synchronization and full data synchronization, do not change the schema of databases or collections (including array-type data updates). Otherwise, the synchronization task may fail or cause data inconsistency between the source and destination databases.

    • If you perform only full data synchronization, do not write new data to the source instance. Otherwise, data inconsistency may occur between the source and destination databases.

    • While the synchronization instance is running, do not run commands that change data distribution on the objects to synchronize in the source database (for example, shardCollection, reshardCollection, unshardCollection, moveCollection, or movePrimary). Otherwise, data inconsistency may occur.

  • If the Balancer in your source database balances data, latency may occur.

  • DTS does not support connecting to MongoDB databases using SRV addresses.

  • If the source database is MongoDB 5.0 or later and the destination database version is earlier than 5.0, you cannot sync capped collections. If you attempt to sync them, the task fails or data inconsistency occurs between the source and destination databases. This is because, starting from MongoDB 5.0, the behavior of capped collections changed to allow operations such as explicit deletion and increasing document size during updates, and earlier versions of the database kernel cannot support these new features.

Other limits

  • Before starting the task, add the shard key used by the destination to the data in the source. After starting the task, INSERT statements for the data to synchronize must include the shard key. UPDATE statements cannot modify the shard key.

  • We recommend using the same MongoDB version for the source and destination databases. Or use a higher version for the destination to ensure compatibility. Synchronizing from a higher version to a lower version may cause compatibility issues.

  • If the destination collection has a unique index or its capped property is set to true, concurrent replay is not supported during incremental synchronization. Only single-threaded writes are allowed. This may increase task latency.

  • DTS does not support synchronizing data from the admin, config, and local databases.

  • Transaction information is not preserved. Transactions in the source database become individual records in the destination database.

  • When DTS writes data to the destination collection, if a primary key or unique key conflict occurs, DTS skips the write statement and keeps existing data in the destination collection.

  • If your source MongoDB version is earlier than 3.6 and your destination MongoDB version is 3.6 or later, differences in execution plans may cause inconsistent field ordering in synchronized data. Field-value relationships remain unchanged. If your business uses text-matching queries on nested structures, assess the impact of inconsistent field ordering.

  • Evaluate the performance of both the source and destination databases before synchronization. Run synchronization during off-peak hours. Otherwise, full data initialization consumes read and write resources, increasing database load.

  • Full initialization runs INSERT operations concurrently. This creates fragmentation in the destination database collection. After full initialization, the collection space in the destination instance is larger than in the source instance.

  • Do not write data to the destination database except through DTS during synchronization. Otherwise, data inconsistency may occur between the source and destination databases. For example, if you use DMS to perform online DDL operations while other data is written to the destination database, data loss may occur.

  • If a two-way synchronization instance in the China (Chengdu) or China (Shanghai) region includes a full data synchronization task, DTS creates a full data verification task for that instance by default. For this verification task, Full Data Verification is set to Verify based on the number of table rows. If you have already configured a full data verification task, your configuration takes precedence.

  • A two-way synchronization task includes forward and reverse synchronization tasks. When you configure or reset the task, if the destination object of one task matches the synchronization object of the other task:

    • Allow only one task to synchronize full and incremental data. The other task supports only incremental synchronization.

    • Data from the source of the current task synchronizes only to the destination of the current task. It does not serve as source data for the other task.

  • Because DTS writes data concurrently, the storage space used by the destination is 5% to 10% larger than the source.

  • To get the count of documents in the destination MongoDB, use the syntax db.$table_name.aggregate([{ $count:"myCount"}]).

  • Ensure the destination MongoDB does not have the same primary key (default is _id) as the source. Otherwise, data loss may occur. If the destination has the same primary key, clear related data in the destination without affecting your business (delete documents with the same _id value in the destination).

  • Disable the Balancer in your source MongoDB database during full synchronization. Keep it disabled until all subtasks reach the incremental phase. Otherwise, data inconsistency may occur. For more information about managing the Balancer, see Manage the MongoDB Balancer.

  • If you do not need the schema synchronization feature provided by DTS—for example, if data partitioning is already configured at the destination—do not select Configure Objects, and then select Synchronization Types and Schema Synchronization. Otherwise, sharding conflicts can lead to data inconsistency or task failure.

  • If you switch traffic to the destination MongoDB database, ensure your business behavior meets MongoDB's requirements for sharded collections.

  • DTS does not support synchronizing time-series collections introduced in MongoDB 5.0 and later.

  • If a task fails, DTS support staff will attempt to restore it within eight hours. During restoration, they may restart the task or adjust its parameters.

    Note

    Only DTS task parameters are modified—not database parameters. Parameters that may be adjusted include those listed in Modify instance parameters.

One-way synchronization between MongoDB (sharded cluster) instances

If your destination database is MongoDB—such as a self-managed MongoDB instance or ApsaraDB for MongoDB—apply the following considerations and limits:

Type

Description

Source and destination database limits

  • Bandwidth requirement: The server where your source database resides must have sufficient outbound bandwidth. Otherwise, synchronization speed may be affected.

  • The collections to synchronize must have a primary key or UNIQUE constraint, and the field values must be unique. Otherwise, duplicate data may appear in the destination database.

  • The _id field in the collections to synchronize must have unique values. Otherwise, data inconsistency may occur.

  • If you synchronize at the collection level and need to edit objects (such as rename collections), one synchronization task supports up to 1,000 collections. If you exceed this limit, the task fails with an error after submission. To resolve this, split the collections into batches and configure multiple tasks. Or configure a full-database synchronization task.

  • A single document in your source database cannot exceed 16 MB. Otherwise, the task fails.

  • Your source database cannot be Azure Cosmos DB for MongoDB or Amazon DocumentDB (elastic cluster).

  • Your source database must enable Oplog and retain Oplog for at least seven days. Or enable Change Streams and ensure DTS can subscribe to data changes in the last seven days through Change Streams. Otherwise, DTS may fail to capture data changes, causing task failure. In extreme cases, this may cause data inconsistency or loss. These issues are not covered by the DTS Service-level agreement (SLA).

    Important
    • We recommend using Oplog to capture data changes.

    • Only MongoDB 4.0 and later support Change Streams. Change Streams do not support two-way synchronization.

    • If your source database is Amazon DocumentDB (non-elastic cluster), manually enable Change Streams. When you configure the task, set Migration Method to ChangeStream and set Architecture to Sharded Cluster.

  • If DTS uses Oplog for incremental synchronization, scaling the number of shards is not supported for a source MongoDB sharded cluster. Otherwise, DTS tasks may fail and cause data inconsistency.

  • If your source instance is a self-managed MongoDB sharded cluster:

    • Access Method supports only Express Connect, VPN Gateway, or Smart Access Gateway and Cloud Enterprise Network (CEN).

    • If your MongoDB version is 8.0 or later and Migration Method is Oplog, ensure the Shard account used by the synchronization task has the directShardOperations permission. Add this permission using the command db.adminCommand({ grantRolesToUser: "username", roles: [{ role: "directShardOperations", db: "admin"}]}).

      Note

      Replace username in the command with the Shard account used by the synchronization task.

    • If Migration Method is Oplog and the task includes full data synchronization, ensure the Mongos account for your source MongoDB sharded cluster has permission to run the db.runCommand({"balancerStatus":1}) command. DTS checks whether the Balancer is disabled using this command during precheck.

  • The number of Mongos nodes in your source MongoDB sharded cluster instance cannot exceed 10.

  • If the collections to synchronize contain TTL indexes, data inconsistency or latency may occur.

  • Ensure no orphaned documents exist in the source and destination instances. Otherwise, data inconsistency or task failure may occur. For more information, see orphaned documents and How to clean orphaned documents in MongoDB (sharded cluster architecture).

  • Source database operation limits:

    • During schema synchronization and full data synchronization, do not change the schema of databases or collections (including array-type data updates). Otherwise, the synchronization task may fail or cause data inconsistency between the source and destination databases.

    • If you perform only full data synchronization, do not write new data to the source instance. Otherwise, data inconsistency may occur between the source and destination databases.

    • While the synchronization instance is running, do not run commands that change data distribution on the objects to synchronize in the source database (for example, shardCollection, reshardCollection, unshardCollection, moveCollection, or movePrimary). Otherwise, data inconsistency may occur.

  • If the Balancer in your source database balances data, latency may occur.

  • DTS does not support connecting to MongoDB databases using SRV addresses.

  • If the source database is MongoDB 5.0 or later and the destination database version is earlier than 5.0, you cannot sync capped collections. If you attempt to sync them, the task fails or data inconsistency occurs between the source and destination databases. This is because, starting from MongoDB 5.0, the behavior of capped collections changed to allow operations such as explicit deletion and increasing document size during updates, and earlier versions of the database kernel cannot support these new features.

Other limits

  • Before starting the task, add the shard key used by the destination to the data in the source. After starting the task, INSERT statements for the data to synchronize must include the shard key. UPDATE statements cannot modify the shard key.

  • We recommend using the same MongoDB version for the source and destination databases. Or use a higher version for the destination to ensure compatibility. Synchronizing from a higher version to a lower version may cause compatibility issues.

  • DTS does not support synchronizing data from the admin, config, and local databases.

  • Transaction information is not preserved. Transactions in the source database become individual records in the destination database.

  • When DTS writes data to the destination collection, if a primary key or unique key conflict occurs, DTS skips the write statement and keeps existing data in the destination collection.

  • If your source MongoDB version is earlier than 3.6 and your destination MongoDB version is 3.6 or later, differences in execution plans may cause inconsistent field ordering in synchronized data. Field-value relationships remain unchanged. If your business uses text-matching queries on nested structures, assess the impact of inconsistent field ordering.

  • Evaluate the performance of both the source and destination databases before synchronization. Run synchronization during off-peak hours. Otherwise, full data initialization consumes read and write resources, increasing database load.

  • Full initialization runs INSERT operations concurrently. This creates fragmentation in the destination database collection. After full initialization, the collection space in the destination instance is larger than in the source instance.

  • If the destination collection has a unique index or its capped property is set to true, concurrent replay is not supported during incremental synchronization. Only single-threaded writes are allowed. This may increase task latency.

  • Because DTS writes data concurrently, the storage space used by the destination is 5% to 10% larger than the source.

  • To get the count of documents in the destination MongoDB, use the syntax db.$table_name.aggregate([{ $count:"myCount"}]).

  • Ensure the destination MongoDB does not have the same primary key (default is _id) as the source. Otherwise, data loss may occur. If the destination has the same primary key, clear related data in the destination without affecting your business (delete documents with the same _id value in the destination).

  • Disable the Balancer in your source MongoDB database during full synchronization. Keep it disabled until all subtasks reach the incremental phase. Otherwise, data inconsistency may occur. For more information about managing the Balancer, see Manage the MongoDB Balancer.

  • If you do not need the schema synchronization feature provided by DTS—for example, if data partitioning is already configured at the destination—do not select Configure Objects, and then select Synchronization Types and Schema Synchronization. Otherwise, sharding conflicts can lead to data inconsistency or task failure.

  • If you switch traffic to the destination MongoDB database, ensure your business behavior meets MongoDB's requirements for sharded collections.

  • DTS does not support synchronizing time-series collections introduced in MongoDB 5.0 and later.

  • If a task fails, DTS support staff will attempt to restore it within eight hours. During restoration, they may restart the task or adjust its parameters.

    Note

    Only DTS task parameters are modified—not database parameters. Parameters that may be adjusted include those listed in Modify instance parameters.

Two-way synchronization between MongoDB (ReplicaSet) instances

Apply the following considerations and limits:

Type

Description

Source and destination database limits

  • Bandwidth requirement: The server where your source database resides must have sufficient outbound bandwidth. Otherwise, synchronization speed may be affected.

  • The collections to synchronize must have a primary key or UNIQUE constraint, and the field values must be unique. Otherwise, duplicate data may appear in the destination database.

  • If you synchronize at the collection level and need to edit objects (such as rename collections), one synchronization task supports up to 1,000 collections. If you exceed this limit, the task fails with an error after submission. To resolve this, split the collections into batches and configure multiple tasks. Or configure a full-database synchronization task.

  • A single document in your source database cannot exceed 16 MB. Otherwise, the task fails.

  • Your source database cannot be Azure Cosmos DB for MongoDB or Amazon DocumentDB (elastic cluster).

  • Your source database must enable Oplog and retain Oplog for at least seven days. Otherwise, DTS may fail to capture data changes, causing task failure. In extreme cases, this may cause data inconsistency or loss. These issues are not covered by the DTS SLA.

    Important
    • We recommend using Oplog to capture data changes.

    • Change Streams do not support two-way synchronization.

  • If the collections to synchronize contain TTL indexes, data inconsistency or latency may occur.

  • The source and destination databases must be ApsaraDB for MongoDB instances of the same architecture. Two-way synchronization is not supported for self-managed MongoDB instances or MongoDB instances of different architectures.

  • Source database operation limits:

    • During schema synchronization and full data synchronization, do not change the schema of databases or collections (including array-type data updates). Otherwise, the synchronization task may fail or cause data inconsistency between the source and destination databases.

    • If you perform only full data synchronization, do not write new data to the source instance. Otherwise, data inconsistency may occur between the source and destination databases.

  • DTS does not support connecting to MongoDB databases using SRV addresses.

  • If the source database is MongoDB 5.0 or later and the destination database version is earlier than 5.0, you cannot sync capped collections. If you attempt to sync them, the task fails or data inconsistency occurs between the source and destination databases. This is because, starting from MongoDB 5.0, the behavior of capped collections changed to allow operations such as explicit deletion and increasing document size during updates, and earlier versions of the database kernel cannot support these new features.

Other limits

  • We recommend using the same MongoDB version for the source and destination databases. Or use a higher version for the destination to ensure compatibility. Synchronizing from a higher version to a lower version may cause compatibility issues.

  • If a two-way synchronization instance in the China (Chengdu) or China (Shanghai) region includes a full data synchronization task, DTS creates a full data verification task for that instance by default. For this verification task, Full Data Verification is set to Verify based on the number of table rows. If you have already configured a full data verification task, your configuration takes precedence.

  • The architecture of the source and target ApsaraDB for MongoDB must be consistent.

  • DTS does not support synchronizing data from the admin, config, and local databases.

  • Transaction information is not preserved. Transactions in the source database become individual records in the destination database.

  • When DTS writes data to the destination collection, if a primary key or unique key conflict occurs, DTS skips the write statement and keeps existing data in the destination collection.

  • If your source MongoDB version is earlier than 3.6 and your destination MongoDB version is 3.6 or later, differences in execution plans may cause inconsistent field ordering in synchronized data. Field-value relationships remain unchanged. If your business uses text-matching queries on nested structures, assess the impact of inconsistent field ordering.

  • Evaluate the performance of both the source and destination databases before synchronization. Run synchronization during off-peak hours. Otherwise, full data initialization consumes read and write resources, increasing database load.

  • Full initialization runs INSERT operations concurrently. This creates fragmentation in the destination database collection. After full initialization, the collection space in the destination instance is larger than in the source instance.

  • Do not write data to the destination database except through DTS during synchronization. Otherwise, data inconsistency may occur between the source and destination databases. For example, if you use DMS to perform online DDL operations while other data is written to the destination database, data loss may occur.

  • If the destination collection has a unique index or its capped property is set to true, concurrent replay is not supported during incremental synchronization. Only single-threaded writes are allowed. This may increase task latency.

  • A two-way synchronization task includes forward and reverse synchronization tasks. When you configure or reset the task, if the destination object of one task matches the synchronization object of the other task:

    • Allow only one task to synchronize full and incremental data. The other task supports only incremental synchronization.

    • Data from the source of the current task synchronizes only to the destination of the current task. It does not serve as source data for the other task.

  • Because DTS writes data concurrently, the storage space used by the destination is 5% to 10% larger than the source.

  • To get the count of documents in the destination MongoDB, use the syntax db.$table_name.aggregate([{ $count:"myCount"}]).

  • Ensure the destination MongoDB does not have the same primary key (default is _id) as the source. Otherwise, data loss may occur. If the destination has the same primary key, clear related data in the destination without affecting your business (delete documents with the same _id value in the destination).

  • DTS does not support synchronizing time-series collections introduced in MongoDB 5.0 and later.

  • If a task fails, DTS support staff will attempt to restore it within eight hours. During restoration, they may restart the task or adjust its parameters.

    Note

    Only DTS task parameters are modified—not database parameters. Parameters that may be adjusted include those listed in Modify instance parameters.