Solution selection
ES offers two solutions for off-site disaster recovery:
-
OSS snapshot backup and restore: Backs up index data to Alibaba Cloud Object Storage Service (OSS). The first snapshot is a full backup; subsequent snapshots are incremental. Restore data to another ES instance through a cross-cluster OSS repository. Back up and restore data by using a cross-cluster OSS repository.
-
Cross-cluster replication (CCR): Replicates writable indexes from a leader cluster to one or more follower clusters asynchronously and incrementally, with near real-time sync. Ideal for disaster recovery with strict RPO and RTO requirements. Replicate data across clusters by using CCR.
Solution comparison
|
Solution |
Scenarios |
RPO |
RTO |
Major limitations |
|
OSS snapshot |
Periodic backup and restore of large-scale data (GB to PB). |
Hours to days (depends on snapshot interval). |
Several hours (depends on data volume and shard recovery). |
No continuous sync. Downtime may be required during restoration. |
|
CCR |
Off-site disaster recovery, read/write splitting, and proximity-based access. |
Near-zero (seconds). |
Seconds to minutes. |
Follower indexes are read-only. Requires identical mappings and shard counts. |
For off-site disaster recovery with low RPO and real-time availability requirements, CCR is optimal:
-
Synchronizes data in seconds, minimizing data loss.
-
If the primary cluster fails, switch traffic to the follower cluster immediately without waiting for snapshot restoration.
-
Higher initial cost, but more cost-effective long-term by preventing business losses from data unavailability.