The Reindex API copies documents from a source index to a destination index. You can copy all documents or only those that match a specific query. This process can occur within the same cluster or across different clusters. This topic shows you how to use the Reindex API to migrate data from one cluster to another.
Limitations
Both clusters must be in the same region and availability zone.
Management and deployment mode: You can migrate data from a v2 cluster to a v3 cluster, between two v2 clusters, or between two v3 clusters.
Clusters have two management and deployment modes: Cloud-native New Management (v3) and Basic Management (v2). You can view the management and deployment mode of your cluster on its information page in the console.
Cluster version: Data migration is supported between clusters of the same major version, for example, from a lower to a higher minor version. Migrating data across major versions, such as from 7.7.1 to 8.15.1, is not recommended.
Prerequisites
In this example, we will migrate data from ES_2 to ES_1 using the Reindex API. Before you begin, complete the following preparations.
Prepare test data
In ES_2, create an index and insert test data:
PUT /product_info { "settings": { "number_of_shards": 5, "number_of_replicas": 1 }, "mappings": { "properties": { "productName": { "type": "text", "analyzer": "ik_smart" }, "annual_rate":{ "type":"keyword" }, "describe": { "type": "text", "analyzer": "ik_smart" } } } }This command creates an index named product_info that contains the productName, annual_rate, and describe fields. If successful, the request returns the following result.
{ "acknowledged" : true, "shards_acknowledged" : true, "index" : "product_info" }Insert six test documents:
POST /product_info/_bulk {"index":{}} {"productName":"Financial Product A","annual_rate":"3.2200%","describe":"A 180-day fixed-term product with a minimum investment of 20,000. Stable returns with optional message notifications."} {"index":{}} {"productName":"Financial Product B","annual_rate":"3.1100%","describe":"A 90-day regular investment product with a minimum investment of 10,000. Daily profit notifications are sent."} {"index":{}} {"productName":"Financial Product C","annual_rate":"3.3500%","describe":"A 270-day regular investment product with a minimum investment of 40,000. Daily profit notifications are sent."} {"index":{}} {"productName":"Financial Product D","annual_rate":"3.1200%","describe":"A 90-day regular investment product with a minimum investment of 12,000. Daily profit notifications are sent."} {"index":{}} {"productName":"Financial Product E","annual_rate":"3.0100%","describe":"A recommended 30-day regular investment product with a minimum investment of 8,000. Daily profit notifications are sent."} {"index":{}} {"productName":"Financial Product F","annual_rate":"2.7500%","describe":"A popular 3-day short-term product with no fees and a minimum investment of 500. Profit notifications are sent via SMS."}In ES_1, create an index to store the migrated data from ES_2:
PUT dest { "settings": { "number_of_shards": 5, "number_of_replicas": 1 } }
Private connection via NLB and PrivateLink
To enhance cluster security, clusters in the same VPC or different VPCs are network-isolated. You must use NLB and PrivateLink to establish a private connection (VPC connection) between the clusters.
As the following figure shows, the two ES clusters are deployed in the same VPC. An endpoint service is created in the user's VPC. Then, a private connection is configured in ES_1 to obtain an endpoint. Finally, the endpoint is associated with the endpoint service to establish a private connection between the two clusters.
An endpoint service is a service that other VPCs can connect to privately by creating an endpoint. You must manually create the related service resources.
An endpoint is associated with an endpoint service and provides a private network connection to access external services. When you configure a private connection for an Alibaba Cloud ES instance, an endpoint is automatically created in the network environment where the ES cluster resides.
For detailed configuration steps, see Establish a private connection between Alibaba Cloud ES clusters using NLB and PrivateLink. You must complete Step 1, Step 2, and Step 3.
On the Security configuration page of the ES_1 instance, in the Cluster network settings section, click Modify to the right of Configure instance private connection. In the Configure instance private connection panel, you can view the endpoint ID, endpoint service ID, and connection status. When the endpoint connection status is Connected, the ES_1 and ES_2 clusters can communicate through their private IP addresses.
Configure Reindex API whitelist
To ensure secure data migration between clusters, you must add the private connection address and port of the ES_2 cluster to the Reindex API whitelist of ES_1.
Go to the page for ES_1 and click Edit next to Configure Private Connection. In the Configure Private Connection side panel, click the target Endpoint ID.
To add a new connection, click + Add Private Connection at the bottom of the Configure instance private connection side panel.
In the VPC console, on the Endpoint Connections tab, click the
icon next to the endpoint ID to view its corresponding domain name.ImportantYou must remove the availability zone identifier from the domain name before adding it to the Reindex API whitelist.
For example, if the full domain name is "ep-bp1****************-cn-hangzhou-i.epsrv-bp1****************.cn-hangzhou.privatelink.aliyuncs.com", remove the availability zone identifier "-cn-hangzhou-i" to get the final domain name: "ep-bp1bp1****************.epsrv-bp1****************.cn-hangzhou.privatelink.aliyuncs.com".
In the YML file for ES_1, configure the Reindex API whitelist. The whitelist entry must be the endpoint's domain name and port.
reindex: remote: whitelist: >- ep-bp1bp1****************.epsrv-bp1****************.cn-hangzhou.privatelink.aliyuncs.com:9200On the ES cluster configuration page, click Modify configuration to the right of YML configuration. In the Other configure YAML editor in the panel, add the preceding whitelist configuration.
Call the Reindex API
Log on to the Kibana console for ES_1.
In Dev Tools > Console, call the Reindex API to migrate the data.
POST _reindex { "source": { "remote": { "host": "http://ep-bp1bp1****************.epsrv-bp1****************.cn-hangzhou.privatelink.aliyuncs.com:9200", "username": "elastic", "password": "xxx-xxxx123!" }, "index": "product_info", "query": { "match": { "productName": "Financial Product" } } }, "dest": { "index": "dest" } }Category
Parameter
Description
source
remote
The remote cluster. In this example, ES_2.
host
The access address of the ES_2 cluster. It includes:
The protocol. You can find this on the Basic Information page of the cluster.
ImportantFor security, use the HTTPS protocol to prevent the password from being transmitted in plain text when connecting to the cluster. To enable the HTTPS protocol, see HTTPS protocol.
Domain name: The private connection address of the ES_2 cluster. This must be the same domain name configured in the Reindex whitelist.
Port: Fixed at 9200.
username
The default username for the cluster is
elastic.password
The password for the specified user.
The password was set when you created the cluster. If you have forgotten it, you can reset the password.
index
The source index in the remote cluster.
query
A query that specifies which documents to migrate.
In this example, documents where the
productNamefield contains "Financial Product" are migrated from the ES_2 cluster's index to the ES_1 cluster.dest
index
The destination index in the target cluster for the migrated data.
If successful, the request returns the following result:
{ "took": 211, "timed_out": false, "total": 6, "updated": 6, "created": 0, "deleted": 0, "batches": 1, "version_conflicts": 0, "noops": 0, "retries": { "bulk": 0, "search": 0 }, "throttled_millis": 0, "requests_per_second": -1, "throttled_until_millis": 0, "failures": [] }Call the
_searchAPI to view the migration result.GET dest/_searchExpected result:
{ "took": 6, "timed_out": false, "_shards": { "total": 5, "successful": 5, "skipped": 0, "failed": 0 }, "hits": { "total": { "value": 6, "relation": "eq" }, "max_score": 1, "hits": [ { "_index": "dest", "_id": "n9kyqpcBCRuDZhswJCpH", "_score": 1, "_source": { "productName": "Financial Product D", "annual_rate": "3.1200%", "describe": "A 90-day regular investment product with a minimum investment of 12,000. Daily profit notifications are sent." } }, { "_index": "dest", "_id": "nNkyqpcBCRuDZhswJCpG", "_score": 1, "_source": { "productName": "Financial Product A", "annual_rate": "3.2200%", "describe": "A 180-day fixed-term product with a minimum investment of 20,000. Stable returns with optional message notifications." } }, { "_index": "dest", "_id": "ndkyqpcBCRuDZhswJCpG", "_score": 1, "_source": { "productName": "Financial Product B", "annual_rate": "3.1100%", "describe": "A 90-day regular investment product with a minimum investment of 10,000. Daily profit notifications are sent." } }, { "_index": "dest", "_id": "ntkyqpcBCRuDZhswJCpH", "_score": 1, "_source": { "productName": "Financial Product C", "annual_rate": "3.3500%", "describe": "A 270-day regular investment product with a minimum investment of 40,000. Daily profit notifications are sent." } }, { "_index": "dest", "_id": "oNkyqpcBCRuDZhswJCpH", "_score": 1, "_source": { "productName": "Financial Product E", "annual_rate": "3.0100%", "describe": "A recommended 30-day regular investment product with a minimum investment of 8,000. Daily profit notifications are sent." } }, { "_index": "dest", "_id": "odkyqpcBCRuDZhswJCpH", "_score": 1, "_source": { "productName": "Financial Product F", "annual_rate": "2.7500%", "describe": "A popular 3-day short-term product with no fees and a minimum investment of 500. Profit notifications are sent via SMS." } } ] } }