For read-heavy workloads, Tair (Redis OSS-compatible) allows you to dynamically enable or disable read/write splitting. This feature offers a highly available and performant solution for centralized hot data access and high-concurrency reads. In a read/write splitting instance, a proxy component developed by the Alibaba Cloud Tair team automatically identifies and routes read and write requests and handles failovers. This simplifies integration, as you do not need to modify your application code to manage request routing or failovers.
Read/write splitting in standard architecture
A read/write splitting instance that uses the standard architecture consists of a master node, one or more read replicas, proxy servers, and a high availability system. The following figures show the architectures.
Figure 1. Cloud-native
Figure 2. Classic (discontinued)
Component | Cloud-native | Classic (discontinued) |
Master node | Handles write requests and shares the read workload with read replicas. | |
Read replicas | Handle read requests. Read replicas have the following features:
| Handle read requests. Read replicas have the following features:
|
Replica node | Any read replica can serve as a replica node. If the master node fails, the high availability system promotes the read replica with the most complete data to be the new master node. After the switchover, a new read replica is immediately added to the instance. Because a dedicated replica node is not required, cloud-native read/write splitting instances offer the same performance at a lower cost. | A cold standby node for data backup. It does not serve traffic. If the master node fails, requests are failed over to this node. |
Proxy server | When a client connects, the proxy server automatically identifies the request type and distributes traffic to different data nodes by weight. All nodes have equal weights, and the weights are not customizable. For example, write requests are forwarded to the master node, and read requests are forwarded to the master node and read replicas. Note
| |
High availability system |
| |
Features
Dynamic and easy to use
You can enable read/write splitting for an instance that uses the standard architecture. The proxy server intelligently identifies and forwards read and write requests from clients. After you enable this feature, you can use any Redis-compatible client to connect to the read/write splitting instance to improve read performance without modifying your application. Instances with read/write splitting enabled are compatible with Redis protocol commands, but some command restrictions apply due to the proxy. For more information, see Command restrictions for read/write splitting instances.
High availability
A proprietary high availability system from Alibaba Cloud automatically monitors the health of all data nodes to ensure instance availability. If a master node becomes unavailable, the system automatically selects a new master node and rebuilds the replication topology. If a read replica fails, the high availability system automatically detects the failure, launches a new node to complete data synchronization, and takes the failed node offline.
The proxy servers monitor the service status of each read replica in real time. If a read replica becomes unavailable, the proxy automatically reduces its service weight. If a read replica fails more than a specified number of consecutive times, the proxy suspends service to the unavailable node. It continues to monitor the node and resumes its service after the node recovers.
High performance
You can scale out read replicas to linearly increase the overall performance of a read/write splitting instance. Source-code-level optimizations to the Redis replication process maximize system stability during linear replication and fully utilize the physical resources of each read replica.
Use cases
This feature is ideal for scenarios with high read queries per second (QPS). If your application is read-heavy, an instance that uses the standard architecture may not meet your QPS requirements. In this case, you can deploy multiple read replicas to overcome the performance bottleneck of a single node. After you enable read/write splitting, the read QPS of an instance can increase by up to nine times.
Due to the asynchronous replication mechanism of Redis, data replication latency can occur during periods of high write volumes. If you use this architecture, your application must be able to tolerate a certain degree of data staleness.
Read/write splitting in cluster architecture
In a cluster architecture, you can enable read/write splitting only for cloud-native instances that run in proxy mode. The following figure shows an example architecture.
Component descriptions
Component | Description |
Proxy server | After a client connects to a proxy server, the proxy automatically identifies client requests and forwards them to the appropriate data shards and their corresponding read/write nodes. For example, write requests are forwarded to the master node, and read requests are load-balanced across the master node and read replicas. |
Data shard | Each data shard consists of one master node and up to four read replicas.
|
High availability service |
|
If the instance is deployed in a single availability zone, all nodes are located in the primary zone, and the instance provides only an endpoint for the primary zone.
If the instance is deployed in a dual-zone configuration, separate endpoints are provided for the primary and secondary zones. Both endpoints support read and write operations. Read requests from the primary zone are routed to the master node or read replicas within that zone. Read requests from the secondary zone are routed only to the read replicas in that zone to ensure proximity-based access. All write requests are routed to the master node in the primary zone. If all read replicas in the secondary zone become unavailable, the system routes read requests from that zone to the master node to ensure business continuity.
Recommendations and usage notes
If a read replica fails, requests are forwarded to other nodes. If all read replicas become unavailable, all read requests are forwarded to the master node. If read replicas fail, the load on the master node increases, which can lengthen its response time. Therefore, we recommend that you use multiple read replicas for read-heavy workloads.
If a read replica fails, the high availability system suspends service to the failed node and launches a new read replica. This process involves resource allocation, instance creation, data synchronization, and service loading. The time required depends on the workload and data volume. Tair (Redis OSS-compatible) does not guarantee a recovery time objective for read replicas.
A dual-zone read/write splitting deployment requires the primary zone to have at least one master node and one read replica. Before you enable read/write splitting for a dual-zone standard architecture instance that has one master node in the primary zone and one replica node in the secondary zone, you must add a replica node to the primary zone to ensure it contains two nodes. Then, you can enable read/write splitting.
Some scenarios, such as a high-availability failover of the master node, trigger a full data synchronization on a read replica. During a full synchronization, the read replica is unavailable and returns the
-LOADING Redis is loading the dataset in memory\r\nmessage.Some read commands have special forwarding rules in the read/write splitting architecture. For example, the
SCANcommand is forwarded to the master node for execution, while the proxy distributes theHSCAN,SSCAN, andZSCANcommands evenly across the master node and read replicas based on a slot modulo calculation. For the complete set of forwarding rules, see Proxy routing rules.
Prerequisites
Before you begin, make sure that:
The instance is deployed in cloud-native mode
The instance is a Redis Open-Source Edition or Tair (Enterprise Edition) DRAM-optimized or persistent memory-optimized instance
The instance has at least 1 GB of memory
The instance is a high availability instance
Procedure
If you have not created an instance, you can enable read/write splitting when you create an instance.
If you have an existing cloud-native instance, you can directly enable read/write splitting.
FAQ
Q: Does enabling read/write splitting for a standard architecture instance increase its overall bandwidth?
A: Yes. After you enable read/write splitting, the instance's theoretical total bandwidth is the bandwidth of the instance specification multiplied by the total number of nodes (for example, 96 MB/s × 3 nodes = 288 MB/s). The added proxy nodes forward most read requests to the read replicas, which reduces the bandwidth pressure on the master node. However, the actual bandwidth is affected by factors such as business requests and clients. The actual bandwidth is subject to the results of stress tests.
Q: After I enable read/write splitting for a standard architecture instance, can I change its architecture to the cluster architecture?
A: Yes. You must first disable read/write splitting and then change the instance architecture.
Q: How do I check whether read/write splitting is enabled?
A: You can go to the Node Management page of the instance to check whether the read/write splitting option is enabled.
Q: Why are read requests not routed to my read replicas?
A: In a dual-zone read/write splitting architecture, the primary and secondary zones have separate endpoints. Read requests are routed only to the master node or read replicas within the same zone. If you use only the endpoint for the primary zone, no read requests are routed to the read replicas in the secondary zone. To enable proximity-based access and load balancing, you must explicitly differentiate the endpoints for the primary and secondary zones in your application code and route requests for the secondary zone to its specific endpoint.