The 2019 Double 11 sales promotion marked a key milestone in Ant Group's cloud-native architecture transformation, with Service Mesh as a vital component. A primary goal was to decouple the business layer from the infrastructure layer, and Service Mesh provided a practical solution. This topic describes the core aspects of Ant Group's Service Mesh implementation.
This topic covers the following aspects:
Building fundamental capabilities
SOFAMosn capabilities
SOFAMosn includes the following capabilities:
The basic capabilities of a network proxy.
Cloud-native capabilities such as Extended Discovery Service (XDS).
SOFAMosn main modules

Business support
As a high-performance and secure network proxy, SOFAMosn supports various business scenarios, such as RPC, MSG, and GATEWAY.

I/O model
SOFAMosn supports two I/O models:
Classic Golang model: In Ant Group's implementation scenarios, the number of connections is not a bottleneck, typically ranging from thousands to tens of thousands. Therefore, Ant Group chose the classic Golang model: goroutine-per-connection.

Model defects: The number of coroutines is proportional to the number of connections. In scenarios with many connections, many coroutines leads to the following overhead:
Stack memory overhead
Read buffer overhead
Runtime scheduling overhead
RawEpoll model: This model uses the Reactor pattern, which is an I/O multiplexing and non-blocking I/O model. The RawEpoll model is better suited for scenarios where the access layer and gateway have many persistent connections.

Steps:
Establish a connection:
Register a oneshot readable event listener with Epoll. At this point, no coroutine can call
conn.readto avoid conflicts withruntime Netpoll.When a readable event arrives, a coroutine is selected from the goroutine pool to process the read event. Because oneshot mode is used, subsequent readable events for this file descriptor (fd) are not triggered.
During request processing, coroutine scheduling is consistent with the classic Netpoll model.
After the request is processed, the coroutine is returned to the coroutine pool, and the fd is re-registered with RawEpoll.
Coroutine model

One TCP connection corresponds to one Read coroutine, which performs packet reception and protocol parsing.
One request corresponds to one Worker coroutine, which executes business processing, Proxy, and Write logic.
In the conventional model, a TCP connection has two coroutines: Read and Write. The Ant Group team eliminated the separate Write coroutine and assigned its tasks to the worker pool's coroutines. This change reduces scheduling latency and memory usage.
Capability extension
Capability extension includes the following aspects:
Protocol extension: SOFAMosn provides a protocol plugin mechanism using a unified encoding and decoding engine and core encoder and decoder interfaces. The following protocols are supported:
SOFARPC
HTTP1.x/HTTP2.0
Dubbo
NetworkFilter extension: SOFAMosn implements a Network Filter extension mechanism by providing a Network Filter registration mechanism and unified packet read/write filter interfaces. This extension supports the following features:
TCP proxy
Fault injection
StreamFilter extension: SOFAMosn implements a Stream Filter extension mechanism by providing a stream filter registration mechanism and unified stream send/receive filter interfaces. This extension supports the following features:
Traffic mirroring
RBAC authentication
TLS secure link
As a financial technology company, fund security is paramount, and link encryption is the most fundamental capability for ensuring security. The Ant Group team conducted extensive research and testing on TLS secure links. The test results show the following:
Native Go TLS is heavily optimized with assembly and achieves 80% of the performance of Nginx (OpenSSL).
The Boring version of Go uses CGO to call BoringSSL. Because of CGO's performance issues, this version offers no performance advantage.
Therefore, the Ant Group team ultimately chose native Go TLS. The team believes the Go Runtime team will introduce more optimizations, and the team also has its own optimization plans.

Go is not highly optimized for RSA. Go-boring (CGO) is twice as fast as Go.
p256 has assembly optimizations in Go, and its ECDSA performance is better than Go-boring.
For AES-GCM symmetric encryption, Go is 20 times faster than Go-boring.
Corresponding assembly optimizations also exist for HASH algorithms such as SHA and MD.
To meet the security compliance requirements of financial scenarios, the Ant Group team also developed support for Chinese cryptographic algorithms, which are not available in Go Runtime. Compared to the international standard AES-GCM, there is currently a performance gap of about 50%. The Ant Group team is planning subsequent optimizations.

Smooth upgrade capability
To make SOFAMosn releases transparent to applications, the Ant Group team developed a smooth upgrade solution. This solution is similar to Nginx's binary hot upgrade capability. The main difference is that connections to the old SOFAMosn process are not dropped. Instead, they are migrated to the new process, including the underlying socket file descriptors (FDs) and the associated application data. This ensures that business operations are not affected and the upgrade process is transparent to the application during the entire binary release. In addition to protocols such as SOFARPC, Dubbo, and messaging, this solution also supports the migration of TLS-encrypted links.
The smooth upgrade capability includes the following aspects:
Container upgrade: The main process is as follows.

First, inject a new SOFAMosn instance.
Check for an old SOFAMosn instance using a Unix Socket on a shared volume.
If an old SOFAMosn instance exists, its connections are migrated, and then the old instance exits.
SOFAMosn connection migration: The core of connection migration is the migration of kernel Sockets and application data. Connections are maintained without interruption and are transparent to the user.

SOFAMosn metric migration: The Ant Group team uses shared memory to share metric data between the old and new processes. This ensures that the metric data remains accurate during the migration.

Memory reuse mechanism
The main features of the memory reuse mechanism are as follows:
It is based on sync.Pool.
Fine-grained Slab allocation is used for Slice reuse to improve the reuse rate.
It reuses commonly used structs.

Current status:
The online reuse rate can reach over 90%.
sync.Pool still has some issues. However, with continuous optimization of sync.Pool in the Runtime, such as using lock-free structures to reduce lock contention and adding a victim cache mechanism in Go 1.13, its performance is expected to improve.
XDS (UDPA)
This service supports the cloud-native Universal Data Plane API (UDPA) and fully dynamic configuration updates through the Extended Discovery Service (XDS).

Prerequisites
Performance stress testing and optimization
During the preparation phase before the online launch, the Ant Group team conducted extensive stress testing and optimization for the core application `cashiercloudtb` in a grayscale environment. This process laid a solid foundation for the subsequent implementation.
When moving from the offline environment to the grayscale environment, the Ant Group team encountered many large-scale scenarios that were not present offline, such as:
A single instance with tens of thousands of backend nodes and thousands of routing rules. This scenario not only consumes a large amount of memory but also significantly affects the efficiency of routing matching.
Massive, high-frequency service publishing and registration. This poses a great challenge to performance and stability.
The entire stress testing and optimization process took five months. Initially, the overall CPU usage increased by 20% and the response time (RT) increased by 0.8 ms per hop. After optimization, the overall CPU usage increased by only 6%, the RT per hop increased by 0.25 ms, and the peak memory usage was reduced to one-tenth of the original.
Overall CPU increase |
RT per hop |
Peak memory usage |
|
Before optimization |
20% |
0.8 ms |
2365 M |
After optimization |
6% |
0.25 ms |
253 MB |
Some optimization measures:
During the 6.18 sales promotion, the Ant team deployed some core link applications, causing the CPU overhead to increase by a maximum of 1.7%. For some applications that were migrated from Java to Go, the CPU overhead was even reduced by approximately 8%. The average latency increase per hop was 0.17 ms. For two co-located systems, the end-to-end latency increased by 5 to 6 ms, representing an overhead of about 7%.
When SOFAMosn was launched in a single data center, its overall performance under end-to-end stress testing improved. For example, during transaction payments, the RT with SOFAMosn was 7.5% lower than without it.
The numerous core optimizations in SOFAMosn, along with business logic optimizations such as sinking the Route Cache, also provided architectural benefits.
Go version selection
Upgrading versions requires a series of tests because new versions are not always the most suitable for a specific scenario. This project initially used Go 1.9.2. After a year of iteration, the Ant Group team began to investigate the latest version at the time, Go 1.12.6. Tests verified several beneficial optimizations in the new version, and the team also modified the default memory reclamation policy to better meet project requirements.
GC optimization to reduce long-tail requests: The new version's self-preemption mechanism breaks up long-running GC marking processes. This results in smoother GC performance and reduces the impact on business latency.
Go 1.9.2
Go 1.12.6
Memory reclamation policy: Go 1.12 changed the memory reclamation policy from the default MADV_DONTNEED to MADV_FREE. Although this was intended as a performance optimization, tests showed no significant performance improvement in practice. Instead, it consumed more memory, which interfered with monitoring and problem diagnosis. The Ant Group team reverted to the previous policy using GODEBUG=madvdontneed=1. There are related discussions in the community issues, and this default may change in future versions.
When using the default MADV_FREE policy in Go 1.12, HeapInuse was 43 MB, but HeapIdle was 600 MB and was never released.

Go Runtime bug fix
During the initial grayscale validation, SOFAMosn experienced a severe memory leak online, losing 1 GB of memory per day. The final investigation revealed a bug in the Go Runtime's Writev implementation. This bug caused the memory address of a Slice to be referenced by the underlying layer, preventing it from being released by the garbage collector (GC).
The Ant Group team submitted a bug fix to the official Go repository. The fix has been merged into Go 1.13. For more information, see internal/poll: avoid memory leak in Writev.




