In workloads that involve large transactions, commits can take an excessive amount of time to write to the binlog. This can make the instance non-writable for extended periods or even cause it to hang. ApsaraDB RDS for MySQL introduces the binlog cache free flush feature to optimize the commit speed for large transactions. This feature resolves performance fluctuations and prolonged write unavailability, improving overall instance stability.
Features
Because binlog writes are serial, a large transaction is slow to write its binlog events and also blocks other transactions from committing. This feature optimizes the binlog write mechanism for large transaction commits, making commit speed independent of the number of binlog events. This feature resolves the following common issues:
Extremely large transactions block the instance, preventing other transactions from committing.
Large transactions cause high I/O consumption, which can exhaust I/O resources and slow down other operations, such as queries.
Large transactions slow down the execution of small transactions, leading to a surge in active connections and performance fluctuations. In severe cases, this can trigger a cascading failure and make the instance unavailable.
Frequent execution of large transactions causes sustained performance instability for your workload.
A large transaction refers to a transaction that generates a large volume of binlog events. For more information, see Problem analysis.
For more information about how the binlog cache free flush feature works, see How it works.
Prerequisites
The instance must meet one of the following version requirements:
MySQL 8.4
MySQL 8.0 with a minor engine version of 20240731 or later.
MySQL 5.7 with a minor engine version of 20240731 or later.
NoteOn the Basic Information page, check the Configuration Information section for the Upgrade minor engine version button. If the button is available, you can upgrade your instance. If the button is not available, your instance is already on the latest minor engine version. For more information, see Upgrade minor engine version.
Usage
New instances: The binlog cache free flush feature is enabled by default.
Existing instances: You can enable this feature by setting the
loose_binlog_cache_free_flushglobal system variable. The change takes effect immediately without requiring an instance restart.Related parameter:
loose_binlog_cache_free_flush_limit_size: After you enable the binlog cache free flush feature, if the binlog generated by a transaction exceeds the value of this parameter, the binlog cache temporary file is promoted to a binlog file upon commit. Default value: 256 MB. Value range: 20971520 to 18446744073709551615 (unit: bytes).
Problem analysis
When a MySQL instance executes a large transaction, the slow query log may record unusual delays for statements that are expected to complete quickly, such as COMMIT.
This issue is caused by resource contention in the serial binlog write mechanism. A large transaction in one thread takes a long time to write its binlog, blocking other transactions from committing and causing a global, serialized delay.
During a sysbench oltp_write_only test, after running a large UPDATE operation that generates 512 MB of binlog every 5 seconds, we observed the following effects:
Every time the large
UPDATEcommits, the Transactions Per Second (TPS) drops sharply.The performance impact is proportional to the binlog size: the larger the binlog generated by the transaction, the more significant the latency impact on other workloads.
The following figure illustrates the impact of large transactions on a running workload.

The UPDATE-related operations:
CREATE TABLE t_large (a INT, b LONGTEXT) ENGINE = InnoDB;
# Insert a row with 256 MB of text. The resulting UPDATE generates 512 MB of binlog events.
INSERT INTO t_large VALUES (1, repeat('a', 256000000));
UPDATE t_large SET a = a + 1;Optimization effects
Significantly improved TPS stability
Before optimization: The large
UPDATEoperation, which runs every 5 seconds, causes severe TPS fluctuations.After optimization (orange line): The impact of the large
UPDATEon TPS is almost eliminated.

No significant latency fluctuations
As shown by the orange line (after optimization), the impact of
UPDATEoperations of different sizes on latency is no longer significant.Even when the binlog size exceeds 512 MB, the commit latency of other transactions is not significantly affected.

How it works
Binlog cache
The binlog cache is a temporary, session-level space used to stage binlog events. Each session is allocated its own binlog cache, which consists of a memory cache (buffer) and a temporary file. The size of the memory cache is controlled by the binlog_cache_size parameter. When a transaction is large enough to fill the memory cache, its events are written to the temporary file.
During transaction execution, generated events are temporarily stored in the binlog cache. At commit time, the system must read these events sequentially from the binlog cache, update the end_pos (end position) and checksum for each event, and then write them to the binlog file. To ensure the transaction's events are contiguous in the binlog file, a lock protects this entire process. While one transaction holds this lock to write to the binlog file, all other committing transactions are blocked.
Issues with large transaction binlog writes
Writing these events to the binlog file during commit can be time-consuming and severely impact the instance in two main ways:
The process holds a binlog write lock, making the entire instance non-writable for the duration of the write.
The write process consumes significant I/O resources. In environments with limited I/O capacity, this can cause the entire instance to hang.
Binlog cache free flush optimization
AliSQL redesigns the temporary files used by the binlog cache, allowing them to be directly promoted to binlog files. When this feature is enabled, committing a large transaction promotes its temporary file directly into a new binlog file. This process is extremely fast, consumes minimal I/O resources, and completely eliminates the negative impacts of writing binlogs for large transactions.
Changes to the binlog cache
To allow a binlog cache temporary file to be directly promoted into a binlog file, AliSQL modifies how the binlog cache is used in two key ways:
When writing to the binlog cache's temporary file, a space is reserved at the beginning of the file. During the promotion process, this space is used to store the header events of the binlog.
When writing to the binlog cache, the
end_posof each event is calculated based on the size of this reserved space.
Binlog file promotion
When a large transaction is committed, its binlog cache temporary file is promoted directly into a new binlog file. During this process, the binlog's file header and initial header events are written into the reserved space. This reserved space contains four main components:
File Header: A 4-byte marker,
[ 0xFE 'bin'], is written at the beginning of every binlog file to identify it as such.Header Events: Includes the Format description event and Previous gtid event.
Empty event: An existing, ignorable event type is used to fill any remaining reserved space.
Gtid event: The GTID for the large transaction is generated during the commit phase, and its event is also written into the reserved space.
The content following the reserved space consists of the events from the original binlog cache, which typically include Query, Table map, Row, and Xid events.
A binlog file promoted by this feature differs from a regular one in two ways:
It contains an additional Empty event.
The checksum for this file is disabled by default.
The binlog cache free flush feature does not alter the binlog format. Replication and third-party tools that rely on binlog files are not affected.