Lake storage optimization
As data accumulates in an Enterprise Edition instance, small files and expired snapshots degrade query performance and waste storage. Lake storage optimization compacts small files, cleans up orphan files, and manages snapshot lifecycles to maintain an optimal storage layout.
This feature is in the invitation-based testing phase. To apply for access, submit a ticket.
Scope
|
Condition |
Requirement |
|
Instance edition |
Enterprise Edition. |
|
Database type |
Only databases and tables in AnalyticDB for MySQL lake storage. |
|
Visibility |
Only databases visible in Data Catalog can be configured. |
Three-level inheritance
Configuration cascades through Instance > Database > Table. Each level inherits its parent's settings by default but can override them with custom values.
Instance-level settings (global defaults)
├── Database A (inherits instance settings)
│ ├── Table 1 (inherits Database A settings)
│ └── Table 2 (custom: compaction frequency = high)
└── Database B (custom: lake storage optimization disabled)
└── Table 3 (inherits Database B → disabled)
|
Policy |
Behavior |
Use case |
|
Inherit |
Follows parent-level changes automatically. |
Databases and tables that need uniform settings. |
|
Custom |
Uses independent settings, unaffected by parent-level changes. |
Objects with specific performance needs or that require optimization disabled. |
Priority: Table-level > Database-level > Instance-level. After you select Custom, that level and its children are no longer affected by changes at the parent level.
Turning off the Enable Lake Storage Optimization toggle at the instance level deactivates all database-level and table-level settings and stops optimization across the instance. Configured parameters are preserved and restore automatically when you re-enable optimization.
Access the settings
On the instance details page, choose in the left-side navigation pane.
|
Level |
Entry point |
|
Instance |
Click Edit in the Lake Storage Optimization section above the database list. |
|
Database |
Click the target database and go to the Lake Storage Optimization tab. |
|
Table |
Click the target table and go to the Lake Storage Optimization tab. |
Parameters
|
Parameter |
Description |
Valid values |
Default |
Instance |
Database |
Table |
|
Enable Lake Storage Optimization |
Global toggle for the instance. |
On / Off |
Off |
✓ |
— |
— |
|
Policy |
Inherit parent-level settings or use custom values. |
Inherit / Custom |
Inherit |
— |
✓ |
✓ |
|
Status |
Enable or disable optimization at this level. |
On / Off |
— |
— |
✓ |
✓ |
|
Resource Group |
Compute resources for running optimization tasks. |
Existing resource groups. |
— |
✓ |
✓ |
✓ |
|
Small File Merge Frequency |
How often small files are compacted. |
low / normal / high |
normal |
✓ |
✓ |
✓ |
|
Snapshot Retention Period |
How long historical snapshots are retained. |
1 to 7 days |
7 days |
✓ |
✓ |
✓ |
|
Orphan File Retention Period |
How long orphan files are retained before cleanup. |
3 / 5 / 7 / 10 / 14 days |
3 days |
✓ |
✓ |
✓ |
When the instance-level toggle is off, only the toggle is visible. Other parameters appear after you turn it on. At the database and table levels, parameters appear after you select Custom and enable the status.
Small file compaction frequency
Small file compaction merges small data files into larger ones to reduce file count and improve query efficiency.
|
Level |
Use case |
|
low |
Cold data tables with low write volume and infrequent changes. |
|
normal (default) |
General-purpose workloads. |
|
high |
Hot data tables with high write throughput and frequent queries. |
Higher compaction frequency consumes more compute resources. Assign a dedicated resource group for optimization tasks to avoid I/O contention with query workloads.
Snapshot retention period
Snapshots capture a table's full state at a point in time for data rollback and time travel queries. Excess snapshots increase storage usage and metadata overhead. The default retention period is 7 days; the minimum is 1 day.
Orphan file retention period
Orphan files are data files from writes, updates, or compaction that no valid snapshot references. They are automatically cleaned up after the retention period to reclaim storage.
Reducing the retention period saves storage but may affect long-running queries that are in progress. The minimum is 3 days.
Execution history
On the Lake Storage Optimization tab at the table level, view the execution history of compaction tasks, including read data files, added data files, read bytes, failed data files, start time, and status.
Recommended configurations
|
Scenario |
Recommended settings |
Reason |
|
High-frequency analytical tables. |
Compaction frequency = high, Orphan file retention = 3 days. |
Quickly eliminates small files and sustains query performance. |
|
Low-frequency archive tables. |
Compaction frequency = low, Orphan file retention = 14 days. |
Reduces resource consumption and extends the data recovery window. |
|
Mixed-workload databases. |
Database-level = inherit, Hot tables = custom with high, Cold tables = custom with low. |
Enables per-table optimization tailored to workload patterns. |
|
Development and test environments. |
Disable lake storage optimization at the instance level. |
Conserves compute resources. |