Data Cache improves query performance in storage-compute separation clusters by caching hot data from remote storage to local CN nodes.
How it works
Starting from StarRocks v3.1.7 and v3.2.3, Data Cache is enabled by default in storage-compute separation clusters. It loads data from remote storage to the local cache on demand in megabyte-sized blocks, replacing the File Cache feature from earlier versions.
Starting from v3.4.0, queries on internal tables and data lakes in a storage-compute separation environment share the same Data Cache instance.
Limitations
-
This feature applies only to Serverless StarRocks instances with storage-compute separation.
-
Supported and enabled by default starting from StarRocks v3.1.7 and v3.2.3.
Configure Data Cache
Configure Data Cache with the following CN configuration items.
Disk cache capacity
The larger of the datacache_disk_size and starlet_star_cache_disk_size_percent parameters determines the disk cache capacity for a storage-compute separation cluster.
|
Parameter |
Description |
|
|
The maximum amount of data that can be cached on a single disk. Set this as a percentage (for example, |
|
|
The percentage of disk capacity that Data Cache can use. The default value is |
Table-level cache configuration
-
Disable caching for a specific table: Set the
datacache.enableproperty of an internal table tofalse. This prevents the table from using Data Cache. -
Limit the time range for cached data: Use the
datacache.partition_durationproperty to define a retention period for hot data. StarRocks does not cache data outside this time range. Supported time units areYEAR,MONTH,DAY, andHOUR. Examples include7 DAYand12 HOUR. If you do not specify this property, StarRocks treats all data as hot data, making it eligible for caching.NoteThis property applies only when
datacache.enableis set totrue.
Check the Data Cache status
-
Run the following SQL statement to view the disk usage limit for Data Cache:
SELECT * FROM information_schema.be_configs WHERE NAME LIKE '%starlet_star_cache_disk_size_percent%' OR NAME LIKE '%datacache_disk_size%'; -
In the Alibaba Cloud console, navigate to Business Insights > Cache Insights to view the cache hit rate for an instance or compute group.
Disable Data Cache
To disable Data Cache, run the following SQL statement:
SET [GLOBAL] skip_local_disk_cache = true;
To disable caching for a specific table only, configure the table-level datacache.enable property instead of disabling Data Cache globally.