Query acceleration

更新时间:
复制 MD 格式

Simple Log Service accelerates PromQL queries through global cache and concurrent computing.

How it works

Global cache and concurrent computing work as follows.

Global cache

imageThe Prometheus Query engine does not cache results by default — every query recomputes all data, which degrades performance over large datasets and long time ranges. Global cache reuses partial results from previous identical queries (same PromQL statement and step value). Cached results cover the overlapping time range; only uncached intervals are recomputed.

Important
  • Global cache aligns the query time range (start/end) to integer multiples of the step parameter, which increases the cache hit ratio. Query results are updated in the cache after each query completes.

  • Incomplete query results and results outside the cached time range are not cached, which preserves data integrity.

Concurrent computing

Standard Prometheus queries run in a single coroutine on one server, which is slow for large numbers of time series, long time ranges, or complex logic. Simple Log Service extends the Prometheus engine to split PromQL queries by time interval or time series and distribute them across multiple servers, improving performance by 2x to 10x.

  • Splitting by time interval

    Example: a 12-hour query is split into 6 subqueries at 2-hour intervals. All 6 subqueries run concurrently and results are merged.

    query:    sum(metric)
    interval: 12h
    step:     2m
  • Splitting by time series

    Example: a metric with 500,000 time series and 10 concurrent tasks runs 10 parallel tasks of 50,000 time series each, then merges results.

Important
  • The SLS time series engine automatically identifies PromQL queries eligible for concurrent computing and distributes them across nodes.

  • Concurrent computing benefits queries with long time ranges or large numbers of time series. For small data volumes, it may degrade performance. Use HTTP FormValue mode to configure concurrent computing per query.

  • Aggregation without() (equivalent to by on all labels) yields minimal performance gains because intermediate results are barely reduced. With many time series, performance may even degrade. Aggregation by on multiple labels also shows limited improvement.

Parameters

Configure query acceleration parameters in MetricsConfig or HTTP FormValue mode.

Category

Parameter

Description

MetricsConfig

FormValue

Description

parallel_config

(parameters of concurrent computing)

enable

Enables concurrent computing. Disabled by default.

Supported

Supported

Splits a query into subqueries distributed across child compute nodes. Results are aggregated on the primary node.

mode

Concurrent computing mode. Valid values:

  • auto: The system selects concurrency based on recent query results.

  • static: Manually configure time splitting and concurrency.

Supported

Unsupported

auto or static mode. For static mode, consult Simple Log Service technical support.

timePieceInterval

Time-split interval in seconds. Valid values: 3600 to 2592000. Default: 21600 (6 hours).

Supported

Supported

Time-split interval in seconds. In the console, specify a value in whole hours.

timePieceCount

Number of subqueries after time-based splitting. Valid values: 1 to 16. Default: 8.

Supported

Supported

Number of subqueries after time-based splitting.

totalParallelCount

Total concurrent tasks for time-series splitting. Valid values: 2 to 64. Default: 8.

Supported

Supported

Total tasks for time-series splitting. Example: a metric with 5 million time series and totalParallelCount=10 runs 10 tasks of 500,000 time series each.

parallelCountPerHost

Concurrent tasks per server. Valid values: 1 to 8. Default: 2.

Supported

Supported

Number of time-series split tasks assigned to each server.

query_cache_config

(parameters of global cache)

enable

Enables global cache. Disabled by default.

Supported

Supported

Reuses partial results from previous identical queries.

MetricsConfig mode

MetricsConfig is a per-Metricstore parameter configurable through the Simple Log Service console or SDKs. Changes take effect after 3 minutes.

Console

On the Metricstore Attribute page, perform the following operations. For more information about how to access the Metricstore Attribute page, see Modify a metricstore.

Concurrent computing modes:

  • In auto mode, Simple Log Service estimates the concurrency level based on data volume from recent identical queries.

  • In static mode, you manually configure concurrency. Consult Simple Log Service technical support before using static mode.

image

SDK

Use the Simple Log Service SDK for Java to modify MetricsConfig in JSON format. Example:

{
  "parallel_config": {
    "enable": true,
    "mode": "static",
    "parallel_count_per_host": 2,
    "time_piece_count": 8,
    "time_piece_interval": 21600,
    "total_parallel_count": 8
  },
  "query_cache_config": {
    "enable": true
  }
}

The following table maps query acceleration parameters to MetricsConfig JSON fields and FormValue keys.

Category

Parameter

MetricsConfig

FormValue

parallel_config

enable

enable

x-sls-parallel-enable

mode

mode

None

timePieceInterval

time_piece_interval

x-sls-parallel-time-piece-interval

timePieceCount

time_piece_count

x-sls-parallel-time-piece-count

totalParallelCount

total_parallel_count

x-sls-parallel-count

parallelCountPerHost

parallel_count_per_host

x-sls-parallel-count-per-host

query_cache_config

enable

enable

x-sls-global-cache-enable

HTTP FormValue mode

In HTTP FormValue mode, query acceleration settings apply only to the current request. For more information about Metricstore-related HTTP API, see Time series metric query APIs.

Examples:

Global cache

Add x-sls-global-cache-enable=true to enable global cache.

https://{project}.{sls-endpoint}/prometheus/{project}/{metricstore}/api/v1/query_range?query=sum(up)&start=1690876800&end=1690877800&step=10&x-sls-global-cache-enable=true

Concurrent computing

Add x-sls-parallel-enable=true&x-sls-parallel-count=16 to enable concurrent computing and set the degree of concurrency to 16.

https://{project}.{sls-endpoint}/prometheus/{project}/{metricstore}/api/v1/query_range?query=sum(up)&start=1690876800&end=1690877800&step=10&x-sls-parallel-enable=true&x-sls-parallel-count=16