Configure delayed release for elastic instances

更新时间:
复制 MD 格式

Elastic instances are destroyed shortly after they finish processing a request. If your function runs background tasks after the response is sent — such as uploading logs or syncing data — those tasks are interrupted when the instance is terminated, which can cause data loss.

The delayed release feature keeps an elastic instance alive for a configurable duration after it processes its last request, giving background tasks time to finish. During this period, Function Compute automatically transitions the instance between states based on resource utilization to balance performance and cost.

How it works

After the last request completes, Function Compute holds the instance for the configured delayed release duration and automatically moves it between the following states:

StateTrigger conditionEffect
ActivevCPU or GPU utilization (stream processors and decoders) exceeds the system thresholdBackground tasks continue running; billed at the active elastic instance rate
IdleBoth vCPU and GPU utilization drop below the system thresholdInstance remains available at reduced cost; billed at the idle elastic instance rate
DestroyedNo new requests arrive during the delayed release durationBilling stops

When an idle instance receives a new request, it wakes up within milliseconds — avoiding cold start latency.

If no new requests arrive during the delayed release duration, the instance is destroyed and billing stops. After destruction:

  • If minimum instances = 0, the next request triggers a cold start.

  • If minimum instances > 0, cold starts are eliminated.

Constraints

  • Delayed release applies only to elastic instances — those dynamically created in response to requests.

  • This feature does not affect minimum instances. Minimum instances have their own lifecycle management and billing rules.

  • Valid delayed release duration: 5 minutes to 60 minutes (inclusive).

Choose a keepalive strategy

Delayed releaseSession affinityDelayed release + session affinity
ScenarioBackground servicesServices with long-lifecycle sessionsBackground services and persistent session connections
Instance keepalive duration5 minutes to 60 minutes (configured)Latest session expiration time on a single instanceThe greater of: (1) the delayed release duration, or (2) the maximum Session Idle Duration across all sessions on a single instance

Configure delayed release for elastic instances

Configure the delayed release feature when you create a function, or follow these steps for an existing function.

Prerequisites

Before you begin, ensure that you have:

Enable delayed release

  1. Log on to the Function Compute console. In the left navigation pane, choose Function Management > Functions.

  2. In the top navigation bar, select a region. On the Functions page, click the name of the target function.

  3. Click the Function Details > Configuration tab, then click Edit next to Advanced Configuration.

  4. In the Advanced Configuration panel, expand Delayed Release for Elastic Instances. Turn on the Delayed Release for Elastic Instances switch, set Delayed Release Time, then click Deploy.

(Optional) Configure session affinity

This topic uses the HeaderField affinity feature as an example.

Verify the result

  1. On the function details page, click the Code tab and click Test Function.

  2. After the function executes, click the Instances tab and check the Lifecycle column to confirm that delayed release is active.

Billing

The billing rules in this section apply only to elastic instances created dynamically in response to requests, and are independent of billing for minimum instances.

Note: Idle instances still incur charges. Billing stops only after the instance is destroyed.

Billing by instance state

Instance stateTrigger conditionBilling rate
ActivevCPU or GPU utilization > system thresholdActive elastic instance rate
IdleBoth vCPU and GPU utilization < system thresholdIdle elastic instance rate
DestroyedNo new requests during the delayed release durationNo charge

Billing by instance type

  • Minimum instances: These instances are always running, regardless of whether there are requests. They are billed at the idle elastic instance rate when there are no requests, and at the active elastic instance rate when processing requests.

  • Instances with delayed release: After requests are processed, these instances enter an active, idle, or destroyed state based on the rules described in this topic.

Scenario 1: Only delayed release configured

Example: Delayed release duration = 9 minutes.

The instance finishes processing requests at 00:05. Because no new requests arrive, it is destroyed at 00:14 (9 minutes after the last request completed).

PeriodTrigger conditionBilling
00:00–00:05Processing requestsActive elastic instance rate
00:05–00:09vCPU or GPU utilization > threshold — instance stays activeActive elastic instance rate
00:09–00:14Both vCPU and GPU utilization < threshold — instance switches to idleIdle elastic instance rate
image

Scenario 2: Delayed release and session affinity both configured

Example:

  • Delayed release duration = 9 minutes

  • HeaderField affinity enabled, Session Idle Duration = 15 minutes

The system uses 15 minutes as the keepalive period — the greater of the delayed release duration (9 minutes) and the Session Idle Duration (15 minutes). The instance is destroyed at 00:20 if no new requests arrive.

Billing follows the same three-period pattern as Scenario 1, with the idle phase extended to reflect the 15-minute keepalive.

image

What's next