Elastic instances are destroyed shortly after they finish processing a request. If your function runs background tasks after the response is sent — such as uploading logs or syncing data — those tasks are interrupted when the instance is terminated, which can cause data loss.
The delayed release feature keeps an elastic instance alive for a configurable duration after it processes its last request, giving background tasks time to finish. During this period, Function Compute automatically transitions the instance between states based on resource utilization to balance performance and cost.
How it works
After the last request completes, Function Compute holds the instance for the configured delayed release duration and automatically moves it between the following states:
| State | Trigger condition | Effect |
|---|---|---|
| Active | vCPU or GPU utilization (stream processors and decoders) exceeds the system threshold | Background tasks continue running; billed at the active elastic instance rate |
| Idle | Both vCPU and GPU utilization drop below the system threshold | Instance remains available at reduced cost; billed at the idle elastic instance rate |
| Destroyed | No new requests arrive during the delayed release duration | Billing stops |
When an idle instance receives a new request, it wakes up within milliseconds — avoiding cold start latency.
If no new requests arrive during the delayed release duration, the instance is destroyed and billing stops. After destruction:
If minimum instances = 0, the next request triggers a cold start.
If minimum instances > 0, cold starts are eliminated.
Constraints
Delayed release applies only to elastic instances — those dynamically created in response to requests.
This feature does not affect minimum instances. Minimum instances have their own lifecycle management and billing rules.
Valid delayed release duration: 5 minutes to 60 minutes (inclusive).
Choose a keepalive strategy
| Delayed release | Session affinity | Delayed release + session affinity | |
|---|---|---|---|
| Scenario | Background services | Services with long-lifecycle sessions | Background services and persistent session connections |
| Instance keepalive duration | 5 minutes to 60 minutes (configured) | Latest session expiration time on a single instance | The greater of: (1) the delayed release duration, or (2) the maximum Session Idle Duration across all sessions on a single instance |
Configure delayed release for elastic instances
Configure the delayed release feature when you create a function, or follow these steps for an existing function.
Prerequisites
Before you begin, ensure that you have:
Access to the Function Compute console
An existing function to configure
Enable delayed release
Log on to the Function Compute console. In the left navigation pane, choose Function Management > Functions.
In the top navigation bar, select a region. On the Functions page, click the name of the target function.
Click the Function Details > Configuration tab, then click Edit next to Advanced Configuration.
In the Advanced Configuration panel, expand Delayed Release for Elastic Instances. Turn on the Delayed Release for Elastic Instances switch, set Delayed Release Time, then click Deploy.
(Optional) Configure session affinity
This topic uses the HeaderField affinity feature as an example.
Verify the result
On the function details page, click the Code tab and click Test Function.
After the function executes, click the Instances tab and check the Lifecycle column to confirm that delayed release is active.
Billing
The billing rules in this section apply only to elastic instances created dynamically in response to requests, and are independent of billing for minimum instances.
Note: Idle instances still incur charges. Billing stops only after the instance is destroyed.
Billing by instance state
| Instance state | Trigger condition | Billing rate |
|---|---|---|
| Active | vCPU or GPU utilization > system threshold | Active elastic instance rate |
| Idle | Both vCPU and GPU utilization < system threshold | Idle elastic instance rate |
| Destroyed | No new requests during the delayed release duration | No charge |
Billing by instance type
Minimum instances: These instances are always running, regardless of whether there are requests. They are billed at the idle elastic instance rate when there are no requests, and at the active elastic instance rate when processing requests.
Instances with delayed release: After requests are processed, these instances enter an active, idle, or destroyed state based on the rules described in this topic.
Scenario 1: Only delayed release configured
Example: Delayed release duration = 9 minutes.
The instance finishes processing requests at 00:05. Because no new requests arrive, it is destroyed at 00:14 (9 minutes after the last request completed).
| Period | Trigger condition | Billing |
|---|---|---|
| 00:00–00:05 | Processing requests | Active elastic instance rate |
| 00:05–00:09 | vCPU or GPU utilization > threshold — instance stays active | Active elastic instance rate |
| 00:09–00:14 | Both vCPU and GPU utilization < threshold — instance switches to idle | Idle elastic instance rate |
Scenario 2: Delayed release and session affinity both configured
Example:
Delayed release duration = 9 minutes
HeaderField affinity enabled, Session Idle Duration = 15 minutes
The system uses 15 minutes as the keepalive period — the greater of the delayed release duration (9 minutes) and the Session Idle Duration (15 minutes). The instance is destroyed at 00:20 if no new requests arrive.
Billing follows the same three-period pattern as Scenario 1, with the idle phase extended to reflect the 15-minute keepalive.