Configure delayed release for elastic instances-Function Compute(FC)-阿里云帮助中心

Elastic instances are destroyed shortly after they finish processing a request. If your function runs background tasks after the response is sent — such as uploading logs or syncing data — those tasks are interrupted when the instance is terminated, which can cause data loss.

The delayed release feature keeps an elastic instance alive for a configurable duration after it processes its last request, giving background tasks time to finish. During this period, Function Compute automatically transitions the instance between states based on resource utilization to balance performance and cost.

How it works

After the last request completes, Function Compute holds the instance for the configured delayed release duration and automatically moves it between the following states:

State	Trigger condition	Effect
Active	vCPU or GPU utilization (stream processors and decoders) exceeds the system threshold	Background tasks continue running; billed at the active elastic instance rate
Idle	Both vCPU and GPU utilization drop below the system threshold	Instance remains available at reduced cost; billed at the idle elastic instance rate
Destroyed	No new requests arrive during the delayed release duration	Billing stops

When an idle instance receives a new request, it wakes up within milliseconds — avoiding cold start latency.

If no new requests arrive during the delayed release duration, the instance is destroyed and billing stops. After destruction:

If minimum instances = 0, the next request triggers a cold start.
If minimum instances > 0, cold starts are eliminated.

Constraints

Delayed release applies only to elastic instances — those dynamically created in response to requests.
This feature does not affect minimum instances. Minimum instances have their own lifecycle management and billing rules.
Valid delayed release duration: 5 minutes to 60 minutes (inclusive).

Choose a keepalive strategy

	Delayed release	Session affinity	Delayed release + session affinity
Scenario	Background services	Services with long-lifecycle sessions	Background services and persistent session connections
Instance keepalive duration	5 minutes to 60 minutes (configured)	Latest session expiration time on a single instance	The greater of: (1) the delayed release duration, or (2) the maximum Session Idle Duration across all sessions on a single instance

Configure delayed release for elastic instances

Configure the delayed release feature when you create a function, or follow these steps for an existing function.

Prerequisites

Before you begin, ensure that you have:

Access to the Function Compute console
An existing function to configure

Enable delayed release

Log on to the Function Compute console. In the left navigation pane, choose Function Management > Functions.
In the top navigation bar, select a region. On the Functions page, click the name of the target function.
Click the Function Details > Configuration tab, then click Edit next to Advanced Configuration.
In the Advanced Configuration panel, expand Delayed Release for Elastic Instances. Turn on the Delayed Release for Elastic Instances switch, set Delayed Release Time, then click Deploy.

(Optional) Configure session affinity

This topic uses the HeaderField affinity feature as an example.

Verify the result

On the function details page, click the Code tab and click Test Function.
After the function executes, click the Instances tab and check the Lifecycle column to confirm that delayed release is active.

Billing

The billing rules in this section apply only to elastic instances created dynamically in response to requests, and are independent of billing for minimum instances.

Note: Idle instances still incur charges. Billing stops only after the instance is destroyed.

Billing by instance state

Instance state	Trigger condition	Billing rate
Active	vCPU or GPU utilization > system threshold	Active elastic instance rate
Idle	Both vCPU and GPU utilization < system threshold	Idle elastic instance rate
Destroyed	No new requests during the delayed release duration	No charge

Billing by instance type

Minimum instances: These instances are always running, regardless of whether there are requests. They are billed at the idle elastic instance rate when there are no requests, and at the active elastic instance rate when processing requests.
Instances with delayed release: After requests are processed, these instances enter an active, idle, or destroyed state based on the rules described in this topic.

Scenario 1: Only delayed release configured

Example: Delayed release duration = 9 minutes.

The instance finishes processing requests at 00:05. Because no new requests arrive, it is destroyed at 00:14 (9 minutes after the last request completed).

Period	Trigger condition	Billing
00:00–00:05	Processing requests	Active elastic instance rate
00:05–00:09	vCPU or GPU utilization > threshold — instance stays active	Active elastic instance rate
00:09–00:14	Both vCPU and GPU utilization < threshold — instance switches to idle	Idle elastic instance rate

Scenario 2: Delayed release and session affinity both configured

Example:

Delayed release duration = 9 minutes
HeaderField affinity enabled, Session Idle Duration = 15 minutes

The system uses 15 minutes as the keepalive period — the greater of the delayed release duration (9 minutes) and the Session Idle Duration (15 minutes). The instance is destroyed at 00:20 if no new requests arrive.

Billing follows the same three-period pattern as Scenario 1, with the idle phase extended to reflect the 15-minute keepalive.

What's next

Configure launch snapshot and auto scaling rules