Starting from 00:00 on June 14, 2024, Function Compute introduced idle billing for GPU usage. When a provisioned GPU-accelerated instance is waiting between requests, its GPU resources are frozen and billed at USD 0.000007 per compute unit (CU) — significantly less than the active rate of USD 0.000018/CU. This reduces costs for AI inference workloads with bursty or periodic traffic patterns, where instances spend meaningful time idle between invocations.
How it works
Enable idle mode when you configure a provisioned instance policy for your function. Once enabled:
GPU resources are frozen when no requests are being processed.
The instance enters the idle state and is billed at the idle rate.
For setup instructions, see Configure provisioned instances. For pricing details, see CNY 0.00004/CUBilling overview.
Billing example
The following example shows the cost difference for a Stable Diffusion application running on a 16 GB Tesla GPU card, with provisioned instances reserved for 1 hour. The function receives 1,800 invocations during that hour, each taking 1 second to complete, leaving 1,800 seconds of idle time.
Summary
| Idle mode off | Idle mode on | |
|---|---|---|
| Active GPU fee | USD 1.0368 | USD 0.5184 |
| Idle GPU fee | USD 0 | USD 0.2016 |
| Total | USD 1.0368 | USD 0.72 |
| Savings | -- | ~30.6% |
Idle mode off
Without idle mode, GPU-accelerated instances stay fully active for the entire provisioned duration, regardless of whether requests are being processed.
| Item | Calculation | Result |
|---|---|---|
| Active GPU usage | 16 GB x 3,600 s | 57,600 CUs (Tier 0) |
| Fee | USD 0.000018/CU x 57,600 CUs | USD 1.0368 |
Fee = Tier 0 unit price × Usage = CNY 0.00011/CU × 57,600 CU = CNY 6.336
For Tier 0 pricing details, see Billing overview.
Idle mode on
With idle mode, billing splits into active usage (while processing requests) and idle usage (while waiting).
Active duration: 1,800 s (1,800 invocations x 1 s/invocation)
Idle duration: 1,800 s (3,600 s provisioned duration - 1,800 s active duration)
| Item | Calculation | Result |
|---|---|---|
| Active GPU usage | 16 GB x 1,800 s | 28,800 CUs (Tier 0) |
| Idle GPU usage | 16 GB x 1,800 s | 28,800 CUs |
| Active GPU fee | USD 0.000018/CU x 28,800 CUs | USD 0.5184 |
| Idle GPU fee | USD 0.000007/CU x 28,800 CUs | USD 0.2016 |
| Total fee | USD 0.72 |
Fee = (Tier 0 unit price × Active GPU usage) + (Shallow hibernation (formerly idle) GPU unit price × Shallow hibernation (formerly idle) GPU usage) = (CNY 0.00011/CU × 28,800 CU) + (CNY 0.00004/CU × 28,800 CU) = CNY 4.32
For Tier 0 pricing details, see Billing overview.
GPU compute unit mapping
Function Compute provides GPU-accelerated instances in two GPU series. For both series, 1 CU equals 1 GB-second.
| GPU card type | CU | GB-second |
|---|---|---|
| Tesla series | 1 | 1 |
| Ampere series | 1 | 1 |