Savings plans and resource plans

更新时间:
复制 MD 格式

Model Studio offers savings plans and resource plans to help you reduce model costs.

Plan selection

Use this guide to select the right plan.

  • AI General-purpose Savings Plan (Recommended): Commit to a monthly spending amount in exchange for tiered discounts of up to 47%. This plan applies to all models provided by Alibaba Cloud and offers maximum flexibility, making it the best choice for most use cases.

  • Savings plans for other models: Purchase a fixed amount upfront to offset call fees for specific model families. This option applies only to select model families (such as the speech model family) and typically offers smaller discounts than the AI General-purpose Savings Plan.

  • Resource plan: Purchase a specific quantity of resources upfront, such as tokens or image generations. This option applies only to a single, specific model, such as qwen-plus, and typically offers a smaller discount than the AI General-purpose Savings Plan.

To maximize cost-effectiveness, start with the AI General-purpose Savings Plan.

AI general-purpose savings plan

Key benefits

The AI General-purpose Savings Plan offers discounts on pay-as-you-go model calls. Commit to a monthly spend for a set term (3, 6, 12, or 24 months) to receive tiered discounts while keeping pay-as-you-go flexibility.

  • Comprehensive coverage: A single purchase covers all models provided by Alibaba Cloud.

  • Significant cost optimization: Higher spending commitments and longer terms unlock discounts of up to 47%.

  • Simple management: The plan takes effect immediately or at a scheduled time after purchase. Discounts apply automatically without manual activation or binding. Auto-renewal is available.

Usage

Effective time: You can choose when the plan takes effect: "Immediately after ordering" or "Specified time (by hour)".

Calling methods: Model Studio console playground, API calls, and integration with third-party tools (such as Cursor and Cline, using pay-as-you-go API credentials).

Commitment cycle: The plan operates on a monthly cycle, from the effective date to the corresponding date of the following month. Unused commitment expires at the end of each cycle and does not roll over. For example, a 3-month Savings Plan with a CNY 1,000 monthly spend commitment provides a new CNY 1,000 quota each month, not a total of CNY 3,000 for the entire period.

Coverage:

  • Covered usage: Fees for model calls (input and output tokens), model-native tool calling (such as function call and web extractor), context cache, and batch inference.

  • Not covered: Fees for model fine-tuning and model deployment.

Deduction logic:

  • Deduction order: free quota > resource plan > other model-specific Savings Plans > AI General-purpose Savings Plan > pay-as-you-go.

  • Multiple Savings Plans of the same type: The plan expiring first is applied first. If the expiration dates are the same, the plan purchased first is applied first.

  • Excess usage: If all matching Savings Plans expire or are exhausted, any remaining usage is billed at pay-as-you-go rates.

View bills: See How to query Savings Plan bills.

Purchase guide

Purchase

Purchase an AI General-purpose Savings Plan

Regions

China (Beijing), US (Virginia), Singapore, and Germany (Frankfurt)

Eligible services

Discounts vary by tier.

  • Category A: Qwen (excluding qwen3.6-max-preview), Qwen-open-source, text embedding, multimodal embedding, reranking, domain-specific models, and model-native tool calling (such as function calling and web extractor)

  • Category B: Image generation, speech synthesis, speech recognition and translation, and video generation and editing

  • Category C: qwen3.6-max-preview, DeepSeek, Kimi, GLM, MiniMax, and HappyHorse

    Savings Plans do not apply to models provided directly by third parties. See Do third-party direct-supply models qualify for AI General-purpose Savings Plans?

Monthly spend commitment

The amount you commit to spend each month on pay-as-you-go model services. Start at CNY 1,000 in increments of CNY 10. No upper limit.

Commitment cycle

You can choose a commitment cycle of 3, 6, 12, or 24 months.

Offering type

  • All upfront: Pay the full amount for the entire commitment cycle at once to receive the maximum discount.

  • No upfront: No payment is required at purchase. You will be billed for your monthly spend commitment. To use the No upfront option, contact your account manager to be added to the allowlist.

Discounts

See Discount information.

Activation time

You can activate the plan immediately after purchase or schedule it to start at a specific hour.

Discount

Discounts vary by model, commitment tier, term, and payment method.

For example, if you choose a 12-month Savings Plan with a monthly commitment of CNY 10,000 and pay all upfront, you get a 20% discount on Category A Qwen text generation model calls. This means a model call that would normally cost CNY 1 only uses CNY 0.80 of your commitment.

The commitment ranges in the following tables are inclusive of the lower bound and exclusive of the upper bound. For example, 1,000 - 5,000 means an amount greater than or equal to 1,000 and less than 5,000.

Payment option

Monthly spend range (CNY)

Category A

Category B

Category C

3 months

6 months

12 months

24 months

3 months

6 months

12 months

24 months

Entire term

All upfront

1,000 - 5,000

12% off

14% off

16% off

18% off

17% off

20% off

23% off

26% off

No discount

5,000 - 10,000

14% off

16% off

18% off

20% off

20% off

23% off

26% off

29% off

No discount

10,000 - 30,000

16% off

18% off

20% off

22% off

23% off

26% off

29% off

32% off

No discount

30,000 - 50,000

18% off

20% off

22% off

24% off

26% off

29% off

32% off

35% off

No discount

50,000 - 100,000

20% off

22% off

24% off

26% off

29% off

32% off

35% off

38% off

No discount

100,000 - 300,000

22% off

24% off

26% off

28% off

32% off

35% off

38% off

41% off

No discount

300,000 - 1,000,000

24% off

26% off

28% off

30% off

35% off

38% off

41% off

44% off

No discount

1,000,000 and above

26% off

28% off

30% off

32% off

38% off

41% off

44% off

47% off

No discount

No upfront

Contact your account manager to enable this.

1,000 - 5,000

10% off

12% off

14% off

16% off

15% off

18% off

21% off

24% off

No discount

5,000 - 10,000

12% off

14% off

16% off

18% off

18% off

21% off

24% off

27% off

No discount

10,000 - 30,000

14% off

16% off

18% off

20% off

21% off

24% off

27% off

30% off

No discount

30,000 - 50,000

16% off

18% off

20% off

22% off

24% off

27% off

30% off

33% off

No discount

50,000 - 100,000

18% off

20% off

22% off

24% off

27% off

30% off

33% off

36% off

No discount

100,000 - 300,000

20% off

22% off

24% off

26% off

30% off

33% off

36% off

39% off

No discount

300,000 - 1,000,000

22% off

24% off

26% off

28% off

33% off

36% off

39% off

42% off

No discount

1,000,000 and above

24% off

26% off

28% off

30% off

36% off

39% off

42% off

45% off

No discount

Lifecycle management

You can manage your savings plans on the Savings Plan Overview page.

Savings plan renewal

Sign in to the Expenses and Costs console. In the left-side navigation pane, choose Billing > My Subscriptions to view and manage the subscription status, effective date, and auto-renewal settings of your savings plan.

Query discounts

For the AI General-purpose Savings Plan, discounts vary by model, tier, commitment period, and payment method. Go to the Savings Plan Price Discount Details page and filter by the following criteria:

  • Applicable product: Select the corresponding product name from the table below.

  • Deductible billed item: Select the corresponding billed item from the table below.

  • Savings plan type: Select AI General-purpose Savings Plan/Model Studio AI General-purpose Savings Plan

  • Subscription duration and payment method: Select the corresponding options to view the pay-as-you-go discount.

Applicable product

Deductible billed item

Model Studio foundation model inference

Text: Text generation token usage

Image: Image generation count, multi-specification image generation count, and image detection count

Video: Video generation duration

Voice: Speech synthesis character count, speech recognition duration, CosyVoice speech synthesis character count, and voice clone and voice design model count

Vector: Multimodal vector model usage and text vector model usage

Batch call: Batch model usage, BatchChat model usage, BatchChat token usage, and BatchChat video generation duration

Tool call: Usage by call count

and the corresponding global usage for each billed item.

When querying discounts on call fees in China (Beijing) region, select a non-global billed item. When querying discounts in other regions, select the corresponding global billed item.

Model Studio foundation model - Industry-specific model

Text generation token usage

Alibaba Cloud Model Studio - Vector ranking model

Multimodal vector model usage

Model Studio foundation model - Qwen-Audio model

Speech synthesis character count and speech recognition duration

Model Studio foundation model - Model Studio audio model

Speech synthesis character count and speech recognition duration

Query bills

Go to the Expenses and Costs console. In the left-side navigation pane, choose Bills > Bill Details. Set Product to Alibaba Cloud Model Studio and Product Name to AI General-purpose Savings Plan. By default, bill details for the current month are displayed. See Query bills of savings plans.

Other model savings plans

Savings plans for other models

Compared to the AI General-purpose Savings Plan, these savings plans are better suited for lower usage volumes or workloads concentrated on a specific model family.

Usage notes

Effective date: Takes effect immediately after purchase.

Validity period: The validity period depends on the purchased plan. After a plan expires, any unused credit is forfeited and non-refundable.

Deduction scope: This plan covers fees for model calls (input and output tokens). It does not cover fees for tool calls, context cache, batch inference, model fine-tuning, or model deployment.

Deduction logic:

  • Deduction order: Free quota > Resource plan > Savings plans for other models > AI General-purpose Savings Plan > pay-as-you-go.

  • Multiple plans of the same type: The plan that expires first is applied first. If multiple plans share the same expiration date, the one purchased first is applied first.

  • Excess usage: If all applicable savings plans expire or their quotas are exhausted, any excess usage is automatically billed at pay-as-you-go rates.

View bills: See How to query savings plan bills.

Supported savings plans

Large language models

Purchase method

Purchase a savings plan for large language model inference

Tiers

Available tiers: CNY 20, CNY 100, CNY 1,000, CNY 5,000, CNY 10,000, CNY 20,000, CNY 50,000, CNY 100,000, CNY 200,000, CNY 300,000, and CNY 500,000.

Discount

The tiers do not include a discount. Deductions are based on the model call price.

Validity period

  • The CNY 20 tier is valid for 1 month.

  • The CNY 100 tier is valid for 3 months.

  • The CNY 1,000 tier is valid for 6 months.

  • The CNY 5,000, CNY 10,000, CNY 20,000, CNY 50,000, CNY 100,000, CNY 200,000, CNY 300,000, and CNY 500,000 tiers are valid for 1 year.

Applicable region

China (Beijing)

Applicable models

This plan covers token-based text generation models on Model Studio. Applicable models:

  • General large language models:

    • Commercial editions: Qwen-Max, Qwen-Plus, Qwen-Flash, Qwen-Turbo, Qwen-7B-Chat, and Qwen-Long

    • Open source editions: Qwen-72B-Chat, Qwen-14B-Chat, Qwen-7B-Chat, Qwen-1.8B-Chat-Int4, Qwen-VL-Chat, CodeQwen1.5-7B-Chat, and Qwen-1.8B-Chat

    • Third-party models: DeepSeek, GLM, Kimi, and MiniMax

  • Multimodal models:

    • Commercial editions: Qwen-VL-Max, Qwen-VL-Max-Realtime, Qwen-Audio-Chat, Qwen-VL-Plus, and Qwen-OCR

    • Open source editions: Qwen-VL, Qwen-VL-V1.1, Qwen-VL-Chat, and Qwen-Audio-Chat

  • Domain-specific models: CodeQwen, Qwen-Translate, Qwen-Data-Mining, and Qwen-Deep-Research

Note

Embedding and reranking models are not covered by this plan. To deduct fees for these models, see AI General-purpose Savings Plan or Embedding and reranking models savings plan.

Qwen voice models

Purchase method

Purchase a Qwen voice model savings plan

Details

Model Studio offers five tiers:

  • CNY 20: 2% discount

  • CNY 100: 4% discount

  • CNY 500: 10% discount

  • CNY 1,000: 15% discount

  • CNY 5,000: 20% discount

Example: With the CNY 1,000 tier, you receive a 15% discount. If a task costs CNY 1, only CNY 0.85 is deducted from your savings plan.

ASR models are billed per second, and TTS models are billed per character. You can view model call prices in the Model Studio console.

Validity period

  • The CNY 20 and CNY 100 tiers are valid for 6 months.

  • For the CNY 500, CNY 1,000, and CNY 5,000 tiers, you can choose a validity period of 6 or 12 months.

Applicable models

Applicable models vary by region:

  • China (Beijing):

    • Real-time speech synthesis (CosyVoice): cosyvoice-v1, cosyvoice-v2, cosyvoice-v3-flash, and cosyvoice-v3-plus

    • Real-time speech synthesis (Qwen-TTS-Realtime): qwen-tts-realtime, qwen-tts-realtime-2025-07-15, qwen-tts-realtime-latest, qwen3-tts-flash-realtime, and qwen3-tts-flash-realtime-2025-09-18

    • Speech synthesis (Qwen-TTS): qwen-tts, qwen-tts-2025-04-10, qwen-tts-2025-05-22, qwen-tts-latest, qwen3-tts-flash, and qwen3-tts-flash-2025-09-18

    • Real-time speech recognition (Paraformer): paraformer-realtime-v1, paraformer-realtime-v2, paraformer-realtime-8k-v1, and paraformer-realtime-8k-v2

    • Real-time speech recognition (Fun-ASR): fun-asr-realtime, fun-asr-realtime-2025-09-15, and fun-asr-realtime-2025-11-07

    • Real-time speech recognition (Qwen-ASR-Realtime): qwen3-asr-flash-realtime and qwen3-asr-flash-realtime-2025-10-27

    • Audio file recognition (Paraformer): paraformer-v1, paraformer-v2, paraformer-8k-v1, paraformer-8k-v2, and paraformer-mtl-v1

    • Audio file recognition (Fun-ASR): fun-asr, fun-asr-2025-08-25, fun-asr-2025-11-07, fun-asr-mtl, and fun-asr-mtl-2025-08-25

    • Audio file recognition (Qwen-ASR): qwen3-asr-flash, qwen3-asr-flash-2025-09-08, qwen3-asr-flash-filetrans, and qwen3-asr-flash-filetrans-2025-11-17

  • Singapore:

    • Real-time speech synthesis (Qwen-TTS-Realtime): qwen3-tts-flash-realtime and qwen3-tts-flash-realtime-2025-09-18

    • Speech synthesis (Qwen-TTS): qwen3-tts-flash and qwen3-tts-flash-2025-09-18

    • Real-time speech recognition (Qwen-ASR-Realtime): qwen3-asr-flash-realtime and qwen3-asr-flash-realtime-2025-10-27

    • Audio file recognition (Fun-ASR): fun-asr, fun-asr-2025-08-25, and fun-asr-2025-11-07

    • Audio file recognition (Qwen-ASR): qwen3-asr-flash, qwen3-asr-flash-2025-09-08, qwen3-asr-flash-filetrans, and qwen3-asr-flash-filetrans-2025-11-17

You can view a complete list of models in the Model Studio console.

Embedding and reranking models

Purchase method

Purchase a savings plan for embedding and reranking models

Details

Model Studio offers five tiers:

  • CNY 100: No discount

  • CNY 500: 10% discount

  • CNY 2,000: 20% discount

  • CNY 5,000: 25% discount

  • CNY 10,000: 30% discount

Example: With the CNY 5,000 tier, you receive a 25% discount. If a task costs CNY 1, only CNY 0.75 is deducted from your savings plan.

Validity period

  • The CNY 100 and CNY 500 tiers are valid for 3 months.

  • The CNY 2,000 tier is valid for 6 months.

  • The CNY 5,000 and CNY 10,000 tiers are valid for 12 months.

Applicable region

China (Beijing)

Applicable models

Text embedding: text-embedding-v1, text-embedding-v2, text-embedding-async-v1, text-embedding-async-v2, text-embedding-v3, and text-embedding-v4

Multimodal embedding: multimodal-embedding-v1, tongyi-embedding-vision-flash, tongyi-embedding-vision-plus, and qwen2.5-vl-embedding

Text reranking: gte-rerank-v2 and qwen3-rerank

You can view a complete list of models and their model call prices in the Model Studio console.

Resource plan

Resource plan

You can pre-purchase tokens to cover real-time inference usage for a specific model after your free quota is exhausted.

Usage

Effective time: A resource plan takes effect immediately upon purchase. No manual activation or binding is required.

Validity period: The validity period depends on the purchased plan. Any remaining tokens in the resource plan automatically expire when the plan's validity period ends.

Deduction logic:

  • Deduction order: free quota > resource plan > savings plans for other models > AI General-purpose Savings Plan > pay-as-you-go.

  • Multiple resource plans of the same type: Resource plans are consumed by expiration date, from earliest to latest. If the expiration dates are identical, the plan purchased first is consumed first.

  • Excess usage: If all matching resource plans expire or are exhausted, subsequent usage is automatically billed at pay-as-you-go rates.

Balance monitoring and alerts:

Unsubscription rules:

LLM inference resource plan

Purchase link

LLM inference resource plan qwen-plus

LLM inference resource plan qwen-max

LLM inference resource plan qwen-turbo

Region

China (Beijing)

China (Beijing)

China (Beijing)

Applicable model

qwen-plus and qwen-plus-latest for real-time inference services (non-thinking mode)

qwen-max for real-time inference services (non-thinking mode)

qwen-turbo for real-time inference services (non-thinking mode)

Total input and output tokens

12 million/110 million

18 million/39 million/390 million/1.17 billion/1.95 billion

35 million/350 million/1.75 billion/3.5 billion

Price (CNY)

11.66/114.4

57.6/125/1250/3750/6250

11.45/114.45/572.25/1144.5

Validity period

Valid for 3 months, 6 months, or 1 year from the purchase date.

Valid for 1 year from the purchase date.

Valid for 1 year from the purchase date.

Limits

Qwen image generation resource plan

Purchase link

Qwen Image Generation Resource Plan qwen-image

Qwen Image Generation Resource Plan qwen-image-plus

Region

China (Beijing)

China (Beijing)

Applicable model

text-to-image: qwen-image

image editing: qwen-image-edit

text-to-image: qwen-image-plus

image editing: qwen-image-edit-plus

Plan capacity

80/400

100/1,000/10,000/100,000/500,000

Price (CNY)

20/100

Tiered discounts apply:

20/196 (2% off)/1,900 (5% off)/18,000 (10% off)/85,000 (15% off)

Validity period

Valid for 3 months from the purchase date.

For the 100 and 1,000 image tiers, the validity period is 3 months from the purchase date.

For the 10,000 and 100,000 image tiers, the validity period is 6 months from the purchase date.

For the 500,000 image tier, the validity period is 12 months from the purchase date.

Description

Generating an image with the text-to-image model consumes 1 unit of quota. Editing an image with the image editing model consumes 1.2 units of quota.

After the resource plan capacity is exhausted, subsequent usage is billed at the pay-as-you-go rate for the respective model. See model invocation billing.

Generating or editing an image consumes 1 unit of quota.

After the resource plan capacity is exhausted, subsequent usage is billed at the pay-as-you-go rate for the respective model. See model invocation billing.

FAQ

Canceling savings and resource plans

  • Savings plan: Effective 10:00:00 (UTC+8) on April 3, 2026, you can cancel savings plans that meet the following conditions in the Resource Unsubscription console:

    • All-upfront savings plans that have not yet taken effect.

    • All-upfront savings plans that have taken effect but have not been used for any deductions.

    If a savings plan has been used for deductions, you cannot cancel it. For details, see the announcement.

  • Resource plan: You can apply for a refund for the unused portion of your resource plan. The used portion is non-refundable.

Deduction priority for plans

The system deducts charges in the following order of priority: free quota > resource plan > other model-specific savings plans > AI General-purpose Savings Plan > pay-as-you-go. Your free quota is always used first. Once it is exhausted, your resource plan is applied. If the resource plan is insufficient or not applicable, your savings plan is used next. Finally, any remaining charges are billed on a pay-as-you-go basis.

Savings plan not applying

Common reasons include the following:

  1. Model mismatch: The model you are calling is not covered by the savings plan you purchased. For example, you purchased an LLM Savings Plan but are calling a Wan series model, or an embedding or reranking model. Consider purchasing an AI General-purpose Savings Plan to cover calls across different models.

  2. Unsupported feature usage: The AI General-purpose Savings Plan and other savings plans do not cover fees for model fine-tuning or model deployment. Only the AI General-purpose Savings Plan covers fees for context cache, batch inference, and tool calling; other savings plans do not.

  3. Free quota not exhausted: The system applies benefits in the order of free quota > savings plan. A savings plan applies only to charges incurred after your free quota is exhausted.

AI General-purpose Savings Plan and third-party models

For Category C models, only models provided directly by Alibaba Cloud are eligible for savings plan deductions. Third-party direct-supply models are not eligible. You can identify eligible models by the badge in the upper-right corner of their model card in the Model Studio Model Marketplace.

image

How do I use a savings plan after purchase?

A savings plan requires no manual activation or binding once it takes effect. When you call models through the Model Studio console playground, API calls, or third-party tool integration, the incurred fees are automatically deducted according to the deduction priority. You can view deduction details on the Savings Plan Overview page.

Resource plan not applying

A resource plan applies to your usage costs only when specific conditions are met. Common reasons it may not apply include:

  1. Model mismatch: The model you are calling is not covered by the resource plan you purchased. For example, you purchased a qwen-max resource plan but are calling the qwen-plus model.

  2. Unsupported feature usage: A resource plan does not cover fees for these features: batch inference (Batch), context cache, model fine-tuning, and model deployment.

  3. Token limit exceeded: For the qwen-plus resource plan, deductions do not apply to the portion of a single request's input that exceeds 128K tokens.

  4. Free quota not exhausted: The system applies benefits in the order of free quota > resource plan. A resource plan applies only to charges incurred after your free quota is exhausted.

Using a pre-purchased resource plan

First, activate the model service in Alibaba Cloud Model Studio. After activation, the service first consumes your free quota. Your resource plan will apply only after this quota is exhausted.

Does the LLM Savings Plan cover embedding and reranking models?

No. The LLM inference savings plan covers only text generation models. Embedding and reranking models are not covered. If your workload involves both LLMs and embedding or reranking models (for example, RAG), consider the AI General-purpose Savings Plan, or purchase a separate Embedding and reranking models savings plan.