The JST Intelligent Application Development Platform uses tiered pricing to bill developers for Model Studio model service calls, based on the number of input and output tokens and on cache hits.-AgentOne(AgentOne)-阿里云帮助中心

This document explains the billing methods for AI PaaS model services and cloud resources.

1. Model service pricing

1. Billing
Billing is based on the number of tokens consumed. Each model has separate rates for input and output tokens, and charges are calculated per call.

Model	Pricing tier	Type	Price (CNY/1k tokens)
qwen-flash	0k-128k	input	0.0002
		input (thinking)	0.0002
		input (cache hit)	0.0001
		input (thinking mode cache hit)	0.0001
		output	0.0015
		output (thinking)	0.0015
	128k-256k	input	0.0006
		input (thinking)	0.0006
		input (cache hit)	0.0002
		input (thinking mode cache hit)	0.0002
		output	0.006
		output (thinking)	0.006
	256k-1m	input	0.0012
		input (thinking)	0.0012
		input (cache hit)	0.0003
		input (thinking mode cache hit)	0.0003
		output	0.012
		output (thinking)	0.012
qwen3-30b-a3b-instruct-2507	-	input	0.0008
qwen3-30b-a3b-instruct-2507	-	output	0.003
qwen-plus	0k-128k	input	0.0008
		input (thinking)	0.0008
		input (cache hit)	0.0002
		input (thinking mode cache hit)	0.0002
		output	0.002
		output (thinking)	0.008
	128k-256k	input	0.0024
		input (thinking)	0.0024
		input (cache hit)	0.0005
		input (thinking mode cache hit)	0.0005
		output	0.02
		output (thinking)	0.024
	256k-1m	input	0.0048
		input (thinking)	0.0048
		input (cache hit)	0.001
		input (thinking mode cache hit)	0.001
		output	0.048
		output (thinking)	0.064
qwen3-235b-a22b	-	input	0.002
qwen3-235b-a22b	-	output	0.008
qwen3-32b	-	input	0.002
		input (thinking)	0.002
		output	0.008
		output (thinking)	0.02
qwen3-30b-a3b-thinking-2507	-	input (thinking)	0.0008
qwen3-30b-a3b-thinking-2507	-	output (thinking)	0.0075
qwen-turbo	-	input	0.0003
		input (thinking)	0.0003
		input (cache hit)	0.0001
		input (thinking mode cache hit)	0.0001
		output	0.0006
		output (thinking)	0.003
qwen-max-2025-01-25	-	input	0.0024
qwen-max-2025-01-25	-	output	0.0096
qwen-max	-	input	0.0024
		input (cache hit)	0.0005
		output	0.0096
qwen-vl-plus	-	input	0.0008
		input (cache hit)	0.0002
		output	0.002
qwen-vl-max	-	input	0.0016
		input (cache hit)	0.0004
		output	0.004
qwen2.5-vl-72b-instruct	-	input	0.016
qwen2.5-vl-72b-instruct	-	output	0.048
qwen3-next-80b-a3b-instruct	-	input	0.001
qwen3-next-80b-a3b-instruct	-	output	0.004
qwen3-235b-a22b-instruct-2507	-	input	0.002
qwen3-235b-a22b-instruct-2507	-	output	0.008
qwen3-vl-235b-a22b-instruct	-	input	0.002
qwen3-vl-235b-a22b-instruct	-	output	0.008
qwen3-vl-flash	0k-32k	input	0.0002
		input (cache hit)	0.0001
		output	0.0015
	32k-128k	input	0.0003
		input (cache hit)	0.0001
		output	0.003
	128k-256k	input	0.0006
		input (cache hit)	0.0002
		output	0.006
qwen3-vl-plus	0k-32k	input	0.001
		input (cache hit)	0.0002
		output	0.01
	32k-128k	input	0.0015
		input (cache hit)	0.0003
		output	0.015
	128k-256k	input	0.003
		input (cache hit)	0.0006
		output	0.03
qwen-plus-2025-12-01	0k-128k	input	0.0008
		input (thinking)	0.0008
		output	0.002
		output (thinking)	0.008
	128k-256k	input	0.0024
		input (thinking)	0.0024
		output	0.02
		output (thinking)	0.024
	256k-1m	input	0.0048
		input (thinking)	0.0048
		output	0.048
		output (thinking)	0.064
qwen3-coder-plus	0k-32k	input	0.004
		input (cache hit)	0.0008
		output	0.016
	32k-128k	input	0.006
		input (cache hit)	0.0012
		output	0.024
	128k-256k	input	0.01
		input (cache hit)	0.002
		output	0.04
	256k-1m	input	0.02
		input (cache hit)	0.004
		output	0.08
qwen3-max	0k-32k	input	0.0032
		input (cache hit)	0.0007
		output	0.0128
	32k-128k	input	0.0064
		input (cache hit)	0.0013
		output	0.0256
	128k-256k	input	0.0096
		input (cache hit)	0.002
		output	0.0384
qwen3-vl-plus-2025-12-19	0k-32k	input	0.001
	0k-32k	output	0.01
	32k-128k	input	0.0015
	32k-128k	output	0.015
	128k-256k	input	0.003
	128k-256k	output	0.03
qwen-plus-2025-07-28	0k-128k	input	0.0008
		input (thinking)	0.0008
		output	0.002
		output (thinking)	0.008
	128k-256k	input	0.0024
		input (thinking)	0.0024
		output	0.02
		output (thinking)	0.024
	256k-1m	input	0.0048
		input (thinking)	0.0048
		output	0.048
		output (thinking)	0.064
qwen3-0.6b	-	input	0.0003
		input (thinking)	0.0003
		output	0.0012
		output (thinking)	0.003
qwen3-1.7b	-	input	0.0003
		input (thinking)	0.0003
		output	0.0012
		output (thinking)	0.003
qwen3-4b	-	input	0.0003
		input (thinking)	0.0003
		output	0.0012
		output (thinking)	0.003
qwen3-max-2026-01-23	0k-32k	input	0.0025
	0k-32k	output	0.01
	32k-128k	input	0.004
	32k-128k	output	0.016
	128k-256k	input	0.007
	128k-256k	output	0.028
qwen3.5-plus	0k-128k	input	0.0008
	0k-128k	output	0.0048
	128k-256k	input	0.002
	128k-256k	output	0.012
	256k-1m	input	0.004
	256k-1m	output	0.024
qwen3.5-flash	0k-128k	input	0.0002
	0k-128k	output	0.002
	128k-256k	input	0.0008
	128k-256k	output	0.008
	256k-1m	input	0.0012
	256k-1m	output	0.012

2. Cloud resource billing methods

SAE	Billing overview
PAI	Billing for EAS
ApsaraMQ for RabbitMQ	Public access fee
PolarDB	Billing overview
ApsaraDB for ClickHouse	Billing methods
DTS	Billing methods
Alibaba Cloud DNS PrivateZone	Billing overview
Alibaba Cloud Elasticsearch	Billing rules
ApsaraDB for RDS	Billing overview
Tair	Billing methods
SLB	Billing overview
MSE	Billing for Developer and Professional editions
Log Service	Pay-by-feature
ACK	Cluster management fees
OpenSearch	Billing methods and items
NAT Gateway	NAT Gateway billing
DMS	Billable items
CloudMonitor	Billing for OpenTelemetry
ECS	Instance type billing
OSS	Billable items

上一篇: Billing method for User Experience Insight Agent 下一篇: Billing for ad creative agent and resource packages