Choose the right text generation model for AI agents, chatbots, and document processing.
Recommended models for coding tools
We recommend qwen3.7-plus for balanced performance and cost, with full tool calling and a 1M-token context window for large codebases. For the strongest reasoning, choose qwen3.7-max.
Migrate from closed-source models
Map your current GPT, Claude, or Gemini model to an equivalent Bailian model.
|
Closed-source examples |
Bailian recommendation |
|
|
Highest capability |
GPT-5.5, Claude Opus 4.7, Gemini 3.1 Pro |
|
|
Balanced |
GPT-5.4, Claude Sonnet 4.6, Gemini 3 Pro |
|
|
Lightweight & low-cost |
GPT-5.4-mini, Claude Haiku 4.5, Gemini 3.1 Flash |
|
Use cases
Start with qwen3.7-plus for chatbots, content generation, summarization, and document processing — it balances performance, cost, and built-in tools with a 1M-token context window. To cut costs, switch to qwen3.6-flash, which offers similar capabilities at a lower price. For the strongest reasoning, use qwen3.7-max (1M context, higher cost).
Context window
1 million tokens is roughly 750,000 English words, or 8-10 novels.
-
For long documents or large codebases:
qwen3.7-plus/qwen3.6-flash(1 million tokens). -
For standard tasks: 128k-256k tokens is typically sufficient.
For context window details, visit the Models page.
China (Beijing) | Singapore | US | Frankfurt
Thinking mode
Step-by-step reasoning for multi-step math, code debugging, architecture planning, and legal cross-referencing.
Enable with the enable_thinking parameter, or use reasoning.effort in the Responses API to control thinking depth. All Qwen3+ models support this feature, most of which operate in a hybrid mode togglable per request.
See Deep thinking.
Function calling and built-in tools
Let the model take actions such as querying weather, searching databases, or booking meetings.
-
Function calling (custom tools that the model calls): Supported by all general-purpose models.
-
Built-in tools (web search, code interpreter, web scraping) with no configuration required.
See Tool calling.
Structured output
Forces valid JSON output, useful for extracting structured data like names and addresses from text.
See Structured output.
Batch inference
Process large volumes at lower cost when latency is not critical.
See Batch inference.
Recommended
|
Model ID |
Context |
Thinking mode |
Function Calling |
Built-in tools |
Structured output |
Batch calling |
|
|
1M |
|
|
|
|
|
|
|
1M |
|
|
|
|
|
|
|
1M |
|
|
|
|
|
|
|
1M |
|
|
|
|
|
|
|
1M |
|
|
|
|
|
|
|
198k |
|
|
|
|
|
|
|
256k |
|
|
|
|
|
|
|
192k |
|
|
|
|
|
|
|
1M |
|
|
|
|
|
Legacy & snapshot models
For new projects, use the Qwen3.6 or Qwen3.5 series. The following models are legacy and no longer recommended. Visit the Models page to view detailed model parameters, such as context window and billing.
Qwen3.6
|
Model ID |
Context |
Max output |
Thinking budget |
Function Calling |
Built-in tools |
Structured output |
Batch calling |
|
|
256k |
64k |
128k |
|
|
|
|
|
|
1M |
64k |
80k |
|
|
|
|
Qwen3.5
|
Model ID |
Context |
Max output |
Thinking budget |
Function Calling |
Built-in tools |
Structured output |
Batch calling |
|
|
1M |
64k |
80k |
|
|
|
|
|
|
1M |
64k |
80k |
|
|
|
|
|
|
256k |
64k |
80k |
|
|
|
|
|
|
256k |
64k |
80k |
|
|
|
|
|
|
256k |
64k |
80k |
|
|
|
|
|
|
256k |
64k |
80k |
|
|
|
|
Qwen3
|
Model ID |
Context |
Thinking mode |
Function Calling |
Built-in tools |
Structured output |
Batch calling |
|
|
256k |
|
|
|
|
|
|
|
256k |
|
|
|
|
|
|
|
256k |
|
|
|
|
|
|
|
256k |
|
|
|
|
|
|
|
256k |
|
|
|
|
|
|
|
256k |
|
|
|
|
|
|
|
256k |
|
|
|
|
|
|
|
256k |
|
|
|
|
|
|
|
256k |
|
|
|
|
|
|
|
256k |
|
|
|
|
|
|
|
256k |
|
|
|
|
|
|
|
256k |
|
|
|
|
|
|
|
256k |
|
|
|
|
|
|
|
256k |
|
|
|
|
|
|
|
256k |
|
|
|
|
|
Qwen3-Coder
|
Model ID |
Context |
Thinking mode |
Function Calling |
Built-in tools |
Structured output |
Batch calling |
|
|
1M |
|
|
|
|
|
|
|
1M |
|
|
|
|
|
|
|
256k |
|
|
|
|
|
|
|
256k |
|
|
|
|
|
|
|
256k |
|
|
|
|
|
Qwen2.5 (open source)
|
Model ID |
Context |
Thinking mode |
Function Calling |
Built-in tools |
Structured output |
Batch calling |
|
|
1M |
|
|
|
|
|
|
|
1M |
|
|
|
|
|
|
|
1M |
|
|
|
|
|
|
|
1M |
|
|
|
|
|
|
|
1M |
|
|
|
|
|
|
|
1M |
|
|
|
|
|
|
|
1M |
|
|
|
|
|
|
|
1M |
|
|
|
|
|
|
|
1M |
|
|
|
|
|
|
|
1M |
|
|
|
|
|
|
|
1M |
|
|
|
|
|
Translation
|
Model ID |
Context |
Thinking mode |
Function Calling |
Built-in tools |
Structured output |
Batch calling |
|
|
16k |
|
|
|
|
|
|
|
16k |
|
|
|
|
|
|
|
16k |
|
|
|
|
|
|
|
16k |
|
|
|
|
|
Qwen-Long
|
Model ID |
Context |
Thinking mode |
Function Calling |
Built-in tools |
Structured output |
Batch calling |
|
|
10M |
|
|
|
|
|
|
|
10M |
|
|
|
|
|
Role-playing
|
Model ID |
Context |
Thinking mode |
Function Calling |
Built-in tools |
Structured output |
Batch calling |
|
|
32k |
|
|
|
|
|
|
|
32k |
|
|
|
|
|
|
|
8k |
|
|
|
|
|
Legacy Qwen
|
Model ID |
Context |
Thinking mode |
Function Calling |
Built-in tools |
Structured output |
Batch calling |
|
|
1M |
|
|
|
|
(mainline version only) |
|
|
128k |
|
|
|
|
(mainline version only) |
|
|
1M |
|
|
|
|
(mainline version only) |
|
|
1M |
|
|
|
|
(mainline version only) |
|
|
128k |
|
|
|
|
|
|
|
128k |
|
|
|
|
|
|
|
32k |
|
|
|
|
(mainline version only) |
Third-party models
|
Model ID |
Context |
Thinking mode |
Function Calling |
Built-in tools |
Structured output |
Batch calling |
|
|
198k |
|
|
|
|
|
|
|
198k |
|
|
|
|
|
|
|
198k |
|
|
|
|
|
|
|
198k |
|
|
|
|
|
|
|
198k |
|
|
|
|
|
|
|
200k |
|
|
|
|
|
|
|
256k |
|
|
|
|
|
|
|
256k |
|
|
|
|
|
|
|
256k |
|
|
|
|
|
|
|
128k |
|
|
|
|
|
|
|
128k |
|
|
|
|
|
|
|
128k |
|
|
|
|
|
|
|
128k |
|
|
|
|
|
|
|
128k |
|
|
|
|
|
|
|
128k |
|
|
|
|
|
|
|
128k |
|
|
|
|
|
|
|
128k |
|
|
|
|
|
|
|
128k |
|
|
|
|
|
|
|
128k |
|
|
|
|
|
|
|
128k |
|
|
|
|
|
|
|
128k |
|
|
|
|
|