Create a response-Alibaba Cloud Model Studio(Model Studio)-阿里云帮助中心

Use the OpenAI-compatible Responses API to call the Qwen model. This topic describes the input and output parameters and provides a call example.

Advantages over the OpenAI Chat Completions API:

Built-in tools: Get better results on complex tasks with built-in tools like web search, web scraping, a code interpreter, text-to-image, image-to-image, and knowledge base search. For more information, see tool calling.
More flexible input: Supports both direct string input and message arrays in the chat format.
Simplified context management: Avoid manually constructing a message history array by passing the previous_response_id from the last response.
Convenient context caching: Add x-dashscope-session-cache: enable (default value: disable) to the request header to enable automatic server-side caching of the conversation context. This reduces inference latency and costs for multi-turn conversations with no code changes required. For details, see session cache.

Compatibility and limitations

This API is compatible with OpenAI to reduce developer migration cost, but differs in its parameters, functionality, and behavior.

Core Principle: Only the parameters explicitly listed in this document are processed. Any OpenAI parameters not mentioned are ignored.

The following key differences will help you adapt quickly:

Unsupported Parameters: This API does not support some OpenAI API parameters, such as the asynchronous execution parameter background. The API currently supports only synchronous calls.
Reasoning Effort Control: Use the reasoning.effort parameter to control the model's reasoning effort. For usage details, see the description of this parameter.

China (Beijing)

SDK call configuration base_url: https://{WorkspaceId}.cn-beijing.maas.aliyuncs.com/compatible-mode/v1

HTTP request endpoint: POST https://{WorkspaceId}.cn-beijing.maas.aliyuncs.com/compatible-mode/v1/responses

Singapore

SDK call configuration base_url: https://{WorkspaceId}.ap-southeast-1.maas.aliyuncs.com/compatible-mode/v1

HTTP request endpoint: POST https://{WorkspaceId}.ap-southeast-1.maas.aliyuncs.com/compatible-mode/v1/responses

US (Virginia)

SDK call configuration base_url: https://dashscope-us.aliyuncs.com/compatible-mode/v1

HTTP request endpoint: POST https://dashscope-us.aliyuncs.com/compatible-mode/v1/responses

Germany (Frankfurt)

SDK call configuration base_url: https://{WorkspaceId}.eu-central-1.maas.aliyuncs.com/compatible-mode/v1

HTTP request endpoint: POST https://{WorkspaceId}.eu-central-1.maas.aliyuncs.com/compatible-mode/v1/responses

Japan (Tokyo)

SDK call configuration base_url: https://{WorkspaceId}.ap-northeast-1.maas.aliyuncs.com/compatible-mode/v1

HTTP request endpoint: POST https://{WorkspaceId}.ap-northeast-1.maas.aliyuncs.com/compatible-mode/v1/responses

Replace {WorkspaceId} with your actual workspace ID.

Important

Alibaba Cloud Model Studio has released workspace-specific domains for the China (Beijing), Singapore regions. The new dedicated domains deliver superior performance and higher stability for inference requests. We recommend migrating to the new domains:

China (Beijing): from https://dashscope.aliyuncs.com to https://{WorkspaceId}.cn-beijing.maas.aliyuncs.com
Singapore: from https://dashscope-intl.aliyuncs.com to https://{WorkspaceId}.ap-southeast-1.maas.aliyuncs.com

{WorkspaceId} is your workspace ID, which can be found on the Workspace Details page in the Alibaba Cloud Model Studio console. The existing domain remains fully functional.

Important

The legacy URL path /api/v2/apps/protocols/compatible-mode/v1/responses for the OpenAI-compatible Responses API will be deprecated soon. Please migrate to the new path /compatible-mode/v1/responses as soon as possible.

Request body	Basic call Python `import os from openai import OpenAI client = OpenAI( # If the environment variable is not set, replace the following line with your Model Studio API Key: api_key="sk-xxx" api_key=os.getenv("DASHSCOPE_API_KEY"), base_url="https://{WorkspaceId}.cn-beijing.maas.aliyuncs.com/compatible-mode/v1", ) response = client.responses.create( model="qwen3.7-plus", input="What can you do?" ) # Get the model's response print(response.output_text)` Node.js `import OpenAI from "openai"; const openai = new OpenAI({ // If the environment variable is not set, replace the following line with your Model Studio API Key: apiKey: "sk-xxx" apiKey: process.env.DASHSCOPE_API_KEY, baseURL: "https://{WorkspaceId}.cn-beijing.maas.aliyuncs.com/compatible-mode/v1" }); async function main() { const response = await openai.responses.create({ model: "qwen3.7-plus", input: "What can you do?" }); // Get the model's response console.log(response.output_text); } main();` curl `curl -X POST https://{WorkspaceId}.cn-beijing.maas.aliyuncs.com/compatible-mode/v1/responses \ -H "Authorization: Bearer $DASHSCOPE_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "qwen3.7-plus", "input": "What can you do?" }'` Stream output Python import os from openai import OpenAI client = OpenAI( api_key=os.getenv("DASHSCOPE_API_KEY"), base_url="https://{WorkspaceId}.cn-beijing.maas.aliyuncs.com/compatible-mode/v1", ) stream = client.responses.create( model="qwen3.7-plus", input="Briefly introduce artificial intelligence.", stream=True ) print("Receiving stream output:") for event in stream: if event.type == 'response.output_text.delta': print(event.delta, end='', flush=True) elif event.type == 'response.completed': print("\nStream completed") print(f"Total tokens: {event.response.usage.total_tokens}") Node.js import OpenAI from "openai"; const openai = new OpenAI({ apiKey: process.env.DASHSCOPE_API_KEY, baseURL: "https://{WorkspaceId}.cn-beijing.maas.aliyuncs.com/compatible-mode/v1" }); async function main() { const stream = await openai.responses.create({ model: "qwen3.7-plus", input: "Briefly introduce artificial intelligence.", stream: true }); console.log("Receiving stream output:"); for await (const event of stream) { if (event.type === 'response.output_text.delta') { process.stdout.write(event.delta); } else if (event.type === 'response.completed') { console.log("\nStream completed"); console.log(`Total tokens: ${event.response.usage.total_tokens}`); } } } main(); curl `curl -X POST https://{WorkspaceId}.cn-beijing.maas.aliyuncs.com/compatible-mode/v1/responses \ -H "Authorization: Bearer $DASHSCOPE_API_KEY" \ -H "Content-Type: application/json" \ --no-buffer \ -d '{ "model": "qwen3.7-plus", "input": "Briefly introduce artificial intelligence.", "stream": true }'` Multi-turn conversation Python import os from openai import OpenAI client = OpenAI( api_key=os.getenv("DASHSCOPE_API_KEY"), base_url="https://{WorkspaceId}.cn-beijing.maas.aliyuncs.com/compatible-mode/v1", ) # First turn response1 = client.responses.create( model="qwen3.7-plus", input="My name is John. Please remember it." ) print(f"First response: {response1.output_text}") # Second turn - use previous_response_id to link context. The response ID is valid for 7 days. response2 = client.responses.create( model="qwen3.7-plus", input="Do you remember my name?", previous_response_id=response1.id ) print(f"Second response: {response2.output_text}") Node.js import OpenAI from "openai"; const openai = new OpenAI({ apiKey: process.env.DASHSCOPE_API_KEY, baseURL: "https://{WorkspaceId}.cn-beijing.maas.aliyuncs.com/compatible-mode/v1" }); async function main() { // First turn const response1 = await openai.responses.create({ model: "qwen3.7-plus", input: "My name is John. Please remember it." }); console.log(`First response: ${response1.output_text}`); // Second turn - use previous_response_id to link context. The response ID is valid for 7 days. const response2 = await openai.responses.create({ model: "qwen3.7-plus", input: "Do you remember my name?", previous_response_id: response1.id }); console.log(`Second response: ${response2.output_text}`); } main(); Built-in tools Python import os from openai import OpenAI client = OpenAI( api_key=os.getenv("DASHSCOPE_API_KEY"), base_url="https://{WorkspaceId}.cn-beijing.maas.aliyuncs.com/compatible-mode/v1", ) response = client.responses.create( model="qwen3.7-plus", input="Find the Alibaba Cloud official website and extract key information from its homepage.", # For best results, we recommend enabling the built-in tools. tools=[ {"type": "web_search"}, {"type": "code_interpreter"}, {"type": "web_extractor"} ], ) # Uncomment the following line to view the intermediate output. # print(response.output) print(response.output_text) Node.js import OpenAI from "openai"; const openai = new OpenAI({ apiKey: process.env.DASHSCOPE_API_KEY, baseURL: "https://{WorkspaceId}.cn-beijing.maas.aliyuncs.com/compatible-mode/v1" }); async function main() { const response = await openai.responses.create({ model: "qwen3.7-plus", input: "Find the Alibaba Cloud official website and extract key information from its homepage.", tools: [ { type: "web_search" }, { type: "code_interpreter" }, { type: "web_extractor" } ] }); for (const item of response.output) { if (item.type === "reasoning") { console.log("Model is thinking..."); } else if (item.type === "web_search_call") { console.log(`Search query: ${item.action.query}`); } else if (item.type === "web_extractor_call") { console.log("Extracting web content..."); } else if (item.type === "message") { console.log(`Response content: ${item.content[0].text}`); } } } main(); curl `curl -X POST https://{WorkspaceId}.cn-beijing.maas.aliyuncs.com/compatible-mode/v1/responses \ -H "Authorization: Bearer $DASHSCOPE_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "qwen3.7-plus", "input": "Find the Alibaba Cloud official website and extract key information from its homepage.", "tools": [ { "type": "web_search" }, { "type": "code_interpreter" }, { "type": "web_extractor" } ] }'` Function calling Python from openai import OpenAI import json import os import random # Initialize the client. client = OpenAI( # If the environment variable is not set, replace the following line with your Model Studio API Key: api_key="sk-xxx", api_key=os.getenv("DASHSCOPE_API_KEY"), base_url="https://{WorkspaceId}.cn-beijing.maas.aliyuncs.com/compatible-mode/v1", ) # Simulate a user question. USER_QUESTION = "What's the weather like in Beijing?" # Define the list of tools. tools = [ { "type": "function", "name": "get_current_weather", "description": "Useful for getting the weather in a specific city.", "parameters": { "type": "object", "properties": { "location": { "type": "string", "description": "The city or district, such as Beijing or Hangzhou.", } }, "required": ["location"], }, } ] # Simulate the weather query tool. def get_current_weather(arguments): weather_conditions = ["sunny", "cloudy", "rainy"] random_weather = random.choice(weather_conditions) location = arguments["location"] return f"Today in {location} it is {random_weather}." # Wrap the model response function. def get_response(input_data): response = client.responses.create( model="qwen3.7-plus", input=input_data, tools=tools, ) return response # Maintain the conversation context. conversation = [{"role": "user", "content": USER_QUESTION}] response = get_response(conversation) function_calls = [item for item in response.output if item.type == "function_call"] # If no tool call is needed, output the content directly. if not function_calls: print(f"Final assistant response: {response.output_text}") else: # Enter the tool-calling loop. while function_calls: for fc in function_calls: func_name = fc.name arguments = json.loads(fc.arguments) print(f"Calling tool [{func_name}] with arguments: {arguments}") # Execute the tool. tool_result = get_current_weather(arguments) print(f"Tool returned: {tool_result}") # Append the tool call and its result to the context as a pair. conversation.append( { "type": "function_call", "name": fc.name, "arguments": fc.arguments, "call_id": fc.call_id, } ) conversation.append( { "type": "function_call_output", "call_id": fc.call_id, "output": tool_result, } ) # Call the model again with the full context. response = get_response(conversation) function_calls = [ item for item in response.output if item.type == "function_call" ] print(f"Final assistant response: {response.output_text}") Node.js import OpenAI from "openai"; // Initialize the client. const openai = new OpenAI({ // If the environment variable is not set, replace the following line with your Model Studio API Key: apiKey: "sk-xxx", apiKey: process.env.DASHSCOPE_API_KEY, baseURL: "https://{WorkspaceId}.cn-beijing.maas.aliyuncs.com/compatible-mode/v1", }); // Define the list of tools. const tools = [ { type: "function", name: "get_current_weather", description: "Useful for getting the weather in a specific city.", parameters: { type: "object", properties: { location: { type: "string", description: "The city or district, such as Beijing or Hangzhou.", }, }, required: ["location"], }, }, ]; // Simulate the weather query tool. const getCurrentWeather = (args) => { const weatherConditions = ["sunny", "cloudy", "rainy"]; const randomWeather = weatherConditions[Math.floor(Math.random() * weatherConditions.length)]; const location = args.location; return `Today in ${location} it is ${randomWeather}.`; }; // Wrap the model response function. const getResponse = async (inputData) => { const response = await openai.responses.create({ model: "qwen3.7-plus", input: inputData, tools: tools, }); return response; }; const main = async () => { const userQuestion = "What's the weather like in Beijing?"; // Maintain the conversation context. const conversation = [{ role: "user", content: userQuestion }]; let response = await getResponse(conversation); let functionCalls = response.output.filter( (item) => item.type === "function_call" ); // If no tool call is needed, output the content directly. if (functionCalls.length === 0) { console.log(`Final assistant response: ${response.output_text}`); } else { // Enter the tool-calling loop. while (functionCalls.length > 0) { for (const fc of functionCalls) { const funcName = fc.name; const args = JSON.parse(fc.arguments); console.log(`Calling tool [${funcName}] with arguments:`, args); // Execute the tool. const toolResult = getCurrentWeather(args); console.log(`Tool returned: ${toolResult}`); // Append the tool call and its result to the context as a pair. conversation.push({ type: "function_call", name: fc.name, arguments: fc.arguments, call_id: fc.call_id, }); conversation.push({ type: "function_call_output", call_id: fc.call_id, output: toolResult, }); } // Call the model again with the full context. response = await getResponse(conversation); functionCalls = response.output.filter( (item) => item.type === "function_call" ); } console.log(`Final assistant response: ${response.output_text}`); } }; // Start the program. main().catch(console.error); Document understanding Python import os from openai import OpenAI client = OpenAI( # If you have not configured an environment variable, replace the next line with: api_key="sk-xxx" api_key=os.getenv("DASHSCOPE_API_KEY"), base_url="https://{WorkspaceId}.cn-beijing.maas.aliyuncs.com/compatible-mode/v1", ) response = client.responses.create( model="qwen3.5-ocr", input=[ { "role": "user", "content": [ { "type": "input_file", "file_url": "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20260616/qmycjl/1506.02640v5.pdf", }, { "type": "input_text", "text": "Read all the text in the file.", }, ], } ], extra_body={ "ocr_options": {} }, ) print(response.output_text) Node.js import OpenAI from 'openai'; const client = new OpenAI({ // If you have not configured an environment variable, replace the next line with: apiKey: "sk-xxx" apiKey: process.env.DASHSCOPE_API_KEY, baseURL: "https://{WorkspaceId}.cn-beijing.maas.aliyuncs.com/compatible-mode/v1", }); async function main() { const response = await client.responses.create({ model: "qwen3.5-ocr", input: [{ role: "user", content: [{ type: "input_file", file_url: "https://example.com/your-document.pdf" }] }], ocr_options: { task: "document_parsing" } }); // Get the custom task result console.log(response.output[0].content[0].ocr_result); } main(); curl `curl -X POST https://{WorkspaceId}.cn-beijing.maas.aliyuncs.com/compatible-mode/v1/responses \ -H "Authorization: Bearer $DASHSCOPE_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "qwen3.5-ocr", "input": [ { "role": "user", "content": [ { "type": "input_file", "file_url": "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20260616/qmycjl/1506.02640v5.pdf" }, { "type": "input_text", "text": "Read all the text in the file." } ] } ], "ocr_options": {} }'` Session cache Python import os from openai import OpenAI client = OpenAI( api_key=os.getenv("DASHSCOPE_API_KEY"), base_url="https://{WorkspaceId}.cn-beijing.maas.aliyuncs.com/compatible-mode/v1", # Enable the session cache via default_headers. default_headers={"x-dashscope-session-cache": "enable"} ) # Construct a long text over 1024 tokens to trigger cache creation. # If not, caching is triggered when the cumulative context exceeds 1024 tokens. long_context = "Artificial intelligence (AI) is a major branch of computer science, dedicated to researching and developing theories, methods, technologies, and application systems that can simulate, extend, and expand human intelligence." * 50 # First turn response1 = client.responses.create( model="qwen3.7-plus", input=long_context + "\n\nBased on this context, briefly introduce the Random Forest algorithm in machine learning.", ) print(f"First response: {response1.output_text}") # Second turn: Link context using previous_response_id. The cache is handled automatically by the server. response2 = client.responses.create( model="qwen3.7-plus", input="What are the main differences between it and GBDT?", previous_response_id=response1.id, ) print(f"Second response: {response2.output_text}") # Check the cache hit status. usage = response2.usage print(f"Input tokens: {usage.input_tokens}") print(f"Cached tokens: {usage.input_tokens_details.cached_tokens}") Node.js import OpenAI from "openai"; const openai = new OpenAI({ apiKey: process.env.DASHSCOPE_API_KEY, baseURL: "https://{WorkspaceId}.cn-beijing.maas.aliyuncs.com/compatible-mode/v1", // Enable the session cache via defaultHeaders. defaultHeaders: {"x-dashscope-session-cache": "enable"} }); // Construct a long text over 1024 tokens to trigger cache creation. // If not, caching is triggered when the cumulative context exceeds 1024 tokens. const longContext = "Artificial intelligence (AI) is a major branch of computer science, dedicated to researching and developing theories, methods, technologies, and application systems that can simulate, extend, and expand human intelligence.".repeat(50); async function main() { // First turn const response1 = await openai.responses.create({ model: "qwen3.7-plus", input: longContext + "\n\nBased on this context, briefly introduce the Random Forest algorithm in machine learning, including its basic principles and applications." }); console.log(`First response: ${response1.output_text}`); // Second turn: Link context using previous_response_id. The cache is handled automatically by the server. const response2 = await openai.responses.create({ model: "qwen3.7-plus", input: "What are the main differences between it and GBDT?", previous_response_id: response1.id }); console.log(`Second response: ${response2.output_text}`); // Check the cache hit status. console.log(`Input tokens: ${response2.usage.input_tokens}`); console.log(`Cached tokens: ${response2.usage.input_tokens_details.cached_tokens}`); } main(); curl # First turn # Repeat the long text 50 times to ensure it exceeds 1024 tokens and triggers cache creation. curl -X POST https://{WorkspaceId}.cn-beijing.maas.aliyuncs.com/compatible-mode/v1/responses \ -H "Authorization: Bearer $DASHSCOPE_API_KEY" \ -H "Content-Type: application/json" \ -H "x-dashscope-session-cache: enable" \ -d '{ "model": "qwen3.7-plus", "input": "Artificial intelligence (AI) is a major branch of computer science..." }' # Second turn - use the ID from the previous response as previous_response_id. curl -X POST https://{WorkspaceId}.cn-beijing.maas.aliyuncs.com/compatible-mode/v1/responses \ -H "Authorization: Bearer $DASHSCOPE_API_KEY" \ -H "Content-Type: application/json" \ -H "x-dashscope-session-cache: enable" \ -d '{ "model": "qwen3.7-plus", "input": "What are the main differences between it and GBDT?", "previous_response_id": "" }'
model `string` (required) The ID of the model to use. Supported models China (Beijing) China Mainland deployment scope `qwen3.7-max`, `qwen3.7-max-2026-05-20`, `qwen3.7-max-2026-06-08`, `qwen3-max`, `qwen3-max-2026-01-23`, `qwen3.7-plus`, `qwen3.7-plus-2026-05-26`, `qwen3.6-plus`, `qwen3.6-plus-2026-04-02`, `qwen3.5-plus`, `qwen3.5-plus-2026-04-20`, `qwen3.5-plus-2026-02-15`, `qwen3.6-flash`, `qwen3.6-flash-2026-04-16`, `qwen3.5-flash`, `qwen3.5-flash-2026-02-23`, `qwen3.6-35b-a3b`, `qwen3.5-397b-a17b`, `qwen3.5-122b-a10b`, `qwen3.5-27b`, `qwen3.5-35b-a3b`, `qwen-plus`, `qwen-flash`, `qwen3-coder-plus`, `qwen3-coder-flash`, `qwen3.5-ocr`, `qwen-plus-character`, `qwen-flash-character` Singapore International deployment scope `qwen3.7-max`, `qwen3.7-max-2026-05-20`, `qwen3.7-max-2026-06-08`, `qwen3-max`, `qwen3-max-2026-01-23`, `qwen3.7-plus`, `qwen3.7-plus-2026-05-26`, `qwen3.6-plus`, `qwen3.6-plus-2026-04-02`, `qwen3.5-plus`, `qwen3.5-plus-2026-04-20`, `qwen3.5-plus-2026-02-15`, `qwen3.6-flash`, `qwen3.6-flash-2026-04-16`, `qwen3.5-flash`, `qwen3.5-flash-2026-02-23`, `qwen3.6-35b-a3b`, `qwen3.5-397b-a17b`, `qwen3.5-122b-a10b`, `qwen3.5-27b`, `qwen3.5-35b-a3b`, `qwen-plus`, `qwen-flash`, `qwen3-coder-plus`, `qwen3-coder-flash`, `qwen-plus-character`, `qwen-flash-character` US (Virginia) Global deployment scope `qwen3.7-max`, `qwen3.7-max-2026-05-20`, `qwen3.7-max-2026-06-08`, `qwen3.7-plus`, `qwen3.7-plus-2026-05-26`, `qwen3.6-plus`, `qwen3.6-plus-2026-04-02`, `qwen3.5-plus`, `qwen3.5-plus-2026-02-15`, `qwen3.6-flash`, `qwen3.6-flash-2026-04-16`, `qwen3.5-flash`, `qwen3.5-flash-2026-02-23`, `qwen3.6-35b-a3b`, `qwen3.5-397b-a17b`, `qwen3.5-122b-a10b`, `qwen3.5-27b`, `qwen3.5-35b-a3b` Germany (Frankfurt) Global deployment scope `qwen3.7-max, qwen3.7-max-2026-05-20, qwen3.7-max-2026-06-08, qwen3.7-plus, qwen3.7-plus-2026-05-26, qwen3.5-397b-a17b, qwen3.5-122b-a10b, qwen3.5-35b-a3b, qwen3.5-27b` Japan (Tokyo) Japan deployment scope `qwen3.7-plus`, `qwen3.7-plus-2026-05-26` Global deployment scope `qwen3.7-plus`, `qwen3.7-plus-2026-05-26`, `qwen3.7-max`, `qwen3.7-max-2026-05-20`, `qwen3.6-plus`, `qwen3.6-plus-2026-04-02`, `qwen3.6-flash`, `qwen3.6-flash-2026-04-16`
input `string or array` (required) The input for the model. The following formats are supported: `string`: Plain text, such as `"Hello"`. `array`: An array of messages, ordered by conversation turn. Array item types EasyInputMessage `object` An object with a `role` for the message author and `content` for the message payload. Properties role `string` (required) The role of the message's author. Valid values: `user`, `assistant`, `system`, `developer`. content `string or array` (required) The message content. The content is a `string` if the input is plain text, or an `array` if the input is a structured content array. When the `role` is `system` or `developer`, the array element type is `input_text`. When the `role` is `user`, the array element type is `input_text`, `input_image`, or `input_file`. When the `role` is `assistant`, the array element type is `output_text`. The Responses API does not currently support video or audio input. To pass these data types, use the Chat Completions API or DashScope API. Content array items type `string` (required) Specifies the content type. Valid values are `input_text`, `input_image` (user role only), `input_file` (user role only, supports PDF and images), and `output_text` (assistant role only). text `string` The text content. Required when `type` is `input_text` or `output_text`. image_url `string` The public URL of the image. Required when `type` is `input_image`. file_url `string` The public URL of the file. Required when `type` is `input_file`. Supports PDF files (up to 50 pages, 100 MB) and image files (up to 20 MB). Currently only supported by `qwen3.5-ocr`. type `string` (optional) Fixed as `message`. ResponseOutputMessage `object` (optional) The model's output message. To continue a conversation, you can pass the `message` object from a previous response's `output` array back into the `input`. Unlike `EasyInputMessage`, this object includes the full output structure, with `id`, `status`, and structured `content`. Properties type `string` (required) Fixed as `message`. id `string` (required) The unique identifier of the output message, from the previous response. role `string` (required) Fixed as `assistant`. status `string` (required) The message status. Valid values: `in_progress`, `completed`, `incomplete`. content `array` (required) An array of content, where elements are `output_text` objects. Properties type `string` (required) Fixed as `output_text`. text `string` (required) The response text. annotations `array` (optional) Annotation information. Function call `object` (optional) A structured instruction generated when the model decides to call an external tool. Properties type `string` (required) Fixed as `function_call`. id `string` (optional) The unique identifier for the function call, from the previous response. name `string` (required) The name of the tool function. arguments `string` (required) The tool call arguments, in JSON string format. call_id `string` (required) The identifier for the tool call. This must match the `call_id` that is returned by the model. status `string` (optional) The status. Valid values: `in_progress`, `completed`, `incomplete`. Function call output `object` (optional) The output of a tool call. In the message list, this object must immediately follow its corresponding `function_call` message to prevent a request failure. Properties type `string` (required) Fixed as `function_call_output`. id `string` (optional) The unique identifier for the function call output. call_id `string` (required) The tool call identifier must match the `call_id` returned by the model. output `string` (required) The execution result of the tool function. status `string` (optional) The status. Valid values: `in_progress`, `completed`, `incomplete`. Reasoning `object` (optional) The model's reasoning process. You can pass the `reasoning` item from a previous response's `output` back into the `input` to continue this process in a subsequent turn. Properties type `string` (required) Fixed as `reasoning`. id `string` (required) The unique identifier for the reasoning content, from the previous response. summary `array` (required) The reasoning summary content. Properties type `string` (required) Fixed as `summary_text`. text `string` (required) The summary text. status `string` (optional) The status. Valid values: `in_progress`, `completed`, `incomplete`. Web Search Call `object` (optional) A web search call object. You can pass back the web_search_call item from the previous response's output to the input, providing search result context in multi-turn conversations. Properties type `string` (required) Always `web_search_call`. id `string` (required) The unique identifier of the search call, from the previous response. status `string` (required) The search status. Valid values: `in_progress`, `searching`, `completed`, `failed`. action `object` (required) The search action details. Only the `search` type is supported. Properties type `string` (required) The search type. Always `search`. queries `array` (optional) A list of search queries. Each element is a string. sources `array` (optional) A list of search result sources. Properties type `string` (required) The source type. Always `url`. url `string` (required) The source URL.
instructions `string` (optional) It is inserted at the beginning of the context as a system instruction. When `previous_response_id` is used, the `instructions` specified in the previous turn are not passed to the current turn's context.
previous_response_id `string` (optional) The unique ID of the previous response. A response's `id` is valid for 7 days. You can use this parameter to create multi-turn conversations. The server-side automatically retrieves and combines the input and output of that turn as the context. If you provide both the `input` message array and `previous_response_id`, the new messages in `input` are appended to the historical context. This parameter cannot be used with `conversation`.
conversation `string` (optional) The conversation that the current response belongs to (see the Conversations API). The conversation's history is automatically included as context. The input and output of this request are added to the conversation upon completion. Cannot be used with `previous_response_id`.
stream `boolean` (optional) Defaults to `false` Enables stream output. If set to `true`, the model streams the response in real time.
store `boolean` (optional) Defaults to `true` Specifies whether to store the model response generated for this session. `false`: The response is not stored and cannot be referenced in subsequent calls via `previous_response_id`. `true`: The response is stored. The current model response can be referenced by `previous_response_id` and subsequent API calls.
tools `array` (optional) An array of tools the model can call when generating a response. Supports both built-in tools and custom `function` tools, which can be used together. For best results, enable the `code_interpreter`, `web_search`, and `web_extractor` tools. Properties Web search Searches the internet for up-to-date information. Related documentation: Web Search Properties type `string` (required) Fixed as `web_search`. Example: `[{"type": "web_search"}]` Web extractor Accesses and extracts content from web pages. It must be used with the `web_search` tool. For `qwen3-max` and `qwen3-max-2026-01-23` models, reasoning mode must also be enabled. Related documentation: Web Extraction Properties type `string` (required) Fixed as `web_extractor`. Example: `[{"type": "web_search"}, {"type": "web_extractor"}]` Code interpreter Executes code in a sandboxed environment to perform tasks like data analysis. For `qwen3-max` and `qwen3-max-2026-01-23` models, reasoning mode must also be enabled. Related documentation: Code Interpreter Properties type `string` (required) Fixed as `code_interpreter`. Example: `[{"type": "code_interpreter"}]` Web search image Searches for images based on a text description. Related documentation: Text-to-Image Search Properties type `string` (required) Fixed as `web_search_image`. Example: `[{"type": "web_search_image"}]` Image search Searches for similar or related images based on an input image. The input must include the image's URL. Related documentation: Image-to-Image Search Properties type `string` (required) Fixed as `image_search`. Example: `[{"type": "image_search"}]` File search Performs knowledge retrieval by searching a specified knowledge base. Related documentation: Knowledge Retrieval Properties type `string` (required) Fixed as `file_search`. vector_store_ids `array` (required) The ID of the knowledge base to search. Currently, only one knowledge base ID can be provided. Example: `[{"type": "file_search", "vector_store_ids": ["your_knowledge_base_id"]}]` MCP invocation Calls an external service through the Model Context Protocol (MCP). Related documentation: MCP Properties type `string` (required) Fixed as `mcp`. server_protocol `string` (required) The communication protocol with the MCP service, such as `"sse"`. server_label `string` (required) A label used to identify the MCP service. server_description `string` (optional) A description of the service. It helps the model understand its function and when to use it. server_url `string` (required) The URL of the MCP service endpoint. headers `object` (optional) Request headers, used to carry information such as authentication (e.g., `Authorization`). Example: mcp_tool = { "type": "mcp", "server_protocol": "sse", "server_label": "amap-maps", "server_description": "The AMap MCP Server provides a full suite of geographic information services, covering 15 core APIs. These include custom map generation, navigation, ride-hailing, geocoding, reverse geocoding, IP-based location, weather queries, and planning for cycling, walking, driving, and public transit routes, along with distance measurement and various search functions.", "server_url": "https://{WorkspaceId}.cn-beijing.maas.aliyuncs.com/api/v1/mcps/amap-maps/sse", "headers": { "Authorization": "Bearer <your-mcp-server-token>" } } Custom tool function Allows the model to call a developer-defined function. When the model determines that a tool needs to be called, the response returns an output item of type `function_call`. Related documentation: Function Calling Properties type `string` (required) Must be set to `function`. name `string` (required) The name of the tool. Can only contain letters, digits, underscores (`_`), and hyphens (`-`), with a maximum length of 64 tokens. description `string` (required) A description of the tool, which helps the model decide when and how to call it. parameters `object` (optional) The parameter definition for the tool, which must be a valid JSON Schema object. If `parameters` is empty, the tool takes no arguments (e.g., a time query tool). To improve tool-calling accuracy, we recommend defining `parameters`. Example: `[{ "type": "function", "name": "get_weather", "description": "Get weather information for a specified city", "parameters": { "type": "object", "properties": { "city": { "type": "string", "description": "The name of the city" } }, "required": ["city"] } }]`
tool_choice `string or object` (optional) Defaults to `auto` Controls how the model selects and calls tools. This parameter supports two formats: string mode and object mode. String mode `auto`: The model decides whether to call a tool. `none`: Prevents the model from calling any tool. `required`: Forces the model to call a tool. This is only available when the `tools` list contains exactly one tool. Object mode Restricts the model to a specific set of tools for selection and calling. Properties mode `string` (required) `auto`: The model automatically decides whether to call a tool from the provided list. `required`: Forces the model to call a tool from the provided list. This is only available when the `tools` list contains exactly one tool. tools `array`(required) A list of tool definitions that the model is allowed to call. `[ { "type": "function", "name": "get_weather" } ]` type `string` (required) The type of tool configuration. Fixed as `allowed_tools`.
temperature `float` (optional) The sampling temperature, which controls the diversity of the generated text. Higher values make the output more random and diverse, while lower values make it more focused and deterministic. Value range: [0, 2) Both `temperature` and `top_p` control the diversity of the generated text. We recommend using only one of these parameters at a time. For more information, see Overview.
top_p `float` (optional) The probability threshold for top-p sampling, which controls the diversity of the generated text. Higher values make the output more random and diverse, while lower values make it more focused and deterministic. Value range: (0, 1.0] Both `temperature` and `top_p` control the diversity of the generated text. We recommend using only one of these parameters at a time. For more information, see Overview.
enable_thinking `boolean` (optional) Enables or disables reasoning mode. When enabled, the model performs a reasoning step before it responds. The reasoning process is returned as an output item of type `reasoning`. When enabling reasoning mode, we recommend also enabling built-in tools to achieve the best results on complex tasks. Valid values: `true`: Enables reasoning mode. `false`: Disables reasoning mode. For default values for different models, see Supported models. This parameter is not a standard OpenAI parameter. In the Python SDK, pass it using `extra_body={"enable_thinking": True}`. In the Node.js SDK and curl, use `enable_thinking: true` as a top-level parameter. We recommend using `reasoning.effort` instead, as `enable_thinking` will be deprecated.
reasoning `object` (optional) Controls the model's reasoning effort. The model performs a reasoning step before replying, and the reasoning process is returned through an output item of type `reasoning`. Properties effort `string` (optional): The level of reasoning effort. Defaults to `medium`. `none`: Disables reasoning and provides a direct answer. `minimal`: Minimizes reasoning for the fastest response. `medium` (default): Moderate reasoning, balancing speed and depth. `high`: Deep reasoning, optimized for complex and professional tasks. `high`: In-depth analysis for complex, specialized problems. `reasoning.effort` takes precedence over `enable_thinking`. We recommend using `reasoning.effort`, as `enable_thinking` will be deprecated.
ocr_options `object` (optional) OCR built-in task parameters. Only applicable to the `qwen3.5-ocr` model. Use this parameter to call built-in OCR tasks (such as information extraction and text localization). Built-in task results are returned in the `ocr_result` field of the response. This parameter is not a standard OpenAI parameter. In the Python SDK, pass it using `extra_body={"ocr_options": {...}}`. In the Node.js SDK and curl, use `ocr_options` as a top-level parameter.

Response object (non-streaming output)	{ "created_at": 1771165743, "id": "c9f9c06b-032d-4525-a422-ac8ab5eccxxx", "model": "qwen3.7-plus", "object": "response", "output": [ { "content": [ { "annotations": [], "text": "Hello! I am Qwen3.5, the latest Qwen large language model from Alibaba Cloud. I provide strong language understanding, logical reasoning, code generation, and multimodal capabilities to deliver accurate and efficient intelligent services.", "type": "output_text" } ], "id": "msg_544b2907-e88e-40d2-9a83-c30d6d1f9xxx", "role": "assistant", "status": "completed", "type": "message" } ], "parallel_tool_calls": false, "status": "completed", "tool_choice": "auto", "tools": [], "usage": { "input_tokens": 55, "input_tokens_details": { "cached_tokens": 0 }, "output_tokens": 43, "output_tokens_details": { "reasoning_tokens": 0 }, "total_tokens": 98, "x_details": [ { "input_tokens": 55, "output_tokens": 43, "total_tokens": 98, "x_billing_type": "response_api" } ] } }
id `string` A unique identifier for this response, a UUID. This ID is valid for 7 days and can be used in the `previous_response_id` parameter to create a multi-turn conversation.
created_at `integer` The Unix timestamp (in seconds) for this request.
object `string` The object type, which is always `response`.
status `string` The status of the response generation. Valid values: `completed`: Generation completed. `failed`: Generation failed. `in_progress`: Generation in progress. `cancelled`: Generation cancelled. `queued`: Request queued. `incomplete`: Generation incomplete.
model `string` The ID of the model used to generate the response.
output `array` An array of output items generated by the model. The type and order of elements in the array depend on the model's response. Array element properties type `string` The output item type. Valid values: `message`: A message item that contains the model's final response content. `reasoning`: The reasoning type. This parameter is returned when `reasoning.effort` is set to a value other than `none` or when reasoning mode is enabled. The reasoning tokens are counted in `output_tokens_details.reasoning_tokens` and billed as reasoning tokens. `function_call`: The function call type. This is returned when a custom `function` tool is used. Handle the function call and return the result. `web_search_call`: The search call type. This is returned when the `web_search` tool is used. `code_interpreter_call`: A code execution type that is returned when the `code_interpreter` tool is used. `web_extractor_call`: The web extraction type. This is returned when the `web_extractor` tool is used. It must be used with the `web_search` tool. `web_search_image_call`: The call type for a text-to-image search. This is returned when you use the `web_search_image` tool. It contains a list of the found images. `image_search_call`: The call type for an image-to-image search. This is returned when the `image_search` tool is used. It contains a list of similar images. `mcp_call`: The MCP call type. This is returned when you use the `mcp` tool. It contains the result of the MCP service call. `file_search_call`: The call type for a knowledge base search, which is returned when you use the `file_search` tool. It contains the retrieval query and results for the knowledge base. id `string` The unique identifier of the output item. All types of output items contain this field. role `string` The role of the message is always `assistant`. This parameter is present only when `type` is `message`. status `string` The status of the output item. Valid values: `completed` and `in_progress`. This parameter is present when the `type` parameter is not set to `reasoning`. name `string` The name of the tool or function. This parameter is present when `type` is `function_call`, `web_search_image_call`, `image_search_call`, or `mcp_call`. For `web_search_image_call` and `image_search_call`, the values are fixed to `"web_search_image"` and `"image_search"`, respectively. For `mcp_call`, the value is the name of the specific function called in the MCP service, such as `amap-maps-maps_geo`. arguments `string` The parameters for the tool call, in a JSON string format. This parameter is present when `type` is `function_call`, `web_search_image_call`, `image_search_call`, or `mcp_call`. Parse the string by using `JSON.parse()` before use. The content of arguments for different tool types is as follows: `web_search_image_call`: `{"queries": ["Search Keyword 1", "Search Keyword 2"]}`, where `queries` is a list of search keywords automatically generated by the model based on user input. `image_search_call`: `{"img_idx": 0, "bbox": [0, 0, 1000, 1000]}`, where `img_idx` is the index of the input image (starting from 0), and `bbox` is the bounding box coordinates [x1, y1, x2, y2] for the search area. The coordinate values range from 0 to 1000. `function_call`: A parameter object generated from the user-defined function parameter schema. `mcp_call`: The parameter object for the called function in the MCP service. call_id `string` The unique ID for the function call. This parameter is included only when `type` is `function_call`. This ID must be included in the function call result to link the request to the response. content `array` The array of message content. This parameter is present only if `type` is set to `message`. Array element properties type `string` The content type. The value is fixed to `output_text`. text `string` The text content generated by the model. annotations `array` The array of text annotations. This is usually an empty array. summary `array` An array of reasoning summaries. This field is present only when `type` is `reasoning`. Each element contains the `type` field (value: `summary_text`) and the `text` field (the summary text). action `object` The information about the search action. This parameter is present only when `type` is `web_search_call`. Properties query `string` The search query keywords. type `string` The search type. The value is always `search`. sources `array` A list of search sources. Each element contains the `type` and `url` fields. code `string` The code generated and executed by the model. This exists only when `type` is `code_interpreter_call`. outputs `array` The code execution output array. This is present only when `type` is `code_interpreter_call`. Each element has a `type` field (the value is `logs`) and a `logs` field (the code execution logs). container_id `string` The container identifier for the code interpreter. This parameter is present only when `type` is `code_interpreter_call`. This identifier associates multiple code executions within the same session. goal `string` A description of the information to extract from the webpage. This parameter is available only when `type` is `web_extractor_call`. output `string` The output of the tool call. The output is a string. If `type` is `web_extractor_call`, this is a summary of the content extracted from the web page. If `type` is `web_search_image_call` or `image_search_call`, this is a JSON string that contains an array of image search results. Each element includes the `title`, `url`, and `index` fields. If `type` is `mcp_call`, this is the JSON string result returned by the MCP service. urls `array` The list of URLs for the extracted web pages. This parameter is available only when `type` is `web_extractor_call`. server_label `string` The label for the MCP service. This appears only when `type` is `mcp_call`. It shows which MCP service the call used. queries `array` A list of queries for knowledge base retrieval. This parameter exists only when `type` is `file_search_call`. The array contains strings. Each string is a search query generated by the model. results `array` An array of search results from the knowledge base. This parameter is present only when `type` is `file_search_call`. Array element properties file_id `string` The file ID of the matched document. filename `string` The file name of the matched document. score `float` The relevance score of the match. The value ranges from 0 to 1. A larger value indicates higher relevance. text `string` The content snippet from the matched document.
usage `object` Information about the token consumption for this request. Properties input_tokens `integer` The number of tokens in the input. Additional Notes output_tokens `integer` The number of tokens in the model's output. total_tokens `integer` The total number of tokens consumed is the sum of `input_tokens` and `output_tokens`. input_tokens_details `object` A fine-grained classification of input tokens. Properties cached_tokens `integer` The number of tokens that hit the cache. For more information, see context caching. output_tokens_details `object` A detailed breakdown of the output tokens. Properties reasoning_tokens `integer` The number of reasoning tokens. x_details `array` An array of billing details for the request. This provides a more granular breakdown of multimodal tokens than the top-level `usage` field. Properties input_tokens `integer` The number of tokens in the input. Additional Notes output_tokens `integer` The number of tokens in the model's output. total_tokens `integer` The total number of tokens consumed is the sum of `input_tokens` and `output_tokens`. x_billing_type `string` The value is fixed to `response_api`. image_tokens `integer` The number of tokens for image input. This field is returned when the input includes an image and is equivalent to `input_tokens_details.image_tokens`. input_tokens_details `object` A granular breakdown of input tokens. This field is returned for multimodal inputs. It currently distinguishes only between `text_tokens` and `image_tokens`. It does not provide a breakdown for video or audio tokens. Properties text_tokens `integer` The number of tokens for text input. image_tokens `integer` The number of tokens for image input. output_tokens_details `object` A granular breakdown of output tokens. This field has an additional `text_tokens` field compared to the top-level `output_tokens_details`. The `text_tokens` field is returned for multimodal inputs. Properties reasoning_tokens `integer` The number of tokens for the reasoning process. text_tokens `integer` The number of tokens for text output. This field is returned for multimodal inputs. plugins `object` Statistics for built-in tool calls. This field is returned when a built-in tool such as `web_search` is used. Its content is the same as the top-level `x_tools` field. Properties web_search `object` Statistics for web search calls. Properties count `integer` The number of times web search was called in this response. prompt_tokens_details `object` Cache details for input tokens. This field is returned when session cache is enabled. It may return an empty object if the input includes an image but results in a cache miss. Properties cached_tokens `integer` The number of tokens that hit the cache. cache_creation_input_tokens `integer` The number of tokens used to create a new cache in this request. cache_creation `object` Details about cache creation. Properties ephemeral_5m_input_tokens `integer` The number of tokens used to create a new 5-minute ephemeral cache. cache_type `string` The cache type. The value is fixed to `ephemeral`. x_tools `object` Statistics on tool usage. This contains the number of times each built-in tool is called. Example: `{"web_search": {"count": 1}}`
error `object` An error object is returned when the model fails to generate a response. Otherwise, the value is `null`.
tools `array` Echos the full content of the `tools` parameter from the request, with the same structure as the `tools` parameter in the request body.
tool_choice `string` Echoes the value of the `tool_choice` parameter in the request. The valid values are `auto`, `none`, and `required`.

Response chunk object (streaming output)	Basic call // response.created: The response is created and queued. {"response":{"id":"428c90e9-9cd6-90a6-9726-c02b08ebexxx","created_at":1769082930,"object":"response","status":"queued",...},"sequence_number":0,"type":"response.created"} // response.in_progress: Processing begins. {"response":{"id":"428c90e9-9cd6-90a6-9726-c02b08ebexxx","status":"in_progress",...},"sequence_number":1,"type":"response.in_progress"} // response.output_item.added: A new output item is added. {"item":{"id":"msg_bcb45d66-fc34-46a2-bb56-714a51e8exxx","content":[],"role":"assistant","status":"in_progress","type":"message"},"output_index":0,"sequence_number":2,"type":"response.output_item.added"} // response.content_part.added: A new content part is added. {"content_index":0,"item_id":"msg_bcb45d66-fc34-46a2-bb56-714a51e8exxx","output_index":0,"part":{"annotations":[],"text":"","type":"output_text","logprobs":null},"sequence_number":3,"type":"response.content_part.added"} // response.output_text.delta: Incremental text (can be triggered multiple times). {"content_index":0,"delta":"Artificial Intelligence","item_id":"msg_bcb45d66-fc34-46a2-bb56-714a51e8exxx","logprobs":[],"output_index":0,"sequence_number":4,"type":"response.output_text.delta"} {"content_index":0,"delta":" (AI) refers to the technology","item_id":"msg_bcb45d66-fc34-46a2-bb56-714a51e8exxx","logprobs":[],"output_index":0,"sequence_number":6,"type":"response.output_text.delta"} // response.output_text.done: Text generation for a content part is complete. {"content_index":0,"item_id":"msg_bcb45d66-fc34-46a2-bb56-714a51e8exxx","logprobs":[],"output_index":0,"sequence_number":53,"text":"Artificial Intelligence (AI) refers to the technology and science that enables computer systems to simulate human intelligent behaviors...","type":"response.output_text.done"} // response.content_part.done: The content part is complete. {"content_index":0,"item_id":"msg_bcb45d66-fc34-46a2-bb56-714a51e8exxx","output_index":0,"part":{"annotations":[],"text":"...full text...","type":"output_text","logprobs":null},"sequence_number":54,"type":"response.content_part.done"} // response.output_item.done: The output item is complete. {"item":{"id":"msg_bcb45d66-fc34-46a2-bb56-714a51e8exxx","content":[{"annotations":[],"text":"...full text...","type":"output_text","logprobs":null}],"role":"assistant","status":"completed","type":"message"},"output_index":0,"sequence_number":55,"type":"response.output_item.done"} // response.completed: The response is complete (includes full response and usage). {"response":{"id":"428c90e9-9cd6-90a6-9726-c02b08ebexxx","created_at":1769082930,"model":"qwen3.7-max","object":"response","output":[...],"status":"completed","usage":{"input_tokens":37,"output_tokens":243,"total_tokens":280,...}},"sequence_number":56,"type":"response.completed"} Web extraction id:1 event:response.created :HTTP_STATUS/200 data:{"sequence_number":0,"type":"response.created","response":{"output":[],"parallel_tool_calls":false,"created_at":1769435906,"tool_choice":"auto","model":"","id":"863df8d9-cb29-4239-a54f-3e15a2427xxx","tools":[],"object":"response","status":"queued"}} id:2 event:response.in_progress :HTTP_STATUS/200 data:{"sequence_number":1,"type":"response.in_progress","response":{"output":[],"parallel_tool_calls":false,"created_at":1769435906,"tool_choice":"auto","model":"","id":"863df8d9-cb29-4239-a54f-3e15a2427xxx","tools":[],"object":"response","status":"in_progress"}} id:3 event:response.output_item.added :HTTP_STATUS/200 data:{"sequence_number":2,"item":{"summary":[],"type":"reasoning","id":"msg_5bd0c6df-19b8-4a04-bc00-8042a224exxx"},"output_index":0,"type":"response.output_item.added"} id:4 event:response.reasoning_summary_text.delta :HTTP_STATUS/200 data:{"delta":"The user wants me to:\n1. Search for the Alibaba Cloud official website.\n2. Extract key information from the homepage.\n\nI need to first search for the URL of the Alibaba Cloud official website, and then use the web_extractor tool to access the website and extract key information.","sequence_number":3,"output_index":0,"type":"response.reasoning_summary_text.delta","item_id":"msg_5bd0c6df-19b8-4a04-bc00-8042a224exxx","summary_index":0} id:14 event:response.reasoning_summary_text.done :HTTP_STATUS/200 data:{"sequence_number":13,"text":"The user wants me to:\n1. Search for the Alibaba Cloud official website.\n2. Extract key information from the homepage.\n\nI need to first search for the URL of the Alibaba Cloud official website, and then use the web_extractor tool to access the website and extract key information.","output_index":0,"type":"response.reasoning_summary_text.done","item_id":"msg_5bd0c6df-19b8-4a04-bc00-8042a224exxx","summary_index":0} id:15 event:response.output_item.done :HTTP_STATUS/200 data:{"sequence_number":14,"item":{"summary":[{"type":"summary_text","text":"The user wants me to:\n1. Search for the Alibaba Cloud official website.\n2. Extract key information from the homepage.\n\nI need to first search for the URL of the Alibaba Cloud official website, and then use the web_extractor tool to access the website and extract key information."}],"type":"reasoning","id":"msg_5bd0c6df-19b8-4a04-bc00-8042a224exxx"},"output_index":1,"type":"response.output_item.done"} id:16 event:response.output_item.added :HTTP_STATUS/200 data:{"sequence_number":15,"item":{"action":{"type":"search","query":"Web search"},"id":"msg_a8a686b1-0a57-40e1-bb55-049a89cd4xxx","type":"web_search_call","status":"in_progress"},"output_index":1,"type":"response.output_item.added"} id:17 event:response.web_search_call.in_progress :HTTP_STATUS/200 data:{"sequence_number":16,"output_index":1,"type":"response.web_search_call.in_progress","item_id":"msg_a8a686b1-0a57-40e1-bb55-049a89cd4xxx"} id:19 event:response.web_search_call.completed :HTTP_STATUS/200 data:{"sequence_number":18,"output_index":1,"type":"response.web_search_call.completed","item_id":"msg_a8a686b1-0a57-40e1-bb55-049a89cd4xxx"} id:20 event:response.output_item.done :HTTP_STATUS/200 data:{"sequence_number":19,"item":{"action":{"sources":[{"type":"url","url":"https://cn.aliyun.com/"},{"type":"url","url":"https://www.aliyun.com/"}],"type":"search","query":"Web search"},"id":"msg_a8a686b1-0a57-40e1-bb55-049a89cd4xxx","type":"web_search_call","status":"completed"},"output_index":1,"type":"response.output_item.done"} id:33 event:response.output_item.added :HTTP_STATUS/200 data:{"sequence_number":32,"item":{"urls":["https://cn.aliyun.com/"],"goal":"Extract key information from the Alibaba Cloud homepage, including: company positioning/overview, core products and services, main business segments, key features/solutions, latest news/events, free trial/discount information, and navigation menu structure.","id":"msg_8c2cf651-48a5-460c-aa7a-bea5b09b4xxx","type":"web_extractor_call","status":"in_progress"},"output_index":3,"type":"response.output_item.added"} id:34 event:response.output_item.done :HTTP_STATUS/200 data:{"sequence_number":33,"item":{"output":"The useful information in https://cn.aliyun.com/ for user goal 'Extract key information from the Alibaba Cloud homepage, including: company positioning/overview, core products and services, main business segments, key features/solutions, latest news/events, free trial/discount information, and navigation menu structure' is as follows: \n\nEvidence in page: \n## Tongyi large model, the first choice for enterprises in the AI era\n\n## A complete product system, building a cloud for enterprise technological innovation\n\nAll cloud products## Making AI accessible through the synergistic development of large models and cloud computing\n\nAll AI solutions\n\nSummary: \nAlibaba Cloud positions itself as a leading enterprise AI solution provider centered around the Tongyi large model...","urls":["https://cn.aliyun.com/"],"goal":"Extract key information from the Alibaba Cloud homepage, including: company positioning/overview, core products and services, main business segments, key features/solutions, latest news/events, free trial/discount information, and navigation menu structure.","id":"msg_8c2cf651-48a5-460c-aa7a-bea5b09b4xxx","type":"web_extractor_call","status":"completed"},"output_index":3,"type":"response.output_item.done"} id:50 event:response.output_item.added :HTTP_STATUS/200 data:{"sequence_number":50,"item":{"content":[{"type":"text","text":""}],"type":"message","id":"msg_final","role":"assistant"},"output_index":5,"type":"response.output_item.added"} id:51 event:response.output_text.delta :HTTP_STATUS/200 data:{"delta":"I have found the Alibaba Cloud website and extracted the key information from its homepage:\n\n","sequence_number":51,"output_index":5,"type":"response.output_text.delta"} id:60 event:response.completed :HTTP_STATUS/200 data:{"type":"response.completed","response":{"id":"863df8d9-cb29-4239-a54f-3e15a2427xxx","status":"completed","usage":{"input_tokens":45,"output_tokens":320,"total_tokens":365}}} Text-to-image search // 1. response.created: The response is created. id:1 event:response.created data:{"sequence_number":0,"type":"response.created","response":{"output":[],"status":"queued",...}} // 2. response.in_progress: The response is being processed. id:2 event:response.in_progress data:{"sequence_number":1,"type":"response.in_progress","response":{"status":"in_progress",...}} // 3. response.output_item.added: Reasoning starts. id:3 event:response.output_item.added data:{"sequence_number":2,"item":{"summary":[],"type":"reasoning","id":"msg_xxx"},"output_index":0,"type":"response.output_item.added"} // 4. response.reasoning_summary_text.delta: Reasoning summary delta. id:4 event:response.reasoning_summary_text.delta data:{"delta":"The user wants to find a picture of a cat. I need to use the web_search_image tool to search...","sequence_number":3,"output_index":0,"type":"response.reasoning_summary_text.delta","item_id":"msg_xxx","summary_index":0} // 5. response.reasoning_summary_text.done: Reasoning summary is complete. id:10 event:response.reasoning_summary_text.done data:{"sequence_number":9,"text":"The user wants to find a picture of a cat. I need to use the web_search_image tool to search for cat pictures.","output_index":0,"type":"response.reasoning_summary_text.done","item_id":"msg_xxx","summary_index":0} // 6. response.output_item.done: The reasoning item is complete. id:11 event:response.output_item.done data:{"sequence_number":10,"item":{"summary":[{"type":"summary_text","text":"..."}],"type":"reasoning","id":"msg_xxx"},"output_index":0,"type":"response.output_item.done"} // 7. response.output_item.added: Text-to-image search tool call starts (status: in_progress; includes name and arguments). id:12 event:response.output_item.added data:{"sequence_number":11,"item":{"name":"web_search_image","arguments":"{\"queries\": [\"cat pictures\", \"cute cat\"]}","id":"msg_xxx","type":"web_search_image_call","status":"in_progress"},"output_index":1,"type":"response.output_item.added"} // 8. response.output_item.done: Text-to-image search tool call is complete (includes the full 'output' search results). id:13 event:response.output_item.done data:{"sequence_number":12,"item":{"name":"web_search_image","output":"[{\"title\": \"Cute little cat...\", \"url\": \"https://example.com/cat.jpg\", \"index\": 1}, ...]","arguments":"{\"queries\": [\"cat pictures\", \"cute cat\"]}","id":"msg_xxx","type":"web_search_image_call","status":"completed"},"output_index":1,"type":"response.output_item.done"} // 9-12. Subsequent reasoning and message output events follow, similar to the basic call flow. // response.output_item.added (reasoning) → reasoning_summary_text.delta/done → response.output_item.done (reasoning) // response.output_item.added (message) → response.content_part.added → response.output_text.delta → response.output_text.done → response.content_part.done → response.output_item.done (message) // 13. response.completed: The response is complete. id:118 event:response.completed data:{"sequence_number":117,"type":"response.completed","response":{"output":[...],"status":"completed","usage":{"input_tokens":7895,"output_tokens":318,"total_tokens":8213,"x_tools":{"web_search_image":{"count":1}}}}} Image-to-image search // 1-6. The reasoning phase is the same as in the text-to-image search flow. // 7. response.output_item.added: Image-to-image search tool call starts. // Note: arguments includes img_idx (image index) and bbox (bounding box). id:29 event:response.output_item.added data:{"sequence_number":29,"item":{"name":"image_search","arguments":"{\"img_idx\": 0, \"bbox\": [0, 0, 1000, 1000]}","id":"msg_xxx","type":"image_search_call","status":"in_progress"},"output_index":1,"type":"response.output_item.added"} // 8. response.output_item.done: Image-to-image search tool call is complete. id:30 event:response.output_item.done data:{"sequence_number":30,"item":{"name":"image_search","output":"[{\"title\": \"Ink wash mountain background...\", \"url\": \"https://example.com/landscape.jpg\", \"index\": 1}, ...]","arguments":"{\"img_idx\": 0, \"bbox\": [0, 0, 1000, 1000]}","id":"msg_xxx","type":"image_search_call","status":"completed"},"output_index":1,"type":"response.output_item.done"} // 9-12. Second round of reasoning + final message output (same as basic call). // 13. response.completed id:408 event:response.completed data:{"sequence_number":407,"type":"response.completed","response":{"output":[...],"status":"completed","usage":{"input_tokens":8371,"output_tokens":417,"total_tokens":8788,"x_tools":{"image_search":{"count":1}}}}} MCP // 1-6. Reasoning phase (same as other tools). // 7. response.mcp_call_arguments.delta: MCP arguments delta (MCP-specific event). id:27 event:response.mcp_call_arguments.delta data:{"delta":"{\"city\": \"Beijing\"}","sequence_number":26,"output_index":1,"type":"response.mcp_call_arguments.delta","item_id":"msg_xxx"} // 8. response.mcp_call_arguments.done: MCP arguments are complete (MCP-specific event). id:28 event:response.mcp_call_arguments.done data:{"sequence_number":27,"arguments":"{\"city\": \"Beijing\"}","output_index":1,"type":"response.mcp_call_arguments.done","item_id":"msg_xxx"} // 9. response.output_item.added: MCP tool call starts (includes name, server_label, arguments). id:29 event:response.output_item.added data:{"sequence_number":28,"item":{"name":"amap-maps-maps_weather","server_label":"MCP Server","arguments":"{\"city\": \"Beijing\"}","id":"msg_xxx","type":"mcp_call","status":"in_progress"},"output_index":1,"type":"response.output_item.added"} // 10. response.mcp_call.completed: MCP call is complete (MCP-specific event). id:30 event:response.mcp_call.completed data:{"sequence_number":29,"output_index":1,"type":"response.mcp_call.completed","item_id":"msg_xxx"} // 11. response.output_item.done: The MCP output item is complete (includes the full 'output'). id:31 event:response.output_item.done data:{"sequence_number":30,"item":{"output":"{\"city\":\"Beijing\",\"forecasts\":[...]}","name":"amap-maps-maps_weather","server_label":"MCP Server","arguments":"{\"city\": \"Beijing\"}","id":"msg_xxx","type":"mcp_call","status":"completed"},"output_index":1,"type":"response.output_item.done"} // 12-15. Second round of reasoning + final message output. // 16. response.completed id:172 event:response.completed data:{"sequence_number":171,"type":"response.completed","response":{"output":[...],"status":"completed","usage":{"input_tokens":5019,"output_tokens":539,"total_tokens":5558}}} Knowledge base search // 1-6. Reasoning phase (same as other tools). // 7. response.output_item.added: Knowledge base search starts (includes queries, no results). id:19 event:response.output_item.added data:{"sequence_number":18,"item":{"id":"msg_xxx","type":"file_search_call","queries":["Alibaba Cloud Model Studio X1 phone","Alibaba Cloud Model Studio X1 phone","Model Studio X1"],"status":"in_progress"},"output_index":1,"type":"response.output_item.added"} // 8. response.file_search_call.in_progress: Search is in progress (file_search-specific event). id:20 event:response.file_search_call.in_progress data:{"sequence_number":19,"output_index":1,"type":"response.file_search_call.in_progress","item_id":"msg_xxx"} // 9. response.file_search_call.searching: Searching (file_search-specific event). id:21 event:response.file_search_call.searching data:{"sequence_number":20,"output_index":1,"type":"response.file_search_call.searching","item_id":"msg_xxx"} // 10. response.file_search_call.completed: Search is complete (file_search-specific event). id:22 event:response.file_search_call.completed data:{"sequence_number":21,"output_index":1,"type":"response.file_search_call.completed","item_id":"msg_xxx"} // 11. response.output_item.done: Provides the complete output item, including queries and results. id:23 event:response.output_item.done data:{"sequence_number":22,"item":{"id":"msg_xxx","type":"file_search_call","queries":["Alibaba Cloud Model Studio X1 phone","Alibaba Cloud Model Studio X1 phone","Model Studio X1"],"results":[{"score":0.7519,"filename":"Alibaba Cloud Model Studio Series Phone Product Introduction","text":"Alibaba Cloud Model Studio X1 — Enjoy the ultimate visual experience...","file_id":"file_xxx"}],"status":"completed"},"output_index":1,"type":"response.output_item.done"} // 12-15. Second round of reasoning + final message output. // 16. response.completed id:146 event:response.completed data:{"sequence_number":145,"type":"response.completed","response":{"output":[...],"status":"completed","usage":{"input_tokens":1576,"output_tokens":722,"total_tokens":2298,"x_tools":{"file_search":{"count":1}}}}}
Streaming output returns a series of JSON objects. Each object includes a `type` field to specify the event type and a `sequence_number` field to indicate the event order. The `response.completed` event marks the end of the stream.
type `string` The event type identifier. Possible values include: `response.created`: The response is created, with a status of `queued`. `response.in_progress`: The response starts processing, and the status changes to `in_progress`. `response.output_item.added`: A new output item (e.g., a message or a `web_extractor_call`) is added to the output array. When `item.type` is `web_extractor_call`, this indicates the start of a web extraction tool call. `response.content_part.added`: A new content part is added to the `content` array of an output item. `response.output_text.delta`: An incremental text segment is generated. This event is triggered multiple times, and the `delta` field contains the new text segment. `response.output_text.done`: Text generation for a content part is complete. The `text` field contains the full text. `response.content_part.done`: A content part is complete. The `part` object contains the complete content part. `response.output_item.done`: An output item is complete. The `item` object contains the complete output item. When `item.type` is `web_extractor_call`, this indicates the completion of a web extraction tool call. `response.reasoning_summary_text.delta`: (In reasoning mode) Provides an incremental update to the reasoning summary. The `delta` field contains the new segment. `response.reasoning_summary_text.done`: (In reasoning mode) The reasoning summary is complete. The `text` field contains the full summary. `response.web_search_call.in_progress` / `searching` / `completed`: An event that indicates a change in the search status when the web_search tool is used. `response.code_interpreter_call.in_progress` / `interpreting` / `completed`: An event for a change in the code execution status (when using the code_interpreter tool). Note: The `web_extractor` tool does not have a dedicated event type identifier. Its tool calls are passed through the general `response.output_item.added` and `response.output_item.done` events and are identified by the `item.type` field with a value of `web_extractor_call`. `response.mcp_call_arguments.delta` / `response.mcp_call_arguments.done`: These events provide the delta and completion status for MCP call arguments. `response.mcp_call.completed`: The MCP service call is complete. `response.file_search_call.in_progress` / `searching` / `completed`: Status change events for a knowledge base search (when you use the file_search tool). Note: When using the `web_search_image` and `image_search` tools, there are no dedicated intermediate state events. Tool calls are communicated through the `response.output_item.added` (call start) and `response.output_item.done` (call complete) events. `response.completed`: The response generation is complete. The `response` object contains the full response, including usage. This event marks the end of the stream.
sequence_number `integer` The event sequence number, starting at 0 and incrementing with each event. Use this number to process events in the correct order.
response `object` The response object. Appears in the `response.created`, `response.in_progress`, and `response.completed` events. In the `response.completed` event, it contains the complete response data (including `output` and `usage`), and its structure is identical to the non-streaming Response object.
item `object` An output item object. It appears in the `response.output_item.added` and `response.output_item.done` events. In the `added` event, it is an initial skeleton where the `content` is an empty array. In the `done` event, it is a complete object. Properties id `string` A unique identifier for the output item (e.g., `msg_xxx`). type `string` The type of the output item. Possible values: `message`, `reasoning`, `web_search_call`, `web_search_image_call` (text-to-image search), `image_search_call` (image-to-image search), `mcp_call` (MCP call), `file_search_call` (knowledge base search). role `string` The message role, which is always `assistant`. Present only when `type` is `message`. status `string` Generation status. In an `added` event, the status is `in_progress`, and in a `done` event, it is `completed`. content `array` An array of message content. In the `added` event, the array is empty `[]`. In the `done` event, it contains complete content part objects whose structure is the same as that of the `part` object.
part `object` The content part object. Appears in the `response.content_part.added` and `response.content_part.done` events. Properties type `string` The type of the content part, which is always `output_text`. text `string` Text content. This is an empty string in the `added` event and the complete text in the `done` event. annotations `array` An array of text annotations. Usually an empty array. logprobs `object \| null` Token log probabilities. This field currently always returns `null`.
delta `string` The incremental text segment. This field appears in the `response.output_text.delta` event and contains the newly added text segment. Concatenate all `delta` values to reconstruct the full text.
text `string` The complete text content. This field appears in the `response.output_text.done` event. You can use it to validate the text reconstructed from the `delta` fragments.
item_id `string` The unique identifier for the output item. Use this ID to correlate events that belong to the same item.
output_index `integer` The index of the output item in the `output` array.
content_index `integer` The index of the content part in the `content` array.
summary_index `integer` The index of the item in the `summary` array of a reasoning output item. This field appears in the `response.reasoning_summary_text.delta` and `response.reasoning_summary_text.done` events.

FAQ

Q: How do I pass context for a multi-turn conversation?

A: When making a new conversation request, pass the id from the model's previous successful response as the previous_response_id parameter.

Q: Why are some fields in the response example not described in this topic?

A: The official OpenAI SDK may output extra fields defined by the OpenAI protocol. Our service does not support these fields, so they are typically null. Focus only on the fields described in this topic.

Compatibility and limitations

China (Beijing)

Singapore

US (Virginia)

Germany (Frankfurt)

Japan (Tokyo)

Request body

Basic call

Python

Node.js

curl

Stream output

Python

Node.js

curl

Multi-turn conversation

Python

Node.js

Built-in tools

Python

Node.js

curl

Function calling

Python

Node.js

Document understanding

Python

Node.js

curl

Session cache

Python

Node.js

curl

China (Beijing)

Singapore

US (Virginia)

Germany (Frankfurt)

Japan (Tokyo)

Response object (non-streaming output)

Response chunk object (streaming output)

Basic call

Web extraction

Text-to-image search

Image-to-image search

MCP

Knowledge base search

FAQ