OpenAI-compatible - Responses-Alibaba Cloud Model Studio(Model Studio)-阿里云帮助中心

Alibaba Cloud Model Studio supports the OpenAI-compatible Responses API. Building on the Chat Completions API, the Responses API streamlines native agent functionality.

Advantages over the OpenAI Chat Completions API:

Built-in tools: Improve results for complex tasks with web search, web scraping, a code interpreter, text-to-image, and image-to-image. For details, see Call built-in tools.
More flexible input: The API supports both direct string inputs and message arrays in the standard chat format.
Simplified context management: Passing the previous_response_id eliminates the need to manually build a complete message history array.

See the OpenAI Responses API reference for parameter details.

Prerequisites

First, get an API key and set it as an environment variable. If you use the OpenAI SDK, install the SDK.

Important

The legacy path /api/v2/apps/protocols/compatible-mode/v1/responses for the OpenAI-compatible Responses API will be deprecated soon. Please migrate to the new path /compatible-mode/v1/responses as soon as possible.

Important

Alibaba Cloud Model Studio has released workspace-specific domains for the China (Beijing), Singapore regions. The new dedicated domains deliver superior performance and higher stability for inference requests. We recommend migrating to the new domains:

China (Beijing): from https://dashscope.aliyuncs.com to https://{WorkspaceId}.cn-beijing.maas.aliyuncs.com
Singapore: from https://dashscope-intl.aliyuncs.com to https://{WorkspaceId}.ap-southeast-1.maas.aliyuncs.com

{WorkspaceId} is your workspace ID, which can be found on the Workspace Details page in the Alibaba Cloud Model Studio console. The existing domain remains fully functional.

Supported models

qwen3-max, qwen3-max-2026-01-23, qwen3.7-max, qwen3.7-max-2026-05-20, qwen3.7-max-2026-06-08, qwen3.7-max-preview, qwen3.7-max-2026-05-17, qwen3.7-plus, qwen3.7-plus-2026-05-26, qwen3.6-plus, qwen3.6-plus-2026-04-02, qwen3.5-plus, qwen3.5-plus-2026-02-15, qwen3.5-plus-2026-04-20, qwen3.6-flash, qwen3.6-flash-2026-04-16, qwen3.5-flash, qwen3.5-flash-2026-02-23, qwen3.6-35b-a3b, qwen3.5-397b-a17b, qwen3.5-122b-a10b, qwen3.5-27b, qwen3.5-35b-a3b, qwen-plus, qwen-flash, qwen3-coder-plus, qwen3-coder-flash, and qwen3-coder-next.

Endpoints

China (Beijing)

The base_url for SDK call configuration: https://{WorkspaceId}.cn-beijing.maas.aliyuncs.com/compatible-mode/v1

HTTP request URL: POST https://{WorkspaceId}.cn-beijing.maas.aliyuncs.com/compatible-mode/v1/responses

Replace WorkspaceId with your actual Workspace ID.

Singapore

SDK call configuration base_url: https://{WorkspaceId}.ap-southeast-1.maas.aliyuncs.com/compatible-mode/v1

HTTP request URL: POST https://{WorkspaceId}.ap-southeast-1.maas.aliyuncs.com/compatible-mode/v1/responses

Replace WorkspaceId with your actual Workspace ID.

US (Virginia)

SDK call configuration base_url: https://dashscope-us.aliyuncs.com/compatible-mode/v1

HTTP request URL: POST https://dashscope-us.aliyuncs.com/compatible-mode/v1/responses

Germany (Frankfurt)

HTTP request URL: POST https://{WorkspaceId}.eu-central-1.maas.aliyuncs.com/compatible-mode/v1/responses

The base_url for SDK call configuration: https://{WorkspaceId}.eu-central-1.maas.aliyuncs.com/compatible-mode/v1

Replace WorkspaceId with your actual Workspace ID.

Japan (Tokyo)

HTTP request URL: POST https://{WorkspaceId}.ap-northeast-1.maas.aliyuncs.com/compatible-mode/v1/responses

SDK call configuration base_url: https://{WorkspaceId}.ap-northeast-1.maas.aliyuncs.com/compatible-mode/v1

Replace WorkspaceId with your actual Workspace ID.

Code examples

Basic call

Send a message and get a response.

Python

import os
from openai import OpenAI

client = OpenAI(
    # If an environment variable is not set, replace with your Model Studio API key: api_key="sk-xxx"
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    base_url="https://{WorkspaceId}.cn-beijing.maas.aliyuncs.com/compatible-mode/v1",
)

response = client.responses.create(
    model="qwen3.7-plus",
    input="你能做些什么？"
)

# Get model response
# print(response.model_dump_json())
print(response.output_text)

Node.js

import OpenAI from "openai";

const openai = new OpenAI({
    # If an environment variable is not set, replace with your Model Studio API key: apiKey: "sk-xxx"
    apiKey: process.env.DASHSCOPE_API_KEY,
    baseURL: "https://{WorkspaceId}.cn-beijing.maas.aliyuncs.com/compatible-mode/v1"
});

async function main() {
    const response = await openai.responses.create({
        model: "qwen3.7-plus",
        input: "你能做些什么？"
    });

    // Get model response
    console.log(response.output_text);
}

main();

Curl

curl -X POST https://{WorkspaceId}.cn-beijing.maas.aliyuncs.com/compatible-mode/v1/responses \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
    "model": "qwen3.7-plus",
    "input": "你能做些什么？"
}'

Example response

This is a complete API response.

{
    "created_at": 1771225825,
    "id": "0c842a11-c7d1-45da-b7ec-4e668c389xxx",
    "model": "qwen3.7-plus",
    "object": "response",
    "output": [
        {
            "id": "msg_0bdb8ab9-f1de-4db6-82c8-6c1185b91xxx",
            "summary": [
                {
                    "text": "Thinking Process:\n\n1.  **Analyze the Request ...",
                    "type": "summary_text"
                }
            ],
            "type": "reasoning"
        },
        {
            "content": [
                {
                    "annotations": [],
                    "text": "Hello! As an AI assistant, I can help you with a variety of tasks. Here are some of my main capabilities:\n\n1.  **Text Creation and Editing**\n    *   Write emails, articles, reports, stories, or social media copy.\n    *   Polish, rewrite, or summarize existing text.\n\n2.  **Programming and Technical Support**\n    *   Write, debug, or explain code (supporting multiple programming languages like Python, JavaScript, C++, etc.).\n    *   Provide explanations of technical concepts and suggest solutions.\n\n3.  **Q&A and Learning**\n    *   Answer questions across various fields (my knowledge includes information up to 2026).\n    *   Assist with learning new concepts, creating study plans, or solving exercises.\n\n4.  **Language Translation**\n    *   Translate between multiple languages to help you overcome language barriers.\n\n5.  **Data Analysis and Organization**\n    *   Help organize information, extract key points, or perform logical analysis.\n    *   Help format data or generate table structures.\n\n6.  **Creativity and Brainstorming**\n    *   Provide creative inspiration, project plans, or suggestions.\n    *   Chat with you, offering emotional support or life advice.\n\nIs there anything specific I can help you with? Just let me know!",
                    "type": "output_text"
                }
            ],
            "id": "msg_c8bb3db1-d235-44e7-9704-55b584022xxx",
            "role": "assistant",
            "status": "completed",
            "type": "message"
        }
    ],
    "parallel_tool_calls": false,
    "status": "completed",
    "tool_choice": "auto",
    "tools": [],
    "usage": {
        "input_tokens": 49,
        "input_tokens_details": {
            "cached_tokens": 0
        },
        "output_tokens": 1384,
        "output_tokens_details": {
            "reasoning_tokens": 1110
        },
        "total_tokens": 1433,
        "x_details": [
            {
                "input_tokens": 49,
                "output_tokens": 1384,
                "output_tokens_details": {
                    "reasoning_tokens": 1110
                },
                "total_tokens": 1433,
                "x_billing_type": "response_api"
            }
        ]
    }
}

Multi-turn conversation

The previous_response_id parameter automatically maintains the conversation context, so you do not need to manually build the message history. Each response id is valid for 7 days.

The previous_response_id must be the top-level id from the previous response (e.g., resp_xxx, in UUID format), not the message id from within the output array (e.g., msg_56c860c4-3ad8-4a96-8553-d2f94c259xxx).

Python

import os
from openai import OpenAI

client = OpenAI(
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    base_url="https://{WorkspaceId}.cn-beijing.maas.aliyuncs.com/compatible-mode/v1",
)

# First round
response1 = client.responses.create(
    model="qwen3.7-plus",
    input="My name is Alice, please remember it."
)
print(f"First response: {response1.output_text}")

# Second round - use previous_response_id to link context
# The response id expires in 7 days
response2 = client.responses.create(
    model="qwen3.7-plus",
    input="Do you remember my name?",
    previous_response_id=response1.id
)
print(f"Second response: {response2.output_text}")

Node.js

import OpenAI from "openai";

const openai = new OpenAI({
    apiKey: process.env.DASHSCOPE_API_KEY,
    baseURL: "https://{WorkspaceId}.cn-beijing.maas.aliyuncs.com/compatible-mode/v1"
});

async function main() {
    // First round
    const response1 = await openai.responses.create({
        model: "qwen3.7-plus",
        input: "My name is Alice, please remember it."
    });
    console.log(`First response: ${response1.output_text}`);

    // Second round - use previous_response_id to link context
    // The response id expires in 7 days
    const response2 = await openai.responses.create({
        model: "qwen3.7-plus",
        input: "Do you remember my name?",
        previous_response_id: response1.id
    });
    console.log(`Second response: ${response2.output_text}`);
}

main();

Curl

# First round
curl -X POST https://{WorkspaceId}.cn-beijing.maas.aliyuncs.com/compatible-mode/v1/responses \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
    "model": "qwen3.7-plus",
    "input": "My name is Alice, please remember it."
}'

# Second round - use the id from the first response as previous_response_id
curl -X POST https://{WorkspaceId}.cn-beijing.maas.aliyuncs.com/compatible-mode/v1/responses \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
    "model": "qwen3.7-plus",
    "input": "Do you remember my name?",
    "previous_response_id": "response_id_from_first_round"
}'

Second-round response example

{
  "id": "f0dbb153-117f-9bbf-8176-5284b47f3xxx",
  "created_at": 1769169951.0,
  "model": "qwen3.7-plus",
  "object": "response",
  "status": "completed",
  "output": [
    {
      "id": "msg_56c860c4-3ad8-4a96-8553-d2f94c259xxx",
      "type": "message",
      "role": "assistant",
      "status": "completed",
      "content": [
        {
          "type": "output_text",
          "text": "Of course. Your name is Alice!",
          "annotations": []
        }
      ]
    }
  ],
  "usage": {
    "input_tokens": 73,
    "output_tokens": 10,
    "total_tokens": 83,
    "input_tokens_details": {
      "cached_tokens": 0
    },
    "output_tokens_details": {
      "reasoning_tokens": 0
    }
  }
}

Note: In the second round, the input_tokens count is 73. This number includes context from the first round, showing that the model successfully remembered the name "Alice".

Deep thinking

Use the reasoning parameter to control the model's reasoning strength. When you set reasoning.effort, the model thinks before replying, and returns the thinking process in a reasoning output item. The effort parameter supports the following values:

none: Disables thinking and provides a direct answer.
minimal: Minimizes thinking for the fastest response.
low: Performs light thinking, prioritizing a quick response.
medium (default): Performs moderate thinking, balancing speed and depth.
high: Performs deep thinking, focusing on complex and specialized problems.

You cannot use the thinking_budget parameter to control the maximum thinking length. reasoning.effort takes precedence over enable_thinking. Use reasoning.effort, as enable_thinking will be deprecated.

Python

import os
from openai import OpenAI

client = OpenAI(
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    base_url="https://{WorkspaceId}.cn-beijing.maas.aliyuncs.com/compatible-mode/v1",
)

response = client.responses.create(
    model="qwen3.7-plus",
    input="Which is larger, 9.9 or 9.11?",
    reasoning={"effort": "medium"}
)

# Process the output
for item in response.output:
    if item.type == "reasoning":
        print("=== Thinking Process ===")
        for summary in item.summary:
            print(summary.text)
    elif item.type == "message":
        print("\n=== Final Answer ===")
        print(item.content[0].text)

# Check the thinking token count
print(f"\nThinking token count: {response.usage.output_tokens_details.reasoning_tokens}")

Node.js

import OpenAI from "openai";

const openai = new OpenAI({
    apiKey: process.env.DASHSCOPE_API_KEY,
    baseURL: "https://{WorkspaceId}.cn-beijing.maas.aliyuncs.com/compatible-mode/v1"
});

async function main() {
    const response = await openai.responses.create({
        model: "qwen3.7-plus",
        input: "Which is larger, 9.9 or 9.11?",
        reasoning: { effort: "medium" }
    });

    for (const item of response.output) {
        if (item.type === "reasoning") {
            console.log("=== Thinking Process ===");
            for (const summary of item.summary) {
                console.log(summary.text);
            }
        } else if (item.type === "message") {
            console.log("\n=== Final Answer ===");
            console.log(item.content[0].text);
        }
    }

    // Check the thinking token count
    console.log(`\nThinking token count: ${response.usage.output_tokens_details.reasoning_tokens}`);
}

main();

Curl

curl -X POST https://{WorkspaceId}.cn-beijing.maas.aliyuncs.com/compatible-mode/v1/responses \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
    "model": "qwen3.7-plus",
    "input": "Which is larger, 9.9 or 9.11?",
    "reasoning": {"effort": "medium"}
}'

Example response

{
    "created_at": 1774498317,
    "id": "resp_xxx",
    "model": "qwen3.7-plus",
    "object": "response",
    "output": [
        {
            "id": "msg_xxx",
            "summary": [
                {
                    "text": "Thinking Process:\n\n1.  Analyze the Request:\n    *   Question: \"Which is larger, 9.9 or 9.11?\"\n    *   Context: The user is asking a simple mathematical comparison question.\n    *   Current Date: Thursday, March 26, 2026 (provided in the system prompt).\n    *   Knowledge Cutoff: 2026 (provided in the system prompt).\n\n2.  Evaluate the Numbers:\n    *   Number A: 9.9\n    *   Number B: 9.11\n    *   These are decimal numbers.\n    *   Compare the integer part: Both are 9.\n    *   Compare the tenths place (first decimal digit):\n        *   9.9 has 9 in the tenths place.\n        *   9.11 has 1 in the tenths place.\n    *   Since 9 > 1, 9.9 is greater than 9.11.\n\n3.  Consider Potential Ambiguities:\n    *   Could this be version numbering? (e.g., software versions). In versioning, 9.11 is often \"newer\" or \"higher\" than 9.9. However, mathematically, 9.9 > 9.11.\n    *   Could this be dates? (September 9th vs September 11th). 11th is later.\n    *   Standard interpretation: Without context, decimal numbers are assumed to be mathematical values.\n    *   Common pitfall: Some people mistakenly treat decimals like whole numbers (where 11 > 9), leading them to think 9.11 > 9.9. This is a known cognitive bias or misconception in elementary math.\n    *   Decision: Provide the mathematical answer clearly, but perhaps acknowledge the versioning context if relevant (though usually, for this specific question, it's a math test). Given the simplicity, stick to the mathematical truth first.\n\n4.  Formulate the Answer:\n    *   Direct answer: 9.9 is larger.\n    *   Explanation: Compare place values. 9.9 = 9.90, 9.11 = 9.11. 90 hundredths > 11 hundredths.\n    *   Tone: Helpful, clear.\n\n5.  Drafting the Response (Internal Monologue/Draft):\n    *   Okay, mathematically 9.9 is bigger. 9.9 is 9 and 9 tenths. 9.11 is 9 and 11 hundredths (or 1 tenth and 1 hundredth). 9 tenths is more than 1 tenth.\n    *   So, 9.9 > 9.11.\n    *   I should answer in English since the question is in English.\n    *   \"9.9 is larger.\"\n    *   Add a brief explanation to avoid confusion. \"Because 9.9 equals 9.90, and 9.90 is greater than 9.11.\"\n\n6.  Refining based on \"Version Number\" possibility:\n    *   Sometimes this is a trick question about software versions. In semver, 9.11 > 9.9.\n    *   However, usually, when asked plainly \"which is larger\", it refers to numerical value.\n    *   I will provide the numerical answer as primary, maybe add a note about versions if it feels necessary. But keeping it simple is better. Most likely, this is testing the decimal comparison logic.\n    *   Let's stick to the mathematical value as the default.\n\n7.  Final Polish:\n    *   Answer: 9.9 is larger.\n    *   Reasoning: When comparing decimals, first look at the integer part, then the tenths place. The tenths digit of 9.9 is 9, and the tenths digit of 9.11 is 1. Since 9 > 1, 9.9 > 9.11.\n\n8.  Output Generation: (Matching the thought process)\n    *   State clearly which is larger.\n    *   Explain why.\n\n    *Self-Correction on Date/Time:* The system prompt mentions the current date is 2026. This doesn't affect the math question, but I should ensure I don't reference anything post-2026 if I were adding context (not needed here).\n\n    *Final Decision:* Just answer the math question directly.\n\n    \"Numerically, 9.9 is larger.\"\n    Explain: 9.9 = 9.90, 9.11 = 9.11. 90 > 11.\n\n    Wait, is there any other interpretation?\n    - Dates? 9/9 vs 9/11. 11th is later.\n    - Versions? 9.11 is newer.\n    - But \"who is big/larger\" usually implies magnitude.\n    - I will answer based on numerical magnitude but briefly mention the version context if it adds value without confusing. Actually, better to be decisive. Numerical value is the standard interpretation for \"who is larger\" with numbers.\n\n    Let's go with the numerical answer.\n\n    Plan:\n    1. State 9.9 is larger numerically.\n    2. Explain the decimal comparison.\n    3. (Optional but helpful) Mention that if it's version numbers, 9.11 might be considered \"higher\". But prioritize the math.\n    Actually, for a simple query like this, over-explaining versions might be confusing. I'll stick to the math but acknowledge the common confusion.\n\n    *Revised Plan:*\n    1. Direct answer: 9.9 is larger (mathematically).\n    2. Explanation: Place value.\n    3. Note: Unless it's a version number.\n\n    Let's keep it concise.\n\n    \"In terms of mathematical value, 9.9 is larger.\"\n    \"Because 9.9 equals 9.90, which is greater than 9.11.\"\n    \"If it's a version number, version 9.11 is usually higher than version 9.9.\"\n    This covers both bases.\n\n    Wait, looking at common AI training, this is often a \"trap\" question to see if the AI treats decimals like integers. The correct mathematical answer is crucial.\n\n    Okay, final output structure:\n    1. Mathematical answer.\n    2. Brief explanation.\n    3. Contextual note (versions).\n\n    Let's write it.\n    \"In terms of mathematical value, 9.9 is larger.\"\n    \"When comparing decimals, first compare the integer part (both are 9), then the tenths place: The tenths digit of 9.9 is 9, while the tenths digit of 9.11 is 1. Because 9 > 1, 9.9 > 9.11.\"\n    \"However, if these are software version numbers, version 9.11 is typically newer than version 9.9.\"\n\n    This is balanced.\n\n    Wait, I need to check the instruction again. \"Please remember the current actual time...\". This is just system context. I don't need to mention the date in the answer unless relevant. It's not relevant here.\n\n    Okay, ready to generate.\n\n    Wait, one more check. Is there any cultural context? In some contexts, people might say 9.11 is bigger because 11 > 9. This is the misconception I need to correct.\n    So, emphasizing the mathematical rule is key.\n\n    Final check:\n    9.9 = 9 + 9/10 = 9.90\n    9.11 = 9 + 1/10 + 1/100 = 9.11\n    9.90 > 9.11. Correct.\n\n    Response will be in English.\n    \"In terms of numerical value, 9.9 is larger.\"\n    Explain why.\n    Mention version exception.\n    Done.\n",
                    "type": "summary_text"
                }
            ],
            "type": "reasoning"
        },
        {
            "content": [
                {
                    "annotations": [],
                    "text": "In terms of mathematical value, 9.9 is larger.\n\nHere's why:\nWhen comparing decimals, you first look at the integer part, which is 9 for both. Then you look at the tenths place (the first digit after the decimal point):\n*   The tenths digit of 9.9 is 9\n*   The tenths digit of 9.11 is 1\n\nSince 9 is greater than 1, 9.9 > 9.11 (you can think of 9.9 as 9.90 for comparison).\n\nNote: If these were software version numbers, version 9.11 would typically be considered newer (or higher) than version 9.9. However, in a direct numerical comparison, 9.9 is larger.",
                    "type": "output_text"
                }
            ],
            "id": "msg_xxx",
            "role": "assistant",
            "status": "completed",
            "type": "message"
        }
    ],
    "parallel_tool_calls": false,
    "status": "completed",
    "tool_choice": "auto",
    "tools": [],
    "usage": {
        "input_tokens": 57,
        "input_tokens_details": {
            "cached_tokens": 0
        },
        "output_tokens": 2018,
        "output_tokens_details": {
            "reasoning_tokens": 1861
        },
        "total_tokens": 2075,
        "x_details": [
            {
                "input_tokens": 57,
                "output_tokens": 2018,
                "output_tokens_details": {
                    "reasoning_tokens": 1861
                },
                "total_tokens": 2075,
                "x_billing_type": "response_api"
            }
        ]
    }
}

Stream output

Receive content from the model in real time, especially useful for long-form text generation.

Python

import os
from openai import OpenAI

client = OpenAI(
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    base_url="https://{WorkspaceId}.cn-beijing.maas.aliyuncs.com/compatible-mode/v1",
)

stream = client.responses.create(
    model="qwen3.7-plus",
    input="Please briefly introduce artificial intelligence.",
    stream=True
)

print("Receiving stream output:")
for event in stream:
    # print(event.model_dump_json())  # Uncomment to see the raw event response
    if event.type == 'response.output_text.delta':
        # Print the text delta in real time
        print(event.delta, end='', flush=True)
    elif event.type == 'response.completed':
        print("\nStream completed")
        print(f"Total tokens: {event.response.usage.total_tokens}")

Node.js

import OpenAI from "openai";

const openai = new OpenAI({
    apiKey: process.env.DASHSCOPE_API_KEY,
    baseURL: "https://{WorkspaceId}.cn-beijing.maas.aliyuncs.com/compatible-mode/v1"
});

async function main() {
    const stream = await openai.responses.create({
        model: "qwen3.7-plus",
        input: "Please briefly introduce artificial intelligence.",
        stream: true
    });

    console.log("Receiving stream output:");
    for await (const event of stream) {
        // console.log(JSON.stringify(event));  // Uncomment to see the raw event response
        if (event.type === 'response.output_text.delta') {
            process.stdout.write(event.delta);
        } else if (event.type === 'response.completed') {
            console.log("\nStream completed");
            console.log(`Total tokens: ${event.response.usage.total_tokens}`);
        }
    }
}

main();

Curl

curl -X POST https://{WorkspaceId}.cn-beijing.maas.aliyuncs.com/compatible-mode/v1/responses \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
    "model": "qwen3.7-plus",
    "input": "Please briefly introduce artificial intelligence.",
    "stream": true
}'

Example response

{"response":{"id":"47a71e7d-868c-4204-9693-ef8ff9058xxx","created_at":1769417481.0,"error":null,"incomplete_details":null,"instructions":null,"metadata":null,"model":"","object":"response","output":[],"parallel_tool_calls":false,"temperature":null,"tool_choice":"auto","tools":[],"top_p":null,"background":null,"completed_at":null,"conversation":null,"max_output_tokens":null,"max_tool_calls":null,"previous_response_id":null,"prompt":null,"prompt_cache_key":null,"prompt_cache_retention":null,"reasoning":null,"safety_identifier":null,"service_tier":null,"status":"queued","text":null,"top_logprobs":null,"truncation":null,"usage":null,"user":null},"sequence_number":0,"type":"response.created"}
{"response":{"id":"47a71e7d-868c-4204-9693-ef8ff9058xxx","created_at":1769417481.0,"error":null,"incomplete_details":null,"instructions":null,"metadata":null,"model":"","object":"response","output":[],"parallel_tool_calls":false,"temperature":null,"tool_choice":"auto","tools":[],"top_p":null,"background":null,"completed_at":null,"conversation":null,"max_output_tokens":null,"max_tool_calls":null,"previous_response_id":null,"prompt":null,"prompt_cache_key":null,"prompt_cache_retention":null,"reasoning":null,"safety_identifier":null,"service_tier":null,"status":"in_progress","text":null,"top_logprobs":null,"truncation":null,"usage":null,"user":null},"sequence_number":1,"type":"response.in_progress"}
{"item":{"id":"msg_16db29d6-c1d3-47d7-9177-0fba81964xxx","content":[],"role":"assistant","status":"in_progress","type":"message"},"output_index":0,"sequence_number":2,"type":"response.output_item.added"}
{"content_index":0,"item_id":"msg_16db29d6-c1d3-47d7-9177-0fba81964xxx","output_index":0,"part":{"annotations":[],"text":"","type":"output_text","logprobs":null},"sequence_number":3,"type":"response.content_part.added"}
{"content_index":0,"delta":"Artificial","item_id":"msg_16db29d6-c1d3-47d7-9177-0fba81964xxx","logprobs":[],"output_index":0,"sequence_number":4,"type":"response.output_text.delta"}
{"content_index":0,"delta":" intelligence","item_id":"msg_16db29d6-c1d3-47d7-9177-0fba81964xxx","logprobs":[],"output_index":0,"sequence_number":5,"type":"response.output_text.delta"}
{"content_index":0,"delta":" (","item_id":"msg_16db29d6-c1d3-47d7-9177-0fba81964xxx","logprobs":[],"output_index":0,"sequence_number":6,"type":"response.output_text.delta"}
{"content_index":0,"delta":"AI","item_id":"msg_16db29d6-c1d3-47d7-9177-0fba81964xxx","logprobs":[],"output_index":0,"sequence_number":7,"type":"response.output_text.delta"}
... (intermediate events omitted) ...
{"content_index":0,"delta":"fields, and is profoundly changing our","item_id":"msg_16db29d6-c1d3-47d7-9177-0fba81964xxx","logprobs":[],"output_index":0,"sequence_number":38,"type":"response.output_text.delta"}
{"content_index":0,"delta":" lives and ways of working","item_id":"msg_16db29d6-c1d3-47d7-9177-0fba81964xxx","logprobs":[],"output_index":0,"sequence_number":39,"type":"response.output_text.delta"}
{"content_index":0,"delta":".","item_id":"msg_16db29d6-c1d3-47d7-9177-0fba81964xxx","logprobs":[],"output_index":0,"sequence_number":40,"type":"response.output_text.delta"}
{"content_index":0,"item_id":"msg_16db29d6-c1d3-47d7-9177-0fba81964xxx","logprobs":[],"output_index":0,"sequence_number":41,"text":"Artificial intelligence (AI) is the technology and science of simulating human intelligent behavior by using computer systems. xxxx","type":"response.output_text.done"}
{"content_index":0,"item_id":"msg_16db29d6-c1d3-47d7-9177-0fba81964xxx","output_index":0,"part":{"annotations":[],"text":"Artificial intelligence (AI) is the technology and science of simulating human intelligent behavior by using computer systems. xxx","type":"output_text","logprobs":null},"sequence_number":42,"type":"response.content_part.done"}
{"item":{"id":"msg_16db29d6-c1d3-47d7-9177-0fba81964xxx","content":[{"annotations":[],"text":"Artificial intelligence (AI) is the technology and science of simulating human intelligent behavior by using computer systems. It aims to enable machines to perform tasks that typically require human intelligence, such as:\n\n- Learning (for example, training models with data)  \n- Reasoning (for example, logical judgment and problem-solving)  \n- Perception (for example, recognizing images, speech, or text)  \n- Understanding language (for example, natural language processing)  \n- Decision-making (for example, making optimal choices in complex environments)\n\nAI can be divided into weak AI (focused on specific tasks, such as voice assistants and recommendation systems) and strong AI (possessing general, human-like intelligence, which has not yet been achieved).\n\nCurrently, AI is widely used in various fields, including healthcare, finance, transportation, education, and entertainment, and is profoundly changing the way we live and work.","type":"output_text","logprobs":null}],"role":"assistant","status":"completed","type":"message"},"output_index":0,"sequence_number":43,"type":"response.output_item.done"}
{"response":{"id":"47a71e7d-868c-4204-9693-ef8ff9058xxx","created_at":1769417481.0,"error":null,"incomplete_details":null,"instructions":null,"metadata":null,"model":"qwen3.7-plus","object":"response","output":[{"id":"msg_16db29d6-c1d3-47d7-9177-0fba81964xxx","content":[{"annotations":[],"text":"Artificial intelligence (AI) is xxxxxx","type":"output_text","logprobs":null}],"role":"assistant","status":"completed","type":"message"}],"parallel_tool_calls":false,"temperature":null,"tool_choice":"auto","tools":[],"top_p":null,"background":null,"completed_at":null,"conversation":null,"max_output_tokens":null,"max_tool_calls":null,"previous_response_id":null,"prompt":null,"prompt_cache_key":null,"prompt_cache_retention":null,"reasoning":null,"safety_identifier":null,"service_tier":null,"status":"completed","text":null,"top_logprobs":null,"truncation":null,"usage":{"input_tokens":37,"input_tokens_details":{"cached_tokens":0},"output_tokens":166,"output_tokens_details":{"reasoning_tokens":0},"total_tokens":203},"user":null},"sequence_number":44,"type":"response.completed"}

Use built-in tools

Enable built-in tools for complex tasks. The web extractor and code interpreter are free for a limited time. See tool calling for supported tools.

Python

import os
from openai import OpenAI

client = OpenAI(
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    base_url="https://{WorkspaceId}.cn-beijing.maas.aliyuncs.com/compatible-mode/v1",
)

response = client.responses.create(
    model="qwen3.7-plus",
    input="Find the Alibaba Cloud website and extract key information from the homepage",
    # For best results, enable all the built-in tools.
    tools=[
        {"type": "web_search"},
        {"type": "code_interpreter"},
        {"type": "web_extractor"}
    ],
    reasoning={"effort": "medium"}
)

# Uncomment the following line to see the intermediate output.
# print(response.output)
print(response.output_text)

Node.js

import OpenAI from "openai";

const openai = new OpenAI({
    apiKey: process.env.DASHSCOPE_API_KEY,
    baseURL: "https://{WorkspaceId}.cn-beijing.maas.aliyuncs.com/compatible-mode/v1"
});

async function main() {
    const response = await openai.responses.create({
        model: "qwen3.7-plus",
        input: "Find the Alibaba Cloud website and extract key information from the homepage",
        tools: [
            { type: "web_search" },
            { type: "code_interpreter" },
            { type: "web_extractor" }
        ],
        reasoning: { effort: "medium" }
    });

    for (const item of response.output) {
        if (item.type === "reasoning") {
            console.log("Model is thinking...");
        } else if (item.type === "web_search_call") {
            console.log(`Search query: ${item.action.query}`);
        } else if (item.type === "web_extractor_call") {
            console.log("Extracting web content...");
        } else if (item.type === "message") {
            console.log(`Response: ${item.content[0].text}`);
        }
    }
}

main();

Curl

curl -X POST https://{WorkspaceId}.cn-beijing.maas.aliyuncs.com/compatible-mode/v1/responses \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
    "model": "qwen3.7-plus",
    "input": "Find the Alibaba Cloud website and extract key information from the homepage",
    "tools": [
        {
            "type": "web_search"
        },
        {
            "type": "code_interpreter"
        },
        {
            "type": "web_extractor"
        }
    ],
    "reasoning": {"effort": "medium"}
}'

Example response

{
    "id": "69258b21-5099-9d09-92e8-8492b1955xxx",
    "object": "response",
    "status": "completed",
    "output": [
        {
            "type": "reasoning",
            "summary": [
                {
                    "type": "summary_text",
                    "text": "The user wants to find the Alibaba Cloud website and extract information..."
                }
            ]
        },
        {
            "type": "web_search_call",
            "status": "completed",
            "action": {
                "query": "Alibaba Cloud official website",
                "type": "search",
                "sources": [
                    {
                        "type": "url",
                        "url": "https://cn.aliyun.com/"
                    },
                    {
                        "type": "url",
                        "url": "https://www.alibabacloud.com/zh"
                    }
                ]
            }
        },
        {
            "type": "reasoning",
            "summary": [
                {
                    "type": "summary_text",
                    "text": "The search results show the Alibaba Cloud website URL..."
                }
            ]
        },
        {
            "type": "web_extractor_call",
            "status": "completed",
            "goal": "Extract key information from the Alibaba Cloud homepage",
            "output": "Tongyi Large Language Model, full product portfolio, AI solutions...",
            "urls": [
                "https://cn.aliyun.com/"
            ]
        },
        {
            "type": "message",
            "role": "assistant",
            "status": "completed",
            "content": [
                {
                    "type": "output_text",
                    "text": "Key information from the Alibaba Cloud website: Tongyi Large Language Model, cloud computing services..."
                }
            ]
        }
    ],
    "usage": {
        "input_tokens": 40836,
        "output_tokens": 2106,
        "total_tokens": 42942,
        "output_tokens_details": {
            "reasoning_tokens": 677
        },
        "x_tools": {
            "web_extractor": {
                "count": 1
            },
            "web_search": {
                "count": 1
            }
        }
    }
}

Session cache

In multi-turn conversations, enable the session cache to let the server cache conversation context automatically. This reduces latency and costs without manual cache management.

Usage: To enable the session cache, add x-dashscope-session-cache: enable to the request header. To disable it, set the value to disable. The default value is disable.

Supported models: qwen3.7-max, qwen3.7-max-2026-05-20, qwen3.7-max-2026-06-08, qwen3-max, qwen3.7-plus, qwen3.6-plus, qwen3.5-plus, qwen3.6-flash, qwen3.5-flash, qwen-plus, qwen-flash, qwen3-coder-plus, qwen3-coder-flash

The session cache requires a minimum prompt length of 1024 tokens and expires after 5 minutes. It shares the same constraints as the explicit cache.

Python

import os
from openai import OpenAI

client = OpenAI(
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    base_url="https://{WorkspaceId}.cn-beijing.maas.aliyuncs.com/compatible-mode/v1",
    # Enable session cache via default_headers
    default_headers={"x-dashscope-session-cache": "enable"}
)

# Construct a long text exceeding 1024 tokens to trigger cache creation.
# (If the initial prompt is under 1024 tokens, the server creates the cache once the total context exceeds this threshold.)
long_context = "Artificial intelligence is an important branch of computer science that focuses on the research and development of theories, methods, technologies, and application systems that can simulate, extend, and expand human intelligence." * 50

# First request
response1 = client.responses.create(
    model="qwen3.7-plus",
    input=long_context + "\n\nBased on the background knowledge above, briefly introduce the random forest algorithm in machine learning.",
)
print(f"First response: {response1.output_text}")

# Second request: Link the context using previous_response_id. The server manages the cache automatically.
response2 = client.responses.create(
    model="qwen3.7-plus",
    input="What are the main differences between it and GBDT?",
    previous_response_id=response1.id,
)
print(f"Second response: {response2.output_text}")

# Check the cache hit status
usage = response2.usage
print(f"Input tokens: {usage.input_tokens}")
print(f"Cached tokens: {usage.input_tokens_details.cached_tokens}")

Node.js

import OpenAI from "openai";

const openai = new OpenAI({
    apiKey: process.env.DASHSCOPE_API_KEY,
    baseURL: "https://{WorkspaceId}.cn-beijing.maas.aliyuncs.com/compatible-mode/v1",
    // Enable session cache via defaultHeaders
    defaultHeaders: {"x-dashscope-session-cache": "enable"}
});

// Construct a long text exceeding 1024 tokens to trigger cache creation.
// (If the initial prompt is under 1024 tokens, the server creates the cache once the total context exceeds this threshold.)
const longContext = "Artificial intelligence is an important branch of computer science that focuses on the research and development of theories, methods, technologies, and application systems that can simulate, extend, and expand human intelligence.".repeat(50);

async function main() {
    // First request
    const response1 = await openai.responses.create({
        model: "qwen3.7-plus",
        input: longContext + "\n\nBased on the background knowledge above, briefly introduce the random forest algorithm in machine learning, including its basic principles and use cases."
    });
    console.log(`First response: ${response1.output_text}`);

    // Second request: Link the context using previous_response_id. The server manages the cache automatically.
    const response2 = await openai.responses.create({
        model: "qwen3.7-plus",
        input: "What are the main differences between it and GBDT?",
        previous_response_id: response1.id
    });
    console.log(`Second response: ${response2.output_text}`);

    // Check the cache hit status
    console.log(`Input tokens: ${response2.usage.input_tokens}`);
    console.log(`Cached tokens: ${response2.usage.input_tokens_details.cached_tokens}`);
}

main();

curl

# First request
# The 'input' text must exceed 1024 tokens to trigger cache creation.
curl -X POST https://{WorkspaceId}.cn-beijing.maas.aliyuncs.com/compatible-mode/v1/responses \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-H "x-dashscope-session-cache: enable" \
-d '{
    "model": "qwen3.7-plus",
    "input": "Artificial intelligence is an important branch of computer science that focuses on the research and development of theories, methods, technologies, and application systems that can simulate, extend, and expand human intelligence. Artificial intelligence is an important branch of computer science that focuses on the research and development of theories, methods, technologies, and application systems that can simulate, extend, and expand human intelligence. Artificial intelligence is an important branch of computer science that focuses on the research and development of theories, methods, technologies, and application systems that can simulate, extend, and expand human intelligence. Artificial intelligence is an important branch of computer science that focuses on the research and development of theories, methods, technologies, and application systems that can simulate, extend, and expand human intelligence. Artificial intelligence is an important branch of computer science that focuses on the research and development of theories, methods, technologies, and application systems that can simulate, extend, and expand human intelligence. Artificial intelligence is an important branch of computer science that focuses on the research and development of theories, methods, technologies, and application systems that can simulate, extend, and expand human intelligence. Artificial intelligence is an important branch of computer science that focuses on the research and development of theories, methods, technologies, and application systems that can simulate, extend, and expand human intelligence. Artificial intelligence is an important branch of computer science that focuses on the research and development of theories, methods, technologies, and application systems that can simulate, extend, and expand human intelligence. Artificial intelligence is an important branch of computer science that focuses on the research and development of theories, methods, technologies, and application systems that can simulate, extend, and expand human intelligence. Artificial intelligence is an important branch of computer science that focuses on the research and development of theories, methods, technologies, and application systems that can simulate, extend, and expand human intelligence. Artificial intelligence is an important branch of computer science that focuses on the research and development of theories, methods, technologies, and application systems that can simulate, extend, and expand human intelligence. Artificial intelligence is an important branch of computer science that focuses on the research and development of theories, methods, technologies, and application systems that can simulate, extend, and expand human intelligence. Artificial intelligence is an important branch of computer science that focuses on the research and development of theories, methods, technologies, and application systems that can simulate, extend, and expand human intelligence. Artificial intelligence is an important branch of computer science that focuses on the research and development of theories, methods, technologies, and application systems that can simulate, extend, and expand human intelligence. Artificial intelligence is an important branch of computer science that focuses on the research and development of theories, methods, technologies, and application systems that can simulate, extend, and expand human intelligence. Artificial intelligence is an important branch of computer science that focuses on the research and development of theories, methods, technologies, and application systems that can simulate, extend, and expand human intelligence. Artificial intelligence is an important branch of computer science that focuses on the research and development of theories, methods, technologies, and application systems that can simulate, extend, and expand human intelligence. Artificial intelligence is an important branch of computer science that focuses on the research and development of theories, methods, technologies, and application systems that can simulate, extend, and expand human intelligence. Artificial intelligence is an important branch of computer science that focuses on the research and development of theories, methods, technologies, and application systems that can simulate, extend, and expand human intelligence. Artificial intelligence is an important branch of computer science that focuses on the research and development of theories, methods, technologies, and application systems that can simulate, extend, and expand human intelligence.\n\nBased on the background knowledge above, briefly introduce the random forest algorithm in machine learning, including its basic principles and use cases."
}'

# Second request - Set 'previous_response_id' to the 'id' from the first response.
curl -X POST https://{WorkspaceId}.cn-beijing.maas.aliyuncs.com/compatible-mode/v1/responses \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-H "x-dashscope-session-cache: enable" \
-d '{
    "model": "qwen3.7-plus",
    "input": "What are the main differences between it and GBDT?",
    "previous_response_id": "id_from_first_response"
}'

Migrate from Chat Completions API to Responses API

The Responses API simplifies the Chat Completions API interface while maintaining compatibility. To migrate, follow these steps.

1. Update the endpoint address

Update the endpoint address from /v1/chat/completions to /v1/responses.

Python

# Chat Completions API
completion = client.chat.completions.create(
    model="qwen3.7-plus",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Hello!"}
    ]
)
print(completion.choices[0].message.content)

# Responses API - can use the same message format
response = client.responses.create(
    model="qwen3.7-plus",
    input=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Hello!"}
    ]
)
print(response.output_text)

# Responses API - or use a more concise format
response = client.responses.create(
    model="qwen3.7-plus",
    input="Hello!"
)
print(response.output_text)

Node.js

// Chat Completions API
const completion = await client.chat.completions.create({
    model: "qwen3.7-plus",
    messages: [
        { role: "system", content: "You are a helpful assistant." },
        { role: "user", content: "Hello!" }
    ]
});
console.log(completion.choices[0].message.content);

// Responses API - can use the same message format
const response = await client.responses.create({
    model: "qwen3.7-plus",
    input: [
        { role: "system", content: "You are a helpful assistant." },
        { role: "user", content: "Hello!" }
    ]
});
console.log(response.output_text);

// Responses API - or use a more concise format
const response2 = await client.responses.create({
    model: "qwen3.7-plus",
    input: "Hello!"
});
console.log(response2.output_text);

Curl

# Chat Completions API
curl -X POST https://{WorkspaceId}.cn-beijing.maas.aliyuncs.com/compatible-mode/v1/chat/completions \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
    "model": "qwen3.7-plus",
    "messages": [
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Hello!"}
    ]
}'

# Responses API - use a more concise format
curl -X POST https://{WorkspaceId}.cn-beijing.maas.aliyuncs.com/compatible-mode/v1/responses \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
    "model": "qwen3.7-plus",
    "input": "Hello!"
}'

2. Update response handling

The Responses API returns a different response structure. Use the output_text shortcut to retrieve the text output, or access detailed information through the output array.

Response comparison

# Chat Completions Response
{
  "id": "chatcmpl-416b0ea5-e362-9fec-97c5-0a60b5d7xxx",
  "choices": [
    {
      "finish_reason": "stop",
      "index": 0,
      "logprobs": null,
      "message": {
        "content": "Hello! I'm happy to see you. How can I help you?",
        "refusal": null,
        "role": "assistant",
        "function_call": null,
        "tool_calls": null
      }
    }
  ],
  "created": 1769416269,
  "model": "qwen3.7-plus",
  "object": "chat.completion",
  "service_tier": null,
  "system_fingerprint": null,
  "usage": {
    "completion_tokens": 14,
    "prompt_tokens": 22,
    "total_tokens": 36,
    "prompt_tokens_details": {
      "cached_tokens": 0
    }
  }
}

# Responses API Response
{
  "id": "d69c735d-0f5e-4b6c-9c2a-8cab5eb14xxx",
  "created_at": 1769416269.0,
  "model": "qwen3.7-plus",
  "object": "response",
  "status": "completed",
  "output": [
    {
      "id": "msg_3426d3e5-8da7-4dd8-a6a5-7c2cd866xxx",
      "type": "message",
      "role": "assistant",
      "status": "completed",
      "content": [
        {
          "type": "output_text",
          "text": "Hello! Today is Monday, January 26, 2026. How can I help you?",
          "annotations": []
        }
      ]
    }
  ],
  "usage": {
    "input_tokens": 34,
    "output_tokens": 25,
    "total_tokens": 59,
    "input_tokens_details": {
      "cached_tokens": 0
    },
    "output_tokens_details": {
      "reasoning_tokens": 0
    }
  }
}

3. Simplify multi-turn conversations

With the Chat Completions API, you must manually manage the message history array. The Responses API simplifies this process by using the previous_response_id parameter to automatically link conversation context. The response id is valid for 7 days.

Python

# Chat Completions - manual message history management
messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "What is the capital of France?"}
]
res1 = client.chat.completions.create(
    model="qwen3.7-plus",
    messages=messages
)

# Manually add response to history
messages.append(res1.choices[0].message)
messages.append({"role": "user", "content": "What is its population?"})

res2 = client.chat.completions.create(
    model="qwen3.7-plus",
    messages=messages
)

# Responses API - automatic linking with previous_response_id
res1 = client.responses.create(
    model="qwen3.7-plus",
    input="What is the capital of France?"
)

# Just pass the previous response ID
res2 = client.responses.create(
    model="qwen3.7-plus",
    input="What is its population?",
    previous_response_id=res1.id
)

Node.js

// Chat Completions - manual message history management
let messages = [
    { role: "system", content: "You are a helpful assistant." },
    { role: "user", content: "What is the capital of France?" }
];
const res1 = await client.chat.completions.create({
    model: "qwen3.7-plus",
    messages
});

// Manually add response to history
messages = messages.concat([res1.choices[0].message]);
messages.push({ role: "user", content: "What is its population?" });

const res2 = await client.chat.completions.create({
    model: "qwen3.7-plus",
    messages
});

// Responses API - automatic linking with previous_response_id
const res1 = await client.responses.create({
    model: "qwen3.7-plus",
    input: "What is the capital of France?"
});

// Just pass the previous response ID
const res2 = await client.responses.create({
    model: "qwen3.7-plus",
    input: "What is its population?",
    previous_response_id: res1.id
});

4. Use built-in tools

The Responses API includes built-in tools. Specify them in the tools parameter. The Code Interpreter and web search tools are free for a limited time. See tool calling.

Python

# Chat Completions - you need to implement tools yourself
def web_search(query):
    # Need to implement web search logic yourself
    import requests
    r = requests.get(f"https://api.example.com/search?q={query}")
    return r.json().get("results", [])

completion = client.chat.completions.create(
    model="qwen3.7-plus",
    messages=[{"role": "user", "content": "Who is the current president of France?"}],
    functions=[{
        "name": "web_search",
        "description": "Search the web for information",
        "parameters": {
            "type": "object",
            "properties": {"query": {"type": "string"}},
            "required": ["query"]
        }
    }]
)

# Responses API - use built-in tools directly
response = client.responses.create(
    model="qwen3.7-plus",
    input="Who is the current president of France?",
    tools=[{"type": "web_search"}]  # Enable web search directly
)
print(response.output_text)

Node.js

// Chat Completions - you need to implement tools yourself
async function web_search(query) {
    const fetch = (await import('node-fetch')).default;
    const res = await fetch(`https://api.example.com/search?q=${query}`);
    const data = await res.json();
    return data.results;
}

const completion = await client.chat.completions.create({
    model: "qwen3.7-plus",
    messages: [{ role: "user", content: "Who is the current president of France?" }],
    functions: [{
        name: "web_search",
        description: "Search the web for information",
        parameters: {
            type: "object",
            properties: { query: { type: "string" } },
            required: ["query"]
        }
    }]
});

// Responses API - use built-in tools directly
const response = await client.responses.create({
    model: "qwen3.7-plus",
    input: "Who is the current president of France?",
    tools: [{ type: "web_search" }]  // Enable web search directly
});
console.log(response.output_text);

Curl

# Chat Completions - need to implement tools yourself
# Example of calling an external search API
curl https://api.example.com/search \
  -G \
  --data-urlencode "q=current president of France" \
  --data-urlencode "key=$SEARCH_API_KEY"

# Responses API - use built-in tools directly
curl -X POST https://{WorkspaceId}.cn-beijing.maas.aliyuncs.com/compatible-mode/v1/responses \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
    "model": "qwen3.7-plus",
    "input": "Who is the current president of France?",
    "tools": [{"type": "web_search"}]
}'

FAQ

Q: How to pass multi-turn conversation context?

A: Pass the id from the previous successful model response as the previous_response_id parameter in your next conversation request.

Q: Why can't I print 'output_text'?

A: This attribute is missing in some versions of the OpenAI Python SDK, such as 1.99.2. To resolve this error, update the SDK to the latest version.

Prerequisites

Supported models

Endpoints

China (Beijing)

Singapore

US (Virginia)

Germany (Frankfurt)

Japan (Tokyo)

Code examples

Basic call

Python

Node.js

Curl

Multi-turn conversation

Python

Node.js

Curl

Deep thinking

Python

Node.js

Curl

Stream output

Python

Node.js

Curl

Use built-in tools

Python

Node.js

Curl

Session cache

Python

Node.js

curl

Migrate from Chat Completions API to Responses API

1. Update the endpoint address

Python

Node.js

Curl

2. Update response handling

3. Simplify multi-turn conversations

Python

Node.js

4. Use built-in tools

Python

Node.js

Curl

FAQ

Q: How to pass multi-turn conversation context?

Q: Why can't I print 'output_text'?

Related