OpenAI Responses接口兼容

更新时间:
复制为 MD 格式

阿里云百炼的通义千问模型支持 OpenAI 兼容 Responses 接口。作为Chat Completions API的演进版本,Responses API能够以更简洁的方式提供智能体原生功能。

相较于OpenAI Chat Completions API 的优势:

  • 内置工具:内置联网搜索、网页抓取、代码解释器、文搜图、图搜图等工具,可在处理复杂任务时获得更佳效果,详情参考调用内置工具

  • 更灵活的输入:支持直接传入字符串作为模型输入,也兼容 Chat 格式的消息数组。

  • 简化上下文管理:通过传递上一轮响应的 previous_response_id,无需手动构建完整的消息历史数组。

输入输出参数说明请参考OpenAI Responses API参考

前提条件

您需要先获取API Key配置API Key到环境变量。若通过 OpenAI SDK 进行调用,需要安装SDK

支持的模型

qwen3-maxqwen3-max-2026-01-23qwen3.5-plusqwen3.5-plus-2026-02-15qwen3.5-flashqwen3.5-flash-2026-02-23qwen3.5-397b-a17bqwen3.5-122b-a10bqwen3.5-27bqwen3.5-35b-a3bqwen-plusqwen-flashqwen3-coder-plusqwen3-coder-flash

服务地址

华北2(北京)

SDK 调用配置的base_urlhttps://dashscope.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1

HTTP 请求地址:POST https://dashscope.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1/responses

新加坡

SDK 调用配置的base_urlhttps://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1

HTTP 请求地址:POST https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1/responses

代码示例

基础调用

最简单的调用方式,发送一条消息并获取模型回复。

Python

import os
from openai import OpenAI

client = OpenAI(
    # 若没有配置环境变量,请用百炼 API Key 将下行替换为:api_key="sk-xxx"
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    base_url="https://dashscope.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1",
)

response = client.responses.create(
    model="qwen3.5-plus",
    input="你能做些什么?"
)

# 获取模型回复
# print(response.model_dump_json())
print(response.output_text)

Node.js

import OpenAI from "openai";

const openai = new OpenAI({
    // 若没有配置环境变量,请用百炼 API Key 将下行替换为:apiKey: "sk-xxx"
    apiKey: process.env.DASHSCOPE_API_KEY,
    baseURL: "https://dashscope.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1"
});

async function main() {
    const response = await openai.responses.create({
        model: "qwen3.5-plus",
        input: "你能做些什么?"
    });

    // 获取模型回复
    console.log(response.output_text);
}

main();

curl

curl -X POST https://dashscope.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1/responses \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
    "model": "qwen3.5-plus",
    "input": "你能做些什么?"
}'

响应示例

以下为API返回的完整响应。
{
    "created_at": 1771225825,
    "id": "0c842a11-c7d1-45da-b7ec-4e668c389xxx",
    "model": "qwen3.5-plus",
    "object": "response",
    "output": [
        {
            "id": "msg_0bdb8ab9-f1de-4db6-82c8-6c1185b91xxx",
            "summary": [
                {
                    "text": "Thinking Process:\n\n1.  **Analyze the Request ...",
                    "type": "summary_text"
                }
            ],
            "type": "reasoning"
        },
        {
            "content": [
                {
                    "annotations": [],
                    "text": "你好!作为一个人工智能助手,我可以协助你完成多种任务。以下是我的一些主要能力:\n\n1.  **文字创作与编辑**\n    *   撰写邮件、文章、报告、故事或社交媒体文案。\n    *   润色、改写或总结现有的文本内容。\n\n2.  **编程与技术支持**\n    *   编写、调试或解释代码(支持多种编程语言,如 Python、JavaScript、C++ 等)。\n    *   提供技术概念的解释和解决方案建议。\n\n3.  **知识问答与学习**\n    *   回答各领域的问题(我的知识更新至 2026 年)。\n    *   协助学习新概念、制定学习计划或解答习题。\n\n4.  **语言翻译**\n    *   支持多种语言之间的互译,帮助你跨越语言障碍。\n\n5.  **数据分析与整理**\n    *   协助整理信息、提取关键点或进行逻辑分析。\n    *   帮助格式化数据或生成表格结构。\n\n6.  **创意与头脑风暴**\n    *   提供创意灵感、策划方案或建议。\n    *   陪你聊天,提供情感支持或生活建议。\n\n有什么具体需要我帮忙的吗?随时告诉我!",
                    "type": "output_text"
                }
            ],
            "id": "msg_c8bb3db1-d235-44e7-9704-55b584022xxx",
            "role": "assistant",
            "status": "completed",
            "type": "message"
        }
    ],
    "parallel_tool_calls": false,
    "status": "completed",
    "tool_choice": "auto",
    "tools": [],
    "usage": {
        "input_tokens": 49,
        "input_tokens_details": {
            "cached_tokens": 0
        },
        "output_tokens": 1384,
        "output_tokens_details": {
            "reasoning_tokens": 1110
        },
        "total_tokens": 1433,
        "x_details": [
            {
                "input_tokens": 49,
                "output_tokens": 1384,
                "output_tokens_details": {
                    "reasoning_tokens": 1110
                },
                "total_tokens": 1433,
                "x_billing_type": "response_api"
            }
        ]
    }
}

多轮对话

通过 previous_response_id 参数自动关联上下文,无需手动构建消息历史,当前响应id有效期为7天。

previous_response_id 应传入上一轮响应中的顶层 idf0dbb153-117f-9bbf-8176-5284b47f3xxx,UUID格式),而不是 output 数组内消息的 idmsg_56c860c4-3ad8-4a96-8553-d2f94c259xxx)。

Python

import os
from openai import OpenAI

client = OpenAI(
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    base_url="https://dashscope.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1",
)

# 第一轮对话
response1 = client.responses.create(
    model="qwen3.5-plus",
    input="我的名字是张三,请记住。"
)
print(f"第一轮回复: {response1.output_text}")

# 第二轮对话 - 使用 previous_response_id 关联上下文,响应id有效期为7天
response2 = client.responses.create(
    model="qwen3.5-plus",
    input="你还记得我的名字吗?",
    previous_response_id=response1.id
)
print(f"第二轮回复: {response2.output_text}")

Node.js

import OpenAI from "openai";

const openai = new OpenAI({
    apiKey: process.env.DASHSCOPE_API_KEY,
    baseURL: "https://dashscope.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1"
});

async function main() {
    // 第一轮对话
    const response1 = await openai.responses.create({
        model: "qwen3.5-plus",
        input: "我的名字是张三,请记住。"
    });
    console.log(`第一轮回复: ${response1.output_text}`);

    // 第二轮对话 - 使用 previous_response_id 关联上下文,响应id有效期为7天
    const response2 = await openai.responses.create({
        model: "qwen3.5-plus",
        input: "你还记得我的名字吗?",
        previous_response_id: response1.id
    });
    console.log(`第二轮回复: ${response2.output_text}`);
}

main();

curl

# 第一轮对话
curl -X POST https://dashscope.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1/responses \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
    "model": "qwen3.5-plus",
    "input": "我的名字是张三,请记住。"
}'

# 第二轮对话 - 使用上一轮返回的 id 作为 previous_response_id
curl -X POST https://dashscope.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1/responses \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
    "model": "qwen3.5-plus",
    "input": "你还记得我的名字吗?",
    "previous_response_id": "第一轮返回的响应id"
}'

第二轮对话响应示例

{
  "id": "f0dbb153-117f-9bbf-8176-5284b47f3xxx",
  "created_at": 1769169951.0,
  "model": "qwen3.5-plus",
  "object": "response",
  "status": "completed",
  "output": [
    {
      "id": "msg_56c860c4-3ad8-4a96-8553-d2f94c259xxx",
      "type": "message",
      "role": "assistant",
      "status": "completed",
      "content": [
        {
          "type": "output_text",
          "text": "当然记得,你的名字是张三!",
          "annotations": []
        }
      ]
    }
  ],
  "usage": {
    "input_tokens": 73,
    "output_tokens": 10,
    "total_tokens": 83,
    "input_tokens_details": {
      "cached_tokens": 0
    },
    "output_tokens_details": {
      "reasoning_tokens": 0
    }
  }
}

说明:第二轮对话的 input_tokens 为 73,包含了第一轮的上下文,模型成功记住了名字"张三"。

流式输出

通过流式输出实时接收模型生成的内容,适合长文本生成场景。

Python

import os
from openai import OpenAI

client = OpenAI(
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    base_url="https://dashscope.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1",
)

stream = client.responses.create(
    model="qwen3.5-plus",
    input="请简单介绍一下人工智能。",
    stream=True
)

print("开始接收流式输出:")
for event in stream:
    # print(event.model_dump_json())  # 取消注释以查看原始事件响应
    if event.type == 'response.output_text.delta':
        # 实时打印增量文本
        print(event.delta, end='', flush=True)
    elif event.type == 'response.completed':
        print("\n流式输出完成")
        print(f"总 Token 数: {event.response.usage.total_tokens}")

Node.js

import OpenAI from "openai";

const openai = new OpenAI({
    apiKey: process.env.DASHSCOPE_API_KEY,
    baseURL: "https://dashscope.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1"
});

async function main() {
    const stream = await openai.responses.create({
        model: "qwen3.5-plus",
        input: "请简单介绍一下人工智能。",
        stream: true
    });

    console.log("开始接收流式输出:");
    for await (const event of stream) {
        // console.log(JSON.stringify(event));  // 取消注释以查看原始事件响应
        if (event.type === 'response.output_text.delta') {
            process.stdout.write(event.delta);
        } else if (event.type === 'response.completed') {
            console.log("\n流式输出完成");
            console.log(`总 Token 数: ${event.response.usage.total_tokens}`);
        }
    }
}

main();

curl

curl -X POST https://dashscope.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1/responses \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
    "model": "qwen3.5-plus",
    "input": "请简单介绍一下人工智能。",
    "stream": true
}'

响应示例

{"response":{"id":"47a71e7d-868c-4204-9693-ef8ff9058xxx","created_at":1769417481.0,"error":null,"incomplete_details":null,"instructions":null,"metadata":null,"model":"","object":"response","output":[],"parallel_tool_calls":false,"temperature":null,"tool_choice":"auto","tools":[],"top_p":null,"background":null,"completed_at":null,"conversation":null,"max_output_tokens":null,"max_tool_calls":null,"previous_response_id":null,"prompt":null,"prompt_cache_key":null,"prompt_cache_retention":null,"reasoning":null,"safety_identifier":null,"service_tier":null,"status":"queued","text":null,"top_logprobs":null,"truncation":null,"usage":null,"user":null},"sequence_number":0,"type":"response.created"}
{"response":{"id":"47a71e7d-868c-4204-9693-ef8ff9058xxx","created_at":1769417481.0,"error":null,"incomplete_details":null,"instructions":null,"metadata":null,"model":"","object":"response","output":[],"parallel_tool_calls":false,"temperature":null,"tool_choice":"auto","tools":[],"top_p":null,"background":null,"completed_at":null,"conversation":null,"max_output_tokens":null,"max_tool_calls":null,"previous_response_id":null,"prompt":null,"prompt_cache_key":null,"prompt_cache_retention":null,"reasoning":null,"safety_identifier":null,"service_tier":null,"status":"in_progress","text":null,"top_logprobs":null,"truncation":null,"usage":null,"user":null},"sequence_number":1,"type":"response.in_progress"}
{"item":{"id":"msg_16db29d6-c1d3-47d7-9177-0fba81964xxx","content":[],"role":"assistant","status":"in_progress","type":"message"},"output_index":0,"sequence_number":2,"type":"response.output_item.added"}
{"content_index":0,"item_id":"msg_16db29d6-c1d3-47d7-9177-0fba81964xxx","output_index":0,"part":{"annotations":[],"text":"","type":"output_text","logprobs":null},"sequence_number":3,"type":"response.content_part.added"}
{"content_index":0,"delta":"人工智能","item_id":"msg_16db29d6-c1d3-47d7-9177-0fba81964xxx","logprobs":[],"output_index":0,"sequence_number":4,"type":"response.output_text.delta"}
{"content_index":0,"delta":"(Art","item_id":"msg_16db29d6-c1d3-47d7-9177-0fba81964xxx","logprobs":[],"output_index":0,"sequence_number":5,"type":"response.output_text.delta"}
{"content_index":0,"delta":"ificial Intelligence,","item_id":"msg_16db29d6-c1d3-47d7-9177-0fba81964xxx","logprobs":[],"output_index":0,"sequence_number":6,"type":"response.output_text.delta"}
{"content_index":0,"delta":"简称 AI)","item_id":"msg_16db29d6-c1d3-47d7-9177-0fba81964xxx","logprobs":[],"output_index":0,"sequence_number":7,"type":"response.output_text.delta"}
... (省略中间事件) ...
{"content_index":0,"delta":"领域,正在深刻改变我们的","item_id":"msg_16db29d6-c1d3-47d7-9177-0fba81964xxx","logprobs":[],"output_index":0,"sequence_number":38,"type":"response.output_text.delta"}
{"content_index":0,"delta":"生活和工作方式","item_id":"msg_16db29d6-c1d3-47d7-9177-0fba81964xxx","logprobs":[],"output_index":0,"sequence_number":39,"type":"response.output_text.delta"}
{"content_index":0,"delta":"。","item_id":"msg_16db29d6-c1d3-47d7-9177-0fba81964xxx","logprobs":[],"output_index":0,"sequence_number":40,"type":"response.output_text.delta"}
{"content_index":0,"item_id":"msg_16db29d6-c1d3-47d7-9177-0fba81964xxx","logprobs":[],"output_index":0,"sequence_number":41,"text":"人工智能(Artificial Intelligence,简称 AI)是指由计算机系统模拟人类智能行为的技术和科学。xxxx","type":"response.output_text.done"}
{"content_index":0,"item_id":"msg_16db29d6-c1d3-47d7-9177-0fba81964xxx","output_index":0,"part":{"annotations":[],"text":"人工智能(Artificial Intelligence,简称 AI)是指由计算机系统模拟人类智能行为的技术和科学。xxx","type":"output_text","logprobs":null},"sequence_number":42,"type":"response.content_part.done"}
{"item":{"id":"msg_16db29d6-c1d3-47d7-9177-0fba81964xxx","content":[{"annotations":[],"text":"人工智能(Artificial Intelligence,简称 AI)是指由计算机系统模拟人类智能行为的技术和科学。它旨在让机器能够执行通常需要人类智能才能完成的任务,例如:\n\n- **学习**(如通过数据训练模型)  \n- **推理**(如逻辑判断和问题求解)  \n- **感知**(如识别图像、语音或文字)  \n- **理解语言**(如自然语言处理)  \n- **决策**(如在复杂环境中做出最优选择)\n\n人工智能可分为**弱人工智能**(专注于特定任务,如语音助手、推荐系统)和**强人工智能**(具备类似人类的通用智能,目前尚未实现)。\n\n当前,AI 已广泛应用于医疗、金融、交通、教育、娱乐等多个领域,正在深刻改变我们的生活和工作方式。","type":"output_text","logprobs":null}],"role":"assistant","status":"completed","type":"message"},"output_index":0,"sequence_number":43,"type":"response.output_item.done"}
{"response":{"id":"47a71e7d-868c-4204-9693-ef8ff9058xxx","created_at":1769417481.0,"error":null,"incomplete_details":null,"instructions":null,"metadata":null,"model":"qwen3.5-plus","object":"response","output":[{"id":"msg_16db29d6-c1d3-47d7-9177-0fba81964xxx","content":[{"annotations":[],"text":"人工智能(Artificial Intelligence,简称 AI)是xxxxxx","type":"output_text","logprobs":null}],"role":"assistant","status":"completed","type":"message"}],"parallel_tool_calls":false,"temperature":null,"tool_choice":"auto","tools":[],"top_p":null,"background":null,"completed_at":null,"conversation":null,"max_output_tokens":null,"max_tool_calls":null,"previous_response_id":null,"prompt":null,"prompt_cache_key":null,"prompt_cache_retention":null,"reasoning":null,"safety_identifier":null,"service_tier":null,"status":"completed","text":null,"top_logprobs":null,"truncation":null,"usage":{"input_tokens":37,"input_tokens_details":{"cached_tokens":0},"output_tokens":166,"output_tokens_details":{"reasoning_tokens":0},"total_tokens":203},"user":null},"sequence_number":44,"type":"response.completed"}

深度思考

开启深度思考模式后,模型会在回复前进行思考,思考内容通过 reasoning 类型的输出项返回。适用于需要复杂推理的问题。

不支持 thinking_budget 参数控制最大思维长度。

Python

import os
from openai import OpenAI

client = OpenAI(
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    base_url="https://dashscope.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1",
)

response = client.responses.create(
    model="qwen3.5-plus",
    input="9.9和9.11谁大?",
    extra_body={"enable_thinking": True}
)

# 处理输出
for item in response.output:
    if item.type == "reasoning":
        print("=== 思考过程 ===")
        for summary in item.summary:
            print(summary.text)
    elif item.type == "message":
        print("\n=== 最终答案 ===")
        print(item.content[0].text)

# 查看思考 Token 数
print(f"\n思考 Token 数: {response.usage.output_tokens_details.reasoning_tokens}")

Node.js

import OpenAI from "openai";

const openai = new OpenAI({
    apiKey: process.env.DASHSCOPE_API_KEY,
    baseURL: "https://dashscope.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1"
});

async function main() {
    const response = await openai.responses.create({
        model: "qwen3.5-plus",
        input: "9.9和9.11谁大?",
        enable_thinking: true
    });

    for (const item of response.output) {
        if (item.type === "reasoning") {
            console.log("=== 思考过程 ===");
            for (const summary of item.summary) {
                console.log(summary.text);
            }
        } else if (item.type === "message") {
            console.log("\n=== 最终答案 ===");
            console.log(item.content[0].text);
        }
    }

    // 查看思考 Token 数
    console.log(`\n思考 Token 数: ${response.usage.output_tokens_details.reasoning_tokens}`);
}

main();

curl

curl -X POST https://dashscope.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1/responses \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
    "model": "qwen3.5-plus",
    "input": "9.9和9.11谁大?",
    "enable_thinking": true
}'

响应示例

{
    "created_at": 1774498317,
    "id": "resp_xxx",
    "model": "qwen3.5-plus",
    "object": "response",
    "output": [
        {
            "id": "msg_xxx",
            "summary": [
                {
                    "text": "Thinking Process:\n\n1.  **Analyze the Request:**\n    *   Question: \"9.9 和 9.11 谁大?\" (Which is larger, 9.9 or 9.11?)\n    *   Context: The user is asking a simple mathematical comparison question.\n    *   Current Date: Thursday, March 26, 2026 (provided in the system prompt).\n    *   Knowledge Cutoff: 2026 (provided in the system prompt).\n\n2.  **Evaluate the Numbers:**\n    *   Number A: 9.9\n    *   Number B: 9.11\n    *   These are decimal numbers.\n    *   Compare the integer part: Both are 9.\n    *   Compare the tenths place (first decimal digit):\n        *   9.9 has 9 in the tenths place.\n        *   9.11 has 1 in the tenths place.\n    *   Since 9 > 1, 9.9 is greater than 9.11.\n\n3.  **Consider Potential Ambiguities:**\n    *   Could this be version numbering? (e.g., software versions). In versioning, 9.11 is often \"newer\" or \"higher\" than 9.9. However, mathematically, 9.9 > 9.11.\n    *   Could this be dates? (September 9th vs September 11th). 11th is later.\n    *   Standard interpretation: Without context, decimal numbers are assumed to be mathematical values.\n    *   Common pitfall: Some people mistakenly treat decimals like whole numbers (where 11 > 9), leading them to think 9.11 > 9.9. This is a known cognitive bias or misconception in elementary math.\n    *   Decision: Provide the mathematical answer clearly, but perhaps acknowledge the versioning context if relevant (though usually, for this specific question, it's a math test). Given the simplicity, stick to the mathematical truth first.\n\n4.  **Formulate the Answer:**\n    *   Direct answer: 9.9 is larger.\n    *   Explanation: Compare place values. 9.9 = 9.90, 9.11 = 9.11. 90 hundredths > 11 hundredths.\n    *   Tone: Helpful, clear.\n\n5.  **Drafting the Response (Internal Monologue/Draft):**\n    *   Okay, mathematically 9.9 is bigger. 9.9 is 9 and 9 tenths. 9.11 is 9 and 11 hundredths (or 1 tenth and 1 hundredth). 9 tenths is more than 1 tenth.\n    *   So, 9.9 > 9.11.\n    *   I should answer in Chinese since the question is in Chinese.\n    *   \"9.9 更大。\" (9.9 is larger.)\n    *   Add a brief explanation to avoid confusion. \"因为 9.9 等于 9.90,而 9.90 大于 9.11。\" (Because 9.9 equals 9.90, and 9.90 is greater than 9.11.)\n\n6.  **Refining based on \"Version Number\" possibility:**\n    *   Sometimes this is a trick question about software versions. In semver, 9.11 > 9.9.\n    *   However, usually, when asked plainly \"which is larger\", it refers to numerical value.\n    *   I will provide the numerical answer as primary, maybe add a note about versions if it feels necessary. But keeping it simple is better. Most likely, this is testing the decimal comparison logic.\n    *   Let's stick to the mathematical value as the default.\n\n7.  **Final Polish:**\n    *   Answer: 9.9 大。\n    *   Reasoning: 比较小数时,先看整数部分,再看十分位。9.9 的十分位是 9,9.11 的十分位是 1。9 > 1,所以 9.9 > 9.11。\n\n8.  **Output Generation:** (Matching the thought process)\n    *   State clearly which is larger.\n    *   Explain why.\n\n    *Self-Correction on Date/Time:* The system prompt mentions the current date is 2026. This doesn't affect the math question, but I should ensure I don't reference anything post-2026 if I were adding context (not needed here).\n\n    *Final Decision:* Just answer the math question directly.\n\n    \"从数值上看,9.9 更大。\" (Numerically, 9.9 is larger.)\n    Explain: 9.9 = 9.90, 9.11 = 9.11. 90 > 11.\n\n    Wait, is there any other interpretation?\n    - Dates? 9/9 vs 9/11. 11th is later.\n    - Versions? 9.11 is newer.\n    - But \"谁大\" (who is big/larger) usually implies magnitude.\n    - I will answer based on numerical magnitude but briefly mention the version context if it adds value without confusing. Actually, better to be decisive. Numerical value is the standard interpretation for \"who is larger\" with numbers.\n\n    Let's go with the numerical answer.\n\n    **Plan:**\n    1. State 9.9 is larger numerically.\n    2. Explain the decimal comparison.\n    3. (Optional but helpful) Mention that if it's version numbers, 9.11 might be considered \"higher\". But prioritize the math.\n    Actually, for a simple query like this, over-explaining versions might be confusing. I'll stick to the math but acknowledge the common confusion.\n\n    *Revised Plan:*\n    1. Direct answer: 9.9 is larger (mathematically).\n    2. Explanation: Place value.\n    3. Note: Unless it's a version number.\n\n    Let's keep it concise.\n\n    \"在数学数值上,9.9 更大。\"\n    \"因为 9.9 等于 9.90,比 9.11 大。\"\n    \"如果是版本号,则 9.11 版本通常高于 9.9 版本。\"\n    This covers both bases.\n\n    Wait, looking at common AI training, this is often a \"trap\" question to see if the AI treats decimals like integers. The correct mathematical answer is crucial.\n\n    Okay, final output structure:\n    1. Mathematical answer.\n    2. Brief explanation.\n    3. Contextual note (versions).\n\n    Let's write it.\n    \"从数学数值上看,9.9 更大。\"\n    \"比较小数时,先看整数部分(都是 9),再看十分位:9.9 的十分位是 9,而 9.11 的十分位是 1。因为 9 > 1,所以 9.9 > 9.11。\"\n    \"不过,如果是软件版本号,通常 9.11 版会比 9.9 版更新。\"\n\n    This is balanced.\n\n    Wait, I need to check the instruction again. \"Please remember the current actual time...\". This is just system context. I don't need to mention the date in the answer unless relevant. It's not relevant here.\n\n    Okay, ready to generate.\n\n    Wait, one more check. Is there any cultural context? In some contexts, people might say 9.11 is bigger because 11 > 9. This is the misconception I need to correct.\n    So, emphasizing the mathematical rule is key.\n\n    Final check:\n    9.9 = 9 + 9/10 = 9.90\n    9.11 = 9 + 1/10 + 1/100 = 9.11\n    9.90 > 9.11. Correct.\n\n    Response will be in Chinese.\n    \"从数值大小来看,9.9 更大。\"\n    Explain why.\n    Mention version exception.\n    Done.\n",
                    "type": "summary_text"
                }
            ],
            "type": "reasoning"
        },
        {
            "content": [
                {
                    "annotations": [],
                    "text": "从**数学数值**上看,**9.9 更大**。\n\n原因如下:\n比较小数时,先看整数部分,都是 9;再看十分位(小数点后第一位):\n*   9.9 的十分位是 **9**\n*   9.11 的十分位是 **1**\n\n因为 9 大于 1,所以 **9.9 > 9.11**(可以把 9.9 看作 9.90 来比较)。\n\n**注意**:如果是**软件版本号**,通常 9.11 版会比 9.9 版更新(更高),但在纯数字大小比较中,9.9 更大。",
                    "type": "output_text"
                }
            ],
            "id": "msg_xxx",
            "role": "assistant",
            "status": "completed",
            "type": "message"
        }
    ],
    "parallel_tool_calls": false,
    "status": "completed",
    "tool_choice": "auto",
    "tools": [],
    "usage": {
        "input_tokens": 57,
        "input_tokens_details": {
            "cached_tokens": 0
        },
        "output_tokens": 2018,
        "output_tokens_details": {
            "reasoning_tokens": 1861
        },
        "total_tokens": 2075,
        "x_details": [
            {
                "input_tokens": 57,
                "output_tokens": 2018,
                "output_tokens_details": {
                    "reasoning_tokens": 1861
                },
                "total_tokens": 2075,
                "x_billing_type": "response_api"
            }
        ]
    }
}

调用内置工具

开启内置工具可在处理复杂任务时获得更佳效果,当前网页抓取与代码解释器工具限时免费,支持的工具请参见工具调用

Python

import os
from openai import OpenAI

client = OpenAI(
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    base_url="https://dashscope.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1",
)

response = client.responses.create(
    model="qwen3.5-plus",
    input="帮我找一下阿里云官网,并提取首页的关键信息",
    # 建议同时开启内置工具以取得最佳效果
    tools=[
        {"type": "web_search"},
        {"type": "code_interpreter"},
        {"type": "web_extractor"}
    ],
    extra_body={"enable_thinking": True}
)

# 取消以下注释查看中间过程输出
# print(response.output)
print(response.output_text)

Node.js

import OpenAI from "openai";

const openai = new OpenAI({
    apiKey: process.env.DASHSCOPE_API_KEY,
    baseURL: "https://dashscope.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1"
});

async function main() {
    const response = await openai.responses.create({
        model: "qwen3.5-plus",
        input: "帮我找一下阿里云官网,并提取首页的关键信息",
        tools: [
            { type: "web_search" },
            { type: "code_interpreter" },
            { type: "web_extractor" }
        ],
        enable_thinking: true
    });

    for (const item of response.output) {
        if (item.type === "reasoning") {
            console.log("模型正在思考...");
        } else if (item.type === "web_search_call") {
            console.log(`搜索查询: ${item.action.query}`);
        } else if (item.type === "web_extractor_call") {
            console.log("正在抽取网页内容...");
        } else if (item.type === "message") {
            console.log(`回复内容: ${item.content[0].text}`);
        }
    }
}

main();

curl

curl -X POST https://dashscope.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1/responses \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
    "model": "qwen3.5-plus",
    "input": "帮我找一下阿里云官网,并提取首页的关键信息",
    "tools": [
        {
            "type": "web_search"
        },
        {
            "type": "code_interpreter"
        },
        {
            "type": "web_extractor"
        }
    ],
    "enable_thinking": true
}'

响应示例

{
    "id": "69258b21-5099-9d09-92e8-8492b1955xxx",
    "object": "response",
    "status": "completed",
    "output": [
        {
            "type": "reasoning",
            "summary": [
                {
                    "type": "summary_text",
                    "text": "用户要求找阿里云官网并提取信息..."
                }
            ]
        },
        {
            "type": "web_search_call",
            "status": "completed",
            "action": {
                "query": "阿里云官网",
                "type": "search",
                "sources": [
                    {
                        "type": "url",
                        "url": "https://cn.aliyun.com/"
                    },
                    {
                        "type": "url",
                        "url": "https://www.alibabacloud.com/zh"
                    }
                ]
            }
        },
        {
            "type": "reasoning",
            "summary": [
                {
                    "type": "summary_text",
                    "text": "搜索结果显示阿里云官网URL..."
                }
            ]
        },
        {
            "type": "web_extractor_call",
            "status": "completed",
            "goal": "提取阿里云官网首页的关键信息",
            "output": "通义大模型、完整产品体系、AI解决方案...",
            "urls": [
                "https://cn.aliyun.com/"
            ]
        },
        {
            "type": "message",
            "role": "assistant",
            "status": "completed",
            "content": [
                {
                    "type": "output_text",
                    "text": "阿里云官网关键信息:通义大模型,云计算服务..."
                }
            ]
        }
    ],
    "usage": {
        "input_tokens": 40836,
        "output_tokens": 2106,
        "total_tokens": 42942,
        "output_tokens_details": {
            "reasoning_tokens": 677
        },
        "x_tools": {
            "web_extractor": {
                "count": 1
            },
            "web_search": {
                "count": 1
            }
        }
    }
}

Session 缓存

概述

Session 缓存是面向 Responses API 多轮对话场景的缓存模式。与显式缓存需要手动添加 cache_control 标记不同,Session 缓存由服务端自动处理缓存逻辑,只需通过 HTTP header 控制开关,按正常多轮对话方式调用即可。

在使用 previous_response_id 进行多轮对话时,开启 Session 缓存 后,服务端会自动缓存对话上下文,降低推理延迟与使用成本。

使用方式

在请求 header 中添加以下字段即可控制 Session 缓存 的开关:

  • x-dashscope-session-cache: enable:开启 Session 缓存。

  • x-dashscope-session-cache: disable:关闭 Session 缓存,若模型支持将启用隐式缓存。

使用 SDK 时,可通过 default_headers(Python)或 defaultHeaders(Node.js)参数传入该 header;使用 curl 时,通过 -H 参数传入。

支持的模型

qwen3-maxqwen3.5-plusqwen3.5-flashqwen-plusqwen-flashqwen3-coder-plusqwen3-coder-flash

Session 缓存仅适用于 Responses API(OpenAI兼容-Responses),不适用于 Chat Completions API。

代码示例

Python

import os
from openai import OpenAI

client = OpenAI(
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    base_url="https://dashscope.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1",
    # 通过 default_headers 开启 Session 缓存
    default_headers={"x-dashscope-session-cache": "enable"}
)

# 构造超过 1024 Token 的长文本,确保能触发缓存创建(若未达到1024 Token,后续累积对话上下文超过1024 Token时将触发缓存创建)
long_context = "人工智能是计算机科学的一个重要分支,致力于研究和开发能够模拟、延伸和扩展人类智能的理论、方法、技术及应用系统。" * 50

# 第一轮对话
response1 = client.responses.create(
    model="qwen3.5-plus",
    input=long_context + "\n\n基于以上背景知识,请简短介绍机器学习中的随机森林算法。",
)
print(f"第一轮回复: {response1.output_text}")

# 第二轮对话:通过 previous_response_id 关联上下文,缓存由服务端自动处理
response2 = client.responses.create(
    model="qwen3.5-plus",
    input="它和 GBDT 有什么主要区别?",
    previous_response_id=response1.id,
)
print(f"第二轮回复: {response2.output_text}")

# 查看缓存命中情况
usage = response2.usage
print(f"输入 Token: {usage.input_tokens}")
print(f"缓存命中 Token: {usage.input_tokens_details.cached_tokens}")

Node.js

import OpenAI from "openai";

const openai = new OpenAI({
    apiKey: process.env.DASHSCOPE_API_KEY,
    baseURL: "https://dashscope.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1",
    // 通过 defaultHeaders 开启 Session 缓存
    defaultHeaders: {"x-dashscope-session-cache": "enable"}
});

// 构造超过 1024 Token 的长文本,确保能触发缓存创建(若未达到1024 Token,后续累积对话上下文超过1024 Token时将触发缓存创建)
const longContext = "人工智能是计算机科学的一个重要分支,致力于研究和开发能够模拟、延伸和扩展人类智能的理论、方法、技术及应用系统。".repeat(50);

async function main() {
    // 第一轮对话
    const response1 = await openai.responses.create({
        model: "qwen3.5-plus",
        input: longContext + "\n\n基于以上背景知识,请简短介绍机器学习中的随机森林算法,包括基本原理和应用场景。"
    });
    console.log(`第一轮回复: ${response1.output_text}`);

    // 第二轮对话:通过 previous_response_id 关联上下文,缓存由服务端自动处理
    const response2 = await openai.responses.create({
        model: "qwen3.5-plus",
        input: "它和 GBDT 有什么主要区别?",
        previous_response_id: response1.id
    });
    console.log(`第二轮回复: ${response2.output_text}`);

    // 查看缓存命中情况
    console.log(`输入 Token: ${response2.usage.input_tokens}`);
    console.log(`缓存命中 Token: ${response2.usage.input_tokens_details.cached_tokens}`);
}

main();

curl

# 第一轮对话
# 请将 input 替换为超过 1024 Token 的长文本,以确保触发缓存创建
curl -X POST https://dashscope.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1/responses \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-H "x-dashscope-session-cache: enable" \
-d '{
    "model": "qwen3.5-plus",
    "input": "人工智能是计算机科学的一个重要分支,致力于研究和开发能够模拟、延伸和扩展人类智能的理论、方法、技术及应用系统。人工智能是计算机科学的一个重要分支,致力于研究和开发能够模拟、延伸和扩展人类智能的理论、方法、技术及应用系统。人工智能是计算机科学的一个重要分支,致力于研究和开发能够模拟、延伸和扩展人类智能的理论、方法、技术及应用系统。人工智能是计算机科学的一个重要分支,致力于研究和开发能够模拟、延伸和扩展人类智能的理论、方法、技术及应用系统。人工智能是计算机科学的一个重要分支,致力于研究和开发能够模拟、延伸和扩展人类智能的理论、方法、技术及应用系统。人工智能是计算机科学的一个重要分支,致力于研究和开发能够模拟、延伸和扩展人类智能的理论、方法、技术及应用系统。人工智能是计算机科学的一个重要分支,致力于研究和开发能够模拟、延伸和扩展人类智能的理论、方法、技术及应用系统。人工智能是计算机科学的一个重要分支,致力于研究和开发能够模拟、延伸和扩展人类智能的理论、方法、技术及应用系统。人工智能是计算机科学的一个重要分支,致力于研究和开发能够模拟、延伸和扩展人类智能的理论、方法、技术及应用系统。人工智能是计算机科学的一个重要分支,致力于研究和开发能够模拟、延伸和扩展人类智能的理论、方法、技术及应用系统。人工智能是计算机科学的一个重要分支,致力于研究和开发能够模拟、延伸和扩展人类智能的理论、方法、技术及应用系统。人工智能是计算机科学的一个重要分支,致力于研究和开发能够模拟、延伸和扩展人类智能的理论、方法、技术及应用系统。人工智能是计算机科学的一个重要分支,致力于研究和开发能够模拟、延伸和扩展人类智能的理论、方法、技术及应用系统。人工智能是计算机科学的一个重要分支,致力于研究和开发能够模拟、延伸和扩展人类智能的理论、方法、技术及应用系统。人工智能是计算机科学的一个重要分支,致力于研究和开发能够模拟、延伸和扩展人类智能的理论、方法、技术及应用系统。人工智能是计算机科学的一个重要分支,致力于研究和开发能够模拟、延伸和扩展人类智能的理论、方法、技术及应用系统。人工智能是计算机科学的一个重要分支,致力于研究和开发能够模拟、延伸和扩展人类智能的理论、方法、技术及应用系统。人工智能是计算机科学的一个重要分支,致力于研究和开发能够模拟、延伸和扩展人类智能的理论、方法、技术及应用系统。人工智能是计算机科学的一个重要分支,致力于研究和开发能够模拟、延伸和扩展人类智能的理论、方法、技术及应用系统。人工智能是计算机科学的一个重要分支,致力于研究和开发能够模拟、延伸和扩展人类智能的理论、方法、技术及应用系统。\n\n基于以上背景知识,请简短介绍机器学习中的随机森林算法,包括基本原理和应用场景。"
}'

# 第二轮对话 - 使用上一轮返回的 id 作为 previous_response_id
curl -X POST https://dashscope.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1/responses \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-H "x-dashscope-session-cache: enable" \
-d '{
    "model": "qwen3.5-plus",
    "input": "它和 GBDT 有什么主要区别?",
    "previous_response_id": "第一轮返回的响应id"
}'

第二轮对话响应示例(缓存命中)

在第二轮对话的返回结果中,usage.input_tokens_details.cached_tokens 字段表示命中缓存的 Token 数。

{
    "id": "145584fd-3dce-4890-99dc-e3896d7f5a42",
    "created_at": 1772440976.0,
    "error": null,
    "incomplete_details": null,
    "instructions": null,
    "metadata": null,
    "model": "qwen3.5-plus",
    "object": "response",
    "output": [
        {
            "id": "msg_62a4e323-d78c-46c7-8469-2ad50f8af4b1",
            "summary": [
                {
                    "text": "Thinking Process:\n\n1.  **Analyze the Request:**\n    *   User asks for the main differences between Random Forest (RF) and GBDT (Gradient Boosting Decision Tree).\n    *   Context: The previous turn provided a definition of AI (repeated many times) and then a brief introduction to Random Forest.\n    *   Constraint: Keep it concise (implied by \"简短介绍\" in the first turn, though this turn asks for \"主要区别\" - main differences). The tone should be informative and professional.\n    *   Current Date: Monday, March 02, 2026.\n    *   Knowledge Cutoff: 2026.\n\n2.  **Recall Knowledge about RF vs. GBDT:**\n    *   **Bagging vs. Boosting:** RF is Bagging (Parallel), GBDT is Boosting (Sequential).\n    *   **Bias/Variance:** RF reduces variance, GBDT reduces bias.\n    *   **Sample/Feature Selection:** RF uses bootstrap sampling (row sampling) and feature subsampling. GBDT usually uses all samples (weighted) and often all features (though variants like XGBoost/LightGBM introduce feature sampling).\n    *   **Tree Structure:** RF trees are usually deep and independent. GBDT trees are usually shallow (weak learners) and dependent (each corrects the previous).\n    *   **Sensitivity to Outliers:** GBDT is more sensitive.\n    *   **Parallelization:** RF is easily parallelized. GBDT is inherently sequential (though tree construction within a step can be parallelized).\n\n3.  **Draft the Response:**\n    *   Structure: Core difference (Bagging vs. Boosting), Training method, Performance characteristics, Sensitivity.\n    *   Language: Chinese.\n    *   Tone: Professional, clear.\n\n4.  **Refine the Content (Self-Correction/Optimization):**\n    *   Make sure to highlight the *ensemble strategy* (Bagging vs. Boosting) as the fundamental difference.\n    *   Mention parallel vs. sequential training.\n    *   Mention bias/variance trade-off.\n    *   Keep it brief but comprehensive enough for a \"main differences\" question.\n\n5.  **Final Polish (incorporating constraints):**\n    *   Ensure no hallucination.\n    *   Ensure alignment with the 2026 knowledge cutoff (nothing has changed fundamentally in RF vs. GBDT theory).\n    *   Check against the repeated text in the first prompt (ignore the repetition, focus on the actual question).\n\n6.  **Drafting the response (internal monologue):**\n    *   Start with the ensemble type: RF = Bagging, GBDT = Boosting.\n    *   Training: RF = Parallel, independent trees. GBDT = Sequential, dependent trees (fitting residuals).\n    *   Goal: RF = Reduce variance. GBDT = Reduce bias.\n    *   Sensitivity: GBDT more sensitive to outliers.\n    *   Summary table or bullet points for clarity.\n\n7.  **Final Output Generation:** (Matches the drafted points).\n\n    *   **集成方式**:RF 是 Bagging,GBDT 是 Boosting。\n    *   **训练过程**:RF 并行,GBDT 串行。\n    *   **侧重点**:RF 降方差,GBDT 降偏差。\n    *   **异常值**:GBDT 更敏感。\n\n    Let's write it out clearly.\n\n    *   **核心集成策略**:随机森林基于 **Bagging** 思想,树之间相互独立;GBDT 基于 **Boosting** 思想,树之间呈强依赖关系(后一棵树拟合前一棵树的残差)。\n    *   **训练方式**:随机森林支持**并行训练**,效率较高;GBDT 必须**串行训练**,难以并行(尽管工程实现如 XGBoost 做了优化)。\n    *   **模型目标**:随机森林主要降低**方差**(防止过拟合);GBDT 主要降低**偏差**(提升拟合能力)。\n    *   **异常值敏感度**:GBDT 对异常值更敏感,随机森林相对鲁棒。\n\n    Looks good. Keep it concise.\n\n8.  **Final Review:** Does this answer the user's question directly? Yes. Is it accurate? Yes. Is it appropriate for the context? Yes.\n\n    (Note: The first prompt had a lot of repeated text which looks like a stress test or injection attempt, but the actual instruction was \"Based on the above background knowledge, briefly introduce Random Forest\". The second prompt is a follow-up. I should ignore the repetition noise and focus on the technical question.)\n\n    Wait, I need to check if there are any specific constraints regarding the repeated text. The user just pasted the same sentence many times. It doesn't change the factual knowledge about RF vs GBDT. I will proceed with the technical answer.\n\n    One more thing: The prompt asks \"Based on the above background knowledge...\" in the first turn. The second turn just asks \"What are the main differences with GBDT?\". I should answer directly.\n\n    Let's formulate the response.\n\n    **Structure:**\n    1.  Ensemble Strategy (Bagging vs Boosting)\n    2.  Training Process (Parallel vs Serial)\n    3.  Bias/Variance Focus\n    4.  Sensitivity\n\n    Ready to write.cw",
                    "type": "summary_text"
                }
            ],
            "type": "reasoning",
            "content": null,
            "encrypted_content": null,
            "status": null
        },
        {
            "id": "msg_560e34a6-1bdf-42ae-993e-590b38249146",
            "content": [
                {
                    "annotations": [],
                    "text": "随机森林(Random Forest)与 GBDT(Gradient Boosting Decision Tree)虽然都是基于决策树的集成算法,但存在以下主要区别:\n\n1.  **集成策略不同**\n    *   **随机森林**:基于 **Bagging** 思想。每棵树独立训练,彼此之间没有依赖关系。\n    *   **GBDT**:基于 **Boosting** 思想。树之间呈强依赖关系,后一棵树旨在拟合前一棵树预测结果的残差(负梯度)。\n\n2.  **训练方式不同**\n    *   **随机森林**:支持**并行训练**,因为树之间独立,计算效率通常较高。\n    *   **GBDT**:必须**串行训练**,因为后一棵树依赖前一棵树的输出,难以天然并行(尽管工程实现如 XGBoost 在特征粒度上做了并行优化)。\n\n3.  **优化目标不同**\n    *   **随机森林**:主要通过平均多个模型来降低**方差**(Variance),防止过拟合,提升稳定性。\n    *   **GBDT**:主要通过逐步修正错误来降低**偏差**(Bias),提升模型的拟合能力和精度。\n\n4.  **对异常值的敏感度**\n    *   **随机森林**:相对鲁棒,对异常值不敏感。\n    *   **GBDT**:对异常值较为敏感,因为异常值会产生较大的残差,影响后续树的拟合方向。\n\n总结来说,随机森林胜在稳定和并行效率,而 GBDT 通常在精度上表现更优,但调参更复杂且训练较慢。",
                    "type": "output_text",
                    "logprobs": null
                }
            ],
            "role": "assistant",
            "status": "completed",
            "type": "message",
            "phase": null
        }
    ],
    "parallel_tool_calls": false,
    "temperature": null,
    "tool_choice": "auto",
    "tools": [],
    "top_p": null,
    "background": null,
    "completed_at": null,
    "conversation": null,
    "max_output_tokens": null,
    "max_tool_calls": null,
    "previous_response_id": null,
    "prompt": null,
    "prompt_cache_key": null,
    "prompt_cache_retention": null,
    "reasoning": null,
    "safety_identifier": null,
    "service_tier": null,
    "status": "completed",
    "text": null,
    "top_logprobs": null,
    "truncation": null,
    "usage": {
        "input_tokens": 1524,
        "input_tokens_details": {
            "cached_tokens": 1305
        },
        "output_tokens": 1534,
        "output_tokens_details": {
            "reasoning_tokens": 1187
        },
        "total_tokens": 3058,
        "x_details": [
            {
                "input_tokens": 1524,
                "output_tokens": 1534,
                "output_tokens_details": {
                    "reasoning_tokens": 1187
                },
                "prompt_tokens_details": {
                    "cache_creation": {
                        "ephemeral_5m_input_tokens": 213
                    },
                    "cache_creation_input_tokens": 213,
                    "cache_type": "ephemeral",
                    "cached_tokens": 1305
                },
                "total_tokens": 3058,
                "x_billing_type": "response_api"
            }
        ]
    },
    "user": null
}

第二轮对话的 input_tokens 为 1524,其中 cached_tokens 为 1305,表示首轮对话的上下文已被缓存命中,可有效降低推理延迟与成本。

如何计费

Session 缓存 的计费规则与显式缓存一致:

  • 创建缓存:按输入 Token 标准单价的 125% 计费。

  • 命中缓存:按输入 Token 标准单价的 10% 计费。

    命中缓存的 Token 数通过 usage.input_tokens_details.cached_tokens 参数查看。
  • 其他 Token:未命中且未创建缓存的 Token 按原价计费。

约束限制

  • 最小可缓存提示词长度为 1024 Token。

  • 缓存有效期为 5 分钟,命中后重置。

  • 仅适用于 Responses API,需配合 previous_response_id 参数进行多轮对话。

  • Session 缓存 与显式缓存、隐式缓存互斥,开启后其他两种模式不生效。

从 Chat Completions 迁移到 Responses API

如果您当前使用的是 OpenAI Chat Completions API,可以通过以下步骤迁移到 Responses API。Responses API 提供了更简洁的接口和更强大的功能,同时保持了与 Chat Completions 的兼容性。

1. 更新端点地址和 base_url

需要同时更新两处:

  • 端点路径:从 /v1/chat/completions 更新为 /v1/responses

  • base_url

    • 华北2(北京):从 https://dashscope.aliyuncs.com/compatible-mode/v1更新为 https://dashscope.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1

    • 新加坡:从 https://dashscope-intl.aliyuncs.com/compatible-mode/v1更新为 https://dashscope-intl.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1

Python

# Chat Completions API
completion = client.chat.completions.create(
    model="qwen3.5-plus",
    messages=[
        {"role": "system", "content": "你是一个有帮助的助手。"},
        {"role": "user", "content": "你好!"}
    ]
)
print(completion.choices[0].message.content)

# Responses API - 可以使用相同的消息格式
response = client.responses.create(
    model="qwen3.5-plus",
    input=[
        {"role": "system", "content": "你是一个有帮助的助手。"},
        {"role": "user", "content": "你好!"}
    ]
)
print(response.output_text)

# Responses API - 或使用更简洁的格式
response = client.responses.create(
    model="qwen3.5-plus",
    input="你好!"
)
print(response.output_text)

Node.js

// Chat Completions API
const completion = await client.chat.completions.create({
    model: "qwen3.5-plus",
    messages: [
        { role: "system", content: "你是一个有帮助的助手。" },
        { role: "user", content: "你好!" }
    ]
});
console.log(completion.choices[0].message.content);

// Responses API - 可以使用相同的消息格式
const response = await client.responses.create({
    model: "qwen3.5-plus",
    input: [
        { role: "system", content: "你是一个有帮助的助手。" },
        { role: "user", content: "你好!" }
    ]
});
console.log(response.output_text);

// Responses API - 或使用更简洁的格式
const response2 = await client.responses.create({
    model: "qwen3.5-plus",
    input: "你好!"
});
console.log(response2.output_text);

curl

# Chat Completions API
curl -X POST https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
    "model": "qwen3.5-plus",
    "messages": [
        {"role": "system", "content": "你是一个有帮助的助手。"},
        {"role": "user", "content": "你好!"}
    ]
}'

# Responses API - 使用更简洁的格式
curl -X POST https://dashscope.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1/responses \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
    "model": "qwen3.5-plus",
    "input": "你好!"
}'

2. 更新响应处理

Responses API 的响应结构有所不同。使用 output_text 快捷方法获取文本输出,或通过 output 数组访问详细信息。

响应对比

# Chat Completions 响应
{
  "id": "chatcmpl-416b0ea5-e362-9fec-97c5-0a60b5d7xxx",
  "choices": [
    {
      "finish_reason": "stop",
      "index": 0,
      "logprobs": null,
      "message": {
        "content": "你好!很高兴见到你~ 有什么我可以帮你的吗?",
        "refusal": null,
        "role": "assistant",
        "function_call": null,
        "tool_calls": null
      }
    }
  ],
  "created": 1769416269,
  "model": "qwen3.5-plus",
  "object": "chat.completion",
  "service_tier": null,
  "system_fingerprint": null,
  "usage": {
    "completion_tokens": 14,
    "prompt_tokens": 22,
    "total_tokens": 36,
    "prompt_tokens_details": {
      "cached_tokens": 0
    }
  }
}
# Responses API 响应
{
  "id": "d69c735d-0f5e-4b6c-9c2a-8cab5eb14xxx",
  "created_at": 1769416269.0,
  "model": "qwen3.5-plus",
  "object": "response",
  "status": "completed",
  "output": [
    {
      "id": "msg_3426d3e5-8da7-4dd8-a6a5-7c2cd866xxx",
      "type": "message",
      "role": "assistant",
      "status": "completed",
      "content": [
        {
          "type": "output_text",
          "text": "你好!今天是2026126日,星期一。有什么我可以帮你的吗?",
          "annotations": []
        }
      ]
    }
  ],
  "usage": {
    "input_tokens": 34,
    "output_tokens": 25,
    "total_tokens": 59,
    "input_tokens_details": {
      "cached_tokens": 0
    },
    "output_tokens_details": {
      "reasoning_tokens": 0
    }
  }
}

3. 简化多轮对话管理

在 Chat Completions 中需要手动管理消息历史数组,而 Responses API 提供了 previous_response_id 参数自动关联上下文,当前响应id有效期为7天。

Python

# Chat Completions - 需要手动管理消息历史
messages = [
    {"role": "system", "content": "你是一个有帮助的助手。"},
    {"role": "user", "content": "法国的首都是哪里?"}
]
res1 = client.chat.completions.create(
    model="qwen3.5-plus",
    messages=messages
)

# 手动添加响应到历史
messages.append(res1.choices[0].message)
messages.append({"role": "user", "content": "它的人口是多少?"})

res2 = client.chat.completions.create(
    model="qwen3.5-plus",
    messages=messages
)
# Responses API - 使用 previous_response_id 自动关联
res1 = client.responses.create(
    model="qwen3.5-plus",
    input="法国的首都是哪里?"
)

# 只需传递上一轮的 ID
res2 = client.responses.create(
    model="qwen3.5-plus",
    input="它的人口是多少?",
    previous_response_id=res1.id
)

Node.js

// Chat Completions - 需要手动管理消息历史
let messages = [
    { role: "system", content: "你是一个有帮助的助手。" },
    { role: "user", content: "法国的首都是哪里?" }
];
const res1 = await client.chat.completions.create({
    model: "qwen3.5-plus",
    messages
});

// 手动添加响应到历史
messages = messages.concat([res1.choices[0].message]);
messages.push({ role: "user", content: "它的人口是多少?" });

const res2 = await client.chat.completions.create({
    model: "qwen3.5-plus",
    messages
});
// Responses API - 使用 previous_response_id 自动关联
const res1 = await client.responses.create({
    model: "qwen3.5-plus",
    input: "法国的首都是哪里?"
});

// 只需传递上一轮的 ID
const res2 = await client.responses.create({
    model: "qwen3.5-plus",
    input: "它的人口是多少?",
    previous_response_id: res1.id
});

4. 使用内置工具

Responses API 内置了多种工具,无需自行实现。只需在 tools 参数中指定即可,当前代码解释器与网页抓取工具限时免费,详情请参见工具调用

Python

# Chat Completions - 需要自己实现工具函数
def web_search(query):
    # 需要自己实现网络搜索逻辑
    import requests
    r = requests.get(f"https://api.example.com/search?q={query}")
    return r.json().get("results", [])

completion = client.chat.completions.create(
    model="qwen3.5-plus",
    messages=[{"role": "user", "content": "法国现任总统是谁?"}],
    functions=[{
        "name": "web_search",
        "description": "搜索网络信息",
        "parameters": {
            "type": "object",
            "properties": {"query": {"type": "string"}},
            "required": ["query"]
        }
    }]
)
# Responses API - 直接使用内置工具
response = client.responses.create(
    model="qwen3.5-plus",
    input="法国现任总统是谁?",
    tools=[{"type": "web_search"}]  # 直接启用网络搜索
)
print(response.output_text)

Node.js

// Chat Completions - 需要自己实现工具函数
async function web_search(query) {
    const fetch = (await import('node-fetch')).default;
    const res = await fetch(`https://api.example.com/search?q=${query}`);
    const data = await res.json();
    return data.results;
}

const completion = await client.chat.completions.create({
    model: "qwen3.5-plus",
    messages: [{ role: "user", content: "法国现任总统是谁?" }],
    functions: [{
        name: "web_search",
        description: "搜索网络信息",
        parameters: {
            type: "object",
            properties: { query: { type: "string" } },
            required: ["query"]
        }
    }]
});
// Responses API - 直接使用内置工具
const response = await client.responses.create({
    model: "qwen3.5-plus",
    input: "法国现任总统是谁?",
    tools: [{ type: "web_search" }]  // 直接启用网络搜索
});
console.log(response.output_text);

curl

# Chat Completions - 需要自己实现工具
# 这里展示调用外部搜索 API 的示例
curl https://api.example.com/search \
  -G \
  --data-urlencode "q=法国现任总统" \
  --data-urlencode "key=$SEARCH_API_KEY"
# Responses API - 直接使用内置工具
curl -X POST https://dashscope.aliyuncs.com/api/v2/apps/protocols/compatible-mode/v1/responses \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
    "model": "qwen3.5-plus",
    "input": "法国现任总统是谁?",
    "tools": [{"type": "web_search"}]
}'

常见问题

Q:如何传递多轮对话的上下文?

A:在发起新一轮对话请求时,请将上一轮模型响应成功返回的id作为 previous_response_id 参数传入。

Q:为何无法打印 output_text?

A:OpenAI Python SDK 在某些版本(如1.99.2)错误移除了该属性,请更新 SDK 为最新版以避免该报错。