Anthropic-compatible Messages

更新时间:
复制 MD 格式

Migrate your Anthropic application to Model Studio by changing three settings. This topic covers the request and response parameters with code examples.

To migrate an existing Anthropic application to Model Studio, change these settings:

  • api_key: Replace with the Model Studio API key.

  • base_url: Replace with a Model Studio endpoint listed below.

  • model: Replace with a supported model name, such as qwen3.7-plus.

Important

Model Studio has released workspace-specific domains for the Singapore regions. The new dedicated domains deliver superior performance and higher stability for inference requests. We recommend migrating to the new domains:

  • Singapore: from https://dashscope-intl.aliyuncs.com to https://{WorkspaceId}.ap-southeast-1.maas.aliyuncs.com

{WorkspaceId} is your workspace ID, which can be found on the Workspace Details page in the Model Studio console. The existing domain remains fully functional.

China (Beijing)

SDK base_url:https://dashscope.aliyuncs.com/apps/anthropic

HTTP request URL:POST https://dashscope.aliyuncs.com/apps/anthropic/v1/messages

Singapore

SDK base_url:https://{WorkspaceId}.ap-southeast-1.maas.aliyuncs.com/apps/anthropic

HTTP request URL:POST https://{WorkspaceId}.ap-southeast-1.maas.aliyuncs.com/apps/anthropic/v1/messages

Replace {WorkspaceId} with your actual workspace ID.

US (Virginia)

SDK base_url:https://dashscope-us.aliyuncs.com/apps/anthropic

HTTP request URL:POST https://dashscope-us.aliyuncs.com/apps/anthropic/v1/messages

Germany (Frankfurt)

SDK base_url:https://{WorkspaceId}.eu-central-1.maas.aliyuncs.com/apps/anthropic

HTTP request URL:POST https://{WorkspaceId}.eu-central-1.maas.aliyuncs.com/apps/anthropic/v1/messages

Replace {WorkspaceId} with your actual Workspace ID.

Japan (Tokyo)

SDK base_url:https://{WorkspaceId}.ap-northeast-1.maas.aliyuncs.com/apps/anthropic

HTTP request URL:POST https://{WorkspaceId}.ap-northeast-1.maas.aliyuncs.com/apps/anthropic/v1/messages

Replace {WorkspaceId} with your actual workspace ID.

Authentication: Pass your Model Studio API key in either the x-api-key header or the Authorization: Bearer header.

Request Body

Basic Call

Python

import anthropic
import os

client = anthropic.Anthropic(
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    base_url="https://dashscope.aliyuncs.com/apps/anthropic",
)

message = client.messages.create(
    model="qwen3.7-plus",
    max_tokens=1024,
    system="You are a helpful assistant",
    messages=[
        {
            "role": "user",
            "content": "Who are you?"
        }
    ],
    thinking={"type": "disabled"},
)

print(message.content[0].text)

TypeScript

import Anthropic from "@anthropic-ai/sdk";

const anthropic = new Anthropic({
  apiKey: process.env.DASHSCOPE_API_KEY,
  baseURL: "https://dashscope.aliyuncs.com/apps/anthropic",
});

async function main() {
  const message = await anthropic.messages.create({
    model: "qwen3.7-plus",
    max_tokens: 1024,
    system: "You are a helpful assistant",
    messages: [{
      role: "user",
      content: "Who are you?"
    }],
    thinking: { type: "disabled" },
  });

  console.log(message.content[0].text);
}

main().catch(console.error);

curl

curl -X POST "https://dashscope.aliyuncs.com/apps/anthropic/v1/messages" \
  -H "Content-Type: application/json" \
  -H "x-api-key: $DASHSCOPE_API_KEY" \
  -d '{
    "model": "qwen3.7-plus",
    "max_tokens": 1024,
    "system": "You are a helpful assistant",
    "messages": [
        {
            "role": "user",
            "content": "Who are you?"
        }
    ],
    "thinking": {"type": "disabled"}
}'

Streaming

Python

import anthropic
import os

client = anthropic.Anthropic(
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    base_url="https://dashscope.aliyuncs.com/apps/anthropic",
)

stream = client.messages.create(
    model="qwen3.7-plus",
    max_tokens=1024,
    stream=True,
    messages=[
        {
            "role": "user",
            "content": "Give a brief introduction to artificial intelligence."
        }
    ],
    thinking={"type": "disabled"},
)

for chunk in stream:
    if chunk.type == "content_block_delta":
        if hasattr(chunk.delta, 'text'):
            print(chunk.delta.text, end="", flush=True)

TypeScript

import Anthropic from "@anthropic-ai/sdk";

async function main() {
  const anthropic = new Anthropic({
    apiKey: process.env.DASHSCOPE_API_KEY,
    baseURL: "https://dashscope.aliyuncs.com/apps/anthropic",
  });

  const stream = await anthropic.messages.create({
    model: "qwen3.7-plus",
    max_tokens: 1024,
    stream: true,
    messages: [{
      role: "user",
      content: "Give a brief introduction to artificial intelligence."
    }],
    thinking: { type: "disabled" },
  });

  for await (const chunk of stream) {
    if (chunk.type === "content_block_delta" && 'text' in chunk.delta) {
      process.stdout.write(chunk.delta.text);
    }
  }
}

main().catch(console.error);

curl

curl -X POST "https://dashscope.aliyuncs.com/apps/anthropic/v1/messages" \
  -H "Content-Type: application/json" \
  -H "x-api-key: $DASHSCOPE_API_KEY" \
  --no-buffer \
  -d '{
    "model": "qwen3.7-plus",
    "max_tokens": 1024,
    "stream": true,
    "messages": [
        {
            "role": "user",
            "content": "Give a brief introduction to artificial intelligence."
        }
    ],
    "thinking": {"type": "disabled"}
}'

Extended Thinking

Python

import anthropic
import os

client = anthropic.Anthropic(
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    base_url="https://dashscope.aliyuncs.com/apps/anthropic",
)

stream = client.messages.create(
    model="qwen3.7-plus",
    max_tokens=2048,
    stream=True,
    thinking={
        "type": "enabled",
        "budget_tokens": 1024
    },
    messages=[
        {
            "role": "user",
            "content": "Analyze the future prospects of quantum computing."
        }
    ]
)

for chunk in stream:
    if chunk.type == "content_block_delta":
        if hasattr(chunk.delta, 'thinking'):
            print(chunk.delta.thinking, end="", flush=True)
        elif hasattr(chunk.delta, 'text'):
            print(chunk.delta.text, end="", flush=True)

TypeScript

import Anthropic from "@anthropic-ai/sdk";

async function main() {
  const anthropic = new Anthropic({
    apiKey: process.env.DASHSCOPE_API_KEY,
    baseURL: "https://dashscope.aliyuncs.com/apps/anthropic",
  });

  const stream = await anthropic.messages.create({
    model: "qwen3.7-plus",
    max_tokens: 2048,
    stream: true,
    thinking: { type: "enabled", budget_tokens: 1024 },
    messages: [{
      role: "user",
      content: "Analyze the future prospects of quantum computing."
    }]
  });

  for await (const chunk of stream) {
    if (chunk.type === "content_block_delta") {
      if ('thinking' in chunk.delta) {
        process.stdout.write(chunk.delta.thinking);
      } else if ('text' in chunk.delta) {
        process.stdout.write(chunk.delta.text);
      }
    }
  }
}

main().catch(console.error);

curl

curl -X POST "https://dashscope.aliyuncs.com/apps/anthropic/v1/messages" \
  -H "Content-Type: application/json" \
  -H "x-api-key: $DASHSCOPE_API_KEY" \
  -d '{
    "model": "qwen3.7-plus",
    "max_tokens": 2048,
    "stream": true,
    "thinking": {
        "type": "enabled",
        "budget_tokens": 1024
    },
    "messages": [
        {
            "role": "user",
            "content": "Analyze the future prospects of quantum computing."
        }
    ]
}'

Image Understanding

Python

import anthropic
import os

client = anthropic.Anthropic(
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    base_url="https://dashscope.aliyuncs.com/apps/anthropic",
)

stream = client.messages.create(
    model="qwen3.7-plus",
    max_tokens=1024,
    stream=True,
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "image",
                    "source": {
                        "type": "url",
                        "url": "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20250414/mqqmiy/animal_01.jpg",
                    },
                },
                {
                    "type": "text",
                    "text": "Describe the content of this image."
                },
            ],
        }
    ],
    thinking={"type": "disabled"},
)

for chunk in stream:
    if chunk.type == "content_block_delta":
        if hasattr(chunk.delta, 'text'):
            print(chunk.delta.text, end="", flush=True)

TypeScript

import Anthropic from "@anthropic-ai/sdk";

async function main() {
  const anthropic = new Anthropic({
    apiKey: process.env.DASHSCOPE_API_KEY,
    baseURL: "https://dashscope.aliyuncs.com/apps/anthropic",
  });

  const stream = await anthropic.messages.create({
    model: "qwen3.7-plus",
    max_tokens: 1024,
    stream: true,
    messages: [{
      role: "user",
      content: [
        {
          type: "image",
          source: {
            type: "url",
            url: "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20250414/mqqmiy/animal_01.jpg",
          },
        },
        { type: "text", text: "Describe the content of this image." },
      ],
    }],
    thinking: { type: "disabled" },
  });

  for await (const chunk of stream) {
    if (chunk.type === "content_block_delta" && 'text' in chunk.delta) {
      process.stdout.write(chunk.delta.text);
    }
  }
}

main().catch(console.error);

curl

curl -X POST "https://dashscope.aliyuncs.com/apps/anthropic/v1/messages" \
  -H "Content-Type: application/json" \
  -H "x-api-key: $DASHSCOPE_API_KEY" \
  -d '{
    "model": "qwen3.7-plus",
    "max_tokens": 1024,
    "stream": true,
    "messages": [
        {
            "role": "user",
            "content": [
                {
                    "type": "image",
                    "source": {
                        "type": "url",
                        "url": "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20250414/mqqmiy/animal_01.jpg"
                    }
                },
                {
                    "type": "text",
                    "text": "Describe the content of this image."
                }
            ]
        }
    ],
    "thinking": {"type": "disabled"}
}'

Video Understanding

Python

import anthropic
import os

client = anthropic.Anthropic(
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    base_url="https://dashscope.aliyuncs.com/apps/anthropic",
)

stream = client.messages.create(
    model="qwen3.7-plus",
    max_tokens=1024,
    stream=True,
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "video",
                    "source": {
                        "type": "url",
                        "url": "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20251208/zpupby/3e81ef38-98f0-4d55-bbb6-259334ca18d0.mp4",
                    },
                },
                {
                    "type": "text",
                    "text": "Describe the content of this video."
                },
            ],
        }
    ],
    thinking={"type": "disabled"},
)

for chunk in stream:
    if chunk.type == "content_block_delta":
        if hasattr(chunk.delta, 'text'):
            print(chunk.delta.text, end="", flush=True)

TypeScript

import Anthropic from "@anthropic-ai/sdk";

async function main() {
  const anthropic = new Anthropic({
    apiKey: process.env.DASHSCOPE_API_KEY,
    baseURL: "https://dashscope.aliyuncs.com/apps/anthropic",
  });

  const stream = await anthropic.messages.create({
    model: "qwen3.7-plus",
    max_tokens: 1024,
    stream: true,
    messages: [{
      role: "user",
      content: [
        {
          type: "video",
          source: {
            type: "url",
            url: "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20251208/zpupby/3e81ef38-98f0-4d55-bbb6-259334ca18d0.mp4",
          },
        },
        { type: "text", text: "Describe the content of this video." },
      ],
    }],
    thinking: { type: "disabled" },
  });

  for await (const chunk of stream) {
    if (chunk.type === "content_block_delta" && 'text' in chunk.delta) {
      process.stdout.write(chunk.delta.text);
    }
  }
}

main().catch(console.error);

curl

curl -X POST "https://dashscope.aliyuncs.com/apps/anthropic/v1/messages" \
  -H "Content-Type: application/json" \
  -H "x-api-key: $DASHSCOPE_API_KEY" \
  -d '{
    "model": "qwen3.7-plus",
    "max_tokens": 1024,
    "stream": true,
    "messages": [
        {
            "role": "user",
            "content": [
                {
                    "type": "video",
                    "source": {
                        "type": "url",
                        "url": "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20251208/zpupby/3e81ef38-98f0-4d55-bbb6-259334ca18d0.mp4"
                    }
                },
                {
                    "type": "text",
                    "text": "Describe the content of this video."
                }
            ]
        }
    ],
    "thinking": {"type": "disabled"}
}'

Function calling

Python

import anthropic
import os

client = anthropic.Anthropic(
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    base_url="https://dashscope.aliyuncs.com/apps/anthropic",
)

tools = [
    {
        "name": "get_weather",
        "description": "Get weather information for a specified city",
        "input_schema": {
            "type": "object",
            "properties": {
                "city": {
                    "type": "string",
                    "description": "City name"
                }
            },
            "required": ["city"]
        }
    }
]

message = client.messages.create(
    model="qwen3.7-plus",
    max_tokens=1024,
    tools=tools,
    messages=[
        {
            "role": "user",
            "content": "What's the weather like in Hangzhou today?"
        }
    ]
)

print(message.content)

TypeScript

import Anthropic from "@anthropic-ai/sdk";

async function main() {
  const anthropic = new Anthropic({
    apiKey: process.env.DASHSCOPE_API_KEY,
    baseURL: "https://dashscope.aliyuncs.com/apps/anthropic",
  });

  const message = await anthropic.messages.create({
    model: "qwen3.7-plus",
    max_tokens: 1024,
    tools: [
      {
        name: "get_weather",
        description: "Get weather information for a specified city",
        input_schema: {
          type: "object",
          properties: {
            city: { type: "string", description: "City name" }
          },
          required: ["city"],
        },
      },
    ],
    messages: [{
      role: "user",
      content: "What's the weather like in Hangzhou today?"
    }],
  });

  console.log(JSON.stringify(message.content, null, 2));
}

main().catch(console.error);

curl

curl -X POST "https://dashscope.aliyuncs.com/apps/anthropic/v1/messages" \
  -H "Content-Type: application/json" \
  -H "x-api-key: $DASHSCOPE_API_KEY" \
  -d '{
    "model": "qwen3.7-plus",
    "max_tokens": 1024,
    "tools": [
        {
            "name": "get_weather",
            "description": "Get weather information for a specified city",
            "input_schema": {
                "type": "object",
                "properties": {
                    "city": {
                        "type": "string",
                        "description": "City name"
                    }
                },
                "required": ["city"]
            }
        }
    ],
    "messages": [
        {
            "role": "user",
            "content": "What's the weather like in Hangzhou today?"
        }
    ]
}'

Prompt Caching

Python

import anthropic
import os

client = anthropic.Anthropic(
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    base_url="https://dashscope.aliyuncs.com/apps/anthropic",
)

# Simulate code repository content. Must reach minimum cacheable length (1024 tokens)
long_text_content = "<Your Code Here>" * 400


def get_completion(user_input):
    response = client.messages.create(
        # Choose a model that supports prompt caching
        model="qwen3.7-plus",
        max_tokens=1024,
        system=[
            {
                "type": "text",
                "text": long_text_content,
                # Add cache_control on a text block to mark a cache breakpoint. Can also be placed on content blocks in the messages array
                "cache_control": {"type": "ephemeral"},
            }
        ],
        messages=[
            {"role": "user", "content": user_input},
        ],
    )
    return response


# First request: Create cache
first = get_completion("What does this code do?")
print(f"Cache creation tokens: {first.usage.cache_creation_input_tokens}")
print(f"Cache read tokens: {first.usage.cache_read_input_tokens}")
print("=" * 20)
# Second request: Same long content, different question -> Cache hit
second = get_completion("How can this code be optimized?")
print(f"Cache creation tokens: {second.usage.cache_creation_input_tokens}")
print(f"Cache read tokens: {second.usage.cache_read_input_tokens}")

TypeScript

import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic({
  apiKey: process.env.DASHSCOPE_API_KEY,
  baseURL: "https://dashscope.aliyuncs.com/apps/anthropic",
});

// Simulate code repository content. Must reach minimum cacheable length (1024 tokens)
const longTextContent = "<Your Code Here>".repeat(400);

async function getCompletion(userInput) {
  return client.messages.create({
    // Choose a model that supports prompt caching
    model: "qwen3.7-plus",
    max_tokens: 1024,
    system: [
      {
        type: "text",
        text: longTextContent,
        // Add cache_control on a text block to mark a cache breakpoint. Can also be placed on content blocks in the messages array
        cache_control: { type: "ephemeral" },
      },
    ],
    messages: [{ role: "user", content: userInput }],
  });
}

// First request: Create cache
const first = await getCompletion("What does this code do?");
console.log(`Cache creation tokens: ${first.usage.cache_creation_input_tokens}`);
console.log(`Cache read tokens: ${first.usage.cache_read_input_tokens}`);
console.log("=".repeat(20));
// Second request: Same long content, different question -> Cache hit
const second = await getCompletion("How can this code be optimized?");
console.log(`Cache creation tokens: ${second.usage.cache_creation_input_tokens}`);
console.log(`Cache read tokens: ${second.usage.cache_read_input_tokens}`);

curl

curl -X POST "https://dashscope.aliyuncs.com/apps/anthropic/v1/messages" \
  -H "Content-Type: application/json" \
  -H "x-api-key: $DASHSCOPE_API_KEY" \
  -d '{
    "model": "qwen3.7-plus",
    "max_tokens": 1024,
    "system": [
      {
        "type": "text",
        "text": "<Place cacheable content here with at least 1024 tokens>",
        "cache_control": {"type": "ephemeral"}
      }
    ],
    "messages": [
      {"role": "user", "content": "What does this code do?"}
    ]
}'

Structured Outputs

Python

import anthropic
import os

client = anthropic.Anthropic(
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    base_url="https://dashscope.aliyuncs.com/apps/anthropic",
)

message = client.messages.create(
    model="deepseek-v4-pro",
    max_tokens=1024,
    messages=[
        {
            "role": "user",
            "content": "Extract key info from this email: John Smith (john@example.com) is interested in the Enterprise plan and wants to schedule a demo for next Tuesday at 2pm."
        }
    ],
    output_config={
        "format": {
            "type": "json_schema",
            "schema": {
                "type": "object",
                "properties": {
                    "name": {"type": "string"},
                    "email": {"type": "string"},
                    "plan_interest": {"type": "string"},
                    "demo_requested": {"type": "boolean"}
                },
                "required": ["name", "email", "plan_interest", "demo_requested"],
                "additionalProperties": False
            }
        }
    },
)

print(message.content[0].text)

TypeScript

import Anthropic from "@anthropic-ai/sdk";

const anthropic = new Anthropic({
  apiKey: process.env.DASHSCOPE_API_KEY,
  baseURL: "https://dashscope.aliyuncs.com/apps/anthropic",
});

async function main() {
  const message = await anthropic.messages.create({
    model: "deepseek-v4-pro",
    max_tokens: 1024,
    messages: [{
      role: "user",
      content: "Extract key info from this email: John Smith (john@example.com) is interested in the Enterprise plan and wants to schedule a demo for next Tuesday at 2pm."
    }],
    output_config: {
      format: {
        type: "json_schema",
        schema: {
          type: "object",
          properties: {
            name: { type: "string" },
            email: { type: "string" },
            plan_interest: { type: "string" },
            demo_requested: { type: "boolean" }
          },
          required: ["name", "email", "plan_interest", "demo_requested"],
          additionalProperties: false
        }
      }
    }
  });

  console.log(message.content[0].text);
}

main().catch(console.error);

curl

curl -X POST "https://dashscope.aliyuncs.com/apps/anthropic/v1/messages" \
  -H "Content-Type: application/json" \
  -H "x-api-key: $DASHSCOPE_API_KEY" \
  -d '{
    "model": "deepseek-v4-pro",
    "max_tokens": 1024,
    "messages": [
        {
            "role": "user",
            "content": "Extract key info from this email: John Smith (john@example.com) is interested in the Enterprise plan and wants to schedule a demo for next Tuesday at 2pm."
        }
    ],
    "output_config": {
        "format": {
            "type": "json_schema",
            "schema": {
                "type": "object",
                "properties": {
                    "name": {"type": "string"},
                    "email": {"type": "string"},
                    "plan_interest": {"type": "string"},
                    "demo_requested": {"type": "boolean"}
                },
                "required": ["name", "email", "plan_interest", "demo_requested"],
                "additionalProperties": false
            }
        }
    }
}'

model string (Required)

Model name. Supported models:

Supported Models

Qwen-Max: qwen3.7-max, qwen3.7-max-2026-05-20, qwen3.7-max-2026-06-08, qwen3.6-max-preview, qwen3-max, qwen3-max-2026-01-23, qwen3-max-preview

Qwen-Plus: qwen3.7-plus, qwen3.7-plus-2026-05-26, qwen3.6-plus, qwen3.6-plus-2026-04-02, qwen3.5-plus, qwen3.5-plus-2026-04-20, qwen3.5-plus-2026-02-15, qwen-plus, qwen-plus-latest, qwen-plus-2025-09-11

Qwen-Flash: qwen3.6-flash, qwen3.6-flash-2026-04-16, qwen3.5-flash, qwen3.5-flash-2026-02-23, qwen-flash, qwen-flash-2025-07-28

Qwen-Turbo: qwen-turbo

Qwen-Coder: qwen3-coder-next, qwen3-coder-plus, qwen3-coder-plus-2025-09-23, qwen3-coder-flash

Qwen-VL: qwen3-vl-plus, qwen3-vl-flash, qwen-vl-max, qwen-vl-plus

Qwen Open-Source Models: qwen3.6-27b, qwen3.5-397b-a17b, qwen3.5-122b-a10b, qwen3.5-27b, qwen3.5-35b-a3b

Third-Party Models

deepseek-v4-pro, deepseek-v4-flash, kimi-k2.5, kimi-k2-thinking, glm-5.1, glm-5, glm-4.7, glm-4.6, MiniMax-M2.5, MiniMax-M2.1

max_tokens integer (Required)

Maximum number of tokens for the reply content. If the generated content exceeds this value, generation stops early and stop_reason is max_tokens.

max_tokens does not limit the length of the thinking process. When extended thinking is enabled, the thinking tokens are controlled separately by thinking.budget_tokens.

system string or array (Optional)

System prompt that defines model behavior. system is a top-level parameter — the messages array does not accept a system role.

A string equals a single type="text" block. Pass an array to mark prompt caching breakpoints.

Properties

type string (Required)

Fixed value: text.

text string (Required)

The system prompt text.

cache_control object (Optional)

Prompt caching breakpoint. On cache hit, subsequent requests are billed at the cache read rate. Contains only type, fixed to ephemeral.

messages array (Required)

The message array, arranged in alternating user/assistant turns.

messages array element

role string (Required)

The message role. Valid values: user, assistant.

content string or array (Required)

Plain text string or structured content array. A string equals a single content block with type="text".

content array element types

Text

Properties

type string (Required)

Fixed value: text.

text string (Required)

The text content.

cache_control object (Optional)

Prompt caching breakpoint. Contains only type, fixed to ephemeral.

Image (requires vision model)

Properties

type string (Required)

Fixed value: image.

source object (Required)

The source of the image data.

Properties

type string (Required)

Valid values: url (public image URL), base64 (Base64-encoded).

url string

The public URL of the image. Required when type is url.

media_type string

The MIME type of the image, such as image/jpeg. Required when type is base64.

data string

The Base64-encoded image data. Required when type is base64.

Video (requires vision model)

Properties

type string (Required)

Fixed value: video.

source object (Required)

The source of the video data.

Properties

type string (Required)

Valid values: url (public video URL), base64 (Base64-encoded).

url string

The public URL of the video. Required when type is url.

media_type string

The MIME type of the video, such as video/mp4. Required when type is base64.

data string

The Base64-encoded video data. Required when type is base64.

Tool use (assistant role; tool call instruction returned by the model)

Properties

type string (Required)

Fixed value: tool_use.

id string (Required)

The unique identifier of the tool call, used to associate the result in a subsequent tool_result.

name string (Required)

The name of the called tool.

input object (Required)

The input parameters of the tool call. The structure is determined by the input_schema of the corresponding tool in tools.

cache_control object (Optional)

Prompt caching breakpoint. Contains only type, fixed to ephemeral. The tool call content participates in the cache prefix.

Tool result (user role; execution result of a tool sent back to the model)

Properties

type string (Required)

Fixed value: tool_result.

tool_use_id string (Required)

Corresponds to the id in the tool_use block.

content string (Required)

Content returned by the tool.

cache_control object (Optional)

Prompt caching breakpoint. Contains only type, fixed to ephemeral.

stream boolean (Optional)

Whether to enable streaming. Default value: false.

temperature number (Optional)

Controls the diversity of generated text. Value range: [0, 2). Higher values produce more random results.

Note

This range is different from the official Anthropic range of [0.0, 1.0]. When migrating from Anthropic, verify the value of this parameter.

top_p number (Optional)

Nucleus sampling probability threshold.

Both temperature and top_p can control the diversity of generated text. We recommend setting only one of them. For more information, see Overview.

top_k integer (Optional)

Candidate set size during sampling.

stop_sequences array (Optional)

Text sequences that trigger generation to stop. Output ends before the matched sequence.

Note

After a match, the stop_reason in the response is still end_turn, and the response does not include the matched sequence.

thinking object (Optional)

Extended thinking configuration. When enabled, the model reasons before responding, and the response includes thinking-type content blocks. Not all models support thinking mode.

Properties

type string (Required)

Valid values: enabled (enable thinking mode), disabled (disable thinking mode).

budget_tokens integer (Optional)

Maximum tokens for the thinking process. Disjoint from max_tokens: this parameter limits the thinking portion, while max_tokens limits the final reply. A larger budget allows more thorough analysis on complex questions. Takes effect when type is enabled.

reasoning_effort string (Optional)

Controls the reasoning intensity of the model. Valid values: high, max. Default value: max. Supported models: deepseek-v4-pro, deepseek-v4-flash.

Note

When set to low or medium, it is mapped to high. When set to xhigh, it is mapped to max.

tools array (Optional)

Tool definitions for function calling.

tools array element

name string (Required)

The tool name.

description string (Optional)

The description of the tool function.

input_schema object (Required)

The JSON Schema definition of the tool input parameters.

tool_choice object (Optional)

Tool selection strategy:

  • {"type": "auto"}: The model decides whether to call a tool (default).

  • {"type": "any"}: Force the model to call any tool.

  • {"type": "none"}: Prohibit the model from calling tools.

  • {"type": "tool", "name": "tool_name"}: Force the model to call a specified tool.

output_config object (Optional)

Structured output configuration. When enabled, the model returns a JSON string. Behavior varies by model:

  • Strict structured outputs: Available for deepseek and glm series models. The model strictly follows the provided JSON Schema, guaranteeing the same field types and hierarchy.

  • Regular structured outputs: For all other models, schema field constraints are not enforced — the API automatically falls back to a plain JSON mode (only guaranteeing that the output is a valid JSON string). In this fallback mode, the request must satisfy both of the following: (1) the output_config parameter is explicitly provided; (2) the system or messages content contains the keyword "JSON" (case-insensitive). If the keyword "JSON" is missing, the API throws: 'messages' must contain the word 'json' in some form.

Properties

format object (Required)

The output format definition.

Properties

type string (Required)

Fixed value: json_schema.

schema object (Required)

JSON Schema object that follows the standard JSON Schema specification. Should include type (data type), properties (field definitions), required (array of required field names), and additionalProperties (must be set to false).

Non-streaming Response

Response Example

{
  "id": "msg_e2898f19-fc0e-4cb3-bd9b-5b7dc4ea3bc9",
  "type": "message",
  "role": "assistant",
  "model": "qwen3.7-plus",
  "content": [
    {
      "type": "thinking",
      "thinking": "Let me analyze this problem...",
      "signature": ""
    },
    {
      "type": "text",
      "text": "Hello! I am Qwen..."
    }
  ],
  "stop_reason": "end_turn",
  "stop_sequence": null,
  "usage": {
    "input_tokens": 22,
    "output_tokens": 223,
    "cache_creation_input_tokens": 0,
    "cache_read_input_tokens": 0
  }
}

id string

Unique message identifier.

type string

Fixed value: message.

role string

Fixed value: assistant.

model string

The model used for generation.

content array

The content array.

content array element types

Text

Properties

type string

Fixed value: text.

text string

The text response generated by the model.

Thinking (returned when Extended Thinking is enabled)

Properties

type string

Fixed value: thinking.

thinking string

The model's reasoning before the final response.

signature string

Currently fixed as an empty string.

Tool use (function call scenario)

Properties

type string

Fixed value: tool_use.

id string

Unique tool call identifier, used to match the tool_result.

name string

The name of the called tool.

input object

The input parameters of the tool call.

stop_reason string

Reason generation stopped. Valid values: end_turn (normal completion), max_tokens (token limit reached), tool_use (tool call).

stop_sequence string

Always null.

usage object

Token usage statistics.

Note

In streaming calls, the usage field of the message_start event contains only input_tokens and output_tokens. The full four fields are returned in the message_delta event.

Properties

input_tokens integer

Input tokens.

output_tokens integer

Output tokens.

cache_creation_input_tokens integer

Tokens consumed for cache creation.

cache_read_input_tokens integer

Tokens consumed from cache reads.

Streaming Response

Streaming response example

{"type":"message_start","message":{"id":"msg_xxx","type":"message","role":"assistant","model":"qwen3.7-plus","content":[],"usage":{"input_tokens":15,"output_tokens":0}}}
{"type":"content_block_start","index":0,"content_block":{"type":"thinking","thinking":"","signature":""}}
{"type":"content_block_delta","index":0,"delta":{"type":"thinking_delta","thinking":"Here's a thinking process:\n\n1. **Analyze User Input:**\n   - **Topic:** Artificial Intelligence (AI)\n   - **Request:** Give a brief introduction to artificial intelligence."}}
{"type":"content_block_delta","index":0,"delta":{"type":"signature_delta","signature":""}}
{"type":"content_block_stop","index":0}
{"type":"content_block_start","index":1,"content_block":{"type":"text","text":""}}
{"type":"content_block_delta","index":1,"delta":{"type":"text_delta","text":"Artificial intelligence (AI) is an important branch of computer science..."}}
{"type":"content_block_stop","index":1}
{"type":"message_delta","delta":{"stop_reason":"end_turn","stop_sequence":null},"usage":{"input_tokens":15,"output_tokens":1078,"cache_creation_input_tokens":0,"cache_read_input_tokens":0}}
{"type":"message_stop"}

message_start

First stream event, marks message start.

Properties

type string

Fixed value: message_start.

message object

The initial message object. content is an empty array, and usage contains only input_tokens and output_tokens.

content_block_start

Marks the start of a content block.

Properties

type string

Fixed value: content_block_start.

index integer

0-based index corresponding to position in the content array.

content_block object

The initial object of the content block. The type value is text, thinking, or tool_use. For the tool_use type, the input field is an empty object in this event, and the complete input parameters are assembled from subsequent content_block_delta deltas.

content_block_delta

Incremental content block update. Multiple deltas sent per block.

Properties

type string

Fixed value: content_block_delta.

index integer

The index of the associated content block.

delta object

Delta object. type values:

  • text_delta: Text delta, containing the text field.

  • thinking_delta: Thinking delta, containing the thinking field.

  • signature_delta: Signature delta, containing the signature field (currently fixed as an empty string).

  • input_json_delta: Tool call input parameter delta, containing the partial_json field.

content_block_stop

Marks the end of a content block.

Properties

type string

Fixed value: content_block_stop.

index integer

The index of the ended content block.

message_delta

Sent after all content blocks end. Contains stop reason and final token usage.

Properties

type string

Fixed value: message_delta.

delta object

Contains stop_reason and stop_sequence. For valid values, see the Non-streaming Response table above.

usage object

Complete token usage statistics, including input_tokens, output_tokens, cache_creation_input_tokens, and cache_read_input_tokens.

message_stop

Final event, marks message end.

Properties

type string

Fixed value: message_stop.

In addition, streaming responses periodically send ping events ({"type":"ping"}) to keep the connection alive. Clients can ignore them.