Knowledge retrieval-Alibaba Cloud Model Studio(Model Studio)-阿里云帮助中心

Large Language Models (LLMs) cannot answer questions about private data. The knowledge retrieval tool retrieves content from a knowledge base and provides it to the LLM. This enables the model to generate more accurate and relevant answers.

How to use

You can call the knowledge retrieval feature through the Responses API by adding the file_search tool to the tools parameter and specifying the knowledge base ID in the vector_store_ids parameter.

Before you start, create and use a knowledge base and obtain its ID. Currently, the vector_store_ids parameter supports only one knowledge base ID.

# Import dependencies and create a client...
response = client.responses.create(
    model="qwen3.8-max",
    input="Tell me about the Alibaba Cloud Model Studio X1 phone",
    tools=[
        {
            "type": "file_search",
            # Replace this with your knowledge base ID. Only one is currently supported.
            "vector_store_ids": ["your_knowledge_base_id"]
        }
    ]
)

print(response.output_text)

Supported models

Qwen-Max: Qwen3.8-Max series and Qwen3.7-Max series
Qwen-Plus: Qwen3.7-Plus series, Qwen3.6-Plus series, and Qwen3.5-Plus series
Qwen-Flash: Qwen3.7-Flash series, Qwen3.6-Flash series and Qwen3.5-Flash series
Qwen3.6 open-source series (except qwen3.6-27b)
Qwen3.5 open source series

This feature can only be called through the Responses API.

Prerequisites

Obtain an API key and configure it as an environment variable.
Create a knowledge base and obtain its ID. You can create a knowledge base in one of the following ways:
- Create in the console: Create a knowledge base on the knowledge base page in the Model Studio console. For more information, see Create and use a knowledge base.
- Create using the API: You can call the API through the Alibaba Cloud Model Studio software development kit (SDK) to create a knowledge base. For more information, see Knowledge Base API Guide.
  
  Before you create a knowledge base using the API, obtain the workspace ID (`workspace_id`) from the Model Studio console. Only two types of knowledge bases are supported: document search and data query. These types are suitable for basic document Q&A pairs and do not support responses with rich text or images.
You can find the knowledge base ID on the knowledge base details page in the Model Studio console.

Getting started

Run the following code to call the knowledge retrieval tool through the Responses API. The tool retrieves relevant content from the specified knowledge base to generate a response.

Replace vector_store_ids in the sample code with your knowledge base ID.

Python

import os
from openai import OpenAI

client = OpenAI(
    # If you have not configured the environment variable, replace the following line with your Model Studio API key: api_key="sk-xxx" (not recommended)
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    # The following URL is for the China (Beijing) region. Replace {WorkspaceId} with your Workspace ID. URLs vary by region.
    base_url="https://{WorkspaceId}.cn-beijing.maas.aliyuncs.com/compatible-mode/v1"
)

response = client.responses.create(
    model="qwen3.8-max",
    input="Tell me about the Alibaba Cloud Model Studio X1 phone",
    tools=[
        {
            "type": "file_search",
            # Replace this with your knowledge base ID. Only one is currently supported.
            "vector_store_ids": ["your_knowledge_base_id"]
        }
    ]
)

print("[Model Response]")
print(response.output_text)
print(f"\n[Token Usage] Input: {response.usage.input_tokens}, Output: {response.usage.output_tokens}, Total: {response.usage.total_tokens}")

Node.js

import OpenAI from "openai";
import process from 'process';

const openai = new OpenAI({
    // If you have not configured the environment variable, replace the following line with your Model Studio API key: apiKey: "sk-xxx",
    apiKey: process.env.DASHSCOPE_API_KEY,
    // The following URL is for the China (Beijing) region. Replace {WorkspaceId} with your Workspace ID. URLs vary by region.
    baseURL: "https://{WorkspaceId}.cn-beijing.maas.aliyuncs.com/compatible-mode/v1"
});

async function main() {
    const response = await openai.responses.create({
        model: "qwen3.8-max",
        input: "Tell me about the Alibaba Cloud Model Studio X1 phone",
        tools: [
            {
                type: "file_search",
                // Replace this with your knowledge base ID. Only one is currently supported.
                vector_store_ids: ["your_knowledge_base_id"]
            }
        ]
    });

    console.log("[Model Response]");
    console.log(response.output_text);

    const usage = response.usage;
    console.log(`\n[Token Usage] Input: ${usage.input_tokens}, Output: ${usage.output_tokens}, Total: ${usage.total_tokens}`);
}

main();

curl

# The following URL is for the China (Beijing) region. Replace {WorkspaceId} with your Workspace ID. URLs vary by region.
curl -X POST https://{WorkspaceId}.cn-beijing.maas.aliyuncs.com/compatible-mode/v1/responses \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
    "model": "qwen3.8-max",
    "input": "Tell me about the Alibaba Cloud Model Studio X1 phone",
    "tools": [
        {
            "type": "file_search",
            "vector_store_ids": ["your_knowledge_base_id"]
        }
    ]
}'

Running the code returns a response similar to the following:

[Model Response]
Based on the content in the knowledge base, the key points of this product include the following:

1. **Core features**: The product provides...
2. **Scenarios**: Suitable for...
3. **Technical attributes**: Based on...

...

[Token Usage] Input: 1568, Output: 1724, Total: 3292

Streaming output

The knowledge retrieval tool performs a semantic search in the knowledge base, which can take some time to process. You can enable streaming output to obtain intermediate results in real time.

Python

import os
from openai import OpenAI

client = OpenAI(
    # If you have not configured the environment variable, replace the following line with your Model Studio API key: api_key="sk-xxx" (not recommended)
    api_key=os.getenv("DASHSCOPE_API_KEY"),
    # The following URL is for the China (Beijing) region. Replace {WorkspaceId} with your Workspace ID. URLs vary by region.
    base_url="https://{WorkspaceId}.cn-beijing.maas.aliyuncs.com/compatible-mode/v1"
)

stream = client.responses.create(
    model="qwen3.8-max",
    input="Tell me about the Alibaba Cloud Model Studio X1 phone",
    tools=[
        {
            "type": "file_search",
            # Replace this with your knowledge base ID. Only one is currently supported.
            "vector_store_ids": ["your_knowledge_base_id"]
        }
    ],
    stream=True
)

for event in stream:
    # The model response starts
    if event.type == "response.content_part.added":
        print("[Model Response]")
    # Print the model response in a stream
    elif event.type == "response.output_text.delta":
        print(event.delta, end="", flush=True)
    # When the response is complete, print the token usage
    elif event.type == "response.completed":
        usage = event.response.usage
        print(f"\n\n[Token Usage] Input: {usage.input_tokens}, Output: {usage.output_tokens}, Total: {usage.total_tokens}")

Node.js

import OpenAI from "openai";
import process from 'process';

const openai = new OpenAI({
    // If you have not configured the environment variable, replace the following line with your Model Studio API key: apiKey: "sk-xxx",
    apiKey: process.env.DASHSCOPE_API_KEY,
    // The following URL is for the China (Beijing) region. Replace {WorkspaceId} with your Workspace ID. URLs vary by region.
    baseURL: "https://{WorkspaceId}.cn-beijing.maas.aliyuncs.com/compatible-mode/v1"
});

async function main() {
    const stream = await openai.responses.create({
        model: "qwen3.8-max",
        input: "Tell me about the Alibaba Cloud Model Studio X1 phone",
        tools: [
            {
                type: "file_search",
                // Replace this with your knowledge base ID. Only one is currently supported.
                vector_store_ids: ["your_knowledge_base_id"]
            }
        ],
        stream: true
    });

    for await (const event of stream) {
        // The model response starts
        if (event.type === "response.content_part.added") {
            console.log("[Model Response]");
        }
        // Print the model response in a stream
        else if (event.type === "response.output_text.delta") {
            process.stdout.write(event.delta);
        }
        // When the response is complete, print the token usage
        else if (event.type === "response.completed") {
            const usage = event.response.usage;
            console.log(`\n\n[Token Usage] Input: ${usage.input_tokens}, Output: ${usage.output_tokens}, Total: ${usage.total_tokens}`);
        }
    }
}

main();

curl

# The following URL is for the China (Beijing) region. Replace {WorkspaceId} with your Workspace ID. URLs vary by region.
curl -X POST https://{WorkspaceId}.cn-beijing.maas.aliyuncs.com/compatible-mode/v1/responses \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
    "model": "qwen3.8-max",
    "input": "Tell me about the Alibaba Cloud Model Studio X1 phone",
    "tools": [
        {
            "type": "file_search",
            "vector_store_ids": ["your_knowledge_base_id"]
        }
    ],
    "stream": true
}'

Parameters

The file_search tool supports the following parameters:

Parameter

Required

Description

type

Yes

Must be set to "file_search".

vector_store_ids

Yes

A list of knowledge base IDs. Currently, you can pass only one knowledge base ID. You can find the knowledge base ID on the knowledge base details page in the Model Studio console or get it when you create a knowledge base using the API.

Make sure to pass a valid knowledge base ID. If you pass an empty array or an invalid ID, the knowledge retrieval tool does not work. The model then uses its own knowledge to generate a response and does not return an error.

Billing

Billing involves the following aspects:

Model call fees: Content retrieved from the knowledge base is added to the prompt. This increases the number of input tokens for the model. You are charged based on the model's standard pricing. For pricing details, see the Model Studio console.
Tool call fees: For more information, see Knowledge base billing.