Long context (Qwen-Long)

更新时间:
复制 MD 格式

Qwen-Long handles documents up to 10 million tokens through a file upload and reference mechanism, overcoming standard model context limits.

Note

This document applies only to the Chinese mainland (Beijing) region. To use the model, you must use an API key from the Chinese mainland (Beijing) region.

How to use

Use Qwen-Long in two steps: upload files, then call the API.

  1. File upload and parsing:

    • Upload a file using the API. For details about supported file formats and size limits, see Supported formats.

    • After a successful upload, the system returns a unique file-id for your account and starts parsing. No fees are charged for file upload, storage, or parsing.

  2. API call and billing:

    • When you call the model, reference one or more file-ids in the system message.

    • The model performs inference based on the text content associated with the file-id.

    • For each API call, the number of tokens in the referenced file content is counted as input tokens for that request.

This avoids transferring large files in each request, but note that file tokens are billed per API call.

Getting started

Prerequisites

Upload a document

This example uploads Model_Studio_Phone_Product_Introduction.docx to Model Studio's secure storage via the OpenAI-compatible interface and gets a file-id. See the API documentation for upload parameters.

Python

import os
from pathlib import Path
from openai import OpenAI

client = OpenAI(
    api_key=os.getenv("DASHSCOPE_API_KEY"),  # If not configured, replace with your API key.
    base_url="https://dashscope.aliyuncs.com/compatible-mode/v1",  # Enter the base_url of the DashScope service.
)

file_object = client.files.create(file=Path("Model_Studio_Phone_Product_Introduction.docx"), purpose="file-extract")
print(file_object.id)

Java

import com.openai.client.OpenAIClient;
import com.openai.client.okhttp.OpenAIOkHttpClient;
import com.openai.models.files.*;

import java.nio.file.Path;
import java.nio.file.Paths;

public class Main {
    public static void main(String[] args) {
        // Create a client and use the API key from the environment variable.
        OpenAIClient client = OpenAIOkHttpClient.builder()
                // If you have not configured the environment variable, replace the following line with: .apiKey("sk-xxx")
                .apiKey(System.getenv("DASHSCOPE_API_KEY"))
                .baseUrl("https://dashscope.aliyuncs.com/compatible-mode/v1")
                .build();
        // Set the file path. Modify the path and filename as needed.
        Path filePath = Paths.get("src/main/java/org/example/Model_Studio_Phone_Product_Introduction.docx");
        // Create file upload parameters.
        FileCreateParams fileParams = FileCreateParams.builder()
                .file(filePath)
                .purpose(FilePurpose.of("file-extract"))
                .build();

        // Upload the file and print the file-id.
        FileObject fileObject = client.files().create(fileParams);
        System.out.println(fileObject.id());
    }
}

curl

curl --location --request POST 'https://dashscope.aliyuncs.com/compatible-mode/v1/files' \
  --header "Authorization: Bearer $DASHSCOPE_API_KEY" \
  --form 'file=@"Alibaba Cloud Model Studio Phone Series Product Introduction.docx"' \
  --form 'purpose="file-extract"'

Run the code to obtain the file-id for the uploaded file.

Pass information and chat using a file ID

Pass the file-id in system messages: first message defines the role, second contains the file-id, then add user questions.

Longer documents need more parsing time. Wait for parsing to complete before calling.

Python

import os
from openai import OpenAI, BadRequestError

client = OpenAI(
    api_key=os.getenv("DASHSCOPE_API_KEY"),  # If not configured, replace with your API key.
    base_url="https://dashscope.aliyuncs.com/compatible-mode/v1",  # Enter the base_url of the DashScope service.
)
try:
    # Initialize messages list.
    completion = client.chat.completions.create(
        model="qwen-long",
        messages=[
            # sys1: Role definition.
            {'role': 'system', 'content': 'You are a helpful assistant.'},
            # sys2: Document content (plain text or file-id).
            # Replace '{FILE_ID}' with the file-id used in your conversation.
            {'role': 'system', 'content': f'fileid://{FILE_ID}'},
            # When the request includes a second system message, the user message content is limited to 9,000 tokens.
            {'role': 'user', 'content': 'What is this article about?'}
        ],
        # All examples use streaming output to show the model's response process. For non-streaming examples, see https://help.aliyun.com/en/model-studio/text-generation
        stream=True,
        stream_options={"include_usage": True}
    )

    full_content = ""
    for chunk in completion:
        if chunk.choices and chunk.choices[0].delta.content:
            # Concatenate the output content.
            full_content += chunk.choices[0].delta.content
            print(chunk.model_dump())
        
        # Get token usage.
        if chunk.usage:
            print(f"Total tokens: {chunk.usage.total_tokens}")

    print(full_content)

except BadRequestError as e:
    print(f"Error: {e}")
    print("See documentation: https://help.aliyun.com/en/model-studio/developer-reference/error-code")

Java

import com.openai.client.OpenAIClient;
import com.openai.client.okhttp.OpenAIOkHttpClient;
import com.openai.core.http.StreamResponse;
import com.openai.models.chat.completions.*;

public class Main {
    public static void main(String[] args) {
        // Create a client and use the API key from the environment variable.
        OpenAIClient client = OpenAIOkHttpClient.builder()
                // If you have not configured the environment variable, replace the following line with: .apiKey("sk-xxx")
                .apiKey(System.getenv("DASHSCOPE_API_KEY"))
                .baseUrl("https://dashscope.aliyuncs.com/compatible-mode/v1")
                .build();

        // Create a chat request.
        ChatCompletionCreateParams chatParams = ChatCompletionCreateParams.builder()
                //sys1: Role definition.
                .addSystemMessage("You are a helpful assistant.")
                //sys2: Document content (plain text or file-id).
                //Replace '{FILE_ID}' with the file-id used in your conversation.
                .addSystemMessage("fileid://{FILE_ID}")
                //When the request includes a second system message, the user message content is limited to 9,000 tokens.
                .addUserMessage("What is this article about?")
                .model("qwen-long")
                .build();

        StringBuilder fullResponse = new StringBuilder();

        // All examples use streaming output to show the model's response process. For non-streaming examples, see https://help.aliyun.com/en/model-studio/text-generation
        try (StreamResponse<ChatCompletionChunk> streamResponse = client.chat().completions().createStreaming(chatParams)) {
            streamResponse.stream().forEach(chunk -> {
                // Print and concatenate the content of each chunk.
                System.out.println(chunk);
                String content = chunk.choices().get(0).delta().content().orElse("");
                if (!content.isEmpty()) {
                    fullResponse.append(content);
                }
            });
            System.out.println(fullResponse);
        } catch (Exception e) {
            System.err.println("Error: " + e.getMessage());
            System.err.println("See documentation: https://help.aliyun.com/en/model-studio/error-code");
        }
    }
}

curl

curl --location 'https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions' \
--header "Authorization: Bearer $DASHSCOPE_API_KEY" \
--header "Content-Type: application/json" \
--data '{
    "model": "qwen-long",
    "messages": [
        {"role": "system","content": "You are a helpful assistant."},
        {"role": "system","content": "fileid://file-fe-xxx"},
        {"role": "user","content": "What is this article about?"}
    ],
    "stream": true,
    "stream_options": {
        "include_usage": true
    }
}'

Pass multiple documents

Pass multiple file-ids in one system message or add separate system messages for each document.

Pass multiple documents

Python

import os
from openai import OpenAI, BadRequestError

client = OpenAI(
    api_key=os.getenv("DASHSCOPE_API_KEY"),  # If not configured, replace with your API key.
    base_url="https://dashscope.aliyuncs.com/compatible-mode/v1",  # Enter the base_url of the DashScope service.
)
try:
    # Initialize messages list.
    completion = client.chat.completions.create(
        model="qwen-long",
        messages=[
            {'role': 'system', 'content': 'You are a helpful assistant.'},
            # Replace '{FILE_ID1}' and '{FILE_ID2}' with the file-ids used in your conversation.
            {'role': 'system', 'content': f"fileid://{FILE_ID1},fileid://{FILE_ID2}"},
            {'role': 'user', 'content': 'What are these articles about?'}
        ],
        # All examples use streaming output to show the model's response process. For non-streaming examples, see https://help.aliyun.com/en/model-studio/text-generation
        stream=True,
        stream_options={"include_usage": True}
    )
    
    full_content = ""
    for chunk in completion:
        if chunk.choices and chunk.choices[0].delta.content:
            # Concatenate the output content.
            full_content += chunk.choices[0].delta.content
            print(chunk.model_dump())
        
        # Get token usage.
        if chunk.usage:
            print(f"Total tokens: {chunk.usage.total_tokens}")
    
    print(full_content)

except BadRequestError as e:
    print(f"Error: {e}")
    print("See documentation: https://help.aliyun.com/en/model-studio/developer-reference/error-code")

Java

import com.openai.client.OpenAIClient;
import com.openai.client.okhttp.OpenAIOkHttpClient;
import com.openai.core.http.StreamResponse;
import com.openai.models.chat.completions.*;

public class Main {
    public static void main(String[] args) {
        // Create a client and use the API key from the environment variable.
        OpenAIClient client = OpenAIOkHttpClient.builder()
                .apiKey(System.getenv("DASHSCOPE_API_KEY"))
                .baseUrl("https://dashscope.aliyuncs.com/compatible-mode/v1")
                .build();

        // Create a chat request.
        ChatCompletionCreateParams chatParams = ChatCompletionCreateParams.builder()
                .addSystemMessage("You are a helpful assistant.")
                //Replace '{FILE_ID1}' and '{FILE_ID2}' with the file-ids used in your conversation.
                .addSystemMessage("fileid://{FILE_ID1},fileid://{FILE_ID2}")
                .addUserMessage("What are these two articles about?")
                .model("qwen-long")
                .build();

        StringBuilder fullResponse = new StringBuilder();

        // All examples use streaming output to show the model's response process. For non-streaming examples, see https://help.aliyun.com/en/model-studio/text-generation
        try (StreamResponse<ChatCompletionChunk> streamResponse = client.chat().completions().createStreaming(chatParams)) {
            streamResponse.stream().forEach(chunk -> {
                // The content of each chunk.
                System.out.println(chunk);
                String content = chunk.choices().get(0).delta().content().orElse("");
                if (!content.isEmpty()) {
                    fullResponse.append(content);
                }
            });
            System.out.println("\nFull response content:");
            System.out.println(fullResponse);
        } catch (Exception e) {
            System.err.println("Error: " + e.getMessage());
            System.err.println("See documentation: https://help.aliyun.com/en/model-studio/error-code");
        }
    }
}

curl

curl --location 'https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions' \
--header "Authorization: Bearer $DASHSCOPE_API_KEY" \
--header "Content-Type: application/json" \
--data '{
    "model": "qwen-long",
    "messages": [
        {"role": "system","content": "You are a helpful assistant."},
        {"role": "system","content": "fileid://file-fe-xxx1"},
        {"role": "system","content": "fileid://file-fe-xxx2"},
        {"role": "user","content": "What are these two articles about?"}
    ],
    "stream": true,
    "stream_options": {
        "include_usage": true
    }
}'

Append document

Python

import os
from openai import OpenAI, BadRequestError

client = OpenAI(
    api_key=os.getenv("DASHSCOPE_API_KEY"),  # If not configured, replace with your API key.
    base_url="https://dashscope.aliyuncs.com/compatible-mode/v1",  # Enter the base_url of the DashScope service.
)
# Initialize the messages list.
messages = [
    {'role': 'system', 'content': 'You are a helpful assistant.'},
    # Replace '{FILE_ID1}' with the file-id used in your conversation.
    {'role': 'system', 'content': f'fileid://{FILE_ID1}'},
    {'role': 'user', 'content': 'What is this article about?'}
]

try:
    # First-round response
    completion_1 = client.chat.completions.create(
        model="qwen-long",
        messages=messages,
        stream=False
    )
    # Print first-round response.
    # To stream: set stream=True, concatenate segments, and pass to assistant_message content.
    print(f"First-round response: {completion_1.choices[0].message.model_dump()}")
except BadRequestError as e:
    print(f"Error: {e}")
    print("See documentation: https://help.aliyun.com/en/model-studio/error-code")

# Construct the assistant_message.
assistant_message = {
    "role": "assistant",
    "content": completion_1.choices[0].message.content}

# Add assistant_message to messages.
messages.append(assistant_message)

# Add the file-id of the appended document to messages.
# Replace '{FILE_ID2}' with the file-id used in your conversation.
system_message = {'role': 'system', 'content': f'fileid://{FILE_ID2}'}
messages.append(system_message)

# Add the user's question.
messages.append({'role': 'user', 'content': 'What are the similarities and differences between the methods discussed in these two articles?'})

# Response after appending the document.
completion_2 = client.chat.completions.create(
    model="qwen-long",
    messages=messages,
    # All code examples use streaming output to clearly and intuitively show the model output process. For non-streaming output examples, see https://help.aliyun.com/en/model-studio/text-generation
    stream=True,
    stream_options={
        "include_usage": True
    }
)

# Stream and print the response after appending the document.
print("Response after appending the document:")
for chunk in completion_2:
    print(chunk.model_dump())

Java

import com.openai.client.OpenAIClient;
import com.openai.client.okhttp.OpenAIOkHttpClient;
import com.openai.models.chat.completions.*;
import com.openai.core.http.StreamResponse;

import java.util.ArrayList;
import java.util.List;

public class Main {
    public static void main(String[] args) {
        OpenAIClient client = OpenAIOkHttpClient.builder()
                .apiKey(System.getenv("DASHSCOPE_API_KEY"))
                .baseUrl("https://dashscope.aliyuncs.com/compatible-mode/v1")
                .build();
        // Initialize messages list.
        List<ChatCompletionMessageParam> messages = new ArrayList<>();
        
        // Add information for role setting.
        ChatCompletionSystemMessageParam roleSet = ChatCompletionSystemMessageParam.builder()
                .content("You are a helpful assistant.")
                .build();
        messages.add(ChatCompletionMessageParam.ofSystem(roleSet));
        
        // Replace '{FILE_ID1}' with the file-id used in your conversation.
        ChatCompletionSystemMessageParam systemMsg1 = ChatCompletionSystemMessageParam.builder()
                .content("fileid://{FILE_ID1}")
                .build();
        messages.add(ChatCompletionMessageParam.ofSystem(systemMsg1));

        // User question message (USER role).
        ChatCompletionUserMessageParam userMsg1 = ChatCompletionUserMessageParam.builder()
                .content("Please summarize the article content.")
                .build();
        messages.add(ChatCompletionMessageParam.ofUser(userMsg1));

        // Construct the first-round request and handle exceptions.
        ChatCompletion completion1;
        try {
            completion1 = client.chat().completions().create(
                    ChatCompletionCreateParams.builder()
                            .model("qwen-long")
                            .messages(messages)
                            .build()
            );
        } catch (Exception e) {
            System.err.println("Request error. See error code page:");
            System.err.println("https://help.aliyun.com/en/model-studio/error-code");
            System.err.println("Error details: " + e.getMessage());
            e.printStackTrace(); 
            return; 
        }

        // First-round response.
        String firstResponse = completion1 != null ? completion1.choices().get(0).message().content().orElse("") : "";
        System.out.println("First-round response: " + firstResponse);

        // Construct AssistantMessage.
        ChatCompletionAssistantMessageParam assistantMsg = ChatCompletionAssistantMessageParam.builder()
                .content(firstResponse)
                .build();
        messages.add(ChatCompletionMessageParam.ofAssistant(assistantMsg));

        // Replace '{FILE_ID2}' with the file-id used in your conversation.
        ChatCompletionSystemMessageParam systemMsg2 = ChatCompletionSystemMessageParam.builder()
                .content("fileid://{FILE_ID2}")
                .build();
        messages.add(ChatCompletionMessageParam.ofSystem(systemMsg2));

        // Second-round user question (USER role).
        ChatCompletionUserMessageParam userMsg2 = ChatCompletionUserMessageParam.builder()
                .content("Please compare the structural differences between the two articles.")
                .build();
        messages.add(ChatCompletionMessageParam.ofUser(userMsg2));

        // All examples use streaming output to show the model's response process. For non-streaming examples, see https://help.aliyun.com/en/model-studio/text-generation
        StringBuilder fullResponse = new StringBuilder();
        try (StreamResponse<ChatCompletionChunk> streamResponse = client.chat().completions().createStreaming(
                ChatCompletionCreateParams.builder()
                        .model("qwen-long")
                        .messages(messages)
                        .build())) {

            streamResponse.stream().forEach(chunk -> {
                String content = chunk.choices().get(0).delta().content().orElse("");
                if (!content.isEmpty()) {
                    fullResponse.append(content);
                }
            });
            System.out.println("\nFinal response:");
            System.out.println(fullResponse.toString().trim());
        } catch (Exception e) {
            System.err.println("Error: " + e.getMessage());
            System.err.println("See documentation: https://help.aliyun.com/en/model-studio/error-code");
        }
    }
}

curl

curl --location 'https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions' \
--header "Authorization: Bearer $DASHSCOPE_API_KEY" \
--header "Content-Type: application/json" \
--data '{
    "model": "qwen-long",
    "messages": [
        {"role": "system","content": "You are a helpful assistant."},
        {"role": "system","content": "fileid://file-fe-xxx1"},
        {"role": "user","content": "What is this article about?"},
        {"role": "system","content": "fileid://file-fe-xxx2"},
        {"role": "user","content": "What are the similarities and differences between the methods discussed in these two articles?"}
    ],
    "stream": true,
    "stream_options": {
        "include_usage": true
    }
}'

Pass information as plain text

Instead of using file-ids, pass document content directly as a string. Add role settings in the first message to prevent confusion with document content.

If document content exceeds 1 million tokens, use a file ID instead due to API size limits.

Simple example

You can input the document content directly into the System Message.

Python

import os
from openai import OpenAI

client = OpenAI(
    api_key=os.getenv("DASHSCOPE_API_KEY"),  # Replace your API key here if you haven't set the environment variable
    base_url="https://dashscope.aliyuncs.com/compatible-mode/v1",  # Set the DashScope service base_url
)
# Initialize the messages list
completion = client.chat.completions.create(
    model="qwen-long",
    messages=[
        {'role': 'system', 'content': 'You are a helpful assistant.'},
        {'role': 'system', 'content': 'Alibaba Cloud Model Studio smartphone product introduction: Alibaba Cloud Model Studio X1 —————— Enjoy an ultimate visual experience: features a 6.7-inch 1440 x 3200 pixel ultra-clear screen...'},
        {'role': 'user', 'content': 'What does the article talk about?'}
    ],
    # All code examples use streaming output to clearly and intuitively show the model's output process. For non-streaming examples, see https://help.aliyun.com/en/model-studio/text-generation
    stream=True,
    stream_options={"include_usage": True}
)

full_content = ""
for chunk in completion:
    if chunk.choices and chunk.choices[0].delta.content:
        # Append output content
        full_content += chunk.choices[0].delta.content
        print(chunk.model_dump())

print(full_content)

Java

import com.openai.client.OpenAIClient;
import com.openai.client.okhttp.OpenAIOkHttpClient;
import com.openai.core.http.StreamResponse;
import com.openai.models.chat.completions.*;


public class Main {
    public static void main(String[] args) {
        // Create a client using the API key from the environment variable
        OpenAIClient client = OpenAIOkHttpClient.builder()
                .apiKey(System.getenv("DASHSCOPE_API_KEY"))
                .baseUrl("https://dashscope.aliyuncs.com/compatible-mode/v1")
                .build();

        // Create a chat request
        ChatCompletionCreateParams chatParams = ChatCompletionCreateParams.builder()
                .addSystemMessage("You are a helpful assistant.")
                .addSystemMessage("Alibaba Cloud Model Studio smartphone product introduction: Alibaba Cloud Model Studio X1 —————— Enjoy an ultimate visual experience: features a 6.7-inch 1440 x 3200 pixel ultra-clear screen...")
                .addUserMessage("What does this article talk about?")
                .model("qwen-long")
                .build();

        StringBuilder fullResponse = new StringBuilder();

        // All examples use streaming output to show the model's response process. For non-streaming examples, see https://help.aliyun.com/en/model-studio/text-generation
        try (StreamResponse<ChatCompletionChunk> streamResponse = client.chat().completions().createStreaming(chatParams)) {
            streamResponse.stream().forEach(chunk -> {
                // Print and append each chunk's content
                System.out.println(chunk);
                String content = chunk.choices().get(0).delta().content().orElse("");
                if (!content.isEmpty()) {
                    fullResponse.append(content);
                }
            });
            System.out.println(fullResponse);
        } catch (Exception e) {
            System.err.println("Error: " + e.getMessage());
            System.err.println("For more information, see https://help.aliyun.com/en/model-studio/error-code");
        }
    }
}

curl

curl --location 'https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions' \
--header "Authorization: Bearer $DASHSCOPE_API_KEY" \
--header "Content-Type: application/json" \
--data '{
    "model": "qwen-long",
    "messages": [
        {"role": "system","content": "You are a helpful assistant."},
        {"role": "system","content": "Alibaba Cloud Model Studio X1 —— Enjoy an ultimate visual experience: features a 6.7-inch 1440 x 3200 pixel ultra-clear screen with a 120Hz refresh rate, ..."},
        {"role": "user","content": "What does this article talk about?"}
    ],
    "stream": true,
    "stream_options": {
        "include_usage": true
    }
}'

Pass multiple documents

Place each document's content in a separate system message.

Python

import os
from openai import OpenAI

client = OpenAI(
    api_key=os.getenv("DASHSCOPE_API_KEY"),  # Replace your API key here if you haven't set the environment variable
    base_url="https://dashscope.aliyuncs.com/compatible-mode/v1",  # Set the DashScope service base_url
)
# Initialize the messages list
completion = client.chat.completions.create(
    model="qwen-long",
    messages=[
        {'role': 'system', 'content': 'You are a helpful assistant.'},
        {'role': 'system', 'content': 'Alibaba Cloud Model Studio X1————Enjoy an ultimate visual experience: features a 6.7-inch 1440 x 3200 pixel ultra-clear screen with a 120Hz refresh rate...'},
        {'role': 'system', 'content': 'Stardust S9 Pro —— A revolutionary visual feast: breakthrough 6.9-inch 1440 x 3088 pixel under-display camera design...'},
        {'role': 'user', 'content': 'What are the similarities and differences between the products discussed in these two articles?'}
    ],
    # All code examples use streaming output to clearly and intuitively show the model's output process. For non-streaming examples, see https://help.aliyun.com/en/model-studio/text-generation
    stream=True,
    stream_options={"include_usage": True}
)
full_content = ""
for chunk in completion:
    if chunk.choices and chunk.choices[0].delta.content:
        # Append output content
        full_content += chunk.choices[0].delta.content
        print(chunk.model_dump())

print(full_content)

Java

import com.openai.client.OpenAIClient;
import com.openai.client.okhttp.OpenAIOkHttpClient;
import com.openai.core.http.StreamResponse;

import com.openai.models.chat.completions.*;


public class Main {
    public static void main(String[] args) {
        // Create a client using the API key from the environment variable
        OpenAIClient client = OpenAIOkHttpClient.builder()
                .apiKey(System.getenv("DASHSCOPE_API_KEY"))
                .baseUrl("https://dashscope.aliyuncs.com/compatible-mode/v1")
                .build();

        // Create a chat request
        ChatCompletionCreateParams chatParams = ChatCompletionCreateParams.builder()
                .addSystemMessage("You are a helpful assistant.")
                .addSystemMessage("Alibaba Cloud Model Studio smartphone product introduction: Alibaba Cloud Model Studio X1 —————— Enjoy an ultimate visual experience: features a 6.7-inch 1440 x 3200 pixel ultra-clear screen...")
                .addSystemMessage("Stardust S9 Pro —— A revolutionary visual feast: breakthrough 6.9-inch 1440 x 3088 pixel under-display camera design...")
                .addUserMessage("What are the similarities and differences between the products discussed in these two articles?")
                .model("qwen-long")
                .build();

        StringBuilder fullResponse = new StringBuilder();

        // All examples use streaming output to show the model's response process. For non-streaming examples, see https://help.aliyun.com/en/model-studio/text-generation
        try (StreamResponse<ChatCompletionChunk> streamResponse = client.chat().completions().createStreaming(chatParams)) {
            streamResponse.stream().forEach(chunk -> {
                // Print and append each chunk's content
                System.out.println(chunk);
                String content = chunk.choices().get(0).delta().content().orElse("");
                if (!content.isEmpty()) {
                    fullResponse.append(content);
                }
            });
            System.out.println(fullResponse);
        } catch (Exception e) {
            System.err.println("Error: " + e.getMessage());
            System.err.println("For more information, see https://help.aliyun.com/en/model-studio/error-code");
        }
    }
}

curl

curl --location 'https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions' \
--header "Authorization: Bearer $DASHSCOPE_API_KEY" \
--header "Content-Type: application/json" \
--data '{
    "model": "qwen-long",
    "messages": [
        {"role": "system","content": "You are a helpful assistant."},
        {"role": "system","content": "Alibaba Cloud Model Studio X1 —— Enjoy an ultimate visual experience: features a 6.7-inch 1440 x 3200 pixel ultra-clear screen with a 120Hz refresh rate..."},
        {"role": "system","content": "Stardust S9 Pro —— A revolutionary visual feast: breakthrough 6.9-inch 1440 x 3088 pixel under-display camera design..."},
        {"role": "user","content": "What are the similarities and differences between the products discussed in these two articles?"}
    ],
    "stream": true,
    "stream_options": {
        "include_usage": true
    }
}'

Append documents

To add new documents during conversation, append them as system messages to the messages array.

Python

import os
from openai import OpenAI, BadRequestError

client = OpenAI(
    api_key=os.getenv("DASHSCOPE_API_KEY"),  # Replace your API key here if you haven't set the environment variable
    base_url="https://dashscope.aliyuncs.com/compatible-mode/v1",  # Set the DashScope service base_url
)
# Initialize the messages list
messages = [
    {'role': 'system', 'content': 'You are a helpful assistant.'},
    {'role': 'system', 'content': 'Alibaba Cloud Model Studio X1 —— Enjoy an ultimate visual experience: features a 6.7-inch 1440 x 3200 pixel ultra-clear screen with a 120Hz refresh rate...'},
    {'role': 'user', 'content': 'What does this article talk about?'}
]

try:
    # First-round response
    completion_1 = client.chat.completions.create(
        model="qwen-long",
        messages=messages,
        stream=False
    )
    # Print the first-round response
    # For streaming output in the first round, set stream=True and concatenate each segment's content. Pass the concatenated string as the content when constructing assistant_message
    print(f"First-round response: {completion_1.choices[0].message.model_dump()}")
except BadRequestError as e:
    print(f"Error: {e}")
    print("For more information, see https://help.aliyun.com/en/model-studio/error-code")

# Construct assistant_message
assistant_message = {
    "role": "assistant",
    "content": completion_1.choices[0].message.content}

# Append assistant_message to messages
messages.append(assistant_message)
# Append new document content to messages
system_message = {
    'role': 'system',
    'content': 'Stardust S9 Pro —— A revolutionary visual feast: breakthrough 6.9-inch 1440 x 3088 pixel under-display camera design, delivering an immersive visual experience...'}
messages.append(system_message)

# Add user question
messages.append({
    'role': 'user',
    'content': 'What are the similarities and differences between the products discussed in these two articles?'
})

# Response after appending the document
completion_2 = client.chat.completions.create(
    model="qwen-long",
    messages=messages,
    # All code examples use streaming output to clearly and intuitively show the model's output process. For non-streaming examples, see https://help.aliyun.com/en/model-studio/text-generation
    stream=True,
    stream_options={"include_usage": True}
)

# Stream and print the response after appending the document
print("Response after appending the document:")
for chunk in completion_2:
    print(chunk.model_dump())

Java

import com.openai.client.OpenAIClient;
import com.openai.client.okhttp.OpenAIOkHttpClient;
import com.openai.models.chat.completions.*;
import com.openai.core.http.StreamResponse;

import java.util.ArrayList;
import java.util.List;

public class Main {
    public static void main(String[] args) {
        OpenAIClient client = OpenAIOkHttpClient.builder()
                .apiKey(System.getenv("DASHSCOPE_API_KEY"))
                .baseUrl("https://dashscope.aliyuncs.com/compatible-mode/v1")
                .build();
        // Initialize the messages list
        List<ChatCompletionMessageParam> messages = new ArrayList<>();
        
        // Add role-setting information
        ChatCompletionSystemMessageParam roleSet = ChatCompletionSystemMessageParam.builder()
                .content("You are a helpful assistant.")
                .build();
        messages.add(ChatCompletionMessageParam.ofSystem(roleSet));
        
        // First-round content
        ChatCompletionSystemMessageParam systemMsg1 = ChatCompletionSystemMessageParam.builder()
                .content("Alibaba Cloud Model Studio X1 —— Enjoy an ultimate visual experience: features a 6.7-inch 1440 x 3200 pixel ultra-clear screen with a 120Hz refresh rate, 256GB storage, 12GB RAM, and a 5000mAh long-lasting battery...")
                .build();
        messages.add(ChatCompletionMessageParam.ofSystem(systemMsg1));

        // User question (USER role)
        ChatCompletionUserMessageParam userMsg1 = ChatCompletionUserMessageParam.builder()
                .content("Please summarize the article content")
                .build();
        messages.add(ChatCompletionMessageParam.ofUser(userMsg1));

        // Build the first-round request and handle exceptions
        ChatCompletion completion1;
        try {
            completion1 = client.chat().completions().create(
                    ChatCompletionCreateParams.builder()
                            .model("qwen-long")
                            .messages(messages)
                            .build()
            );
        } catch (Exception e) {
            System.err.println("Error: " + e.getMessage());
            System.err.println("For more information, see https://help.aliyun.com/en/model-studio/error-code");
            e.printStackTrace(); 
            return; 
        }

        // First-round response
        String firstResponse = completion1 != null ? completion1.choices().get(0).message().content().orElse("") : "";
        System.out.println("First-round response: " + firstResponse);

        // Construct AssistantMessage
        ChatCompletionAssistantMessageParam assistantMsg = ChatCompletionAssistantMessageParam.builder()
                .content(firstResponse)
                .build();
        messages.add(ChatCompletionMessageParam.ofAssistant(assistantMsg));

        // Second-round content
        ChatCompletionSystemMessageParam systemMsg2 = ChatCompletionSystemMessageParam.builder()
                .content("Stardust S9 Pro —— A revolutionary visual feast: breakthrough 6.9-inch 1440 x 3088 pixel under-display camera design, delivering an immersive visual experience...")
                .build();
        messages.add(ChatCompletionMessageParam.ofSystem(systemMsg2));

        // Second-round user question (USER role)
        ChatCompletionUserMessageParam userMsg2 = ChatCompletionUserMessageParam.builder()
                .content("Please compare the structural differences between the two descriptions")
                .build();
        messages.add(ChatCompletionMessageParam.ofUser(userMsg2));

        // All examples use streaming output to show the model's response process. For non-streaming examples, see https://help.aliyun.com/en/model-studio/text-generation
        StringBuilder fullResponse = new StringBuilder();
        try (StreamResponse<ChatCompletionChunk> streamResponse = client.chat().completions().createStreaming(
                ChatCompletionCreateParams.builder()
                        .model("qwen-long")
                        .messages(messages)
                        .build())) {

            streamResponse.stream().forEach(chunk -> {
                String content = chunk.choices().get(0).delta().content().orElse("");
                if (!content.isEmpty()) {
                    fullResponse.append(content);
                }
            });
            System.out.println("\nFinal response:");
            System.out.println(fullResponse.toString().trim());
        } catch (Exception e) {
            System.err.println("Error: " + e.getMessage());
            System.err.println("For more information, see https://help.aliyun.com/en/model-studio/error-code");
        }
    }
}

curl

curl --location 'https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions' \
--header "Authorization: Bearer $DASHSCOPE_API_KEY" \
--header "Content-Type: application/json" \
--data '{
    "model": "qwen-long",
    "messages": [
            {"role": "system","content": "You are a helpful assistant."},
            {"role": "system","content": "Alibaba Cloud Model Studio X1 —— Enjoy an ultimate visual experience: features a 6.7-inch 1440 x 3200 pixel ultra-clear screen with a 120Hz refresh rate..."},
            {"role": "user","content": "What does this article talk about?"},
            {"role": "system","content": "Stardust S9 Pro —— A revolutionary visual feast: breakthrough 6.9-inch 1440 x 3088 pixel under-display camera design, delivering an immersive visual experience..."},
            {"role": "user","content": "What are the similarities and differences between the products discussed in these two articles"}
        ],
    "stream": true,
    "stream_options": {
        "include_usage": true
    }
}'

Model pricing

Chinese mainland

If you select the Chinese mainland deployment scope, model inference compute resources are restricted to the Chinese mainland. Static data is stored in your selected region. Supported region: China (Beijing).

Model

Version

Context window

Max input

Max output

Input cost

Output cost

Free quota

Note

(tokens)

(per 1M tokens)

qwen-long

Batch calls at half price

Stable

10,000,000

10,000,000

32,768

0.5 CNY

2 CNY

1 million tokens each

Valid for 90 days after activating Model Studio

qwen-long-latest

Always matches the latest snapshot version
Batch calls at half price

Latest

qwen-long-2025-01-25

Also known as qwen-long-0125

Snapshot

0.5 CNY

2 CNY

On the Qwen-Long Playground page, you can upload documents and ask questions.

FAQ

  1. Does the Qwen-Long model support submitting batch jobs?

    Yes. Qwen-Long supports the OpenAI Batch API at 50% of real-time call rates. Submit batch jobs as files; jobs run asynchronously and return results on completion or timeout.

  2. Where are files saved after they are uploaded using the OpenAI-compatible file API?

    Files are uploaded to your Model Studio bucket at no cost. See the OpenAI File API for querying and managing files.

  3. What is qwen-long-2025-01-25?

    This is a version snapshot frozen at a specific point in time. More stable than `latest`, with no expiration date.

  4. How can I know when a file has finished parsing?

    Call the model with the file-id. If parsing is incomplete, you'll get error 400: "File parsing in progress, please try again later." A successful response means parsing is complete.

  5. How can I ensure the model outputs a JSON string in a standard format?

    qwen-long and all snapshots support structured output. Specify a JSON Schema to ensure valid JSON that matches your structure.

API reference

Refer to Qwen API details for the input and output parameters of the Qwen-Long model.

Error codes

If the model call fails and returns an error message, see Error messages for resolution.

Limits

  • SDK dependencies:

    • File operations (upload, delete, query) require an OpenAI-compatible SDK.

    • Invoke models using an OpenAI-compatible SDK or Dashscope SDK.

  • File upload:

    • Supported formats: TXT, DOCX, PDF, XLSX, EPUB, MOBI, MD, CSV, JSON, BMP, PNG, JPG/JPEG, and GIF.

    • File size: The maximum size for image files is 20 MB. The maximum size for other file formats is 150 MB.

    • Account quota: Maximum 10,000 files or 100 GB per account. Uploads fail when either limit is reached. Delete files to free quota. See OpenAI compatible - File.

    • Storage period: Currently, there is no expiration limit for stored files.

  • API inputs:

    • The first system message defines the role. The second contains document content or fileid://xxx. The user message contains the query.

    • When referencing files using a file-id, a single request can reference a maximum of 100 files.

    • With a second system message, user message limit is 9,000 tokens. No limit with only one system message.

    • The total context length is limited to 10 million tokens.

  • API outputs:

    • The maximum output length is 32,768 tokens.

  • File sharing:

    • file-ids are account-specific and cannot be used cross-account or with RAM user API keys.

  • Free quota: The free quota of 1 million tokens is valid for 90 days after you activate Alibaba Cloud Model Studio. Usage that exceeds the free quota is charged based on the corresponding input and output costs.

  • Throttling: For information about model throttling conditions, see Throttling.