通义千问数据挖掘模型支持从文档中提取结构化信息,在数据标注、内容审核等领域中表现出色。
模型概览
模型名称 | 上下文长度 | 最大输入 | 最大输出 | 输入成本 | 输出成本 | 免费额度 |
(Token数) | (每千Token) | |||||
qwen-doc-turbo | 262,144 | 253,952 | 8,192 | 0.0006元 | 0.001元 | 无免费额度 |
快速开始
您需要已获取API Key并配置API Key到环境变量。如果通过OpenAI SDK或DashScope SDK进行调用,还需要安装SDK。
文档内容上传方式选择
在选择文档内容上传方式时,请考虑以下因素:
通过
文件ID
上传推荐:适合需要频繁引用和管理的文档。可以减少文本输入错误,操作简便。
模型支持以纯文本提取和结构化方式解析TXT、DOCX、DOC、PPTX、PPT、XLSX、XLS、MD文件。单个文件大小不可超过150MB,单个阿里云账号最多可上传 1 万个文件,总文件大小不得超过 100GB。当任一条件超出限制时,需删除部分文件或文件内容以满足要求后再尝试上传,详情请参见OpenAI文件接口兼容。
通过纯文本上传
适用场景:适合小规模文档或临时内容。如果文档较短且不需要长期存储,可以选择此方式。
请根据您的具体需求和文档特性选择最合适的上传方式。我们建议优先考虑 文件ID 上传,以获得最佳体验。
通过文件ID传入文档信息
您可以通过OpenAI兼容接口上传文档,并将返回的文件ID输入到System Message中,使得模型在回复时参考文档信息。
文件ID目前仅能用于Qwen-Long、Qwen-Doc-Turbo模型以及Batch接口调用。
Qwen-Doc-Turbo模型可以基于您上传的文档进行回复。此处以阿里云百炼系列手机产品介绍.docx作为示例文件。
将文件通过OpenAI兼容接口上传到阿里云百炼平台,保存至平台安全存储空间后获取
文件ID
。有关文档上传接口的详细参数解释及调用方式,请参考API文档页面进行了解。Python
import os from pathlib import Path from openai import OpenAI client = OpenAI( api_key=os.getenv("DASHSCOPE_API_KEY"), # 如果您没有配置环境变量,请在此处替换您的API-KEY base_url="https://dashscope.aliyuncs.com/compatible-mode/v1", # 填写DashScope服务base_url ) file_object = client.files.create(file=Path("阿里云百炼系列手机产品介绍.docx"), purpose="file-extract") print(file_object.id)
Java
import com.openai.client.OpenAIClient; import com.openai.client.okhttp.OpenAIOkHttpClient; import com.openai.models.*; import java.nio.file.Path; import java.nio.file.Paths; public class Main { public static void main(String[] args) { // 创建客户端,使用环境变量中的API密钥 OpenAIClient client = OpenAIOkHttpClient.builder() .apiKey(System.getenv("DASHSCOPE_API_KEY")) .baseUrl("https://dashscope.aliyuncs.com/compatible-mode/v1") .build(); // 设置文件路径,请根据实际需求修改路径与文件名 Path filePath = Paths.get("src/main/java/org/example/阿里云百炼系列手机产品介绍.docx"); // 创建文件上传参数 FileCreateParams fileParams = FileCreateParams.builder() .file(filePath) .purpose(FilePurpose.of("file-extract")) .build(); // 上传文件打印fileid FileObject fileObject = client.files().create(fileParams); System.out.println(fileObject.id()); } }
curl
curl --location --request POST 'https://dashscope.aliyuncs.com/compatible-mode/v1/files' \ --header "Authorization: Bearer $DASHSCOPE_API_KEY" \ --form 'file=@"阿里云百炼系列手机产品介绍.docx"' \ --form 'purpose="file-extract"'
运行以上代码,您可以得到本次上传文件对应的
文件ID
。将
文件ID
传入System Message中,并在User Message中输入问题。import os from openai import OpenAI, BadRequestError client = OpenAI( api_key=os.getenv("DASHSCOPE_API_KEY"), base_url="https://dashscope.aliyuncs.com/compatible-mode/v1", ) try: completion = client.chat.completions.create( model="qwen-doc-turbo", messages=[ {'role': 'system', 'content': 'You are a helpful assistant.'}, # 如果您没有配置环境变量,请在此处替换您的API-KEY {'role': 'system', 'content': 'fileid://file-fe-xxx'}, {'role': 'user', 'content': '阿里云百炼都有那些手机?'} ], # 所有代码示例均采用流式输出,以清晰和直观地展示模型输出过程。如果您希望查看非流式输出的案例,请参见https://help.aliyun.com/zh/model-studio/text-generation stream=True, stream_options={"include_usage": True} ) full_content = "" for chunk in completion: if chunk.choices and chunk.choices[0].delta.content: full_content += chunk.choices[0].delta.content print(chunk.model_dump()) print(full_content) except BadRequestError as e: print(f"错误信息:{e}") print("请参考文档:https://help.aliyun.com/zh/model-studio/developer-reference/error-code")
import com.openai.client.OpenAIClient; import com.openai.client.okhttp.OpenAIOkHttpClient; import com.openai.core.http.StreamResponse; import com.openai.models.*; public class Main { public static void main(String[] args) { // 创建客户端,使用环境变量中的API密钥 OpenAIClient client = OpenAIOkHttpClient.builder() .apiKey(System.getenv("DASHSCOPE_API_KEY")) ////请将 'file-fe-xxx'替换为您实际对话场景所使用的 fileid。 .baseUrl("https://dashscope.aliyuncs.com/compatible-mode/v1") .build(); ChatCompletionCreateParams chatParams = ChatCompletionCreateParams.builder() .addSystemMessage("You are a helpful assistant.") .addSystemMessage("fileid://file-fe-xxx") .addUserMessage("阿里云百炼都有那些手机?") .model("qwen-doc-turbo") .build(); try (StreamResponse<ChatCompletionChunk> streamResponse = client.chat().completions().createStreaming(chatParams)) { streamResponse.stream().forEach(chunk -> { String content = chunk.choices().get(0).delta().content().orElse(""); if (!content.isEmpty()) { System.out.print(content); } }); } catch (Exception e) { System.err.println("错误信息:" + e.getMessage()); } } }
curl --location 'https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions' \ --header "Authorization: Bearer $DASHSCOPE_API_KEY" \ --header "Content-Type: application/json" \ --data '{ "model": "qwen-doc-turbo", "messages": [ {"role": "system","content": "You are a helpful assistant."}, {"role": "system","content": "fileid://file-fe-xxx"}, {"role": "user","content": "阿里云百炼都有那些手机?"} ], "stream": true, "stream_options": { "include_usage": true } }'
通过配置
stream
及stream_options
参数,Qwen-Doc-Turbo模型会流式输出回复,并在最后返回的对象中通过usage字段展示Token使用情况。{'id': 'chatcmpl-ddbb10e9-dba2-930e-9e5b-xxxxxxxxxxxx', 'choices': [{'delta': {'content': '这篇文章', 'function_call': None, 'refusal': None, 'role': None, 'tool_calls': None}, 'finish_reason': None, 'index': 0, 'logprobs': None}], 'created': 1753076578, 'model': 'qwen-doc-turbo', 'object': 'chat.completion.chunk', 'service_tier': None, 'system_fingerprint': None, 'usage': None} {'id': 'chatcmpl-ddbb10e9-dba2-930e-9e5b-xxxxxxxxxxxx', 'choices': [{'delta': {'content': '是', 'function_call': None, 'refusal': None, 'role': None, 'tool_calls': None}, 'finish_reason': None, 'index': 0, 'logprobs': None}], 'created': 1753076578, 'model': 'qwen-doc-turbo', 'object': 'chat.completion.chunk', 'service_tier': None, 'system_fingerprint': None, 'usage': None} ... {'id': 'chatcmpl-ddbb10e9-dba2-930e-9e5b-xxxxxxxxxxxx', 'choices': [{'delta': {'content': '方面的强大功能和', 'function_call': None, 'refusal': None, 'role': None, 'tool_calls': None}, 'finish_reason': None, 'index': 0, 'logprobs': None}], 'created': 1753076578, 'model': 'qwen-doc-turbo', 'object': 'chat.completion.chunk', 'service_tier': None, 'system_fingerprint': None, 'usage': None} {'id': 'chatcmpl-ddbb10e9-dba2-930e-9e5b-xxxxxxxxxxxx', 'choices': [{'delta': {'content': '高性价比,吸引', 'function_call': None, 'refusal': None, 'role': None, 'tool_calls': None}, 'finish_reason': None, 'index': 0, 'logprobs': None}], 'created': 1753076578, 'model': 'qwen-doc-turbo', 'object': 'chat.completion.chunk', 'service_tier': None, 'system_fingerprint': None, 'usage': None} {'id': 'chatcmpl-ddbb10e9-dba2-930e-9e5b-xxxxxxxxxxxx', 'choices': [{'delta': {'content': '潜在消费者的关注。', 'function_call': None, 'refusal': None, 'role': None, 'tool_calls': None}, 'finish_reason': None, 'index': 0, 'logprobs': None}], 'created': 1753076578, 'model': 'qwen-doc-turbo', 'object': 'chat.completion.chunk', 'service_tier': None, 'system_fingerprint': None, 'usage': None} 这篇文章是关于“百炼系列手机产品介绍”的内容。文章详细介绍了百炼系列的不同手机型号,包括它们的屏幕尺寸、分辨率、刷新率、存储空间、RAM、电池容量、摄像头配置、特色功能以及参考售价等信息。每一款手机都强调了其在特定领域的优势,如视觉体验、摄影体验、游戏性能、轻薄便携性以及折叠屏创新等。文章旨在展示百炼手机系列在各个方面的强大功能和高性价比,吸引潜在消费者的关注。
ChatCompletionChunk{id=chatcmpl-2a9ebb9d-93ed-9342-8139-xxxxxxxxxxxx, choices=[Choice{delta=Delta{content=, functionCall=, refusal=, role=assistant, toolCalls=, additionalProperties={}}, finishReason=null, index=0, logprobs=null, additionalProperties={}}], created=1744943511, model=qwen-doc-turbo, object_=chat.completion.chunk, serviceTier=, systemFingerprint=null, usage=null, additionalProperties={}} ChatCompletionChunk{id=chatcmpl-2a9ebb9d-93ed-9342-8139-xxxxxxxxxxxx, choices=[Choice{delta=Delta{content=这篇文章, functionCall=, refusal=, role=, toolCalls=, additionalProperties={}}, finishReason=null, index=0, logprobs=null, additionalProperties={}}], created=1744943511, model=qwen-doc-turbo, object_=chat.completion.chunk, serviceTier=, systemFingerprint=null, usage=null, additionalProperties={}} ChatCompletionChunk{id=chatcmpl-2a9ebb9d-93ed-9342-8139-xxxxxxxxxxxx, choices=[Choice{delta=Delta{content=介绍了, functionCall=, refusal=, role=, toolCalls=, additionalProperties={}}, finishReason=null, index=0, logprobs=null, additionalProperties={}}], created=1744943511, model=qwen-doc-turbo, object_=chat.completion.chunk, serviceTier=, systemFingerprint=null, usage=null, additionalProperties={}} ... ChatCompletionChunk{id=chatcmpl-2a9ebb9d-93ed-9342-8139-xxxxxxxxxxxx, choices=[Choice{delta=Delta{content=手中的“科技艺术品, functionCall=, refusal=, role=, toolCalls=, additionalProperties={}}, finishReason=null, index=0, logprobs=null, additionalProperties={}}], created=1744943511, model=qwen-doc-turbo, object_=chat.completion.chunk, serviceTier=, systemFingerprint=null, usage=null, additionalProperties={}} ChatCompletionChunk{id=chatcmpl-2a9ebb9d-93ed-9342-8139-xxxxxxxxxxxx, choices=[Choice{delta=Delta{content=”。, functionCall=, refusal=, role=, toolCalls=, additionalProperties={}}, finishReason=null, index=0, logprobs=null, additionalProperties={}}], created=1744943511, model=qwen-doc-turbo, object_=chat.completion.chunk, serviceTier=, systemFingerprint=null, usage=null, additionalProperties={}} ChatCompletionChunk{id=chatcmpl-2a9ebb9d-93ed-9342-8139-xxxxxxxxxxxx, choices=[Choice{delta=Delta{content=, functionCall=, refusal=, role=, toolCalls=, additionalProperties={}}, finishReason=stop, index=0, logprobs=null, additionalProperties={}}], created=1744943511, model=qwen-doc-turbo, object_=chat.completion.chunk, serviceTier=, systemFingerprint=null, usage=null, additionalProperties={}} 这篇文章介绍了多个品牌的手机产品,具体描述了每一款手机的主要特点和卖点,包括屏幕尺寸.....每款手机都强调了自己的独特之处,力求成为用户手中的“科技艺术品”。
data: {"choices":[{"delta":{"content":"","role":"assistant"},"index":0,"logprobs":null,"finish_reason":null}],"object":"chat.completion.chunk","usage":null,"created":1728649489,"system_fingerprint":null,"model":"qwen-doc-turbo","id":"chatcmpl-e2434284-140a-9e3a-8ca5-f81e65e98d01"} data: {"choices":[{"finish_reason":null,"delta":{"content":"这篇文章"},"index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1728649489,"system_fingerprint":null,"model":"qwen-doc-turbo","id":"chatcmpl-e2434284-140a-9e3a-8ca5-f81e65e98d01"} data: {"choices":[{"delta":{"content":"是"},"finish_reason":null,"index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1728649489,"system_fingerprint":null,"model":"qwen-doc-turbo","id":"chatcmpl-e2434284-140a-9e3a-8ca5-f81e65e98d01"} data: {"choices":[{"delta":{"content":"关于"},"finish_reason":null,"index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1728649489,"system_fingerprint":null,"model":"qwen-doc-turbo","id":"chatcmpl-e2434284-140a-9e3a-8ca5-f81e65e98d01"} ..... data: {"choices":[{"delta":{"content":"描述了每款"},"finish_reason":null,"index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1728649489,"system_fingerprint":null,"model":"qwen-doc-turbo","id":"chatcmpl-e2434284-140a-9e3a-8ca5-f81e65e98d01"} data: {"choices":[{"delta":{"content":"手机的主要特点和"},"finish_reason":null,"index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1728649489,"system_fingerprint":null,"model":"qwen-doc-turbo","id":"chatcmpl-e2434284-140a-9e3a-8ca5-f81e65e98d01"} data: {"choices":[{"delta":{"content":"规格,并提供了参考"},"finish_reason":null,"index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1728649489,"system_fingerprint":null,"model":"qwen-doc-turbo","id":"chatcmpl-e2434284-140a-9e3a-8ca5-f81e65e98d01"} data: {"choices":[{"delta":{"content":"售价信息。"},"finish_reason":null,"index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1728649489,"system_fingerprint":null,"model":"qwen-doc-turbo","id":"chatcmpl-e2434284-140a-9e3a-8ca5-f81e65e98d01"} data: {"choices":[{"finish_reason":"stop","delta":{"content":""},"index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1728649489,"system_fingerprint":null,"model":"qwen-doc-turbo","id":"chatcmpl-e2434284-140a-9e3a-8ca5-f81e65e98d01"} data: {"choices":[],"object":"chat.completion.chunk","usage":{"prompt_tokens":5395,"completion_tokens":71,"total_tokens":5466},"created":1728649489,"system_fingerprint":null,"model":"qwen-doc-turbo","id":"chatcmpl-e2434284-140a-9e3a-8ca5-f81e65e98d01"} data: [DONE]
纯文本传入文档信息
除了通过 文件ID 传入文档信息外,您还可以直接使用字符串传入文档内容。在此方法下,为避免模型混淆角色设定与文档内容,请确保在 messages
的第一条消息中添加用于角色设定的信息。
import os
from openai import OpenAI, BadRequestError
client = OpenAI(
api_key=os.getenv("DASHSCOPE_API_KEY"),
base_url="https://dashscope.aliyuncs.com/compatible-mode/v1",
)
try:
completion = client.chat.completions.create(
model="qwen-doc-turbo",
messages=[
{'role': 'system', 'content': 'You are a helpful assistant.'},
# 如果您没有配置环境变量,请在此处替换您的API-KEY
{'role': 'system', 'content': '阿里云百炼手机产品介绍 阿里云百炼X1 ——————畅享极致视界:搭载6.7英寸1440 x 3200像素超清屏幕...'},
{'role': 'user', 'content': '阿里云百炼都有那些手机?'}
],
# 所有代码示例均采用流式输出,以清晰和直观地展示模型输出过程。如果您希望查看非流式输出的案例,请参见https://help.aliyun.com/zh/model-studio/text-generation
stream=True,
stream_options={"include_usage": True}
)
full_content = ""
for chunk in completion:
if chunk.choices and chunk.choices[0].delta.content:
full_content += chunk.choices[0].delta.content
print(chunk.model_dump())
print(full_content)
except BadRequestError as e:
print(f"错误信息:{e}")
print("请参考文档:https://help.aliyun.com/zh/model-studio/developer-reference/error-code")
import com.openai.client.OpenAIClient;
import com.openai.client.okhttp.OpenAIOkHttpClient;
import com.openai.core.http.StreamResponse;
import com.openai.models.*;
public class Main {
public static void main(String[] args) {
// 创建客户端,使用环境变量中的API密钥
OpenAIClient client = OpenAIOkHttpClient.builder()
.apiKey(System.getenv("DASHSCOPE_API_KEY"))
////请将 'file-fe-xxx'替换为您实际对话场景所使用的 fileid。
.baseUrl("https://dashscope.aliyuncs.com/compatible-mode/v1")
.build();
ChatCompletionCreateParams chatParams = ChatCompletionCreateParams.builder()
.addSystemMessage("You are a helpful assistant.")
.addSystemMessage("阿里云百炼手机产品介绍 阿里云百炼X1 ——————畅享极致视界:搭载6.7英寸1440 x 3200像素超清屏幕...")
.addUserMessage("阿里云百炼都有那些手机?")
.model("qwen-doc-turbo")
.build();
try (StreamResponse<ChatCompletionChunk> streamResponse = client.chat().completions().createStreaming(chatParams)) {
streamResponse.stream().forEach(chunk -> {
String content = chunk.choices().get(0).delta().content().orElse("");
if (!content.isEmpty()) {
System.out.print(content);
}
});
} catch (Exception e) {
System.err.println("错误信息:" + e.getMessage());
}
}
}
curl --location 'https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions' \
--header "Authorization: Bearer $DASHSCOPE_API_KEY" \
--header "Content-Type: application/json" \
--data '{
"model": "qwen-doc-turbo",
"messages": [
{"role": "system","content": "You are a helpful assistant."},
{"role": "system","content": "阿里云百炼X1 —— 畅享极致视界:搭载6.7英寸1440 x 3200像素超清屏幕,搭配120Hz刷新率,..."},
{"role": "user","content": "阿里云百炼都有那些手机?"}
],
"stream": true,
"stream_options": {
"include_usage": true
}
}'
常见问题
Dashscope SDK的调用方式是否兼容?
是的,模型调用兼容DashScope SDK,但文件上传仅限OpenAI SDK。
不同的API Key之间能否共享
文件ID
进行调用?只能在同一个阿里云账号内的API Key之间共享。
通过OpenAI文件兼容接口上传文件后,文件将被保存在何处?
保存在当前阿里云账号的百炼存储空间,不产生费用。关于所上传文件的信息查询与管理请参考OpenAI文件接口。
文档ID是否可以用于其他模型对话或功能调用?
文件ID目前仅能用于Qwen-Long、Qwen-Doc-Turbo模型对话以及Batch接口批量调用。
有非流式输出的代码请求示例参考吗?
相关内容请参考非流式输出案例。
API参考
关于Qwen-Doc-Turbo模型的输入与输出参数,请参考通义千问API详情。
错误码
如果模型调用失败并返回报错信息,请参见错误信息进行解决。
限流
模型限流触发条件请参考:限流。