如何使用Qwen-Long API_模型服务灵积(DashScope)-阿里云帮助中心

qwen-long是通义千问模型家族中，提供具备强大长文本处理能力的模型，最大可支持千万tokens的对话窗口，并通过与OpenAI兼容的模式提供API服务。参考Chat，只需配置DashScope的API key以及服务的base_url，即可访问（注：Dashscope SDK调用的方式仍然兼容）。

模型概览

模型名	模型简介
qwen-long	通义千问超大规模语言模型，支持长文本上下文，以及基于长文档、多文档等多个场景的对话功能。具体支持的文档格式与限制，可参见File。

使用说明

模型能力限制

qwen-long可支持最大10,000,000 tokens（包含您的问答历史及上传文档的总tokens）的上下文，请参照以下说明选择适合您的使用方式。

场景示例

为了帮助您选择合适的模型使用方式，我们提供以下几种较为常见的场景及模型的使用方法供您参考。

单文档对话

单文档对话	方法介绍	方法说明
方式1	通过文件服务上传文件获取fileid 放在system message中进行对话，参考单文档。	推荐使用该方法。
方式2	可直接将文档内容放在system message中，参考单文档。	1M tokens以下的文档可选用该方法。

多文档对话

当您在本轮对话时明确有多个文档需要对话时

多文档对话	方法介绍	方法说明
方式1	通过文件服务将所有需要对话的文档上传，并将所有的文档id输入system message，参考多文档。	推荐使用该方法。
方式2	直接每个文档内容放进一个system message中，一并传给大模型，可参考多文档。	不推荐使用该方法，直接输入多文档内容进行对话。受API调用请求大小所限，大量输入文本内容（超过1M tokens）可能会受到限制。

当您需要在对话中追加文档进行对话时

追加文档对话	方法介绍	方法说明
方式1	持续通过文件服务上传文档，而后将待追加的文档id填入system message中继续对话，参考追加文档。	推荐使用该方法。
方式2	持续将待追加的文件内容直接放入system message中继续对话，参考追加文档。	不推荐使用该方法，直接输入多文档内容进行对话。受API调用请求大小所限，大量输入文本内容（超过1M tokens）可能会受到限制。

通过OpenAI SDK调用

前提条件

已开通服务并获得API-KEY：API-KEY的获取与配置。
安装OpenAI SDK，注意OpenAI SDK版本需不低于1.0.0

重要

安装方式：pip install --upgrade 'openai>=1.0'

检查版本：python -c 'import openai; print("version =",openai.__version__)'

使用方式

qwen-long支持长文本（文档）对话，文档内容需放在role为system的message中，有以下两种方式可将文档信息输入给模型：

在提前上传文件获取文件id（fileid）后，可以直接提供fileid。其中上传文件的接口和操作方法可参考File。支持在对话中，使用一个或多个fileid。
直接输入需要处理的文本格式的文档内容（file content）。

重要

请避免直接将文档内容放在role为user的message中，role为user的message及用于role play的system message限制输入最长为9K tokens。

重要

使用qwen-long时，通过system message提供文档信息时，还必须同时提供一个正常role-play的system message，默认为"You are a helpful assistant."，您也可以根据实际需求进行自定义修改，例如“你是一个文本解读专家。”等等。请参照文档中的范例代码作为参考。

示例代码

以文档id（fileid）方式输入文档：

单文档

流式输出
使用方法：stream 设置为True

from pathlib import Path
from openai import OpenAI

client = OpenAI(
    api_key="$your-dashscope-api-key",  # 替换成真实DashScope的API_KEY
    base_url="https://dashscope.aliyuncs.com/compatible-mode/v1",  # 填写DashScopebase_url
)

# data.pdf 是一个示例文件
file = client.files.create(file=Path("data.pdf"), purpose="file-extract")

# 新文件上传后需要等待模型解析，首轮rt可能较长
completion = client.chat.completions.create(
    model="qwen-long",
    messages=[
        {
            'role': 'system',
            'content': 'You are a helpful assistant.'
        },
        {
            'role': 'system',
            'content': f'fileid://{file.id}'
        },
        {
            'role': 'user',
            'content': '这篇文章讲了什么？'
        }
    ],
    stream=True
)
for chunk in completion:
    if chunk.choices[0].delta.content is not None:
        print(chunk.choices[0].dict())

非流式输出
使用方法：stream 设置为False

from pathlib import Path
from openai import OpenAI

client = OpenAI(
    api_key="$your-dashscope-api-key",  # 替换成真实DashScope的API_KEY
    base_url="https://dashscope.aliyuncs.com/compatible-mode/v1",  # 填写DashScopebase_url
)

# data.pdf 是一个示例文件
file = client.files.create(file=Path("data.pdf"), purpose="file-extract")

# 新文件上传后需要等待模型解析，首轮rt可能较长
completion = client.chat.completions.create(
    model="qwen-long",
    messages=[
        {
            'role': 'system',
            'content': 'You are a helpful assistant.'
        },
        {
            'role': 'system',
            'content': f'fileid://{file.id}'
        },
        {
            'role': 'user',
            'content': '这篇文章讲了什么？'
        }
    ],
    stream=False
)

print(completion.choices[0].message.dict())

多文档

流式输出
使用方法：stream 设置为True

from openai import OpenAI

client = OpenAI(
    api_key="$your-dashscope-api-key",  # 替换成真实DashScope的API_KEY
    base_url="https://dashscope.aliyuncs.com/compatible-mode/v1",  # 填写DashScopebase_url
)

# 首次对话会等待文档解析完成，首次rt可能较长
completion = client.chat.completions.create(
    model="qwen-long",
    messages=[
        {
            'role': 'system',
            'content': 'You are a helpful assistant.'
        },
        {
            'role': 'system',
            'content': 'fileid://file-fe-xxx,fileid://file-fe-yyy,fileid://file-fe-zzz'
        },
        {
            'role': 'user',
            'content': '这几篇文章讲了什么？'
        }
    ],
    stream=True
)
for chunk in completion:
    if chunk.choices[0].delta.content is not None:
        print(chunk.choices[0].dict())

非流式输出
参数stream置为False即可。

追加文档

说明

可在file system message中提供文档内容进行对话。包括直接提供文本以及通过文档服务上传文档后提供文档id来进行对话的两种方式。这两种文档输入方式在messages中暂不支持混合使用。

流式输出
使用方法：stream 设置为True

from pathlib import Path
from openai import OpenAI

client = OpenAI(
    api_key="$your-dashscope-api-key",  # 替换成真实DashScope的API_KEY
    base_url="https://dashscope.aliyuncs.com/compatible-mode/v1",  # 填写DashScopebase_url
)

# 首次对话会等待文档解析完成，首次rt可能较长
completion = client.chat.completions.create(
    model="qwen-long",
    messages=[
        {
            'role': 'system',
            'content': 'You are a helpful assistant.'
        },
        {
            'role': 'system',
            'content': 'fileid://file-fe-xxx'
        },
        {
            'role': 'user',
            'content': '这篇文章讲了什么？'
        },
        {
            'role': 'assistant',
            'content': '这篇文章主题为大模型预训练方法，主要内容是xxx'
        },
        {
            'role': 'system',
            'content': 'fileid://file-fe-yyy'
        },
        {
            'role': 'user',
            'content': '这两篇文章讨论的方法有什么异同点？'
        },
    ],
    stream=True
)
for chunk in completion:
    if chunk.choices[0].delta.content is not None:
        print(chunk.choices[0].dict())

非流式输出
参数stream置为False即可。

以文本方式直接输入文档内容：

单文档

流式输出
stream 设置为True

from openai import OpenAI

client = OpenAI(
    api_key="$your-dashscope-api-key",  # 替换成真实DashScope的API_KEY
    base_url="https://dashscope.aliyuncs.com/compatible-mode/v1",  # 填写DashScopeendpoint
)

completion = client.chat.completions.create(
    model="qwen-long",
    messages=[
        {
            'role': 'system',
            'content': 'You are a helpful assistant.'
        },
        {
            'role': 'system',
            'content': '大型语言模型(llm)已经彻底改变了人工智能领域，使以前被认为是人类独有的自然语言处理任务成为可能...'
        },
        {
            'role': 'user',
            'content': '文章讲了什么？'
        }
    ],
    stream=True
)
for chunk in completion:
    if chunk.choices[0].delta.content is not None:
        print(chunk.choices[0].dict())

非流式输出
参数stream置为False即可。

多文档

说明

您可直接输入多文档内容进行对话。受API调用请求大小所限，大量输入文本内容可能会受到限制（当前的报文长度可支持约1M tokens的content长度，超过限制时无法保证）。为保证您的对话体验，我们建议您在需要基于大量文本内容进行对话时，尤其是使用了多文档场景时，尽量单独上传文档，再根据文档id进行对话。

流式输出
stream 设置为True

from openai import OpenAI

client = OpenAI(
    api_key="$your-dashscope-api-key",  # 替换成真实DashScope的API_KEY
    base_url="https://dashscope.aliyuncs.com/compatible-mode/v1",  # 填写DashScopeendpoint
)

completion = client.chat.completions.create(
    model="qwen-long",
    messages=[
        {
            'role': 'system',
            'content': 'You are a helpful assistant.'
        },
        {
            'role': 'system',
            'content': '大型语言模型(llm)已经彻底改变了人工智能领域，使以前被认为是人类独有的自然语言处理任务成为可能...'
        },
        {
            'role': 'system',
            'content': '大型语言模型的训练分为两个阶段:...'
        },
        {
            'role': 'user',
            'content': '这两篇文章讨论的方法有什么异同点？'
        },
    ],
    stream=True
)
for chunk in completion:
    if chunk.choices[0].delta.content is not None:
        print(chunk.choices[0].dict())

非流式输出
参数stream置为False即可。

追加文档

说明

流式输出
stream 设置为True

from openai import OpenAI

client = OpenAI(
    api_key="$your-dashscope-api-key",  # 替换成真实DashScope的API_KEY
    base_url="https://dashscope.aliyuncs.com/compatible-mode/v1",  # 填写DashScopeendpoint
)

completion = client.chat.completions.create(
    model="qwen-long",
    messages=[
        {
            'role': 'system',
            'content': 'You are a helpful assistant.'
        },
        {
            'role': 'system',
            'content': '大型语言模型(llm)已经彻底改变了人工智能领域，使以前被认为是人类独有的自然语言处理任务成为可能...'
        },
        {
            'role': 'user',
            'content': '文章讲了什么？'
        },
        {
            'role': 'assistant',
            'content': '这篇文章主题为大模型预训练方法，主要内容是...'
        },
        {
            'role': 'system',
            'content': '大型语言模型的训练分为两个阶段:...'
        },
        {
            'role': 'user',
            'content': '这两篇文章讨论的方法有什么异同点？'
        },
    ],
    stream=True
)
for chunk in completion:
    if chunk.choices[0].delta.content is not None:
        print(chunk.choices[0].dict())

非流式输出
参数stream置为False即可。

参数配置与OpenAI的接口参数对齐，当前已支持的参数如下，更多参数支持添加中，详情可参考输入参数配置

参数	类型	默认值	说明
model	string	-	当前模型为qwen-long
messages	list	-	用户与模型的对话历史。list中的每个元素形式为{"role":角色, "content": 内容}。角色当前可选值：system、user、assistant。其中user和assistant需要交替出现。 messages如出现多个role为system的message，第一个会作为系统设置，第二个及以后的system message会作为file system输入。
top_p (可选）	float	-	生成过程中的核采样方法概率阈值，例如，取值为0.8时，仅保留概率加起来大于等于0.8的最可能token的最小集合作为候选集。取值范围为（0,1.0)，取值越大，生成的随机性越高；取值越低，生成的确定性越高。
temperature（可选）	float	-	用于控制模型回复的随机性和多样性。具体来说，temperature值控制了生成文本时对每个候选词的概率分布进行平滑的程度。较高的temperature值会降低概率分布的峰值，使得更多的低概率词被选择，生成结果更加多样化；而较低的temperature值则会增强概率分布的峰值，使得高概率词更容易被选择，生成结果更加确定。取值范围： [0, 2)，不建议取值为0，无意义。
max_tokens（可选）	integer	2000	指定模型可生成的最大token个数。例如模型最大输出长度为2k，您可以设置为1k，防止模型输出过长的内容。不同的模型有不同的输出上限，例如qwen-max输出上限为2k，qwen-plus输出上限为8k。
stream (可选）	boolean	False	用于控制是否使用流式输出。当以stream模式输出结果时，接口返回结果为generator，需要通过迭代获取结果，默认每次输出为当前生成的整个序列，最后一次输出为最终全部生成结果。
stop (可选）	string or array	None	stop参数用于实现内容生成过程的精确控制，在模型生成的内容即将包含指定的字符串或token_id时自动停止。stop可以为string类型或array类型。 string类型当模型将要生成指定的stop词语时停止。例如将stop指定为"你好"，则模型将要生成“你好”时停止。 array类型 array中的元素可以为token_id或者字符串，或者元素为token_id的array。当模型将要生成的token或其对应的token_id在stop中时，模型生成将会停止。以下为stop为array时的示例（tokenizer对应模型为qwen-turbo）： 1.元素为token_id： token_id为108386和104307分别对应token为“你好”和“天气”，设定stop为[108386,104307]，则模型将要生成“你好”或者“天气”时停止。 2.元素为字符串：设定stop为["你好","天气"]，则模型将要生成“你好”或者“天气”时停止。 3.元素为array： token_id为108386和103924分别对应token为“你好”和“啊”，token_id为35946和101243分别对应token为“我”和“很好”。设定stop为[[108386, 103924],[35946, 101243]]，则模型将要生成“你好啊”或者“我很好”时停止。说明 stop为array类型时，不可以将token_id和字符串同时作为元素输入，比如不可以指定stop为["你好",104307]。

返回结果

非stream返回结果示例

Python

{
  'content': '文章探讨了大型语言模型训练的两个阶段：无监督预训练和大规模指令微调与强化学习，并提出了一种名为LIMA的语言模型，它是一个基于LLaMa的650亿参数模型，仅通过1000个精心挑选的提示和响应进行标准监督损失微调，未涉及强化学习或人类偏好建模。LIMA展示了强大的性能，能从少数示例中学习到特定的响应格式，处理包括规划旅行行程到推测替代历史等复杂查询，并能较好地泛化到未见过的任务上。\n\n通过对比实验，发现LIMA在43%的情况下，其生成的回复与GPT-4相比要么等同要么更受欢迎；与Bard相比这一比例为58%，与经过人类反馈训练的DaVinci003相比更是高达65%。这些结果强烈表明，大型语言模型中的几乎所有知识都是在预训练阶段学到的，而只需要有限的指令微调数据即可教会模型产生高质量的输出。\n\n此外，文章还涵盖了关于时间旅行的虚构故事创作、对古埃及文明的兴趣描述、以及如何帮助聪明孩子交友的建议等内容，展示了语言模型在多样任务上的应用能力。同时，提到了一个关于营销策略的计划概要，以及美国总统拜登面临的经济挑战与就业市场分析。最后，还包含了有关回答质量影响的测试、如何以乔治·卡林风格编写单口相声段子的示例，以及如何制作shakshuka食谱的指导。',
  'role': 'assistant',
  'function_call': None,
  'tool_calls': None
}

stream返回结果

{'delta': {'content': '文章', 'function_call': None, 'role': 'assistant', 'tool_calls': None}, 'finish_reason': 'null', 'index': 0, 'logprobs': None}
{'delta': {'content': '主要', 'function_call': None, 'role': None, 'tool_calls': None}, 'finish_reason': 'null', 'index': 0, 'logprobs': None}
{'delta': {'content': '探讨', 'function_call': None, 'role': None, 'tool_calls': None}, 'finish_reason': 'null', 'index': 0, 'logprobs': None}
{'delta': {'content': '了一种名为LIMA', 'function_call': None, 'role': None, 'tool_calls': None}, 'finish_reason': 'null', 'index': 0, 'logprobs': None}
{'delta': {'content': '的语言模型的训练方法及其对齐', 'function_call': None, 'role': None, 'tool_calls': None}, 'finish_reason': 'null', 'index': 0, 'logprobs': None}
{'delta': {'content': '能力的评估。LIMA是一个拥有', 'function_call': None, 'role': None, 'tool_calls': None}, 'finish_reason': 'null', 'index': 0, 'logprobs': None}
{'delta': {'content': '650亿参数的大型语言', 'function_call': None, 'role': None, 'tool_calls': None}, 'finish_reason': 'null', 'index': 0, 'logprobs': None}
{'delta': {'content': '模型，它仅通过在10', 'function_call': None, 'role': None, 'tool_calls': None}, 'finish_reason': 'null', 'index': 0, 'logprobs': None}
{'delta': {'content': '00个精心挑选的提示和', 'function_call': None, 'role': None, 'tool_calls': None}, 'finish_reason': 'null', 'index': 0, 'logprobs': None}
{'delta': {'content': '响应上进行标准监督微调来', 'function_call': None, 'role': None, 'tool_calls': None}, 'finish_reason': 'null', 'index': 0, 'logprobs': None}
{'delta': {'content': '完成训练，过程中并未采用强化学习', 'function_call': None, 'role': None, 'tool_calls': None}, 'finish_reason': 'null', 'index': 0, 'logprobs': None}
{'delta': {'content': '或人类偏好建模。研究结果显示', 'function_call': None, 'role': None, 'tool_calls': None}, 'finish_reason': 'null', 'index': 0, 'logprobs': None}
{'delta': {'content': '，尽管训练数据有限，LIMA', 'function_call': None, 'role': None, 'tool_calls': None}, 'finish_reason': 'null', 'index': 0, 'logprobs': None}
{'delta': {'content': '仍能展现出强大的性能，能够从', 'function_call': None, 'role': None, 'tool_calls': None}, 'finish_reason': 'null', 'index': 0, 'logprobs': None}
{'delta': {'content': '少数示例中学习到特定的', 'function_call': None, 'role': None, 'tool_calls': None}, 'finish_reason': 'null', 'index': 0, 'logprobs': None}
{'delta': {'content': '响应格式，并泛化到未见过', 'function_call': None, 'role': None, 'tool_calls': None}, 'finish_reason': 'null', 'index': 0, 'logprobs': None}
{'delta': {'content': '的任务上。\n\n对比实验表明，在控制', 'function_call': None, 'role': None, 'tool_calls': None}, 'finish_reason': 'null', 'index': 0, 'logprobs': None}
{'delta': {'content': '条件下，相对于GPT-4、', 'function_call': None, 'role': None, 'tool_calls': None}, 'finish_reason': 'null', 'index': 0, 'logprobs': None}
{'delta': {'content': 'Bard和DaVinci00', 'function_call': None, 'role': None, 'tool_calls': None}, 'finish_reason': 'null', 'index': 0, 'logprobs': None}
{'delta': {'content': '3等其他模型，人们更倾向于', 'function_call': None, 'role': None, 'tool_calls': None}, 'finish_reason': 'null', 'index': 0, 'logprobs': None}
{'delta': {'content': 'LIMA生成的回复，分别有', 'function_call': None, 'role': None, 'tool_calls': None}, 'finish_reason': 'null', 'index': 0, 'logprobs': None}
{'delta': {'content': '43%、58%和', 'function_call': None, 'role': None, 'tool_calls': None}, 'finish_reason': 'null', 'index': 0, 'logprobs': None}
{'delta': {'content': '65%的情况下认为LIMA的表现', 'function_call': None, 'role': None, 'tool_calls': None}, 'finish_reason': 'null', 'index': 0, 'logprobs': None}
{'delta': {'content': '更好或至少相当。这表明大型', 'function_call': None, 'role': None, 'tool_calls': None}, 'finish_reason': 'null', 'index': 0, 'logprobs': None}
{'delta': {'content': '语言模型在预训练阶段已经学', 'function_call': None, 'role': None, 'tool_calls': None}, 'finish_reason': 'null', 'index': 0, 'logprobs': None}
{'delta': {'content': '到了大量的知识，而只需少量的', 'function_call': None, 'role': None, 'tool_calls': None}, 'finish_reason': 'null', 'index': 0, 'logprobs': None}
{'delta': {'content': '指令微调数据即可让模型产生', 'function_call': None, 'role': None, 'tool_calls': None}, 'finish_reason': 'null', 'index': 0, 'logprobs': None}
{'delta': {'content': '高质量的输出，强调了“少', 'function_call': None, 'role': None, 'tool_calls': None}, 'finish_reason': 'null', 'index': 0, 'logprobs': None}
{'delta': {'content': '即是多”（Less is More）', 'function_call': None, 'role': None, 'tool_calls': None}, 'finish_reason': 'null', 'index': 0, 'logprobs': None}
{'delta': {'content': '的理念在模型对齐上的有效性。', 'function_call': None, 'role': None, 'tool_calls': None}, 'finish_reason': 'null', 'index': 0, 'logprobs': None}
{'delta': {'content': '\n\n此外，文章还提及了关于模型', 'function_call': None, 'role': None, 'tool_calls': None}, 'finish_reason': 'null', 'index': 0, 'logprobs': None}
{'delta': {'content': '输出质量的测试，以及使用不同', 'function_call': None, 'role': None, 'tool_calls': None}, 'finish_reason': 'null', 'index': 0, 'logprobs': None}
{'delta': {'content': '数量的示例进行微调时', 'function_call': None, 'role': None, 'tool_calls': None}, 'finish_reason': 'null', 'index': 0, 'logprobs': None}
{'delta': {'content': '模型稳定性的观察，进一步证明了', 'function_call': None, 'role': None, 'tool_calls': None}, 'finish_reason': 'null', 'index': 0, 'logprobs': None}
{'delta': {'content': '即使是小规模的、经过筛选的数据', 'function_call': None, 'role': None, 'tool_calls': None}, 'finish_reason': 'null', 'index': 0, 'logprobs': None}
{'delta': {'content': '也能显著提升模型性能。同时，', 'function_call': None, 'role': None, 'tool_calls': None}, 'finish_reason': 'null', 'index': 0, 'logprobs': None}
{'delta': {'content': '文中还包含了一些示例输出，', 'function_call': None, 'role': None, 'tool_calls': None}, 'finish_reason': 'null', 'index': 0, 'logprobs': None}
{'delta': {'content': '如有关如何帮助聪明的孩子交友的', 'function_call': None, 'role': None, 'tool_calls': None}, 'finish_reason': 'null', 'index': 0, 'logprobs': None}
{'delta': {'content': '建议、模仿乔治·卡林风格', 'function_call': None, 'role': None, 'tool_calls': None}, 'finish_reason': 'null', 'index': 0, 'logprobs': None}
{'delta': {'content': '的单口相声段子，以及', 'function_call': None, 'role': None, 'tool_calls': None}, 'finish_reason': 'null', 'index': 0, 'logprobs': None}
{'delta': {'content': '如何制作北非风味的shak', 'function_call': None, 'role': None, 'tool_calls': None}, 'finish_reason': 'null', 'index': 0, 'logprobs': None}
{'delta': {'content': 'shuka菜谱等，展示了模型', 'function_call': None, 'role': None, 'tool_calls': None}, 'finish_reason': 'null', 'index': 0, 'logprobs': None}
{'delta': {'content': '根据特定要求生成多样化内容的能力。', 'function_call': None, 'role': None, 'tool_calls': None}, 'finish_reason': 'null', 'index': 0, 'logprobs': None}
{'delta': {'content': '', 'function_call': None, 'role': None, 'tool_calls': None}, 'finish_reason': 'stop', 'index': 0, 'logprobs': None}

返回参数详情可参考返回参数

通过HTTP接口调用

您可以通过HTTP接口来调用服务，获得与通过HTTP接口调用OpenAI服务相同结构的返回结果。

前提条件

已开通DashScope并获得API-KEY：API-KEY的获取与配置。
我们推荐您将API-KEY配置到环境变量中以降低API-KEY的泄漏风险，配置方法可参考通过环境变量配置API-KEY。您也可以在代码中配置API-KEY，但是泄漏风险会提高。

提交接口调用

POST https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions

请求示例

以下示例展示通过CURL命令来调用API的脚本。

说明

需要使用您的API-KEY替换示例中的 $DASHSCOPE_API_KEY。

非流式输出

Shell

curl --location 'https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions' \
--header "Authorization: Bearer $DASHSCOPE_API_KEY" \
--header 'Content-Type: application/json' \
--data '{
    "model": "qwen-long",
    "messages": [
        {
            "role": "system",
            "content": "You are a helpful assistant."
        },
        {
            "role": "system",
            "content": "fileid://file-fe-xxx"
        },
        {
            "role": "user",
            "content": "文章讲了什么？"
        }
    ]
}'

运行命令可得到以下结果：

{
  "choices": [
    {
      "message": {
        "role": "assistant",
        "content": "文章主要探讨了大型语言模型的训练方法及其对齐（alignment）问题，特别是关于如何使这些模型更好地服务于最终任务和用户偏好。研究通过一个名为LIMA的项目进行，该项目基于一个650亿参数的LLaMa语言模型，仅使用精心挑选的1000个提示及其响应进行微调，未采用强化学习或人类反馈直接指导。结果显示，即使在如此有限的指令调整数据下，LIMA仍能展现出强劲的性能，能够从少数示例中学习到特定的响应格式，并泛化到未见过的任务上。\n\n研究强调了预训练阶段对于模型获取广泛知识的重要性，表明几乎所有知识都是在这一无监督阶段习得的，而后续的指令调优仅需少量数据即可引导模型产生高质量输出。此外，文中还提到了一些实验细节，比如使用过滤与未过滤数据源训练模型产生的质量差异，以及模型在不同场景下的应用示例，如提供建议、编写故事、讽刺喜剧等，进一步证明了模型的有效性和灵活性。"
      },
      "finish_reason": "stop",
      "index": 0,
      "logprobs": null
    }
  ],
  "object": "chat.completion",
  "usage": {
    "prompt_tokens": 0,
    "completion_tokens": 0,
    "total_tokens": 0
  },
  "created": 1715324557,
  "system_fingerprint": "",
  "model": "qwen-long",
  "id": "chatcmpl-07b1b68992a091a08d7e239bd5a4a566"
}

流式输出

如果您需要使用流式输出，请在请求体中指定stream参数为True。

curl --location 'https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions' \
--header "Authorization: Bearer $DASHSCOPE_API_KEY" \
--header 'Content-Type: application/json' \
--data '{
    "model": "qwen-long",
    "messages": [
        {
            "role": "system",
            "content": "You are a helpful assistant."
        },
        {
            "role": "system",
            "content": "fileid://file-fe-xxx"
        },
        {
            "role": "user",
            "content": "文章讲了什么？"
        }
    ],
    "stream":true
}'

运行命令可得到以下结果：

data:{"choices":[{"delta":{"content":"文章","role":"assistant"},"finish_reason":"null","index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1715334042,"system_fingerprint":"","model":"qwen-long","id":"chatcmpl-35e3c387759692f98041ed0a5dd9b72a"}

data:{"choices":[{"delta":{"content":"探讨"},"finish_reason":"null","index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1715334042,"system_fingerprint":"","model":"qwen-long","id":"chatcmpl-35e3c387759692f98041ed0a5dd9b72a"}

data:{"choices":[{"delta":{"content":"了"},"finish_reason":"null","index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1715334042,"system_fingerprint":"","model":"qwen-long","id":"chatcmpl-35e3c387759692f98041ed0a5dd9b72a"}

data:{"choices":[{"delta":{"content":"大型语言模型的训练"},"finish_reason":"null","index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1715334042,"system_fingerprint":"","model":"qwen-long","id":"chatcmpl-35e3c387759692f98041ed0a5dd9b72a"}

data:{"choices":[{"delta":{"content":"方法及其对齐（alignment）的重要性"},"finish_reason":"null","index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1715334042,"system_fingerprint":"","model":"qwen-long","id":"chatcmpl-35e3c387759692f98041ed0a5dd9b72a"}

data:{"choices":[{"delta":{"content":"，主要分为两个阶段：无监督"},"finish_reason":"null","index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1715334042,"system_fingerprint":"","model":"qwen-long","id":"chatcmpl-35e3c387759692f98041ed0a5dd9b72a"}

data:{"choices":[{"delta":{"content":"预训练和大规模指令微调及"},"finish_reason":"null","index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1715334042,"system_fingerprint":"","model":"qwen-long","id":"chatcmpl-35e3c387759692f98041ed0a5dd9b72a"}

data:{"choices":[{"delta":{"content":"强化学习。研究通过一个名为L"},"finish_reason":"null","index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1715334042,"system_fingerprint":"","model":"qwen-long","id":"chatcmpl-35e3c387759692f98041ed0a5dd9b72a"}

data:{"choices":[{"delta":{"content":"IMA的650亿参数语言"},"finish_reason":"null","index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1715334042,"system_fingerprint":"","model":"qwen-long","id":"chatcmpl-35e3c387759692f98041ed0a5dd9b72a"}

data:{"choices":[{"delta":{"content":"模型实验，该模型仅使用精心"},"finish_reason":"null","index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1715334042,"system_fingerprint":"","model":"qwen-long","id":"chatcmpl-35e3c387759692f98041ed0a5dd9b72a"}

data:{"choices":[{"delta":{"content":"挑选的1000个提示"},"finish_reason":"null","index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1715334042,"system_fingerprint":"","model":"qwen-long","id":"chatcmpl-35e3c387759692f98041ed0a5dd9b72a"}

data:{"choices":[{"delta":{"content":"及其响应进行标准监督损失微调"},"finish_reason":"null","index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1715334042,"system_fingerprint":"","model":"qwen-long","id":"chatcmpl-35e3c387759692f98041ed0a5dd9b72a"}

data:{"choices":[{"delta":{"content":"，未采用任何强化学习或人类"},"finish_reason":"null","index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1715334042,"system_fingerprint":"","model":"qwen-long","id":"chatcmpl-35e3c387759692f98041ed0a5dd9b72a"}

data:{"choices":[{"delta":{"content":"偏好建模。LIMA展现出强大的"},"finish_reason":"null","index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1715334042,"system_fingerprint":"","model":"qwen-long","id":"chatcmpl-35e3c387759692f98041ed0a5dd9b72a"}

data:{"choices":[{"delta":{"content":"性能，能从少数示例中"},"finish_reason":"null","index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1715334042,"system_fingerprint":"","model":"qwen-long","id":"chatcmpl-35e3c387759692f98041ed0a5dd9b72a"}

data:{"choices":[{"delta":{"content":"学习到特定的回答格式，处理复杂"},"finish_reason":"null","index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1715334042,"system_fingerprint":"","model":"qwen-long","id":"chatcmpl-35e3c387759692f98041ed0a5dd9b72a"}

data:{"choices":[{"delta":{"content":"查询，并且在未见过的任务上"},"finish_reason":"null","index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1715334042,"system_fingerprint":"","model":"qwen-long","id":"chatcmpl-35e3c387759692f98041ed0a5dd9b72a"}

data:{"choices":[{"delta":{"content":"表现出了良好的泛化能力。\n\n对比"},"finish_reason":"null","index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1715334042,"system_fingerprint":"","model":"qwen-long","id":"chatcmpl-35e3c387759692f98041ed0a5dd9b72a"}

data:{"choices":[{"delta":{"content":"人类评估显示，相比GPT-"},"finish_reason":"null","index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1715334042,"system_fingerprint":"","model":"qwen-long","id":"chatcmpl-35e3c387759692f98041ed0a5dd9b72a"}

data:{"choices":[{"delta":{"content":"4、Bard和DaVinci"},"finish_reason":"null","index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1715334042,"system_fingerprint":"","model":"qwen-long","id":"chatcmpl-35e3c387759692f98041ed0a5dd9b72a"}

data:{"choices":[{"delta":{"content":"003（后者经过人类反馈"},"finish_reason":"null","index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1715334042,"system_fingerprint":"","model":"qwen-long","id":"chatcmpl-35e3c387759692f98041ed0a5dd9b72a"}

data:{"choices":[{"delta":{"content":"训练），参与者更偏好或认为L"},"finish_reason":"null","index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1715334042,"system_fingerprint":"","model":"qwen-long","id":"chatcmpl-35e3c387759692f98041ed0a5dd9b72a"}

data:{"choices":[{"delta":{"content":"IMA的回答等同的比例分别达到了4"},"finish_reason":"null","index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1715334042,"system_fingerprint":"","model":"qwen-long","id":"chatcmpl-35e3c387759692f98041ed0a5dd9b72a"}

data:{"choices":[{"delta":{"content":"3%、58%和6"},"finish_reason":"null","index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1715334042,"system_fingerprint":"","model":"qwen-long","id":"chatcmpl-35e3c387759692f98041ed0a5dd9b72a"}

data:{"choices":[{"delta":{"content":"5%。这表明大型语言模型"},"finish_reason":"null","index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1715334042,"system_fingerprint":"","model":"qwen-long","id":"chatcmpl-35e3c387759692f98041ed0a5dd9b72a"}

data:{"choices":[{"delta":{"content":"在预训练阶段几乎学到了所有"},"finish_reason":"null","index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1715334042,"system_fingerprint":"","model":"qwen-long","id":"chatcmpl-35e3c387759692f98041ed0a5dd9b72a"}

data:{"choices":[{"delta":{"content":"知识，只需有限的指令微调"},"finish_reason":"null","index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1715334042,"system_fingerprint":"","model":"qwen-long","id":"chatcmpl-35e3c387759692f98041ed0a5dd9b72a"}

data:{"choices":[{"delta":{"content":"数据即可产生高质量输出。此外，"},"finish_reason":"null","index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1715334042,"system_fingerprint":"","model":"qwen-long","id":"chatcmpl-35e3c387759692f98041ed0a5dd9b72a"}

data:{"choices":[{"delta":{"content":"文中还提到了质量对模型输出"},"finish_reason":"null","index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1715334042,"system_fingerprint":"","model":"qwen-long","id":"chatcmpl-35e3c387759692f98041ed0a5dd9b72a"}

data:{"choices":[{"delta":{"content":"的影响，以及不同类型的生成任务示"},"finish_reason":"null","index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1715334042,"system_fingerprint":"","model":"qwen-long","id":"chatcmpl-35e3c387759692f98041ed0a5dd9b72a"}

data:{"choices":[{"delta":{"content":"例，如关于育儿建议、模仿"},"finish_reason":"null","index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1715334042,"system_fingerprint":"","model":"qwen-long","id":"chatcmpl-35e3c387759692f98041ed0a5dd9b72a"}

data:{"choices":[{"delta":{"content":"乔治·卡林风格的单口"},"finish_reason":"null","index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1715334042,"system_fingerprint":"","model":"qwen-long","id":"chatcmpl-35e3c387759692f98041ed0a5dd9b72a"}

data:{"choices":[{"delta":{"content":"相声、处理职场情感问题的建议"},"finish_reason":"null","index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1715334042,"system_fingerprint":"","model":"qwen-long","id":"chatcmpl-35e3c387759692f98041ed0a5dd9b72a"}

data:{"choices":[{"delta":{"content":"和制作北非风味菜肴shak"},"finish_reason":"null","index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1715334042,"system_fingerprint":"","model":"qwen-long","id":"chatcmpl-35e3c387759692f98041ed0a5dd9b72a"}

data:{"choices":[{"delta":{"content":"shuka的食谱。这些示"},"finish_reason":"null","index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1715334042,"system_fingerprint":"","model":"qwen-long","id":"chatcmpl-35e3c387759692f98041ed0a5dd9b72a"}

data:{"choices":[{"delta":{"content":"例进一步证明了模型在多样任务"},"finish_reason":"null","index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1715334042,"system_fingerprint":"","model":"qwen-long","id":"chatcmpl-35e3c387759692f98041ed0a5dd9b72a"}

data:{"choices":[{"delta":{"content":"中的应用潜力与灵活性。"},"finish_reason":"null","index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1715334042,"system_fingerprint":"","model":"qwen-long","id":"chatcmpl-35e3c387759692f98041ed0a5dd9b72a"}

data:{"choices":[{"delta":{"content":""},"finish_reason":"stop","index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1715334042,"system_fingerprint":"","model":"qwen-long","id":"chatcmpl-35e3c387759692f98041ed0a5dd9b72a"}

异常响应示例

在访问请求出错的情况下，输出的结果中会通过 code 和 message 指明出错原因。

{
    "error": {
        "message": "Incorrect API key provided. ",
        "type": "invalid_request_error",
        "param": null,
        "code": "invalid_api_key"
    }
}

状态码说明

错误码	说明
400 - Invalid file [id:xxx].	提供的文件id存在问题
400 - Too many files provided.	提供的对话文档数量大于等于100
400 - File [id:xxx] cannot be found.	输入的文件已经被删除
400 - File [id:xxx] exceeds size limit.	文档大小超限
400 - File [id:xxx] exceeds page limits (15000 pages).	文档页数超限
400 - Multiple types of file system prompt detected, please do not mix file-id and text content in one request.	输入的文件中包含了file id 和文件内容两种方式，当前暂不支持两种方式混用
400 - File [id:xxx] format is not supported.	文档格式不支持
400 - File [id:xxx] content blank.	文档内容为空
400 - Total message token length exceed model limit （10000000 tokens）.	输入的messages 总token数超过了10M
400 - Single round file-content exceeds token limit, please use fileid to supply lengthy input.	输入的单条message token数超过了9K
400 - Role specification invalid, please refer to API documentation for usage.	messages组装格式存在问题，请参考上述参数描述与示例代码进行参考
400 - File parsing in progress, please try again later.	文档解析中，请稍后再试
400 - Input data may contain inappropriate content.	数据检查错误，输入包含疑似敏感内容被绿网拦截
429 - You exceeded your current requests list.	您超出了对模型访问的限流值，请稍后再试
500 - File [id:xxx] parsing error.	文档解析失败
500 - File [id:xxx] prasing timeout.	文档解析超时
500 - Preprocessor error.	大模型前处理错误
500 - Postprocessor error.	大模型后处理错误
500 - File content conversion error.	文档message处理错误
500 - An unspecified internal error has occured.	调用大模型出现异常
500 - Response timeout.	处理超时，可尝试重试
503 - The engine is currently overloaded, please try again later.	服务端负载过高，可重试