通义千问VL模型可以根据您传入的图片或视频来进行回答。
访问模型广场可以在线体验图片理解能力。视频理解能力当前仅支持通过API使用。
应用示例
通义千问VL模型
如何使用
您需要已获取API-KEY并配置API-KEY到环境变量。如果通过OpenAI SDK或DashScope SDK进行调用,还需要安装SDK。
简单示例
OpenAI兼容
您可以通过OpenAI SDK或OpenAI兼容的HTTP方式调用通义千问VL模型。
Python
示例代码
from openai import OpenAI
import os
client = OpenAI(
api_key=os.getenv("DASHSCOPE_API_KEY"),
base_url="https://dashscope.aliyuncs.com/compatible-mode/v1",
)
completion = client.chat.completions.create(
model="qwen-vl-max",
messages=[
{
"role": "user",
"content": [
{
"type": "image_url",
"image_url": {
"url": "https://dashscope.oss-cn-beijing.aliyuncs.com/images/dog_and_girl.jpeg"
},
},
{"type": "text", "text": "这是什么"},
],
}
],
)
print(completion.model_dump_json())
返回结果
{
"id": "chatcmpl-4b5a3bb9-221f-9687-bdd7-a7d56aae44df",
"choices": [
{
"finish_reason": "stop",
"index": 0,
"logprobs": null,
"message": {
"content": "这张图片展示了一位女士和一只狗在海滩上互动。女士坐在沙滩上,微笑着与狗握手。背景是海浪和天空,阳光洒在她们身上,营造出温馨的氛围。狗戴着项圈,显得很温顺。",
"role": "assistant",
"function_call": null,
"tool_calls": null
}
}
],
"created": 1725948492,
"model": "qwen-vl-max",
"object": "chat.completion",
"service_tier": null,
"system_fingerprint": null,
"usage": {
"completion_tokens": 55,
"prompt_tokens": 1270,
"total_tokens": 1325
}
}
curl
示例代码
{
"choices": [
{
"message": {
"content": "这是一张在海滩上拍摄的照片。照片中,一个人和一只狗坐在沙滩上,背景是大海和天空。人和狗似乎在互动,狗的前爪搭在人的手上。阳光从画面的右侧照射过来,给整个场景增添了一种温暖的氛围。",
"role": "assistant"
},
"finish_reason": "stop",
"index": 0,
"logprobs": null
}
],
"object": "chat.completion",
"usage": {
"prompt_tokens": 1270,
"completion_tokens": 61,
"total_tokens": 1331
},
"created": 1726369725,
"system_fingerprint": null,
"model": "qwen-vl-max",
"id": "chatcmpl-58870858-6eea-9161-9456-4095a68374a4"
}
返回结果
{
"choices": [
{
"message": {
"content": "这张图片展示了一位女士和一只狗在海滩上互动。女士坐在沙滩上,微笑着与狗握手。背景是大海和天空,阳光洒在她们身上,营造出温暖的氛围。狗戴着项圈,显得很温顺。",
"role": "assistant"
},
"finish_reason": "stop",
"index": 0,
"logprobs": null
}
],
"object": "chat.completion",
"usage": {
"prompt_tokens": 1270,
"completion_tokens": 54,
"total_tokens": 1324
},
"created": 1725948561,
"system_fingerprint": null,
"model": "qwen-vl-max",
"id": "chatcmpl-0fd66f46-b09e-9164-a84f-3ebbbedbac15"
}
DashScope
您可以通过DashScope SDK或HTTP方式调用通义千问VL模型。
Python
示例代码
from http import HTTPStatus
import dashscope
messages = [
{
"role": "user",
"content": [
{"image": "https://dashscope.oss-cn-beijing.aliyuncs.com/images/dog_and_girl.jpeg"},
{"text": "这是什么?"}
]
}
]
response = dashscope.MultiModalConversation.call(
model='qwen-vl-max',
messages=messages
)
if response.status_code == HTTPStatus.OK:
print(response)
else:
print(response.code)
print(response.message)
返回结果
{
"status_code": 200,
"request_id": "3a031529-707f-9b7d-968c-172e7533debc",
"code": "",
"message": "",
"output": {
"text": null,
"finish_reason": null,
"choices": [
{
"finish_reason": "stop",
"message": {
"role": "assistant",
"content": [
{
"text": "这是一张在海滩上拍摄的照片。照片中有一位女士和一只狗。女士穿着格子衬衫,坐在沙滩上,微笑着与狗互动。狗戴着项圈,似乎在与女士握手。背景是大海和天空,阳光洒在她们身上,营造出温暖的氛围。"
}
]
}
}
]
},
"usage": {
"input_tokens": 1271,
"output_tokens": 63,
"image_tokens": 1247
}
}
Java
示例代码
// Copyright (c) Alibaba, Inc. and its affiliates.
import java.util.Arrays;
import java.util.Collections;
import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversation;
import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversationParam;
import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversationResult;
import com.alibaba.dashscope.common.MultiModalMessage;
import com.alibaba.dashscope.common.Role;
import com.alibaba.dashscope.exception.ApiException;
import com.alibaba.dashscope.exception.NoApiKeyException;
import com.alibaba.dashscope.exception.UploadFileException;
import com.alibaba.dashscope.utils.JsonUtils;
public class Main {
public static void simpleMultiModalConversationCall()
throws ApiException, NoApiKeyException, UploadFileException {
MultiModalConversation conv = new MultiModalConversation();
MultiModalMessage userMessage = MultiModalMessage.builder().role(Role.USER.getValue())
.content(Arrays.asList(
Collections.singletonMap("image", "https://dashscope.oss-cn-beijing.aliyuncs.com/images/dog_and_girl.jpeg"),
Collections.singletonMap("text", "这是什么?"))).build();
MultiModalConversationParam param = MultiModalConversationParam.builder()
.model("qwen-vl-max")
.message(userMessage)
.build();
MultiModalConversationResult result = conv.call(param);
System.out.println(JsonUtils.toJson(result));
}
public static void main(String[] args) {
try {
simpleMultiModalConversationCall();
} catch (ApiException | NoApiKeyException | UploadFileException e) {
System.out.println(e.getMessage());
}
System.exit(0);
}
}
返回结果
{
"requestId": "dcb38a0f-fd69-9071-bcde-c4530f9a7559",
"usage": {
"input_tokens": 1271,
"output_tokens": 58
},
"output": {
"choices": [
{
"finish_reason": "stop",
"message": {
"role": "assistant",
"content": [
{
"text": "这是一张在海滩上拍摄的照片。照片中有一位女士和一只狗。女士穿着格子衬衫,坐在沙滩上,与狗互动。狗戴着项圈,看起来很开心。背景是大海和天空,阳光洒在她们身上,营造出温暖的氛围。"
}
]
}
}
]
}
}
curl
示例代码
curl -X POST https://dashscope.aliyuncs.com/api/v1/services/aigc/multimodal-generation/generation \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H 'Content-Type: application/json' \
-d '{
"model": "qwen-vl-max",
"input":{
"messages":[
{
"role": "user",
"content": [
{"image": "https://dashscope.oss-cn-beijing.aliyuncs.com/images/dog_and_girl.jpeg"},
{"text": "这是什么?"}
]
}
]
}
}'
返回结果
{
"output": {
"choices": [
{
"finish_reason": "stop",
"message": {
"role": "assistant",
"content": [
{
"text": "这是一张在海滩上拍摄的照片。照片中有一个穿着格子衬衫的人和一只戴着项圈的狗。他们坐在沙滩上,背景是大海和天空。阳光从画面的右侧照射过来,给整个场景增添了一种温暖的氛围。"
}
]
}
}
]
},
"usage": {
"output_tokens": 55,
"input_tokens": 1271,
"image_tokens": 1247
},
"request_id": "ccf845a3-dc33-9cda-b581-20fe7dc23f70"
}
多图片输入
您可以在一次请求中向通义千问VL模型输入多张图片,传入方法请参考以下代码。
OpenAI兼容
您可以通过OpenAI SDK或OpenAI兼容的HTTP方式调用通义千问VL模型。
Python
示例代码
from openai import OpenAI
import os
client = OpenAI(
api_key=os.getenv("DASHSCOPE_API_KEY"),
base_url="https://dashscope.aliyuncs.com/compatible-mode/v1",
)
completion = client.chat.completions.create(
model="qwen-vl-max",
messages=[
{
"role": "user",
"content": [
{
"type": "image_url",
"image_url": {
"url": "https://dashscope.oss-cn-beijing.aliyuncs.com/images/dog_and_girl.jpeg"
}
},
{
"type": "image_url",
"image_url": {
"url": "https://dashscope.oss-cn-beijing.aliyuncs.com/images/tiger.png"
}
},
{
"type": "text",
"text": "这些是什么"
}
]
}
]
)
print(completion.model_dump_json())
返回结果
{
"id": "chatcmpl-4b5a3bb9-221f-9687-bdd7-a7d56aae44df",
"choices": [
{
"finish_reason": "stop",
"index": 0,
"logprobs": null,
"message": {
"content": "图1中是一位女士和一只拉布拉多犬在海滩上互动的场景。女士穿着格子衬衫,坐在沙滩上,与狗进行握手的动作,背景是海浪和天空,整个画面充满了温馨和愉快的氛围。\n\n图2中是一只老虎在森林中行走的场景。老虎的毛色是橙色和黑色相间的条纹,它正向前迈步,周围是茂密的树木和植被,地面上覆盖着落叶,整个画面给人一种野生自然的感觉。",
"role": "assistant",
"function_call": null,
"tool_calls": null
}
}
],
"created": 1725948492,
"model": "qwen-vl-max",
"object": "chat.completion",
"service_tier": null,
"system_fingerprint": null,
"usage": {
"completion_tokens": 106,
"prompt_tokens": 2497,
"total_tokens": 2603
}
}
curl
示例代码
curl -X POST https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H 'Content-Type: application/json' \
-d '{
"model": "qwen-vl-max",
"messages": [
{
"role": "user",
"content": [
{
"type": "image_url",
"image_url": {
"url": "https://dashscope.oss-cn-beijing.aliyuncs.com/images/dog_and_girl.jpeg"
}
},
{
"type": "image_url",
"image_url": {
"url": "https://dashscope.oss-cn-beijing.aliyuncs.com/images/tiger.png"
}
},
{
"type": "text",
"text": "这些是什么"
}
]
}
]
}'
返回结果
{
"choices": [
{
"message": {
"content": "图1中是一位女士和一只拉布拉多犬在海滩上互动的场景。女士穿着格子衬衫,坐在沙滩上,与狗进行握手的动作,背景是海景和日落的天空,整个画面显得非常温馨和谐。\n\n图2中是一只老虎在森林中行走的场景。老虎的毛色是橙色和黑色条纹相间,它正向前迈步,周围是茂密的树木和植被,地面上覆盖着落叶,整个画面充满了自然的野性和生机。",
"role": "assistant"
},
"finish_reason": "stop",
"index": 0,
"logprobs": null
}
],
"object": "chat.completion",
"usage": {
"prompt_tokens": 2497,
"completion_tokens": 109,
"total_tokens": 2606
},
"created": 1725948561,
"system_fingerprint": null,
"model": "qwen-vl-max",
"id": "chatcmpl-0fd66f46-b09e-9164-a84f-3ebbbedbac15"
}
DashScope
您可以通过DashScope SDK或HTTP方式调用通义千问VL模型。
Python
示例代码
from http import HTTPStatus
import dashscope
messages = [
{
"role": "user",
"content": [
{"image": "https://dashscope.oss-cn-beijing.aliyuncs.com/images/dog_and_girl.jpeg"},
{"image": "https://dashscope.oss-cn-beijing.aliyuncs.com/images/tiger.png"},
{"image": "https://dashscope.oss-cn-beijing.aliyuncs.com/images/rabbit.png"},
{"text": "这些是什么?"}
]
}
]
response = dashscope.MultiModalConversation.call(
model='qwen-vl-plus',
messages=messages
)
if response.status_code == HTTPStatus.OK:
print(response)
else:
print(response.code)
print(response.message)
返回结果
{
"status_code": 200,
"request_id": "3a031529-707f-9b7d-968c-172e7533debc",
"code": "",
"message": "",
"output": {
"text": null,
"finish_reason": null,
"choices": [
{
"finish_reason": "stop",
"message": {
"role": "assistant",
"content": [
{
"text": "图1中是一名女子和狗在沙滩上玩耍。\n图2是孟加拉虎的插画,它正向镜头走来。\n图3里是一只可爱的小白兔。"
}
]
}
}
]
},
"usage": {
"input_tokens": 3743,
"output_tokens": 41,
"image_tokens": 3697
}
}
Java
示例代码
import java.util.Arrays;
import java.util.Collections;
import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversation;
import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversationParam;
import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversationResult;
import com.alibaba.dashscope.common.MultiModalMessage;
import com.alibaba.dashscope.common.Role;
import com.alibaba.dashscope.exception.ApiException;
import com.alibaba.dashscope.exception.NoApiKeyException;
import com.alibaba.dashscope.exception.UploadFileException;
import com.alibaba.dashscope.utils.JsonUtils;
public class Main {
public static void simpleMultiModalConversationCall()
throws ApiException, NoApiKeyException, UploadFileException {
MultiModalConversation conv = new MultiModalConversation();
MultiModalMessage userMessage = MultiModalMessage.builder().role(Role.USER.getValue())
.content(Arrays.asList(
Collections.singletonMap("image", "https://dashscope.oss-cn-beijing.aliyuncs.com/images/dog_and_girl.jpeg"),
Collections.singletonMap("image", "https://dashscope.oss-cn-beijing.aliyuncs.com/images/tiger.png"),
Collections.singletonMap("image", "https://dashscope.oss-cn-beijing.aliyuncs.com/images/rabbit.png"),
Collections.singletonMap("text", "这些是什么?"))).build();
MultiModalConversationParam param = MultiModalConversationParam.builder()
.model("qwen-vl-plus")
.message(userMessage)
.build();
MultiModalConversationResult result = conv.call(param);
System.out.println(JsonUtils.toJson(result));
}
public static void main(String[] args) {
try {
simpleMultiModalConversationCall();
} catch (ApiException | NoApiKeyException | UploadFileException e) {
System.out.println(e.getMessage());
}
System.exit(0);
}
}
返回结果
{
"requestId": "dcb38a0f-fd69-9071-bcde-c4530f9a7559",
"usage": {
"input_tokens": 3740,
"output_tokens": 48
},
"output": {
"choices": [
{
"finish_reason": "stop",
"message": {
"role": "assistant",
"content": [
{
"text": "图1中是一名女子和一只大金毛在沙滩上玩耍。\n图2是孟加拉虎的写实照片,老虎正向镜头走来。\n图3是一幅插画,主要展示了一只兔子。"
}
]
}
}
]
}
}
curl
示例代码
curl --location 'https://dashscope.aliyuncs.com/api/v1/services/aigc/multimodal-generation/generation' \
--header "Authorization: Bearer $DASHSCOPE_API_KEY" \
--header 'Content-Type: application/json' \
--data '{
"model": "qwen-vl-plus",
"input":{
"messages":[
{
"role": "user",
"content": [
{"image": "https://dashscope.oss-cn-beijing.aliyuncs.com/images/dog_and_girl.jpeg"},
{"image": "https://dashscope.oss-cn-beijing.aliyuncs.com/images/tiger.png"},
{"image": "https://dashscope.oss-cn-beijing.aliyuncs.com/images/rabbit.png"},
{"text": "这些是什么?"}
]
}
]
}
}'
返回结果
{
"output": {
"choices": [
{
"finish_reason": "stop",
"message": {
"role": "assistant",
"content": [
{
"text": "这张图片显示了一位女士和她的狗在海滩上。她们似乎正在享受彼此的陪伴,狗狗坐在沙滩上伸出爪子与女士握手或互动。背景是美丽的日落景色,海浪轻轻拍打着海岸线。\n\n请注意,我提供的描述基于图像中可见的内容,并不包括任何超出视觉信息之外的信息。如果您需要更多关于这个场景的具体细节,请告诉我!"
}
]
}
}
]
},
"usage": {
"output_tokens": 81,
"input_tokens": 1277,
"image_tokens": 1247
},
"request_id": "ccf845a3-dc33-9cda-b581-20fe7dc23f70"
}
多轮对话
通义千问VL模型可以参考历史对话信息进行回复。您可以参考以下示例代码,通过OpenAI或者DashScope的方式,调用通义千问VL模型,实现多轮对话的功能。
OpenAI兼容
您可以通过OpenAI SDK或OpenAI兼容的HTTP方式调用通义千问VL模型,体验多轮对话的功能。
Python
示例代码
from openai import OpenAI
import os
def get_response():
client = OpenAI(
api_key=os.getenv("DASHSCOPE_API_KEY"),
base_url="https://dashscope.aliyuncs.com/compatible-mode/v1"
)
messages=[
{
"role": "user",
"content": [
{
"type": "image_url",
"image_url": {
"url": "https://dashscope.oss-cn-beijing.aliyuncs.com/images/dog_and_girl.jpeg"
}
},
{
"type": "text",
"text": "这是什么"
}
]
}
]
completion = client.chat.completions.create(
model="qwen-vl-plus",
messages=messages,
)
print(f"模型第一轮输出:\n{completion.model_dump()}")
assistant_message = completion.choices[0].message
messages.append(assistant_message.model_dump())
messages.append({
"role": "user",
"content": [
{
"type": "text",
"text": "做一首诗描述这个场景"
}
]
})
completion = client.chat.completions.create(
model="qwen-vl-plus",
messages=messages,
)
print(f"模型第二轮输出:\n{completion.model_dump()}")
if __name__=='__main__':
get_response()
返回结果
模型第一轮输出:
{
"id": "chatcmpl-afcdac19-9bbf-942b-91ad-04252fe1722c",
"choices": [
{
"finish_reason": "stop",
"index": 0,
"logprobs": None,
"message": {
"content": "图中是一名女子和她的狗在沙滩上互动。狗狗坐在地上,伸出爪子像是要握手或者击掌的样子。这名女士穿着格子衬衫,似乎正在与狗狗进行亲密的接触,并且面带微笑。背景是海洋和日出或日落时分的天空。这是一幅描绘人与宠物之间温馨时刻的画面。",
"role": "assistant",
"function_call": None,
"tool_calls": None
}
}
],
"created": 1721820065,
"model": "qwen-vl-plus",
"object": "chat.completion",
"service_tier": None,
"system_fingerprint": None,
"usage": {
"completion_tokens": 75,
"prompt_tokens": 1276,
"total_tokens": 1351
}
}
模型第二轮输出:
{
"id": "chatcmpl-3090adc4-91da-95d9-8482-49240d47099a",
"choices": [
{
"finish_reason": "stop",
"index": 0,
"logprobs": None,
"message": {
"content": "朝阳照海涛,\n沙岸女伴犬。\n欢笑共此时,\n友情深似海。\n\n手握同游步,\n默契心间留。\n潮起潮又落,\n此景久难忘。",
"role": "assistant",
"function_call": None,
"tool_calls": None
}
}
],
"created": 1721820068,
"model": "qwen-vl-plus",
"object": "chat.completion",
"service_tier": None,
"system_fingerprint": None,
"usage": {
"completion_tokens": 44,
"prompt_tokens": 1366,
"total_tokens": 1410
}
}
curl
示例代码
curl -X POST https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H 'Content-Type: application/json' \
-d '{
"model": "qwen-vl-max",
"messages": [
{
"role": "user",
"content": [
{
"type": "image_url",
"image_url": {
"url": "https://dashscope.oss-cn-beijing.aliyuncs.com/images/dog_and_girl.jpeg"
}
},
{
"type": "text",
"text": "这是什么"
}
]
},
{
"role": "assistant",
"content": [
{
"type": "text",
"text": "这是一个女孩和一只狗。"
}
]
},
{
"role": "user",
"content": [
{
"type": "text",
"text": "写一首七言绝句描述这个场景"
}
]
}
]
}'
返回结果
{
"choices": [
{
"message": {
"content": "海风轻拂笑颜开, \n沙滩上与犬相陪。 \n夕阳斜照人影短, \n欢乐时光心自醉。",
"role": "assistant"
},
"finish_reason": "stop",
"index": 0,
"logprobs": null
}
],
"object": "chat.completion",
"usage": {
"prompt_tokens": 1295,
"completion_tokens": 32,
"total_tokens": 1327
},
"created": 1726324976,
"system_fingerprint": null,
"model": "qwen-vl-max",
"id": "chatcmpl-3c953977-6107-96c5-9a13-c01e328b24ca"
}
DashScope
您可以通过DashScope SDK或HTTP方式调用通义千问VL模型,体验多轮对话的功能。
Python
示例代码
from dashscope import MultiModalConversation
def conversation_call():
messages = [
{
"role": "user",
"content": [
{"image": "https://dashscope.oss-cn-beijing.aliyuncs.com/images/dog_and_girl.jpeg"},
{"text": "这是什么?"},
]
}
]
response = MultiModalConversation.call(
model='qwen-vl-plus',
messages=messages
)
print(f"模型第一轮输出:{response}")
messages.append(response['output']['choices'][0]['message'])
user_msg = {
"role": "user",
"content": [
{
"text": "做一首诗描述这个场景"
}
]
}
messages.append(user_msg)
response = MultiModalConversation.call(
model='qwen-vl-plus',
messages=messages
)
print(f"模型第二轮输出:{response}")
if __name__ == '__main__':
conversation_call()
返回结果
模型第一轮输出:
{
"status_code": 200,
"request_id": "0468708b-85fb-95d8-a502-bb9098b31b37",
"code": "",
"message": "",
"output": {
"text": null,
"finish_reason": null,
"choices": [
{
"finish_reason": "stop",
"message": {
"role": "assistant",
"content": [
{
"text": "这张图片显示了一位女士和她的狗在海滩上。她们似乎正在享受彼此的陪伴,狗狗坐在沙滩上伸出爪子与女士握手或互动。背景是美丽的日落景色,海浪轻轻拍打着海岸线。\n\n请注意,我提供的描述基于图像中可见的内容,并不包括任何超出视觉信息之外的信息。如果您需要更多关于这个场景的具体细节,请告诉我!"
}
]
}
}
]
},
"usage": {
"input_tokens": 1277,
"output_tokens": 81,
"image_tokens": 1247
}
}
模型第二轮输出:
{
"status_code": 200,
"request_id": "8f236443-7b01-9bad-87be-abff5d3887de",
"code": "",
"message": "",
"output": {
"text": null,
"finish_reason": null,
"choices": [
{
"finish_reason": "stop",
"message": {
"role": "assistant",
"content": [
{
"text": "夕阳染红了天边,\n波涛轻抚着沙岸。\n人犬共坐此情深,\n\n手握着手心相牵。\n欢笑回荡于风间,\n这一刻永恒不变。\n爱意如潮水般涌动,\n在这片金色的大地上蔓延。"
}
]
}
}
]
},
"usage": {
"input_tokens": 1373,
"output_tokens": 59,
"image_tokens": 1247
}
}
Java
示例代码
// Copyright (c) Alibaba, Inc. and its affiliates.
import java.util.ArrayList;
import java.util.Arrays;
import java.util.Collections;
import java.util.List;
import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversation;
import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversationParam;
import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversationResult;
import com.alibaba.dashscope.common.MultiModalMessage;
import com.alibaba.dashscope.common.Role;
import com.alibaba.dashscope.exception.ApiException;
import com.alibaba.dashscope.exception.NoApiKeyException;
import com.alibaba.dashscope.exception.UploadFileException;
import com.alibaba.dashscope.utils.JsonUtils;
public class Main {
private static final String modelName = "qwen-vl-plus";
public static void MultiRoundConversationCall() throws ApiException, NoApiKeyException, UploadFileException {
MultiModalConversation conv = new MultiModalConversation();
MultiModalMessage systemMessage = MultiModalMessage.builder().role(Role.SYSTEM.getValue())
.content(Arrays.asList(Collections.singletonMap("text", "You are a helpful assistant."))).build();
MultiModalMessage userMessage = MultiModalMessage.builder().role(Role.USER.getValue())
.content(Arrays.asList(Collections.singletonMap("image", "https://dashscope.oss-cn-beijing.aliyuncs.com/images/dog_and_girl.jpeg"),
Collections.singletonMap("text", "这是什么?"))).build();
List<MultiModalMessage> messages = new ArrayList<>();
messages.add(systemMessage);
messages.add(userMessage);
MultiModalConversationParam param = MultiModalConversationParam.builder()
.model(modelName)
.messages(messages)
.build();
MultiModalConversationResult result = conv.call(param);
System.out.println(JsonUtils.toJson(result));
// add the result to conversation
messages.add(result.getOutput().getChoices().get(0).getMessage());
MultiModalMessage msg = MultiModalMessage.builder().role(Role.USER.getValue())
.content(Arrays.asList(Collections.singletonMap("text", "做一首诗描述这个场景"))).build();
messages.add(msg);
// new messages
param.setMessages((List)messages);
result = conv.call(param);
System.out.println(JsonUtils.toJson(result));
}
public static void main(String[] args) {
try {
MultiRoundConversationCall();
} catch (ApiException | NoApiKeyException | UploadFileException e) {
System.out.println(e.getMessage());
}
System.exit(0);
}
}
返回结果
{
"requestId": "1239e1de-dbd3-9b46-b508-421023ed3053",
"usage": {
"input_tokens": 1277,
"output_tokens": 81
},
"output": {
"choices": [
{
"finish_reason": "stop",
"message": {
"role": "assistant",
"content": [
{
"text": "这张图片显示了一位女士和她的狗在海滩上。她们似乎正在享受彼此的陪伴,狗狗坐在沙滩上伸出爪子与女士握手或互动。背景是美丽的日落景色,海浪轻轻拍打着海岸线。\n\n请注意,我提供的描述基于图像中可见的内容,并不包括任何超出视觉信息之外的信息。如果您需要更多关于这个场景的具体细节,请告诉我!"
}
]
}
}
]
}
}
{
"requestId": "045fe96f-26c4-9cfd-b0ad-ec5f1f4033ce",
"usage": {
"input_tokens": 1373,
"output_tokens": 59
},
"output": {
"choices": [
{
"finish_reason": "stop",
"message": {
"role": "assistant",
"content": [
{
"text": "夕阳染红了天边,\n波涛轻抚着沙岸。\n人犬共坐此情深,\n\n手握着手心相牵。\n欢笑回荡于风间,\n这一刻永恒不变。\n爱意如潮水般涌动,\n在这片金色的大地上蔓延。"
}
]
}
}
]
}
}
curl
示例代码
curl -X POST https://dashscope.aliyuncs.com/api/v1/services/aigc/multimodal-generation/generation \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H 'Content-Type: application/json' \
-d '{
"model": "qwen-vl-max",
"input":{
"messages":[
{
"role": "user",
"content": [
{"image": "https://dashscope.oss-cn-beijing.aliyuncs.com/images/dog_and_girl.jpeg"},
{"text": "这是什么?"}
]
},
{
"role": "assistant",
"content": [
{"text": "这是一只狗和一只女孩。"}
]
},
{
"role": "user",
"content": [
{"text": "写一首七言绝句描述这个场景"}
]
}
]
}
}'
返回结果
{
"output": {
"choices": [
{
"finish_reason": "stop",
"message": {
"role": "assistant",
"content": [
{
"text": "海浪轻拍沙滩边,女孩与狗同嬉戏。阳光洒落笑颜开,快乐时光永铭记。"
}
]
}
}
]
},
"usage": {
"output_tokens": 27,
"input_tokens": 1298,
"image_tokens": 1247
},
"request_id": "bdf5ef59-c92e-92a6-9d69-a738ecee1590"
}
流式输出
大模型并不是一次性生成最终结果,而是逐步地生成中间结果,最终结果由中间结果拼接而成。使用非流式输出方式需要等待模型生成结束后再将生成的中间结果拼接后返回,而流式输出可以实时地将中间结果返回,您可以在模型进行输出的同时进行阅读,减少等待模型回复的时间。
OpenAI兼容
您可以通过OpenAI SDK或OpenAI兼容的HTTP方式调用通义千问VL模型,体验流式输出的功能。
Python
示例代码
from openai import OpenAI
import os
client = OpenAI(
api_key=os.getenv("DASHSCOPE_API_KEY"),
base_url="https://dashscope.aliyuncs.com/compatible-mode/v1",
)
completion = client.chat.completions.create(
model="qwen-vl-plus",
messages=[
{
"role": "user",
"content": [
{
"type": "image_url",
"image_url": {
"url": "https://dashscope.oss-cn-beijing.aliyuncs.com/images/dog_and_girl.jpeg"
}
},
{
"type": "text",
"text": "这是什么"
}
]
}
],
stream=True,
stream_options={"include_usage": True}
)
for chunk in completion:
print(chunk.model_dump())
返回结果
{'id': 'chatcmpl-6cf91cc7-1121-9977-b4bc-5e7d1fbfd693', 'choices': [{'delta': {'content': '', 'function_call': None, 'role': 'assistant', 'tool_calls': None}, 'finish_reason': None, 'index': 0, 'logprobs': None}], 'created': 1721823365, 'model': 'qwen-vl-plus', 'object': 'chat.completion.chunk', 'service_tier': None, 'system_fingerprint': None, 'usage': None}
{'id': 'chatcmpl-6cf91cc7-1121-9977-b4bc-5e7d1fbfd693', 'choices': [{'delta': {'content': '图', 'function_call': None, 'role': None, 'tool_calls': None}, 'finish_reason': None, 'index': 0, 'logprobs': None}], 'created': 1721823365, 'model': 'qwen-vl-plus', 'object': 'chat.completion.chunk', 'service_tier': None, 'system_fingerprint': None, 'usage': None}
{'id': 'chatcmpl-6cf91cc7-1121-9977-b4bc-5e7d1fbfd693', 'choices': [{'delta': {'content': '中', 'function_call': None, 'role': None, 'tool_calls': None}, 'finish_reason': None, 'index': 0, 'logprobs': None}], 'created': 1721823365, 'model': 'qwen-vl-plus', 'object': 'chat.completion.chunk', 'service_tier': None, 'system_fingerprint': None, 'usage': None}
{'id': 'chatcmpl-6cf91cc7-1121-9977-b4bc-5e7d1fbfd693', 'choices': [{'delta': {'content': '是一名', 'function_call': None, 'role': None, 'tool_calls': None}, 'finish_reason': None, 'index': 0, 'logprobs': None}], 'created': 1721823365, 'model': 'qwen-vl-plus', 'object': 'chat.completion.chunk', 'service_tier': None, 'system_fingerprint': None, 'usage': None}
{'id': 'chatcmpl-6cf91cc7-1121-9977-b4bc-5e7d1fbfd693', 'choices': [{'delta': {'content': '女子和她的狗在', 'function_call': None, 'role': None, 'tool_calls': None}, 'finish_reason': None, 'index': 0, 'logprobs': None}], 'created': 1721823365, 'model': 'qwen-vl-plus', 'object': 'chat.completion.chunk', 'service_tier': None, 'system_fingerprint': None, 'usage': None}
{'id': 'chatcmpl-6cf91cc7-1121-9977-b4bc-5e7d1fbfd693', 'choices': [{'delta': {'content': '沙滩上互动。狗狗坐在地上,', 'function_call': None, 'role': None, 'tool_calls': None}, 'finish_reason': None, 'index': 0, 'logprobs': None}], 'created': 1721823365, 'model': 'qwen-vl-plus', 'object': 'chat.completion.chunk', 'service_tier': None, 'system_fingerprint': None, 'usage': None}
{'id': 'chatcmpl-6cf91cc7-1121-9977-b4bc-5e7d1fbfd693', 'choices': [{'delta': {'content': '伸出爪子像是要握手或者击', 'function_call': None, 'role': None, 'tool_calls': None}, 'finish_reason': None, 'index': 0, 'logprobs': None}], 'created': 1721823365, 'model': 'qwen-vl-plus', 'object': 'chat.completion.chunk', 'service_tier': None, 'system_fingerprint': None, 'usage': None}
{'id': 'chatcmpl-6cf91cc7-1121-9977-b4bc-5e7d1fbfd693', 'choices': [{'delta': {'content': '掌的样子。这名女士穿着格子', 'function_call': None, 'role': None, 'tool_calls': None}, 'finish_reason': None, 'index': 0, 'logprobs': None}], 'created': 1721823365, 'model': 'qwen-vl-plus', 'object': 'chat.completion.chunk', 'service_tier': None, 'system_fingerprint': None, 'usage': None}
{'id': 'chatcmpl-6cf91cc7-1121-9977-b4bc-5e7d1fbfd693', 'choices': [{'delta': {'content': '衬衫,似乎正在与狗狗进行亲密', 'function_call': None, 'role': None, 'tool_calls': None}, 'finish_reason': None, 'index': 0, 'logprobs': None}], 'created': 1721823365, 'model': 'qwen-vl-plus', 'object': 'chat.completion.chunk', 'service_tier': None, 'system_fingerprint': None, 'usage': None}
{'id': 'chatcmpl-6cf91cc7-1121-9977-b4bc-5e7d1fbfd693', 'choices': [{'delta': {'content': '的接触,并且面带微笑。', 'function_call': None, 'role': None, 'tool_calls': None}, 'finish_reason': None, 'index': 0, 'logprobs': None}], 'created': 1721823365, 'model': 'qwen-vl-plus', 'object': 'chat.completion.chunk', 'service_tier': None, 'system_fingerprint': None, 'usage': None}
{'id': 'chatcmpl-6cf91cc7-1121-9977-b4bc-5e7d1fbfd693', 'choices': [{'delta': {'content': '背景是海洋和日出或日', 'function_call': None, 'role': None, 'tool_calls': None}, 'finish_reason': None, 'index': 0, 'logprobs': None}], 'created': 1721823365, 'model': 'qwen-vl-plus', 'object': 'chat.completion.chunk', 'service_tier': None, 'system_fingerprint': None, 'usage': None}
{'id': 'chatcmpl-6cf91cc7-1121-9977-b4bc-5e7d1fbfd693', 'choices': [{'delta': {'content': '落时分的天空。这是一', 'function_call': None, 'role': None, 'tool_calls': None}, 'finish_reason': None, 'index': 0, 'logprobs': None}], 'created': 1721823365, 'model': 'qwen-vl-plus', 'object': 'chat.completion.chunk', 'service_tier': None, 'system_fingerprint': None, 'usage': None}
{'id': 'chatcmpl-6cf91cc7-1121-9977-b4bc-5e7d1fbfd693', 'choices': [{'delta': {'content': '幅描绘人与宠物之间温馨时刻', 'function_call': None, 'role': None, 'tool_calls': None}, 'finish_reason': None, 'index': 0, 'logprobs': None}], 'created': 1721823365, 'model': 'qwen-vl-plus', 'object': 'chat.completion.chunk', 'service_tier': None, 'system_fingerprint': None, 'usage': None}
{'id': 'chatcmpl-6cf91cc7-1121-9977-b4bc-5e7d1fbfd693', 'choices': [{'delta': {'content': '的画面。', 'function_call': None, 'role': None, 'tool_calls': None}, 'finish_reason': 'stop', 'index': 0, 'logprobs': None}], 'created': 1721823365, 'model': 'qwen-vl-plus', 'object': 'chat.completion.chunk', 'service_tier': None, 'system_fingerprint': None, 'usage': None}
{'id': 'chatcmpl-6cf91cc7-1121-9977-b4bc-5e7d1fbfd693', 'choices': [], 'created': 1721823365, 'model': 'qwen-vl-plus', 'object': 'chat.completion.chunk', 'service_tier': None, 'system_fingerprint': None, 'usage': {'completion_tokens': 75, 'prompt_tokens': 1276, 'total_tokens': 1351}}
curl
示例代码
curl --location 'https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions' \
--header "Authorization: Bearer $DASHSCOPE_API_KEY" \
--header 'Content-Type: application/json' \
--data '{
"model": "qwen-vl-plus",
"messages": [
{
"role": "user",
"content": [
{
"type": "image_url",
"image_url": {
"url": "https://dashscope.oss-cn-beijing.aliyuncs.com/images/dog_and_girl.jpeg"
}
},
{
"type": "text",
"text": "这是什么"
}
]
}
],
"stream":true,
"stream_options":{"include_usage":true}
}'
返回结果
data: {"choices":[{"delta":{"content":"","role":"assistant"},"index":0,"logprobs":null,"finish_reason":null}],"object":"chat.completion.chunk","usage":null,"created":1721823635,"system_fingerprint":null,"model":"qwen-vl-plus","id":"chatcmpl-9a9ec75a-3109-9910-b79e-7bcbce81c8f9"}
data: {"choices":[{"finish_reason":null,"delta":{"content":"图"},"index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1721823635,"system_fingerprint":null,"model":"qwen-vl-plus","id":"chatcmpl-9a9ec75a-3109-9910-b79e-7bcbce81c8f9"}
data: {"choices":[{"delta":{"content":"中"},"finish_reason":null,"index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1721823635,"system_fingerprint":null,"model":"qwen-vl-plus","id":"chatcmpl-9a9ec75a-3109-9910-b79e-7bcbce81c8f9"}
data: {"choices":[{"delta":{"content":"是一名"},"finish_reason":null,"index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1721823635,"system_fingerprint":null,"model":"qwen-vl-plus","id":"chatcmpl-9a9ec75a-3109-9910-b79e-7bcbce81c8f9"}
data: {"choices":[{"delta":{"content":"女子和她的狗在"},"finish_reason":null,"index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1721823635,"system_fingerprint":null,"model":"qwen-vl-plus","id":"chatcmpl-9a9ec75a-3109-9910-b79e-7bcbce81c8f9"}
data: {"choices":[{"delta":{"content":"沙滩上互动。狗狗坐在地上,"},"finish_reason":null,"index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1721823635,"system_fingerprint":null,"model":"qwen-vl-plus","id":"chatcmpl-9a9ec75a-3109-9910-b79e-7bcbce81c8f9"}
data: {"choices":[{"delta":{"content":"伸出爪子像是要握手或者击"},"finish_reason":null,"index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1721823635,"system_fingerprint":null,"model":"qwen-vl-plus","id":"chatcmpl-9a9ec75a-3109-9910-b79e-7bcbce81c8f9"}
data: {"choices":[{"delta":{"content":"掌的样子。这名女士穿着格子"},"finish_reason":null,"index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1721823635,"system_fingerprint":null,"model":"qwen-vl-plus","id":"chatcmpl-9a9ec75a-3109-9910-b79e-7bcbce81c8f9"}
data: {"choices":[{"delta":{"content":"衬衫,似乎正在与狗狗进行亲密"},"finish_reason":null,"index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1721823635,"system_fingerprint":null,"model":"qwen-vl-plus","id":"chatcmpl-9a9ec75a-3109-9910-b79e-7bcbce81c8f9"}
data: {"choices":[{"delta":{"content":"的接触,并且面带微笑。"},"finish_reason":null,"index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1721823635,"system_fingerprint":null,"model":"qwen-vl-plus","id":"chatcmpl-9a9ec75a-3109-9910-b79e-7bcbce81c8f9"}
data: {"choices":[{"delta":{"content":"他们背后的海浪拍打着海岸线"},"finish_reason":null,"index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1721823635,"system_fingerprint":null,"model":"qwen-vl-plus","id":"chatcmpl-9a9ec75a-3109-9910-b79e-7bcbce81c8f9"}
data: {"choices":[{"delta":{"content":",天空看起来很明亮但有些模糊"},"finish_reason":null,"index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1721823635,"system_fingerprint":null,"model":"qwen-vl-plus","id":"chatcmpl-9a9ec75a-3109-9910-b79e-7bcbce81c8f9"}
data: {"choices":[{"delta":{"content":",可能是日出或日落时"},"finish_reason":null,"index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1721823635,"system_fingerprint":null,"model":"qwen-vl-plus","id":"chatcmpl-9a9ec75a-3109-9910-b79e-7bcbce81c8f9"}
data: {"choices":[{"delta":{"content":"分拍摄的照片。整体氛围显得非常"},"finish_reason":null,"index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1721823635,"system_fingerprint":null,"model":"qwen-vl-plus","id":"chatcmpl-9a9ec75a-3109-9910-b79e-7bcbce81c8f9"}
data: {"choices":[{"finish_reason":"stop","delta":{"content":"和谐而温馨。"},"index":0,"logprobs":null}],"object":"chat.completion.chunk","usage":null,"created":1721823635,"system_fingerprint":null,"model":"qwen-vl-plus","id":"chatcmpl-9a9ec75a-3109-9910-b79e-7bcbce81c8f9"}
data: {"choices":[],"object":"chat.completion.chunk","usage":{"prompt_tokens":1276,"completion_tokens":85,"total_tokens":1361},"created":1721823635,"system_fingerprint":null,"model":"qwen-vl-plus","id":"chatcmpl-9a9ec75a-3109-9910-b79e-7bcbce81c8f9"}
data: [DONE]
DashScope
您可以通过DashScope SDK或HTTP方式调用通义千问VL模型,体验流式输出的功能。
Python
示例代码
from dashscope import MultiModalConversation
def simple_multimodal_conversation_call():
"""Simple single round multimodal conversation call.
"""
messages = [
{
"role": "user",
"content": [
{"image": "https://dashscope.oss-cn-beijing.aliyuncs.com/images/dog_and_girl.jpeg"},
{"text": "这是什么?"}
]
}
]
responses = MultiModalConversation.call(
model='qwen-vl-plus',
messages=messages,
stream=True,
incremental_output=True
)
for response in responses:
print(response)
if __name__ == '__main__':
simple_multimodal_conversation_call()
返回结果
{"status_code": 200, "request_id": "124a9f95-0a92-9ae7-8462-517724722b2b", "code": "", "message": "", "output": {"text": null, "finish_reason": null, "choices": [{"finish_reason": "null", "message": {"role": "assistant", "content": [{"text": "这张"}]}}]}, "usage": {"input_tokens": 1276, "output_tokens": 1, "image_tokens": 1247}}
{"status_code": 200, "request_id": "124a9f95-0a92-9ae7-8462-517724722b2b", "code": "", "message": "", "output": {"text": null, "finish_reason": null, "choices": [{"finish_reason": "null", "message": {"role": "assistant", "content": [{"text": "图片"}]}}]}, "usage": {"input_tokens": 1276, "output_tokens": 2, "image_tokens": 1247}}
{"status_code": 200, "request_id": "124a9f95-0a92-9ae7-8462-517724722b2b", "code": "", "message": "", "output": {"text": null, "finish_reason": null, "choices": [{"finish_reason": "null", "message": {"role": "assistant", "content": [{"text": "显示"}]}}]}, "usage": {"input_tokens": 1276, "output_tokens": 3, "image_tokens": 1247}}
{"status_code": 200, "request_id": "124a9f95-0a92-9ae7-8462-517724722b2b", "code": "", "message": "", "output": {"text": null, "finish_reason": null, "choices": [{"finish_reason": "null", "message": {"role": "assistant", "content": [{"text": "了一位女士和一只"}]}}]}, "usage": {"input_tokens": 1276, "output_tokens": 8, "image_tokens": 1247}}
{"status_code": 200, "request_id": "124a9f95-0a92-9ae7-8462-517724722b2b", "code": "", "message": "", "output": {"text": null, "finish_reason": null, "choices": [{"finish_reason": "null", "message": {"role": "assistant", "content": [{"text": "狗在海滩上。她们似乎正在"}]}}]}, "usage": {"input_tokens": 1276, "output_tokens": 16, "image_tokens": 1247}}
{"status_code": 200, "request_id": "124a9f95-0a92-9ae7-8462-517724722b2b", "code": "", "message": "", "output": {"text": null, "finish_reason": null, "choices": [{"finish_reason": "null", "message": {"role": "assistant", "content": [{"text": "互动,可能是在玩耍或训练中"}]}}]}, "usage": {"input_tokens": 1276, "output_tokens": 24, "image_tokens": 1247}}
{"status_code": 200, "request_id": "124a9f95-0a92-9ae7-8462-517724722b2b", "code": "", "message": "", "output": {"text": null, "finish_reason": null, "choices": [{"finish_reason": "null", "message": {"role": "assistant", "content": [{"text": "握手。背景是美丽的日落景色"}]}}]}, "usage": {"input_tokens": 1276, "output_tokens": 32, "image_tokens": 1247}}
{"status_code": 200, "request_id": "124a9f95-0a92-9ae7-8462-517724722b2b", "code": "", "message": "", "output": {"text": null, "finish_reason": null, "choices": [{"finish_reason": "null", "message": {"role": "assistant", "content": [{"text": ",海浪轻轻拍打着海岸线"}]}}]}, "usage": {"input_tokens": 1276, "output_tokens": 40, "image_tokens": 1247}}
{"status_code": 200, "request_id": "124a9f95-0a92-9ae7-8462-517724722b2b", "code": "", "message": "", "output": {"text": null, "finish_reason": null, "choices": [{"finish_reason": "null", "message": {"role": "assistant", "content": [{"text": "。\n\n这位女士穿着格子衬衫,并"}]}}]}, "usage": {"input_tokens": 1276, "output_tokens": 48, "image_tokens": 1247}}
{"status_code": 200, "request_id": "124a9f95-0a92-9ae7-8462-517724722b2b", "code": "", "message": "", "output": {"text": null, "finish_reason": null, "choices": [{"finish_reason": "null", "message": {"role": "assistant", "content": [{"text": "且戴着一个手镯。她坐在"}]}}]}, "usage": {"input_tokens": 1276, "output_tokens": 56, "image_tokens": 1247}}
{"status_code": 200, "request_id": "124a9f95-0a92-9ae7-8462-517724722b2b", "code": "", "message": "", "output": {"text": null, "finish_reason": null, "choices": [{"finish_reason": "null", "message": {"role": "assistant", "content": [{"text": "沙滩上与她的宠物进行着愉快"}]}}]}, "usage": {"input_tokens": 1276, "output_tokens": 64, "image_tokens": 1247}}
{"status_code": 200, "request_id": "124a9f95-0a92-9ae7-8462-517724722b2b", "code": "", "message": "", "output": {"text": null, "finish_reason": null, "choices": [{"finish_reason": "null", "message": {"role": "assistant", "content": [{"text": "的时光。这只狗看起来是一只"}]}}]}, "usage": {"input_tokens": 1276, "output_tokens": 72, "image_tokens": 1247}}
{"status_code": 200, "request_id": "124a9f95-0a92-9ae7-8462-517724722b2b", "code": "", "message": "", "output": {"text": null, "finish_reason": null, "choices": [{"finish_reason": "null", "message": {"role": "assistant", "content": [{"text": "拉布拉多犬或其他类似的品种,"}]}}]}, "usage": {"input_tokens": 1276, "output_tokens": 80, "image_tokens": 1247}}
{"status_code": 200, "request_id": "124a9f95-0a92-9ae7-8462-517724722b2b", "code": "", "message": "", "output": {"text": null, "finish_reason": null, "choices": [{"finish_reason": "null", "message": {"role": "assistant", "content": [{"text": "它也戴着手套以保护它的"}]}}]}, "usage": {"input_tokens": 1276, "output_tokens": 88, "image_tokens": 1247}}
{"status_code": 200, "request_id": "124a9f95-0a92-9ae7-8462-517724722b2b", "code": "", "message": "", "output": {"text": null, "finish_reason": null, "choices": [{"finish_reason": "null", "message": {"role": "assistant", "content": [{"text": "爪子并保持清洁。\n\n这个场景"}]}}]}, "usage": {"input_tokens": 1276, "output_tokens": 96, "image_tokens": 1247}}
{"status_code": 200, "request_id": "124a9f95-0a92-9ae7-8462-517724722b2b", "code": "", "message": "", "output": {"text": null, "finish_reason": null, "choices": [{"finish_reason": "null", "message": {"role": "assistant", "content": [{"text": "充满了友谊、爱以及对大自然美景"}]}}]}, "usage": {"input_tokens": 1276, "output_tokens": 104, "image_tokens": 1247}}
{"status_code": 200, "request_id": "124a9f95-0a92-9ae7-8462-517724722b2b", "code": "", "message": "", "output": {"text": null, "finish_reason": null, "choices": [{"finish_reason": "null", "message": {"role": "assistant", "content": [{"text": "的欣赏。这是一个温馨的画面,展示了"}]}}]}, "usage": {"input_tokens": 1276, "output_tokens": 112, "image_tokens": 1247}}
{"status_code": 200, "request_id": "124a9f95-0a92-9ae7-8462-517724722b2b", "code": "", "message": "", "output": {"text": null, "finish_reason": null, "choices": [{"finish_reason": "null", "message": {"role": "assistant", "content": [{"text": "人与动物之间深厚的情感纽带。"}]}}]}, "usage": {"input_tokens": 1276, "output_tokens": 120, "image_tokens": 1247}}
{"status_code": 200, "request_id": "124a9f95-0a92-9ae7-8462-517724722b2b", "code": "", "message": "", "output": {"text": null, "finish_reason": null, "choices": [{"finish_reason": "stop", "message": {"role": "assistant", "content": []}}]}, "usage": {"input_tokens": 1276, "output_tokens": 121, "image_tokens": 1247}}
Java
示例代码
import java.util.Arrays;
import java.util.HashMap;
import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversation;
import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversationParam;
import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversationResult;
import com.alibaba.dashscope.common.MultiModalMessage;
import com.alibaba.dashscope.common.Role;
import com.alibaba.dashscope.exception.ApiException;
import com.alibaba.dashscope.exception.NoApiKeyException;
import com.alibaba.dashscope.exception.UploadFileException;
import com.alibaba.dashscope.utils.JsonUtils;
import io.reactivex.Flowable;
public class Main {
public static void streamCall()
throws ApiException, NoApiKeyException, UploadFileException {
MultiModalConversation conv = new MultiModalConversation();
// must create mutable map.
MultiModalMessage userMessage = MultiModalMessage.builder().role(Role.USER.getValue())
.content(Arrays.asList(new HashMap<String, Object>(){{put("image", "https://dashscope.oss-cn-beijing.aliyuncs.com/images/dog_and_girl.jpeg");}},
new HashMap<String, Object>(){{put("text", "这是什么");}})).build();
MultiModalConversationParam param = MultiModalConversationParam.builder()
.model("qwen-vl-plus")
.message(userMessage)
.incrementalOutput(true)
.build();
Flowable<MultiModalConversationResult> result = conv.streamCall(param);
result.blockingForEach(item -> {
System.out.println(JsonUtils.toJson(item));
});
}
public static void main(String[] args) {
try {
streamCall();
} catch (ApiException | NoApiKeyException | UploadFileException e) {
System.out.println(e.getMessage());
}
System.exit(0);
}
}
返回结果
{"requestId":"8471902a-9936-9f56-9b84-e786007d633a","usage":{"input_tokens":1275,"output_tokens":1},"output":{"choices":[{"finish_reason":"null","message":{"role":"assistant","content":[{"text":"这张"}]}}]}}
{"requestId":"8471902a-9936-9f56-9b84-e786007d633a","usage":{"input_tokens":1275,"output_tokens":2},"output":{"choices":[{"finish_reason":"null","message":{"role":"assistant","content":[{"text":"图片"}]}}]}}
{"requestId":"8471902a-9936-9f56-9b84-e786007d633a","usage":{"input_tokens":1275,"output_tokens":3},"output":{"choices":[{"finish_reason":"null","message":{"role":"assistant","content":[{"text":"显示"}]}}]}}
{"requestId":"8471902a-9936-9f56-9b84-e786007d633a","usage":{"input_tokens":1275,"output_tokens":8},"output":{"choices":[{"finish_reason":"null","message":{"role":"assistant","content":[{"text":"了一位女士和一只"}]}}]}}
{"requestId":"8471902a-9936-9f56-9b84-e786007d633a","usage":{"input_tokens":1275,"output_tokens":16},"output":{"choices":[{"finish_reason":"null","message":{"role":"assistant","content":[{"text":"狗在海滩上互动。她们似乎"}]}}]}}
{"requestId":"8471902a-9936-9f56-9b84-e786007d633a","usage":{"input_tokens":1275,"output_tokens":24},"output":{"choices":[{"finish_reason":"null","message":{"role":"assistant","content":[{"text":"正在享受彼此的陪伴,狗狗坐在"}]}}]}}
{"requestId":"8471902a-9936-9f56-9b84-e786007d633a","usage":{"input_tokens":1275,"output_tokens":32},"output":{"choices":[{"finish_reason":"null","message":{"role":"assistant","content":[{"text":"沙滩上伸出爪子与这位女士"}]}}]}}
{"requestId":"8471902a-9936-9f56-9b84-e786007d633a","usage":{"input_tokens":1275,"output_tokens":40},"output":{"choices":[{"finish_reason":"null","message":{"role":"assistant","content":[{"text":"握手或玩耍。\n\n背景中可以看到海"}]}}]}}
{"requestId":"8471902a-9936-9f56-9b84-e786007d633a","usage":{"input_tokens":1275,"output_tokens":48},"output":{"choices":[{"finish_reason":"null","message":{"role":"assistant","content":[{"text":"浪拍打着海岸线,并且有"}]}}]}}
{"requestId":"8471902a-9936-9f56-9b84-e786007d633a","usage":{"input_tokens":1275,"output_tokens":56},"output":{"choices":[{"finish_reason":"null","message":{"role":"assistant","content":[{"text":"日落时分柔和光线照射下的"}]}}]}}
{"requestId":"8471902a-9936-9f56-9b84-e786007d633a","usage":{"input_tokens":1275,"output_tokens":64},"output":{"choices":[{"finish_reason":"null","message":{"role":"assistant","content":[{"text":"天空。这给人一种宁静而温馨的感觉"}]}}]}}
{"requestId":"8471902a-9936-9f56-9b84-e786007d633a","usage":{"input_tokens":1275,"output_tokens":72},"output":{"choices":[{"finish_reason":"null","message":{"role":"assistant","content":[{"text":",可能是在傍晚或者清晨的时候拍摄"}]}}]}}
{"requestId":"8471902a-9936-9f56-9b84-e786007d633a","usage":{"input_tokens":1275,"output_tokens":80},"output":{"choices":[{"finish_reason":"null","message":{"role":"assistant","content":[{"text":"的照片。这种场景通常象征着友谊"}]}}]}}
{"requestId":"8471902a-9936-9f56-9b84-e786007d633a","usage":{"input_tokens":1275,"output_tokens":88},"output":{"choices":[{"finish_reason":"null","message":{"role":"assistant","content":[{"text":"、爱以及人与宠物之间的深厚"}]}}]}}
{"requestId":"8471902a-9936-9f56-9b84-e786007d633a","usage":{"input_tokens":1275,"output_tokens":92},"output":{"choices":[{"finish_reason":"stop","message":{"role":"assistant","content":[{"text":"情感连接。"}]}}]}}
curl
示例代码
curl -X POST https://dashscope.aliyuncs.com/api/v1/services/aigc/multimodal-generation/generation \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H 'Content-Type: application/json' \
-H 'X-DashScope-SSE: enable' \
-d '{
"model": "qwen-vl-plus",
"input":{
"messages":[
{
"role": "system",
"content": [
{"text": "You are a helpful assistant."}
]
},
{
"role": "user",
"content": [
{"image": "https://dashscope.oss-cn-beijing.aliyuncs.com/images/dog_and_girl.jpeg"},
{"text": "这个图片是哪里?"}
]
}
]
},
"parameters": {
"incremental_output": true
}
}'
返回结果
id:1
event:result
:HTTP_STATUS/200
data:{"output":{"choices":[{"message":{"content":[{"text":"这张"}],"role":"assistant"},"finish_reason":"null"}]},"usage":{"input_tokens":1278,"output_tokens":1,"image_tokens":1247},"request_id":"8b037000-c670-94cd-88d4-13318ddce1d0"}
id:2
event:result
:HTTP_STATUS/200
data:{"output":{"choices":[{"message":{"content":[{"text":"照片"}],"role":"assistant"},"finish_reason":"null"}]},"usage":{"input_tokens":1278,"output_tokens":2,"image_tokens":1247},"request_id":"8b037000-c670-94cd-88d4-13318ddce1d0"}
......
id:10
event:result
:HTTP_STATUS/200
data:{"output":{"choices":[{"message":{"content":[{"text":"拍打着海岸线以及远处的地平"}],"role":"assistant"},"finish_reason":"null"}]},"usage":{"input_tokens":1278,"output_tokens":56,"image_tokens":1247},"request_id":"8b037000-c670-94cd-88d4-13318ddce1d0"}
id:11
event:result
:HTTP_STATUS/200
data:{"output":{"choices":[{"message":{"content":[{"text":"线上有阳光照射过来。"}],"role":"assistant"},"finish_reason":"stop"}]},"usage":{"input_tokens":1278,"output_tokens":63,"image_tokens":1247},"request_id":"8b037000-c670-94cd-88d4-13318ddce1d0"}
使用本地文件
您可以参考以下示例代码,通过OpenAI或者DashScope的方式,调用通义千问VL模型处理本地文件。以下代码使用的示例图片为:test.png
OpenAI兼容
Python
示例代码
from openai import OpenAI
import os
import base64
# base 64 编码格式
def encode_image(image_path):
with open(image_path, "rb") as image_file:
return base64.b64encode(image_file.read()).decode('utf-8')
def get_response(image_path):
base64_image = encode_image(image_path)
client = OpenAI(
api_key=os.getenv("DASHSCOPE_API_KEY"),
base_url="https://dashscope.aliyuncs.com/compatible-mode/v1",
)
completion = client.chat.completions.create(
model="qwen-vl-plus",
messages=[
{
"role": "user",
"content": [
{
"type": "image_url",
"image_url": {
"url": f"data:image/jpeg;base64,{base64_image}"
}
},
{
"type": "text",
"text": "这是什么"
}
]
}
]
)
print(completion.model_dump_json())
if __name__=='__main__':
get_response("test.png")
返回结果
{
"id": "chatcmpl-7399dbeb-af0c-9fcb-9083-0a836669476d",
"choices": [
{
"finish_reason": "stop",
"index": 0,
"logprobs": null,
"message": {
"content": "这是一只在天空中飞翔的鹰。它有着广阔的翅膀,正在翱翔于云层之间。这种鸟类通常与自由、力量和高瞻远瞩等概念相关联。",
"role": "assistant",
"function_call": null,
"tool_calls": null
}
}
],
"created": 1725948726,
"model": "qwen-vl-plus",
"object": "chat.completion",
"service_tier": null,
"system_fingerprint": null,
"usage": {
"completion_tokens": 41,
"prompt_tokens": 1253,
"total_tokens": 1294
}
}
HTTP
示例代码
import os
import base64
import requests
# base 64 编码格式
def encode_image(image_path):
with open(image_path, "rb") as image_file:
return base64.b64encode(image_file.read()).decode('utf-8')
def get_response(image_path):
base64_image = encode_image(image_path)
api_key = os.getenv("DASHSCOPE_API_KEY")
headers = {
"Content-Type": "application/json",
"Authorization": f"Bearer {api_key}"
}
payload = {
"model": "qwen-vl-plus",
"messages": [
{
"role": "user",
"content": [
{
"type": "image_url",
"image_url": {
"url": f"data:image/jpeg;base64,{base64_image}"
}
},
{
"type": "text",
"text": "这是什么"
}
]
}
]
}
response = requests.post("https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions", headers=headers, json=payload)
print(response.json())
if __name__=='__main__':
get_response(image_path="test.png")
返回结果
{
"choices": [
{
"message": {
"content": "这是一只在天空中飞翔的鹰。它有着广阔的翅膀,正在翱翔于云层之间。这种鸟类通常被认为是力量、自由和雄心壮志的象征,在各种文化中有重要的地位。",
"role": "assistant"
},
"finish_reason": "stop",
"index": 0,
"logprobs": None
}
],
"object": "chat.completion",
"usage": {
"prompt_tokens": 1254,
"completion_tokens": 45,
"total_tokens": 1299
},
"created": 1721732005,
"system_fingerprint": None,
"model": "qwen-vl-plus",
"id": "chatcmpl-13b925d1-ef79-9c15-b890-0079a096d7d3"
}
DashScope
请您参考下表,结合您的使用方式与操作系统进行文件路径的创建。
系统 | SDK | 传入的文件路径 | 示例 |
Linux或macOS系统 | Python SDK | file://{文件的绝对路径} | file:///home/images/test.png |
Java SDK | |||
Windows系统 | Python SDK | file://{文件的绝对路径} | file://D:/images/test.png |
Java SDK | file:///{文件的绝对路径} | file:///D:images/test.png |
Python
示例代码
from dashscope import MultiModalConversation
def call_with_local_file(local_path):
image_path = f"file://{local_path}"
messages = [{'role': 'system',
'content': [{'text': 'You are a helpful assistant.'}]},
{'role':'user',
'content': [{'image': image_path},
{'text': '这是什么'}]}]
response = MultiModalConversation.call(model='qwen-vl-plus', messages=messages)
print(response)
if __name__ == '__main__':
call_with_local_file("test.png")
返回结果
{
"status_code": 200,
"request_id": "65061d9a-d4d9-9d31-9caa-d1c5d9eb3d54",
"code": "",
"message": "",
"output": {
"text": null,
"finish_reason": null,
"choices": [
{
"finish_reason": "stop",
"message": {
"role": "assistant",
"content": [
{
"text": "这是一只在天空中飞翔的鹰。它有着广阔的翅膀,正在翱翔于云层之间。这种鸟类通常被认为是力量、自由和雄心壮志的象征,在各种文化中有重要的地位。"
}
]
}
}
]
},
"usage": {
"input_tokens": 1254,
"output_tokens": 45,
"image_tokens": 1225
}
}
Java
示例代码
// Copyright (c) Alibaba, Inc. and its affiliates.
import java.util.Arrays;
import java.util.HashMap;
import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversation;
import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversationParam;
import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversationResult;
import com.alibaba.dashscope.common.MultiModalMessage;
import com.alibaba.dashscope.common.Role;
import com.alibaba.dashscope.exception.ApiException;
import com.alibaba.dashscope.exception.NoApiKeyException;
import com.alibaba.dashscope.exception.UploadFileException;
import com.alibaba.dashscope.utils.JsonUtils;
public class Main {
public static void callWithLocalFile(String localPath)
throws ApiException, NoApiKeyException, UploadFileException {
String filePath = "file://"+localPath;
MultiModalConversation conv = new MultiModalConversation();
MultiModalMessage userMessage = MultiModalMessage.builder().role(Role.USER.getValue())
.content(Arrays.asList(new HashMap<String, Object>(){{put("image", filePath);}},
new HashMap<String, Object>(){{put("text", "这是什么?");}})).build();
MultiModalConversationParam param = MultiModalConversationParam.builder()
.model("qwen-vl-plus")
.message(userMessage)
.build();
MultiModalConversationResult result = conv.call(param);
System.out.println(JsonUtils.toJson(result));
}
public static void main(String[] args) {
try {
callWithLocalFile("test.png");
} catch (ApiException | NoApiKeyException | UploadFileException e) {
System.out.println(e.getMessage());
}
System.exit(0);
}
}
返回结果
{
"requestId": "c1fde568-c7fe-951c-a7fd-0c356fe04c1d",
"usage": {
"input_tokens": 1255,
"output_tokens": 38
},
"output": {
"choices": [
{
"finish_reason": "stop",
"message": {
"role": "assistant",
"content": [
{
"text": "这是一只在天空中飞翔的鹰。它有着广阔的翅膀,正在翱翔于云层之间。这种景象常常象征着自由、力量和勇气等正面意义。"
}
]
}
}
]
}
}
视频理解
qwen-vl-max
、qwen-vl-max-0809
、qwen-vl-plus-0809
模型支持对视频内容的理解功能。您可以直接传入视频文件,或以图片列表形式传入。请参考以下限制条件:
如果传入图片列表,最多可传入768张图片。
如果传入视频文件:
视频文件大小:最大 150MB。
视频文件格式: MP4、AVI、MKV、MOV、FLV、WMV 等。
视频时长:40秒内的视频能达到最佳效果。
视频尺寸:无限制,但是视频文件会被调整到约 600k 像素数,更大尺寸的视频文件不会有更好的理解效果。
暂时不支持对视频文件的音频进行理解。
如果您需要传入本地视频文件,请使用dashscope Python SDK,文件传入格式请参考DashScope,并确保您的dashscope Python SDK版本不低于1.20.7。
from http import HTTPStatus
import dashscope
def simple_multimodal_conversation_call():
"""Simple single round multimodal conversation call.
"""
messages = [
{
"role": "user",
"content": [
# 以视频文件传入
{"video": "https://cloud.video.taobao.com/vod/S8T54f_w1rkdfLdYjL3S5zKN9CrhkzuhRwOhF313tIQ.mp4"},
# 或以图片列表形式传入
# {"video":[
# "https://dashscope.oss-cn-beijing.aliyuncs.com/images/dog_and_girl.jpeg",
# "https://dashscope.oss-cn-beijing.aliyuncs.com/images/tiger.png"
# ]},
{"text": "视频的内容是什么?"}
]
}
]
response = dashscope.MultiModalConversation.call(
model='qwen-vl-max',
messages=messages
)
if response.status_code == HTTPStatus.OK:
print(response)
else:
print(response.code) # The error code.
print(response.message) # The error message.
if __name__ == '__main__':
simple_multimodal_conversation_call()
curl -X POST https://dashscope.aliyuncs.com/api/v1/services/aigc/multimodal-generation/generation \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H 'Content-Type: application/json' \
-d '{
"model": "qwen-vl-max",
"input":{
"messages":[
{
"role": "user",
"content": [
{"video": ["https://cloud.video.taobao.com/vod/S8T54f_w1rkdfLdYjL3S5zKN9CrhkzuhRwOhF313tIQ.mp4"]},
{"text": "这是什么?"}
]
}
]
}
}'
运行以上代码会返回以下结果。
{
"status_code": 200,
"request_id": "a6772f55-5509-9c2c-bcca-3b9132ed6f63",
"code": "",
"message": "",
"output": {
"text": null,
"finish_reason": null,
"choices": [
{
"finish_reason": "stop",
"message": {
"role": "assistant",
"content": [
{
"text": "视频的内容是一个人使用阿里云的通义千问模型进行对话的演示。在视频中,用户向模型输入了“你好”作为问候语,模型回应了“你好!有什么我能为你效劳的吗?”这个演示展示了通义千问模型的对话功能,以及它如何与用户进行交互。"
}
]
}
}
]
},
"usage": {
"input_tokens": 5205,
"output_tokens": 69,
"video_tokens": 5180
}
}
支持的图片
图片格式 | Content Type | 文件扩展名 |
BMP | image/bmp | .bmp |
DIB | image/bmp | .dib |
ICNS | image/icns | .icns |
ICO | image/x-icon | .ico |
JPEG | image/jpeg | .jfif, .jpe, .jpeg, .jpg |
JPEG2000 | image/jp2 | .j2c, .j2k, .jp2, .jpc, .jpf, .jpx |
PNG | image/png | .apng, .png |
SGI | image/sgi | .bw, .rgb, .rgba, .sgi |
TIFF | image/tiff | .tif, .tiff |
WEBP | image/webp | .webp |
对于输入的图片有以下限制:
图片文件大小不超过10MB。
输入
qwen-vl-max
、qwen-vl-max-0809
与qwen-vl-plus-0809
模型的单张图片,总的像素数不超过 12M,可以支持标准的 4K 图片;输入qwen-vl-max-0201
与qwen-vl-plus
模型的单张图片,总的像素数不超过 1048576,相当于一张长宽均为 1024 的图片总像素数。
常见问题
我可以删除已上传的图片吗?
答:在模型完成文本生成后,百炼服务器会自动将图片删除,无需手动删除。
API参考
关于通义千问VL模型的输入输出参数,请参考通过API使用通义千问。