Qwen-ASR API调用方法与参数-大模型服务平台百炼-阿里云-大模型服务平台百炼(Model Studio)-阿里云帮助中心

本文介绍 Qwen-ASR 模型的输入与输出参数。

用户指南：模型介绍和选型请参见录音文件识别-通义千问。

通义千问3-ASR-Flash和通义千问Audio ASR模型需采用同步调用接入；通义千问3-ASR-Flash-Filetrans模型需采用异步调用接入。两种调用方式在请求体、返回体及流程上均存在差异，请勿混用。

同步调用

中国大陆（北京）：POST https://dashscope.aliyuncs.com/api/v1/services/aigc/multimodal-generation/generation

国际（新加坡）：POST https://dashscope-intl.aliyuncs.com/api/v1/services/aigc/multimodal-generation/generation

请求体	通义千问3-ASR-Flash 以下示例为音频 URL 识别；本地音频文件识别示例请参见快速开始。 cURL # ======= 重要提示 ======= # 以下为北京地域url，若使用新加坡地域的模型，需将url替换为：https://dashscope-intl.aliyuncs.com/api/v1/services/aigc/multimodal-generation/generation # 新加坡地域和北京地域的API Key不同。获取API Key：https://help.aliyun.com/zh/model-studio/get-api-key # === 执行时请删除该注释 === curl --location --request POST 'https://dashscope.aliyuncs.com/api/v1/services/aigc/multimodal-generation/generation' \ --header 'Authorization: Bearer $DASHSCOPE_API_KEY' \ --header 'Content-Type: application/json' \ --data '{ "model": "qwen3-asr-flash", "input": { "messages": [ { "content": [ { "text": "" } ], "role": "system" }, { "content": [ { "audio": "https://dashscope.oss-cn-beijing.aliyuncs.com/audios/welcome.mp3" } ], "role": "user" } ] }, "parameters": { "asr_options": { "enable_itn": false } } }' Java import java.util.Arrays; import java.util.Collections; import java.util.HashMap; import java.util.Map; import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversation; import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversationParam; import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversationResult; import com.alibaba.dashscope.common.MultiModalMessage; import com.alibaba.dashscope.common.Role; import com.alibaba.dashscope.exception.ApiException; import com.alibaba.dashscope.exception.NoApiKeyException; import com.alibaba.dashscope.exception.UploadFileException; import com.alibaba.dashscope.utils.Constants; import com.alibaba.dashscope.utils.JsonUtils; public class Main { public static void simpleMultiModalConversationCall() throws ApiException, NoApiKeyException, UploadFileException { MultiModalConversation conv = new MultiModalConversation(); MultiModalMessage userMessage = MultiModalMessage.builder() .role(Role.USER.getValue()) .content(Arrays.asList( Collections.singletonMap("audio", "https://dashscope.oss-cn-beijing.aliyuncs.com/audios/welcome.mp3"))) .build(); MultiModalMessage sysMessage = MultiModalMessage.builder().role(Role.SYSTEM.getValue()) // 此处用于配置定制化识别的Context .content(Arrays.asList(Collections.singletonMap("text", ""))) .build(); Map<String, Object> asrOptions = new HashMap<>(); asrOptions.put("enable_itn", false); // asrOptions.put("language", "zh"); // 可选，若已知音频的语种，可通过该参数指定待识别语种，以提升识别准确率 MultiModalConversationParam param = MultiModalConversationParam.builder() // 新加坡和北京地域的API Key不同。获取API Key：https://help.aliyun.com/zh/model-studio/get-api-key // 若没有配置环境变量，请用百炼API Key将下行替换为：.apiKey("sk-xxx") .apiKey(System.getenv("DASHSCOPE_API_KEY")) .model("qwen3-asr-flash") .message(userMessage) .message(sysMessage) .parameter("asr_options", asrOptions) .build(); MultiModalConversationResult result = conv.call(param); System.out.println(JsonUtils.toJson(result)); } public static void main(String[] args) { try { // 以下为北京地域url，若使用新加坡地域的模型，需将url替换为：https://dashscope-intl.aliyuncs.com/api/v1 Constants.baseHttpApiUrl = "https://dashscope.aliyuncs.com/api/v1"; simpleMultiModalConversationCall(); } catch (ApiException \| NoApiKeyException \| UploadFileException e) { System.out.println(e.getMessage()); } System.exit(0); } } Python import os import dashscope # 以下为北京地域url，若使用新加坡地域的模型，需将url替换为：https://dashscope-intl.aliyuncs.com/api/v1 dashscope.base_http_api_url = 'https://dashscope.aliyuncs.com/api/v1' messages = [ {"role": "system", "content": [{"text": ""}]}, # 配置定制化识别的 Context {"role": "user", "content": [{"audio": "https://dashscope.oss-cn-beijing.aliyuncs.com/audios/welcome.mp3"}]} ] response = dashscope.MultiModalConversation.call( # 新加坡和北京地域的API Key不同。获取API Key：https://help.aliyun.com/zh/model-studio/get-api-key # 若没有配置环境变量，请用百炼API Key将下行替换为：api_key = "sk-xxx" api_key=os.getenv("DASHSCOPE_API_KEY"), model="qwen3-asr-flash", messages=messages, result_format="message", asr_options={ # "language": "zh", # 可选，若已知音频的语种，可通过该参数指定待识别语种，以提升识别准确率 "enable_itn":False } ) print(response) 通义千问Audio ASR 以下示例为音频 URL 识别；本地音频文件识别示例请参见快速开始。 cURL `curl -X POST https://dashscope.aliyuncs.com/api/v1/services/aigc/multimodal-generation/generation \ -H "Authorization: Bearer $DASHSCOPE_API_KEY" \ -H 'Content-Type: application/json' \ -d '{ "model": "qwen-audio-asr", "input":{ "messages":[ { "role": "user", "content": [ {"audio": "https://dashscope.oss-cn-beijing.aliyuncs.com/audios/welcome.mp3"} ] } ] } }'` Java import java.util.Arrays; import java.util.Collections; import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversation; import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversationParam; import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversationResult; import com.alibaba.dashscope.common.MultiModalMessage; import com.alibaba.dashscope.common.Role; import com.alibaba.dashscope.exception.ApiException; import com.alibaba.dashscope.exception.NoApiKeyException; import com.alibaba.dashscope.exception.UploadFileException; import com.alibaba.dashscope.utils.JsonUtils; public class Main { public static void simpleMultiModalConversationCall() throws ApiException, NoApiKeyException, UploadFileException { MultiModalConversation conv = new MultiModalConversation(); MultiModalMessage userMessage = MultiModalMessage.builder() .role(Role.USER.getValue()) .content(Arrays.asList( Collections.singletonMap("audio", "https://dashscope.oss-cn-beijing.aliyuncs.com/audios/welcome.mp3"))) .build(); MultiModalConversationParam param = MultiModalConversationParam.builder() .model("qwen-audio-asr") .message(userMessage) .build(); MultiModalConversationResult result = conv.call(param); System.out.println(JsonUtils.toJson(result)); } public static void main(String[] args) { try { simpleMultiModalConversationCall(); } catch (ApiException \| NoApiKeyException \| UploadFileException e) { System.out.println(e.getMessage()); } System.exit(0); } } Python `import dashscope messages = [{"role": "user","content": [{"audio": "https://dashscope.oss-cn-beijing.aliyuncs.com/audios/welcome.mp3"}]}] response = dashscope.MultiModalConversation.call( model="qwen-audio-asr", messages=messages, result_format="message") print(response)`
model `string` （必选）模型名称。仅适用于通义千问3-ASR-Flash和通义千问Audio ASR模型。
messages `array` （必选）消息列表。通过HTTP调用时，请将messages 放入 input 对象中。消息类型 System Message `object`（可选）模型的目标或角色。如果设置系统消息，请放在messages列表的第一位。仅通义千问3-ASR-Flash支持该参数，通义千问Audio ASR不支持。属性 content `array`（必选）消息内容。属性 text `string` 指定上下文（Context）。通义千问3-ASR-Flash支持用户在语音识别的同时，提供背景文本、实体词表等参考信息（Context），从而获得定制化的识别结果。长度限制：不超过10000 Token。具体介绍请参见上下文增强。 role `string` （必选）固定为`system`。 User Message `object`（必选）用户发送给模型的消息。属性 content `array` （必选）用户消息的内容。属性 audio `string`（必选）待识别音频。具体用法请参见快速开始。通义千问3-ASR-Flash模型支持三种输入形式：Base64编码的文件、本地文件绝对路径、公网可访问的待识别文件URL。通义千问Audio ASR模型支持两种输入形式：本地文件绝对路径、公网可访问的待识别文件URL。使用SDK时，若录音文件存储在阿里云OSS，不支持使用以 `oss://`为前缀的临时 URL。使用RESTful API时，若录音文件存储在阿里云OSS，支持使用以 `oss://`为前缀的临时 URL。但需注意：重要临时 URL 有效期48小时，过期后无法使用，请勿用于生产环境。文件上传凭证接口限流为 100 QPS 且不支持扩容，请勿用于生产环境、高并发及压测场景。生产环境建议使用阿里云OSS 等稳定存储，确保文件长期可用并规避限流问题。 role `string` （必选）用户消息的角色，固定为`user`。
asr_options `object` （可选）用来指定某些功能是否启用。仅通义千问3-ASR-Flash支持该参数，通义千问Audio ASR不支持。属性 language string（可选）无默认值若已知音频的语种，可通过该参数指定待识别语种，以提升识别准确率。只能指定一个语种。若音频语种不确定，或包含多种语种（例如中英日韩混合），请勿指定该参数。参数值： zh：中文（普通话、四川话、闽南语、吴语） yue：粤语 en：英文 ja：日语 de：德语 ko：韩语 ru：俄语 fr：法语 pt：葡萄牙语 ar：阿拉伯语 it：意大利语 es：西班牙语 hi：印地语 id：印尼语 th：泰语 tr：土耳其语 uk：乌克兰语 vi：越南语 enable_itn `boolean`（可选）默认值为`false` 是否启用ITN（Inverse Text Normalization，逆文本标准化）。该功能仅适用于中文和英文音频。参数值： true：开启； false：关闭。

返回体	通义千问3-ASR-Flash `{ "output": { "choices": [ { "finish_reason": "stop", "message": { "annotations": [ { "language": "zh", "type": "audio_info", "emotion": "neutral" } ], "content": [ { "text": "欢迎使用阿里云。" } ], "role": "assistant" } } ] }, "usage": { "input_tokens_details": { "text_tokens": 0 }, "output_tokens_details": { "text_tokens": 6 }, "seconds": 1 }, "request_id": "568e2bf0-d6f2-97f8-9f15-a57b11dc6977" }` 通义千问Audio ASR `{ "status_code": 200, "request_id": "802e87ff-1875-99cd-96c0-16a50338836a", "code": "", "message": "", "output": { "text": null, "finish_reason": null, "choices": [ { "finish_reason": "stop", "message": { "role": "assistant", "content": [ { "text": "欢迎使用阿里云" } ] } } ] }, "usage": { "input_tokens": 74, "output_tokens": 7, "audio_tokens": 46 } }`
request_id `string` 本次调用的唯一标识符。 Java SDK返回参数为requestId。
output `object` 调用结果信息。属性 choices array 模型的输出信息。当result_format为message时返回choices参数。属性 finish_reason `string` 有三种情况：正在生成时为null；因模型输出自然结束，或触发输入参数中的stop条件而结束时为stop；因生成长度过长而结束为length。 message `object` 模型输出的消息对象。属性 role `string` 输出消息的角色，固定为assistant。 content `array` 输出消息的内容。属性 text `string` 语音识别结果。 annotations `array` 输出标注信息（如语种）属性 language `string` 被识别音频的语种。当请求参数`language`已指定语种时，该值与所指定的参数一致。可能的值如下： zh：中文（普通话、四川话、闽南语、吴语） yue：粤语 en：英文 ja：日语 de：德语 ko：韩语 ru：俄语 fr：法语 pt：葡萄牙语 ar：阿拉伯语 it：意大利语 es：西班牙语 hi：印地语 id：印尼语 th：泰语 tr：土耳其语 uk：乌克兰语 vi：越南语 type `string` 固定为`audio_info`，表示音频信息。 emotion `string` 被识别音频的情感。支持的情感如下： `surprised`：惊讶 `neutral`：平静 `happy`：愉快 `sad`：悲伤 `disgusted`：厌恶 `angry`：愤怒 `fearful`：恐惧
usage `object` 本次请求使用的Token信息。属性 input_tokens_details `integer` 通义千问3-ASR-Flash输入内容长度（Token）。属性 text_tokens `integer` 通义千问3-ASR-Flash使用上下文增强功能时输入的文本长度（Token），上限为10000 Token。 output_tokens_details `integer` 通义千问3-ASR-Flash输出内容长度（Token）。属性 text_tokens `integer` 通义千问3-ASR-Flash输出的识别结果文本长度（Token）。 seconds `integer` 通义千问3-ASR-Flash音频时长（秒）。 input_tokens `integer` 通义千问Audio ASR输入音频长度（Token）。音频转换Token规则：每秒音频转换为25个Token，不足1秒按1秒计算。 output_tokens `integer` 通义千问Audio ASR输出的识别结果文本长度（Token）。 audio_tokens `integer` 通义千问Audio ASR输出的音频长度（Token）。音频转换Token规则：每秒音频转换为25个Token，不足1秒按1秒计算。

异步调用

提交任务

中国大陆（北京）：POST https://dashscope.aliyuncs.com/api/v1/services/audio/asr/transcription

国际（新加坡）：POST https://dashscope-intl.aliyuncs.com/api/v1/services/audio/asr/transcription

请求体	cURL # ======= 重要提示 ======= # 以下为北京地域url，若使用新加坡地域的模型，需将url替换为：https://dashscope-intl.aliyuncs.com/api/v1/services/audio/asr/transcription # 新加坡地域和北京地域的API Key不同。获取API Key：https://help.aliyun.com/zh/model-studio/get-api-key # === 执行时请删除该注释 === curl --location --request POST 'https://dashscope.aliyuncs.com/api/v1/services/audio/asr/transcription' \ --header "Authorization: Bearer $DASHSCOPE_API_KEY" \ --header "Content-Type: application/json" \ --header "X-DashScope-Async: enable" \ --data '{ "model": "qwen3-asr-flash-filetrans", "input": { "file_url": "https://dashscope.oss-cn-beijing.aliyuncs.com/audios/welcome.mp3" }, "parameters": { "channel_id":[ 0 ], "enable_itn": false } }' Java import com.google.gson.Gson; import com.google.gson.annotations.SerializedName; import okhttp3.; import java.io.IOException; public class Main { // 以下为北京地域url，若使用新加坡地域的模型，需将url替换为：https://dashscope-intl.aliyuncs.com/api/v1/services/audio/asr/transcription private static final String API_URL = "https://dashscope.aliyuncs.com/api/v1/services/audio/asr/transcription"; public static void main(String[] args) { // 新加坡和北京地域的API Key不同。获取API Key：https://help.aliyun.com/zh/model-studio/get-api-key // 若没有配置环境变量，请用百炼API Key将下行替换为：String apiKey = "sk-xxx" String apiKey = System.getenv("DASHSCOPE_API_KEY"); OkHttpClient client = new OkHttpClient(); Gson gson = new Gson(); /String payloadJson = """ { "model": "qwen3-asr-flash-filetrans", "input": { "file_url": "https://dashscope.oss-cn-beijing.aliyuncs.com/audios/welcome.mp3" }, "parameters": { "channel_id": [0], "enable_itn": false, "language": "zh", "corpus": { "text": "" } } } """;*/ String payloadJson = """ { "model": "qwen3-asr-flash-filetrans", "input": { "file_url": "https://dashscope.oss-cn-beijing.aliyuncs.com/audios/welcome.mp3" }, "parameters": { "channel_id": [0], "enable_itn": false } } """; RequestBody body = RequestBody.create(payloadJson, MediaType.get("application/json; charset=utf-8")); Request request = new Request.Builder() .url(API_URL) .addHeader("Authorization", "Bearer " + apiKey) .addHeader("Content-Type", "application/json") .addHeader("X-DashScope-Async", "enable") .post(body) .build(); try (Response response = client.newCall(request).execute()) { if (response.isSuccessful() && response.body() != null) { String respBody = response.body().string(); // 用 Gson 解析 JSON ApiResponse apiResp = gson.fromJson(respBody, ApiResponse.class); if (apiResp.output != null) { System.out.println("task_id: " + apiResp.output.taskId); } else { System.out.println(respBody); } } else { System.out.println("task failed! HTTP code: " + response.code()); if (response.body() != null) { System.out.println(response.body().string()); } } } catch (IOException e) { e.printStackTrace(); } } static class ApiResponse { @SerializedName("request_id") String requestId; Output output; } static class Output { @SerializedName("task_id") String taskId; @SerializedName("task_status") String taskStatus; } } Python import requests import json import os # 以下为北京地域url，若使用新加坡地域的模型，需将url替换为：https://dashscope-intl.aliyuncs.com/api/v1/services/audio/asr/transcription url = "https://dashscope.aliyuncs.com/api/v1/services/audio/asr/transcription" # 新加坡和北京地域的API Key不同。获取API Key：https://help.aliyun.com/zh/model-studio/get-api-key # 若没有配置环境变量，请用百炼API Key将下行替换为：DASHSCOPE_API_KEY = "sk-xxx" DASHSCOPE_API_KEY = os.getenv("DASHSCOPE_API_KEY") headers = { "Authorization": f"Bearer {DASHSCOPE_API_KEY}", "Content-Type": "application/json", "X-DashScope-Async": "enable" } payload = { "model": "qwen3-asr-flash-filetrans", "input": { "file_url": "https://dashscope.oss-cn-beijing.aliyuncs.com/audios/welcome.mp3" }, "parameters": { "channel_id": [0], # "language": "zh", "enable_itn": False # "corpus": { # "text": "" # } } } response = requests.post(url, headers=headers, data=json.dumps(payload)) if response.status_code == 200: print(f"task_id: {response.json()["output"]["task_id"]}") else: print("task failed!") print(response.json())
model `string` （必选）模型名称。仅适用于通义千问3-ASR-Flash-Filetrans模型。
input `object` （必选）属性 file_url `string`（必选）待识别音频文件URL，URL必须公网可访问。使用RESTful API时，若录音文件存储在阿里云OSS，支持使用以 `oss://`为前缀的临时 URL。但需注意：重要临时 URL 有效期48小时，过期后无法使用，请勿用于生产环境。文件上传凭证接口限流为 100 QPS 且不支持扩容，请勿用于生产环境、高并发及压测场景。生产环境建议使用阿里云OSS 等稳定存储，确保文件长期可用并规避限流问题。
parameters `object` （可选）属性 language string（可选）无默认值若已知音频的语种，可通过该参数指定待识别语种，以提升识别准确率。只能指定一个语种。若音频语种不确定，或包含多种语种（例如中英日韩混合），请勿指定该参数。参数值： zh：中文（普通话、四川话、闽南语、吴语） yue：粤语 en：英文 ja：日语 de：德语 ko：韩语 ru：俄语 fr：法语 pt：葡萄牙语 ar：阿拉伯语 it：意大利语 es：西班牙语 hi：印地语 id：印尼语 th：泰语 tr：土耳其语 uk：乌克兰语 vi：越南语 enable_itn `boolean`（可选）默认值为`false` 是否启用ITN（Inverse Text Normalization，逆文本标准化）。该功能仅适用于中文和英文音频。参数值： true：开启； false：关闭。 text `string` 指定上下文（Context）。通义千问3-ASR-Flash支持用户在语音识别的同时，提供背景文本、实体词表等参考信息（Context），从而获得定制化的识别结果。长度限制：不超过10000 Token。具体介绍请参见上下文增强。 channel_id `array` （可选）默认值为`[0]` 指定多音轨文件中需进行语音识别的音轨索引。例如：[0] 表示仅识别第一条音轨，[0, 1] 表示同时识别第一条和第二条音轨。

返回体	`{ "request_id": "92e3decd-0c69-47a8-**********", "output": { "task_id": "8fab76d0-0eed-4d20-**********", "task_status": "PENDING" } }`
request_id `string` 本次调用的唯一标识符。
output `object` 调用结果信息。属性 task_id `string` 任务ID。该ID在查询语音识别任务接口中作为请求参数传入。 task_status `string` 任务状态： PENDING：任务排队中 RUNNING：任务处理中 SUCCEEDED：任务执行成功 FAILED：任务执行失败 UNKNOWN：任务不存在或状态未知

获取任务执行结果

中国大陆（北京）：GET https://dashscope.aliyuncs.com/api/v1/tasks/{task_id}

国际（新加坡）：GET https://dashscope-intl.aliyuncs.com/api/v1/tasks/{task_id}

请求体

将提交任务返回结果中的task_id作为参数传入，查询语音识别结果。

cURL

# ======= 重要提示 =======
# 以下为北京地域url，若使用新加坡地域的模型，需将url替换为：https://dashscope-intl.aliyuncs.com/api/v1/tasks/{task_id}，注意，将{task_id}替换为待查询任务ID
# 新加坡地域和北京地域的API Key不同。获取API Key：https://help.aliyun.com/zh/model-studio/get-api-key
# === 执行时请删除该注释 ===

curl --location --request GET 'https://dashscope.aliyuncs.com/api/v1/tasks/{task_id}' \
--header "Authorization: Bearer $DASHSCOPE_API_KEY" \
--header "X-DashScope-Async: enable" \
--header "Content-Type: application/json"

Java

import okhttp3.*;

import java.io.IOException;

public class Main {
    public static void main(String[] args) {
        // 替换为实际的task_id
        String taskId = "xxx";
        // 新加坡地域和北京地域的API Key不同。获取API Key：https://help.aliyun.com/zh/model-studio/get-api-key
        // 若没有配置环境变量，请用百炼API Key将下行替换为：String apiKey = "sk-xxx"
        String apiKey = System.getenv("DASHSCOPE_API_KEY");

        // 以下为北京地域url，若使用新加坡地域的模型，需将url替换为：https://dashscope-intl.aliyuncs.com/api/v1/tasks/{task_id}，注意，将{task_id}替换为待查询任务ID
        String apiUrl = "https://dashscope.aliyuncs.com/api/v1/tasks/" + taskId;

        OkHttpClient client = new OkHttpClient();

        Request request = new Request.Builder()
                .url(apiUrl)
                .addHeader("Authorization", "Bearer " + apiKey)
                .addHeader("X-DashScope-Async", "enable")
                .addHeader("Content-Type", "application/json")
                .get()
                .build();

        try (Response response = client.newCall(request).execute()) {
            if (response.body() != null) {
                System.out.println(response.body().string());
            }
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
}

Python

import os
import requests


# 新加坡地域和北京地域的API Key不同。获取API Key：https://help.aliyun.com/zh/model-studio/get-api-key
# 若没有配置环境变量，请用百炼API Key将下行替换为：DASHSCOPE_API_KEY = "sk-xxx"
DASHSCOPE_API_KEY = os.getenv("DASHSCOPE_API_KEY")

# 替换为实际的task_id
task_id = "xxx"
# 以下为北京地域url，若使用新加坡地域的模型，需将url替换为：https://dashscope-intl.aliyuncs.com/api/v1/tasks/{task_id}，注意，将{task_id}替换为待查询任务ID
url = f"https://dashscope.aliyuncs.com/api/v1/tasks/{task_id}"

headers = {
    "Authorization": f"Bearer {DASHSCOPE_API_KEY}",
    "X-DashScope-Async": "enable",
    "Content-Type": "application/json"
}

response = requests.get(url, headers=headers)
print(response.json())

返回体	RUNNING `{ "request_id": "6769df07-2768-4fb0-ad59-**********", "output": { "task_id": "9be1700a-0f8e-4778-be74-********", "task_status": "RUNNING", "submit_time": "2025-10-27 14:19:31.150", "scheduled_time": "2025-10-27 14:19:31.233", "task_metrics": { "TOTAL": 1, "SUCCEEDED": 0, "FAILED": 0 } } }` SUCCEEDED { "request_id": "1dca6c0a-0ed1-4662-aa39-********", "output": { "task_id": "8fab76d0-0eed-4d20-929f-********", "task_status": "SUCCEEDED", "submit_time": "2025-10-27 13:57:45.948", "scheduled_time": "2025-10-27 13:57:46.018", "end_time": "2025-10-27 13:57:47.079", "result": { "transcription_url": "http://dashscope-result-bj.oss-cn-beijing.aliyuncs.com/pre/pre-funasr-mlt-v1/20251027/13%3A57/7a3a8236-ffd1-4099-a280-0299686ac7da.json?Expires=1761631066&OSSAccessKeyId=LTAI**********&Signature=1lKv4RgyWCarRuUdIiErOeOBnwM%3D&response-content-disposition=attachment%3Bfilename%3D7a3a8236-ffd1-4099-a280-0299686ac7da.json" } }, "usage": { "seconds": 3 } } FAILED `{ "request_id": "3d141841-858a-466a-9ff9-********", "output": { "task_id": "c58c7951-7789-4557-9ea3-**********", "task_status": "FAILED", "submit_time": "2025-10-27 15:06:06.915", "scheduled_time": "2025-10-27 15:06:06.967", "end_time": "2025-10-27 15:06:07.584", "code": "FILE_403_FORBIDDEN", "message": "FILE_403_FORBIDDEN" } }`
request_id `string` 本次调用的唯一标识符。
output `object` 调用结果信息。属性 task_id `string` 任务ID。该ID在查询语音识别任务接口中作为请求参数传入。 task_status `string` 任务状态： PENDING：任务排队中 RUNNING：任务处理中 SUCCEEDED：任务执行成功 FAILED：任务执行失败 UNKNOWN：任务不存在或状态未知 result `object` 语音识别结果。属性 transcription_url `string` 识别结果文件的下载 URL，链接有效期为 24 小时。过期后无法查询任务，也无法通过先前的 URL 下载结果。识别结果以 JSON 文件保存，可通过该链接下载文件，或直接使用 HTTP 请求读取文件内容。详情参见识别结果说明。 submit_time `string` 任务提交时间。 schedule_time `string` 任务调度时间，即开始执行时间。 end_time `string` 任务结束时间。 task_metrics `object` 任务指标，包含子任务状态的统计信息。属性 TOTAL `integer` 子任务总数。 SUCCEEDED `integer` 子任务成功数。 FAILED `integer` 子任务失败数。 code `string` 错误码，仅在任务失败时返回。 message `string` 错误信息，仅任务失败时返回。 usage `object` 本次请求使用的Token信息。属性 seconds `integer` 通义千问3-ASR-Flash音频时长（秒）。

识别结果说明	`{ "file_url":"https://***.wav", "audio_info":{ "format":"wav", "sample_rate": 16000 }, "transcripts":[ { "channel_id":0, "text":"今天天气还行吧。", "sentences":[ { "begin_time":100, "end_time":3820, "text":"今天天气还行吧。", "sentence_id":0, "language":"zh", "emotion":"neutral" } ] } ] }`
file_url `string` 被识别的音频文件URL。
audio_info `object` 被识别音频文件相关信息。属性 format `string` 音频格式。 sample_rate `integer` 音频采样率。
transcripts `array` 完整的识别结果列表，每个元素对应一条音轨的识别内容。属性 channel_id `integer` 音轨索引，以0为起始。 text `string` 识别结果文本。 sentences `object` 句子级别的识别结果列表。属性 begin_time`integer` 句子开始时间戳（毫秒）。 end_time`integer` 句子结束时间戳（毫秒）。 text `string` 识别结果文本。 sentence_id `integer` 句子索引，以0为起始。 language `string` 被识别音频的语种。当请求参数`language`已指定语种时，该值与所指定的参数一致。可能的值如下： zh：中文（普通话、四川话、闽南语、吴语） yue：粤语 en：英文 ja：日语 de：德语 ko：韩语 ru：俄语 fr：法语 pt：葡萄牙语 ar：阿拉伯语 it：意大利语 es：西班牙语 hi：印地语 id：印尼语 th：泰语 tr：土耳其语 uk：乌克兰语 vi：越南语 emotion `string` 被识别音频的情感。支持的情感如下： `surprised`：惊讶 `neutral`：平静 `happy`：愉快 `sad`：悲伤 `disgusted`：厌恶 `angry`：愤怒 `fearful`：恐惧

请求体	通义千问3-ASR-Flash 以下示例为音频 URL 识别；本地音频文件识别示例请参见快速开始。 cURL # ======= 重要提示 ======= # 以下为北京地域url，若使用新加坡地域的模型，需将url替换为：https://dashscope-intl.aliyuncs.com/api/v1/services/aigc/multimodal-generation/generation # 新加坡地域和北京地域的API Key不同。获取API Key：https://help.aliyun.com/zh/model-studio/get-api-key # === 执行时请删除该注释 === curl --location --request POST 'https://dashscope.aliyuncs.com/api/v1/services/aigc/multimodal-generation/generation' \ --header 'Authorization: Bearer $DASHSCOPE_API_KEY' \ --header 'Content-Type: application/json' \ --data '{ "model": "qwen3-asr-flash", "input": { "messages": [ { "content": [ { "text": "" } ], "role": "system" }, { "content": [ { "audio": "https://dashscope.oss-cn-beijing.aliyuncs.com/audios/welcome.mp3" } ], "role": "user" } ] }, "parameters": { "asr_options": { "enable_itn": false } } }' Java import java.util.Arrays; import java.util.Collections; import java.util.HashMap; import java.util.Map; import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversation; import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversationParam; import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversationResult; import com.alibaba.dashscope.common.MultiModalMessage; import com.alibaba.dashscope.common.Role; import com.alibaba.dashscope.exception.ApiException; import com.alibaba.dashscope.exception.NoApiKeyException; import com.alibaba.dashscope.exception.UploadFileException; import com.alibaba.dashscope.utils.Constants; import com.alibaba.dashscope.utils.JsonUtils; public class Main { public static void simpleMultiModalConversationCall() throws ApiException, NoApiKeyException, UploadFileException { MultiModalConversation conv = new MultiModalConversation(); MultiModalMessage userMessage = MultiModalMessage.builder() .role(Role.USER.getValue()) .content(Arrays.asList( Collections.singletonMap("audio", "https://dashscope.oss-cn-beijing.aliyuncs.com/audios/welcome.mp3"))) .build(); MultiModalMessage sysMessage = MultiModalMessage.builder().role(Role.SYSTEM.getValue()) // 此处用于配置定制化识别的Context .content(Arrays.asList(Collections.singletonMap("text", ""))) .build(); Map<String, Object> asrOptions = new HashMap<>(); asrOptions.put("enable_itn", false); // asrOptions.put("language", "zh"); // 可选，若已知音频的语种，可通过该参数指定待识别语种，以提升识别准确率 MultiModalConversationParam param = MultiModalConversationParam.builder() // 新加坡和北京地域的API Key不同。获取API Key：https://help.aliyun.com/zh/model-studio/get-api-key // 若没有配置环境变量，请用百炼API Key将下行替换为：.apiKey("sk-xxx") .apiKey(System.getenv("DASHSCOPE_API_KEY")) .model("qwen3-asr-flash") .message(userMessage) .message(sysMessage) .parameter("asr_options", asrOptions) .build(); MultiModalConversationResult result = conv.call(param); System.out.println(JsonUtils.toJson(result)); } public static void main(String[] args) { try { // 以下为北京地域url，若使用新加坡地域的模型，需将url替换为：https://dashscope-intl.aliyuncs.com/api/v1 Constants.baseHttpApiUrl = "https://dashscope.aliyuncs.com/api/v1"; simpleMultiModalConversationCall(); } catch (ApiException \| NoApiKeyException \| UploadFileException e) { System.out.println(e.getMessage()); } System.exit(0); } } Python import os import dashscope # 以下为北京地域url，若使用新加坡地域的模型，需将url替换为：https://dashscope-intl.aliyuncs.com/api/v1 dashscope.base_http_api_url = 'https://dashscope.aliyuncs.com/api/v1' messages = [ {"role": "system", "content": [{"text": ""}]}, # 配置定制化识别的 Context {"role": "user", "content": [{"audio": "https://dashscope.oss-cn-beijing.aliyuncs.com/audios/welcome.mp3"}]} ] response = dashscope.MultiModalConversation.call( # 新加坡和北京地域的API Key不同。获取API Key：https://help.aliyun.com/zh/model-studio/get-api-key # 若没有配置环境变量，请用百炼API Key将下行替换为：api_key = "sk-xxx" api_key=os.getenv("DASHSCOPE_API_KEY"), model="qwen3-asr-flash", messages=messages, result_format="message", asr_options={ # "language": "zh", # 可选，若已知音频的语种，可通过该参数指定待识别语种，以提升识别准确率 "enable_itn":False } ) print(response) 通义千问Audio ASR 以下示例为音频 URL 识别；本地音频文件识别示例请参见快速开始。 cURL `curl -X POST https://dashscope.aliyuncs.com/api/v1/services/aigc/multimodal-generation/generation \ -H "Authorization: Bearer $DASHSCOPE_API_KEY" \ -H 'Content-Type: application/json' \ -d '{ "model": "qwen-audio-asr", "input":{ "messages":[ { "role": "user", "content": [ {"audio": "https://dashscope.oss-cn-beijing.aliyuncs.com/audios/welcome.mp3"} ] } ] } }'` Java import java.util.Arrays; import java.util.Collections; import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversation; import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversationParam; import com.alibaba.dashscope.aigc.multimodalconversation.MultiModalConversationResult; import com.alibaba.dashscope.common.MultiModalMessage; import com.alibaba.dashscope.common.Role; import com.alibaba.dashscope.exception.ApiException; import com.alibaba.dashscope.exception.NoApiKeyException; import com.alibaba.dashscope.exception.UploadFileException; import com.alibaba.dashscope.utils.JsonUtils; public class Main { public static void simpleMultiModalConversationCall() throws ApiException, NoApiKeyException, UploadFileException { MultiModalConversation conv = new MultiModalConversation(); MultiModalMessage userMessage = MultiModalMessage.builder() .role(Role.USER.getValue()) .content(Arrays.asList( Collections.singletonMap("audio", "https://dashscope.oss-cn-beijing.aliyuncs.com/audios/welcome.mp3"))) .build(); MultiModalConversationParam param = MultiModalConversationParam.builder() .model("qwen-audio-asr") .message(userMessage) .build(); MultiModalConversationResult result = conv.call(param); System.out.println(JsonUtils.toJson(result)); } public static void main(String[] args) { try { simpleMultiModalConversationCall(); } catch (ApiException \| NoApiKeyException \| UploadFileException e) { System.out.println(e.getMessage()); } System.exit(0); } } Python `import dashscope messages = [{"role": "user","content": [{"audio": "https://dashscope.oss-cn-beijing.aliyuncs.com/audios/welcome.mp3"}]}] response = dashscope.MultiModalConversation.call( model="qwen-audio-asr", messages=messages, result_format="message") print(response)`
model `string` （必选）模型名称。仅适用于通义千问3-ASR-Flash和通义千问Audio ASR模型。
messages `array` （必选）消息列表。通过HTTP调用时，请将messages 放入 input 对象中。消息类型 System Message `object`（可选）模型的目标或角色。如果设置系统消息，请放在messages列表的第一位。仅通义千问3-ASR-Flash支持该参数，通义千问Audio ASR不支持。属性 content `array`（必选）消息内容。属性 text `string` 指定上下文（Context）。通义千问3-ASR-Flash支持用户在语音识别的同时，提供背景文本、实体词表等参考信息（Context），从而获得定制化的识别结果。长度限制：不超过10000 Token。具体介绍请参见上下文增强。 role `string` （必选）固定为`system`。 User Message `object`（必选）用户发送给模型的消息。属性 content `array` （必选）用户消息的内容。属性 audio `string`（必选）待识别音频。具体用法请参见快速开始。通义千问3-ASR-Flash模型支持三种输入形式：Base64编码的文件、本地文件绝对路径、公网可访问的待识别文件URL。通义千问Audio ASR模型支持两种输入形式：本地文件绝对路径、公网可访问的待识别文件URL。使用SDK时，若录音文件存储在阿里云OSS，不支持使用以 `oss://`为前缀的临时 URL。使用RESTful API时，若录音文件存储在阿里云OSS，支持使用以 `oss://`为前缀的临时 URL。但需注意：重要临时 URL 有效期48小时，过期后无法使用，请勿用于生产环境。文件上传凭证接口限流为 100 QPS 且不支持扩容，请勿用于生产环境、高并发及压测场景。生产环境建议使用阿里云OSS 等稳定存储，确保文件长期可用并规避限流问题。 role `string` （必选）用户消息的角色，固定为`user`。
asr_options `object` （可选）用来指定某些功能是否启用。仅通义千问3-ASR-Flash支持该参数，通义千问Audio ASR不支持。属性 language string（可选）无默认值若已知音频的语种，可通过该参数指定待识别语种，以提升识别准确率。只能指定一个语种。若音频语种不确定，或包含多种语种（例如中英日韩混合），请勿指定该参数。参数值： zh：中文（普通话、四川话、闽南语、吴语） yue：粤语 en：英文 ja：日语 de：德语 ko：韩语 ru：俄语 fr：法语 pt：葡萄牙语 ar：阿拉伯语 it：意大利语 es：西班牙语 hi：印地语 id：印尼语 th：泰语 tr：土耳其语 uk：乌克兰语 vi：越南语 enable_itn `boolean`（可选）默认值为`false` 是否启用ITN（Inverse Text Normalization，逆文本标准化）。该功能仅适用于中文和英文音频。参数值： true：开启； false：关闭。

请求体	cURL # ======= 重要提示 ======= # 以下为北京地域url，若使用新加坡地域的模型，需将url替换为：https://dashscope-intl.aliyuncs.com/api/v1/services/audio/asr/transcription # 新加坡地域和北京地域的API Key不同。获取API Key：https://help.aliyun.com/zh/model-studio/get-api-key # === 执行时请删除该注释 === curl --location --request POST 'https://dashscope.aliyuncs.com/api/v1/services/audio/asr/transcription' \ --header "Authorization: Bearer $DASHSCOPE_API_KEY" \ --header "Content-Type: application/json" \ --header "X-DashScope-Async: enable" \ --data '{ "model": "qwen3-asr-flash-filetrans", "input": { "file_url": "https://dashscope.oss-cn-beijing.aliyuncs.com/audios/welcome.mp3" }, "parameters": { "channel_id":[ 0 ], "enable_itn": false } }' Java import com.google.gson.Gson; import com.google.gson.annotations.SerializedName; import okhttp3.; import java.io.IOException; public class Main { // 以下为北京地域url，若使用新加坡地域的模型，需将url替换为：https://dashscope-intl.aliyuncs.com/api/v1/services/audio/asr/transcription private static final String API_URL = "https://dashscope.aliyuncs.com/api/v1/services/audio/asr/transcription"; public static void main(String[] args) { // 新加坡和北京地域的API Key不同。获取API Key：https://help.aliyun.com/zh/model-studio/get-api-key // 若没有配置环境变量，请用百炼API Key将下行替换为：String apiKey = "sk-xxx" String apiKey = System.getenv("DASHSCOPE_API_KEY"); OkHttpClient client = new OkHttpClient(); Gson gson = new Gson(); /String payloadJson = """ { "model": "qwen3-asr-flash-filetrans", "input": { "file_url": "https://dashscope.oss-cn-beijing.aliyuncs.com/audios/welcome.mp3" }, "parameters": { "channel_id": [0], "enable_itn": false, "language": "zh", "corpus": { "text": "" } } } """;*/ String payloadJson = """ { "model": "qwen3-asr-flash-filetrans", "input": { "file_url": "https://dashscope.oss-cn-beijing.aliyuncs.com/audios/welcome.mp3" }, "parameters": { "channel_id": [0], "enable_itn": false } } """; RequestBody body = RequestBody.create(payloadJson, MediaType.get("application/json; charset=utf-8")); Request request = new Request.Builder() .url(API_URL) .addHeader("Authorization", "Bearer " + apiKey) .addHeader("Content-Type", "application/json") .addHeader("X-DashScope-Async", "enable") .post(body) .build(); try (Response response = client.newCall(request).execute()) { if (response.isSuccessful() && response.body() != null) { String respBody = response.body().string(); // 用 Gson 解析 JSON ApiResponse apiResp = gson.fromJson(respBody, ApiResponse.class); if (apiResp.output != null) { System.out.println("task_id: " + apiResp.output.taskId); } else { System.out.println(respBody); } } else { System.out.println("task failed! HTTP code: " + response.code()); if (response.body() != null) { System.out.println(response.body().string()); } } } catch (IOException e) { e.printStackTrace(); } } static class ApiResponse { @SerializedName("request_id") String requestId; Output output; } static class Output { @SerializedName("task_id") String taskId; @SerializedName("task_status") String taskStatus; } } Python import requests import json import os # 以下为北京地域url，若使用新加坡地域的模型，需将url替换为：https://dashscope-intl.aliyuncs.com/api/v1/services/audio/asr/transcription url = "https://dashscope.aliyuncs.com/api/v1/services/audio/asr/transcription" # 新加坡和北京地域的API Key不同。获取API Key：https://help.aliyun.com/zh/model-studio/get-api-key # 若没有配置环境变量，请用百炼API Key将下行替换为：DASHSCOPE_API_KEY = "sk-xxx" DASHSCOPE_API_KEY = os.getenv("DASHSCOPE_API_KEY") headers = { "Authorization": f"Bearer {DASHSCOPE_API_KEY}", "Content-Type": "application/json", "X-DashScope-Async": "enable" } payload = { "model": "qwen3-asr-flash-filetrans", "input": { "file_url": "https://dashscope.oss-cn-beijing.aliyuncs.com/audios/welcome.mp3" }, "parameters": { "channel_id": [0], # "language": "zh", "enable_itn": False # "corpus": { # "text": "" # } } } response = requests.post(url, headers=headers, data=json.dumps(payload)) if response.status_code == 200: print(f"task_id: {response.json()["output"]["task_id"]}") else: print("task failed!") print(response.json())
model `string` （必选）模型名称。仅适用于通义千问3-ASR-Flash-Filetrans模型。
input `object` （必选）属性 file_url `string`（必选）待识别音频文件URL，URL必须公网可访问。使用RESTful API时，若录音文件存储在阿里云OSS，支持使用以 `oss://`为前缀的临时 URL。但需注意：重要临时 URL 有效期48小时，过期后无法使用，请勿用于生产环境。文件上传凭证接口限流为 100 QPS 且不支持扩容，请勿用于生产环境、高并发及压测场景。生产环境建议使用阿里云OSS 等稳定存储，确保文件长期可用并规避限流问题。
parameters `object` （可选）属性 language string（可选）无默认值若已知音频的语种，可通过该参数指定待识别语种，以提升识别准确率。只能指定一个语种。若音频语种不确定，或包含多种语种（例如中英日韩混合），请勿指定该参数。参数值： zh：中文（普通话、四川话、闽南语、吴语） yue：粤语 en：英文 ja：日语 de：德语 ko：韩语 ru：俄语 fr：法语 pt：葡萄牙语 ar：阿拉伯语 it：意大利语 es：西班牙语 hi：印地语 id：印尼语 th：泰语 tr：土耳其语 uk：乌克兰语 vi：越南语 enable_itn `boolean`（可选）默认值为`false` 是否启用ITN（Inverse Text Normalization，逆文本标准化）。该功能仅适用于中文和英文音频。参数值： true：开启； false：关闭。 text `string` 指定上下文（Context）。通义千问3-ASR-Flash支持用户在语音识别的同时，提供背景文本、实体词表等参考信息（Context），从而获得定制化的识别结果。长度限制：不超过10000 Token。具体介绍请参见上下文增强。 channel_id `array` （可选）默认值为`[0]` 指定多音轨文件中需进行语音识别的音轨索引。例如：[0] 表示仅识别第一条音轨，[0, 1] 表示同时识别第一条和第二条音轨。

同步调用

请求体

通义千问3-ASR-Flash

cURL

Java

Python

通义千问Audio ASR

cURL

Java

Python

返回体

通义千问3-ASR-Flash

通义千问Audio ASR

异步调用

提交任务

请求体

cURL

Java

Python

返回体

获取任务执行结果

请求体

cURL

Java

Python

返回体

RUNNING

SUCCEEDED

FAILED

识别结果说明