Access the Paraformer real-time speech recognition service over a WebSocket connection. This topic describes the service endpoints, request headers, and interaction flow.
User guide: For model overviews and selection guidance, see Speech-to-text. For sample code, see Real-time speech recognition.
The DashScope SDK currently supports only Java and Python. For other languages, connect to the service directly over WebSocket.
Service endpoints
Paraformer is available only in the China (Beijing) region. The WebSocket URL is fixed:
wss://dashscope.aliyuncs.com/api-ws/v1/inference
The URL must use the wss:// protocol. Provide your API key in the Authorization request header (see Request headers).
Request headers
Add the following fields to the request header:
|
Parameter |
Type |
Required |
Description |
|
Authorization |
string |
Yes |
Authentication token in the format |
|
user-agent |
string |
No |
Client identifier. Helps the server identify the source of incoming requests. |
|
X-DashScope-WorkSpace |
string |
No |
Alibaba Cloud Model Studio workspace ID. |
|
X-DashScope-DataInspection |
string |
No |
Whether to enable data inspection. Omit this header by default; set it to |
The Authorization header is verified during the WebSocket handshake. If the API key is invalid or missing, the handshake fails with an HTTP 401 or 403 error.
Interaction flow
For details about client-side and server-side events, see Client events and Server-sent events.
The client and server interact in the following sequence:
-
Establish the connection: The client opens a WebSocket connection to the server.
-
Start the task: The client sends a run-task instruction. The server replies with a task-started event to confirm that the task has started, after which subsequent steps can proceed.
-
Send the audio stream: The client streams mono binary audio. The server returns result-generated events containing the recognition results.
-
Notify the server to end the task: The client sends a finish-task instruction and continues to receive result-generated events from the server.
-
End the task: The client receives a task-finished event from the server, indicating that the task has ended.
-
Close the connection: The client closes the WebSocket connection.