Enhanced audio moderation API

更新时间:
复制 MD 格式

This topic describes how to use the Audio Moderation API for asynchronous audio file moderation and real-time audio stream moderation.

Usage

The service endpoint is https://green-cip.{region}.aliyuncs.com.

Call this API to create audio content moderation tasks. For details on constructing HTTP requests, see Make a request by using HTTP. You can also use pre-configured requests. For more information, see SDKs and best practices for the enhanced audio moderation feature of Content Moderation V2.0.

  • API operations:

    • Submit moderation task: VoiceModeration

    • Query moderation task result: VoiceModerationResult

    • Cancel moderation task: VoiceModerationCancel

  • Supported regions and endpoints:

    Region

    Public endpoint

    Internal endpoint

    Supported services

    China (Shanghai)

    green-cip.cn-shanghai.aliyuncs.com

    green-cip-vpc.cn-shanghai.aliyuncs.com

    audio_media_detection, audio_media_detection_pro, live_stream_detection, live_stream_detection_pro, and voice_aigc_detector

    China (Beijing)

    green-cip.cn-beijing.aliyuncs.com

    green-cip-vpc.cn-beijing.aliyuncs.com

    audio_media_detection and live_stream_detection

    China (Hangzhou)

    green-cip.cn-hangzhou.aliyuncs.com

    green-cip-vpc.cn-hangzhou.aliyuncs.com

    China (Shenzhen)

    green-cip.cn-shenzhen.aliyuncs.com

    green-cip-vpc.cn-shenzhen.aliyuncs.com

  • Billing:

    This is a paid service. Billing applies only to requests that return an HTTP 200 status code. Requests that return other error codes are not billed. For billing details, see Billing.

  • Performance:

    Performance metric

    Description

    Audio file size

    The maximum size for an audio file is 500 MB.

    Audio file format

    Supported audio formats: MP3, WAV, AAC, WMA, OGG, M4A, and AMR.

    Supported video formats: AVI, FLV, MP4, MPG, ASF, WMV, MOV, RMVB, and RM.

    Audio live stream

    Supported protocols: RTMP, HLS, HTTP-FLV, and RTSP.

    Requests per second (QPS)

    The QPS limit for task submission is 100.

    Concurrent streams

    The default limit for concurrent streams is 50.

Submit a moderation task

Request parameters

Parameter

Type

Required

Example

Description

Service

String

Yes

live_stream_detection

The type of moderation service. Valid values:

  • audio_media_detection: Audio and video media detection

  • audio_media_detection_pro: Audio and video media detection (Pro)

  • live_stream_detection: Social entertainment live stream detection

  • live_stream_detection_pro: Social entertainment live stream detection (Pro)

  • voice_aigc_detector: AI-generated content (AIGC) voice detection

ServiceParameters

JSONString

Yes

The parameters required by the moderation service. This value must be a JSON string. For more information about the parameters, see ServiceParameters.

Table 1. ServiceParameters

Parameter

Type

Required

Example

Description

url

String

Conditional. Specify the audio file in one of three ways:

http://aliyundoc.com/test.flv

The URL of the object to moderate. The URL must be a publicly accessible HTTP or HTTPS URL.

ossBucketName

String

bucket_01

The name of the authorized OSS bucket.

Note

To use the internal endpoint of an OSS object, you must use your root account to grant permissions on the Cloud Resource Access Authorization page.

ossObjectName

String

20240307/07/28/test.flv

The name of the file in the authorized OSS bucket.

ossRegionId

String

cn-shanghai

The region where the OSS bucket is located.

callback

String

No

http://aliyundoc.com

The URL to which moderation results are sent. This can be an HTTP or HTTPS endpoint. If you do not specify this parameter, you must poll for the results.

The callback API must support the POST method, data transmitted in UTF-8 encoding, and the form parameters checksum and content.

Content Moderation sets checksum and content according to the following rules and format and calls your callback API to return the detection results.

  • checksum: A string generated by concatenating user ID + seed + content and then applying the SHA256 algorithm. The user ID is your Alibaba Cloud account ID, which you can find in the Alibaba Cloud Console. To prevent tampering, when you receive a push result, you can generate a string by using the same algorithm and verify it against the checksum.

    Note

    The UID must belong to your Alibaba Cloud root account, not a RAM user.

  • content: A string in JSON format. You must parse the string to convert it into a JSON object. For an example of the content result, see the sample response for querying detection results.

Note

After your callback endpoint receives the result from Content Moderation, it must return an HTTP 200 status code to confirm receipt. Any other HTTP status code is considered a failure. On failure, Content Moderation retries up to 16 times. If all 16 retries fail, the service stops sending the result. We recommend that you check the status of your callback endpoint.

seed

String

Conditional

abc****

A random string used to generate the callback signature.

The string can contain letters, digits, and underscores (_), and must not exceed 64 characters. You can define this value to verify that the callback notification originates from Alibaba Cloud Content Moderation.

Note

This parameter is required when callback is specified.

cryptType

String

No

SHA256

When callback is used, this parameter specifies the encryption algorithm for the callback signature. Content Moderation uses this algorithm to calculate the signature. Valid values:

  • SHA256 (Default): Uses the SHA256 encryption algorithm.

  • SM3: Uses the SM3 (ShangMi) HMAC-SM3 encryption algorithm. It returns a hexadecimal string that consists of lowercase letters and digits. For example, abc encrypted with SM3 returns 66c7f0f462eeedd9d1f2d46bdc10e4e24167c4875cf2f7a2297da02b8f4ba8e0.

liveId

String

No

liveId1****

The ID of the live audio stream.

This parameter is used to deduplicate audio live streaming tasks. If this parameter is passed, the system uses uid+service+liveId to check if a detection task is already in progress. If so, the system returns the taskId of the existing task instead of initiating a new one.

dataId

String

No

voice20240307***

A custom ID for the data that you are moderating.

The ID can contain uppercase and lowercase letters, digits, underscores (_), hyphens (-), and periods (.), and must not exceed 64 characters. You can use it to uniquely identify your business data.

referer

String

No

www.aliyun.com

The Referer header, used for scenarios such as hotlink protection. The value cannot exceed 256 characters.

extra

String

No

{"VolcAppId":"6fabbd****1a7e", "VolcTokenId": "User123456", "VolcToken": "6fabbd****1a7e"}

A JSON string of extended parameters, used only to specify parameters for third-party Real-Time Communication (RTC) products. For more information, see Integrate enhanced audio moderation with third-party RTC products.

Response parameters

Parameter

Type

Example

Description

Code

Integer

200

The status code returned for the request. For details, see Status codes.

Data

JSONObject

{"TaskId": "AAAAA-BBBBB","DataId": "voice20240307***"}

The data returned for a successful request.

Message

String

SUCCESS

The response message.

RequestId

String

AAAAAA-BBBB-CCCCC-DDDD-EEEEEEEE****

The unique ID generated for this request.

Examples

Request example

{
    "Service": "audio_media_detection",
    "ServiceParameters": {
        "cryptType": "SHA256",
        "seed": "abc***123",
        "callback": "https://aliyun.com/callback",
        "url": "http://aliyundoc.com/test.flv"
    }
}

Response example

{
    "Code": 200,
    "Data": {
        "TaskId": "AAAAA-BBBBB",
        "DataId": "voice20240307***"
    },
    "Message": "SUCCESS",
    "RequestId": "AAAAAA-BBBB-CCCCC-DDDD-EEEEEEEE****"
}

Query task results

When you query a live stream moderation task, the response returns data for the most recent voice slices while the task is in progress. After the task is complete, the response includes data for all voice slices.

  • To query a moderation task, call the VoiceModerationResult operation.

  • Billing: This API operation does not incur charges.

  • Polling for results: We recommend waiting 30 seconds after submitting an asynchronous detection task before you start polling for results. Results are available for 24 hours and are then deleted.

QPS limit

The QPS limit for a single user is 100 queries per second (QPS). API calls that exceed this limit are throttled, which may affect your business. Call this API at a reasonable rate.

Request parameters

Parameter

Type

Required

Example value

Description

Service

String

Yes

live_stream_detection

The moderation service type.

ServiceParameters

JSONString

Yes

The parameters required for the moderation service. This value must be a JSON string. For details on each field, see ServiceParameters.

Table 2. ServiceParameters

Parameter

Type

Required

Example value

Description

taskId

String

Yes

AAAAA-BBBBB

The ID returned when the task is submitted.

Response parameters

Parameter

Type

Example value

Description

Code

Integer

200

The status code. For more information, see Status Codes.

Data

JSONObject

The result of the audio content moderation. For more information, see Data.

Message

String

OK

The response message.

RequestId

String

AAAAAA-BBBB-CCCCC-DDDD-EEEEEEEE****

The request ID.

Table 3. Data

Parameter

Type

Example value

Description

Url

String

https://aliyundoc.com

The URL of the object being moderated.

LiveId

String

liveId1****

The ID of the live audio stream (optional).

DataId

String

voice20240307***

The data ID of the object being moderated (optional).

RiskLevel

String

high

The overall risk level of the audio, determined from all voice slices. Valid values include:

  • high: High Risk

  • medium: Medium Risk

  • low: Low Risk

  • none: No Risk Detected

Note

We recommend addressing high-risk content immediately and sending medium-risk content for manual review. Low-risk content can be treated as safe, unless your use case requires high recall.

SliceDetails

JSONArray

The detailed results for each voice slice. For more information, see SliceDetails.

ManualTaskId

String

The manual review task ID. This ID is used to query the results of a manual review. This ID is returned if human-machine review is configured and the conditions for manual review are met. For more information about the configuration, see Human-Machine Review Service Configuration.

Table 4. SliceDetails

Parameter

Type

Example value

Description

StartTime

Integer

0

The start time of the voice slice, in seconds.

EndTime

Integer

4065

The end time of the voice slice, in seconds.

StartTimestamp

Integer

1678854649720

The start timestamp of the voice slice, in milliseconds.

EndTimestamp

Integer

1678854649720

The end timestamp of the voice slice, in milliseconds.

Text

String

disgusting

The speech-to-text transcript.

Url

String

https://aliyundoc.com

If the moderated content is an audio stream, this is the temporary URL for the corresponding audio segment. This URL is valid for 30 minutes. Save the file to a different location before it expires.

Labels

String

political_content,xxxx

The labels for detected content, separated by commas.

Audio moderation labels include:

  • ad: Ad-related Promotion

  • violence: Violent and Terrorist Content

  • political_content: Politically Sensitive Content

  • specified_speaking: Specified Speaking

  • specified_lyrics: Specified Lyrics

  • sexual_content: Pornographic Content

  • sexual_sounds: Moaning

  • contraband: Contraband

  • profanity: Profanity

  • religion: Religious Content

  • cyberbullying: Cyberbullying

  • negative_content: Negative Content

  • nontalk: Muted Audio

  • C_customized: Matched in a user-defined library

AIGC voice detection labels include:

  • aigc: Suspected AIGC-generated Audio

  • ugc: Non-AIGC-generated Audio

Note

To determine if audio is AIGC-generated, use the label directly instead of the risk level.

RiskLevel

String

high

The risk level of the voice slice. Valid values include:

  • high: High Risk

  • medium: Medium Risk

  • low: Low Risk

  • none: No Risk Detected

RiskWords

String

AAA,BBB,CCC

The detected risk words, separated by commas.

RiskTips

String

sexual_content_vulgar_words,sexual_content_description

The sub-labels, separated by commas.

Extend

String

{"riskTips":"sexual_content_vulgar_words","riskWords":"sex_services"}

Reserved field.

Examples

Request example

{
    "Service": "live_stream_detection",
    "ServiceParameters": {
        "taskId": "AAAAA-BBBBB"
    }
}

Response example

{
    "Code": 200,
    "Data": {
        "DataId": "voice20240307***",
        "LiveId": "liveId1****",
        "RiskLevel": "high",
        "SliceDetails": [
            {
                "EndTime": 4065,
                "Labels": "political_content,xxxx",
                "RiskLevel": "high",
                "RiskTips": "contraband_prohibited_items",
                "RiskWords": "RiskWordA",
                "StartTime": 0,
                "Text": "Content Moderation product test case",
                "Url": "https://aliyundoc.com"
            }
        ]
    },
    "Message": "OK",
    "RequestId": "AAAAAA-BBBB-CCCCC-DDDD-EEEEEEEE****"
}

Cancel a moderation task

This operation cancels live stream moderation tasks only. File-based tasks cannot be canceled.

  • To cancel a moderation task, call the VoiceModerationCancel API.

  • This API operation is free of charge.

Request parameters

Parameter

Type

Required

Example value

Description

Service

String

Yes

live_stream_detection

The moderation service.

ServiceParameters

JSONString

Yes

The parameter set for the moderation service, which must be a JSON string. For details on each parameter, see ServiceParameters.

Table 5. ServiceParameters

Parameter

Type

Required

Example value

Description

taskId

String

Yes

AAAAA-BBBBB

The ID of the moderation task to be canceled.

Response parameters

Parameter

Type

Example value

Description

Code

Integer

200

The status code. See Status Codes for a list of all possible values.

Message

String

OK

The response message.

RequestId

String

AAAAAA-BBBB-CCCCC-DDDD-EEEEEEEE****

The request ID.

Examples

Request example

{
    "Service": "live_stream_detection",
    "ServiceParameters": {
        "taskId": "AAAAA-BBBBB"
    }
}

Response example

{
    "Code": 200,
    "Message": "OK",
    "RequestId": "AAAAAA-BBBB-CCCCC-DDDD-EEEEEEEE****"
}

Callback message format

The callback message is sent in JSON format and contains the following fields:

Parameter

Type

Description

checksum

String

The checksum is a string generated by concatenating uid + seed + content and then applying the SHA256 algorithm.

Your user ID is your Alibaba Cloud account ID, which can be found in the Alibaba Cloud console. To prevent tampering, you can compute a signature using the same algorithm when you receive the callback message and verify it against the checksum value.

Note

The user ID must be your Alibaba Cloud account ID, not the ID of a RAM user.

taskId

String

The task ID associated with the callback message.

content

String

The serialized moderation result, provided as a JSON string. You must parse this string to retrieve the JSON object. The content format is the same as the response from a task result query. For more information, see Return Parameters.

Status codes

This topic describes the status codes returned by the API. You are billed only for requests that return a 200 status code.

Code

Description

200

The request was successful.

280

Detection is in progress.

400

A request parameter is empty.

401

A request parameter is invalid.

402

The length of a request parameter is invalid. Check the parameter length and try again.

403

The number of requests exceeds the QPS limit. Reduce your request rate and try again.

404

File download failed. Check the file and try again.

405

The file download timed out, possibly because the file is inaccessible. Check the file and try again.

406

The file exceeds the size limit. Check the file size and try again.

407

The file format is not supported. Check the file format and try again.

408

The account is not authorized to call this API. This error can occur if the service has not been activated, the account has an overdue payment, or the account lacks required permissions.

480

The number of concurrent requests exceeds the limit. Reduce the number of concurrent requests and try again.

500

An internal system error occurred.