Synchronous short speech detection

更新时间:
复制 MD 格式

The synchronous short audio moderation operation uses an HTTP or HTTPS interface to detect content in audio files. It converts audio to text in real time and returns content moderation results and risk tags to help you improve your review efficiency. This topic describes how to call the /green/voice/syncscan operation to moderate audio content.

Usage notes

API operation: /green/voice/syncscan. This operation performs synchronous audio moderation.

You can call this operation to create synchronous audio moderation tasks. For more information about how to construct an HTTP request, see Request structure. You can also use a pre-built HTTP request. For more information, see SDK overview.

Note

By default, audio moderation detects Mandarin Chinese. To detect other languages or dialects, contact your account manager. Other languages include English, Japanese, Spanish, Arabic, French, Indonesian, and Vietnamese. Dialects include Cantonese, Sichuanese, Hubei dialect, Shaanxi dialect, Shanxi dialect, Henan dialect, Northeastern dialect, Tianjin dialect, Gansu dialect, Guizhou dialect, Yunnan dialect, Jiangxi dialect, Guangxi dialect, Hunan dialect, Shandong dialect, Suzhou dialect, Zhejiang dialect, Shanghainese, and Minnan.

  • Billing information:

    You are charged for calling this operation. For more information about the billing methods, see

  • Audio file requirements:
    • The size of an audio file cannot exceed 20 MB.
    • The duration of an audio file cannot exceed 1 minute.
    • Supported audio file formats: MP3, WAV, AAC, WMA, OGG, M4A, and M3U8.
    • Supported video file formats that contain audio: AVI, FLV, MP4, MPG, ASF, WMV, MOV, RMVB, and RM.

QPS limits

You can call this operation up to 50 times per second per account. If the number of calls per second exceeds the limit, throttling is triggered. As a result, your business may be affected. We recommend that you take note of the limit when you call this operation.

Request parameters

NameTypeRequiredExample valueDescription
bizTypeStringNodefault

The business scenario. You can create a business scenario in the

Content Moderation console. For more information, see Customize moderation policies.

scenesStringArrayYesantispamThe detection scenario. Set the value to antispam.
tasksJSONArrayYesThe detection objects. Each element in the JSON array is a struct for a detection task. You can specify up to 100 elements, which means you can submit up to 100 content entries for detection at a time. To submit 100 elements, you must increase the number of concurrent tasks to more than 100. For more information about the structure of each element, see task.
Table 1. task
NameTypeRequiredExampleDescription
clientInfoJSONObjectNo{"userId":"120234234","userNick":"Mike","userType":"others"}

The information about the client. For more information, see the "Common request parameters" section of Common parameters.

The server determines whether to use the global clientInfo parameter or the clientInfo parameter that is described in this table.

Note

The clientInfo parameter in this table takes priority over the global one.

dataIdStringNoabc_123

The ID of the moderation object.

The ID can contain letters, digits, underscores (_), hyphens (-), and periods (.). It can be up to 128 characters in length. This ID uniquely identifies your business data.

urlStringYeshttp://aliyundoc.com/test.mp3

of the object to be detectedURL.

  • Public network HTTP/HTTPS URL, and the length cannot exceed2048 characters.

  • Alibaba Cloud OSSthe file path provided.You must first authorize Content Moderation to accessOSSbucket, only in the same regionOSS bucket.For more information, seeauthorize Content Moderation to accessOSSbucket.

    file path format: oss://<bucket-name>.<endpoint>/<object-name>

Returned Data

NameTypeExampleDescription
codeInteger200

The returned HTTP status code.

For more information, see Common error codes.

msgStringOKThe message returned for the request.
dataIdStringabc_123

The ID of the moderation object.

Note

If you set the dataId parameter in the moderation request, the value of the dataId request parameter is returned here.

taskIdStringvc_f_1OsjIYTukH@4@AXkIQ9xxx-1ov52YThe ID of the detection task.
urlStringhttp://aliyundoc.com/test.mp3

of the object to be detectedURL.

  • Public network HTTP/HTTPS URL, and the length cannot exceed2048 characters.

  • Alibaba Cloud OSSthe file path provided.You must first authorize Content Moderation to accessOSSbucket, only in the same regionOSS bucket.For more information, seeauthorize Content Moderation to accessOSSbucket.

    file path format: oss://<bucket-name>.<endpoint>/<object-name>

resultsJSONArrayThe detection results returned when the call is successful (code=200). The results contain one or more elements. Each element is a struct. For more information about the structure of each element, see result.
Table 2. result
NameTypeExampleDescription
sceneStringantispamThe detection scenario. This corresponds to the scenario in the request. The value is fixed as antispam.
labelStringcustomizedThe category of the detection result. Valid values:
  • normal: normal text
  • spam: Contains spam messages.
  • ad: ads
  • politics: political content
  • terrorism: terrorist content
  • abuse: Offensive or insulting language
  • porn: pornographic content
  • flood: excessive junk content
  • contraband: prohibited content
  • meaningless: No meaning.
  • harmful: undesirable scenarios (for protecting minors, including worshiping money, fan culture, negative emotions, and negative guidance)
  • customized: custom content (such as a hit on a custom keyword)
suggestionStringblockThe recommended subsequent operation. Valid values:
  • pass: The content is normal. No further action is required.
  • review: The result is inconclusive and requires manual review.
  • block: The content is non-compliant. We recommend that you delete the content or restrict its visibility.
rateFloat99.91

The score of the confidence level. Valid values: 0 to 100. A greater value indicates a higher confidence level.

If a value of pass is returned for the suggestion parameter, a higher confidence level indicates a higher probability that the content is normal. If a value of review or block is returned for the suggestion parameter, a higher confidence level indicates a higher probability that the content contains violations.

Important

We recommend that you use the values that are returned for the suggestion, label, and sublabel parameters to determine whether the content contains violations. The sublabel parameter is returned by specific operations.

detailsJSONArrayThe details of the text that corresponds to the audio. This can contain one or more elements. Each element corresponds to a sentence. For more information about the structure of each element, see detail.
Table 3. detail
NameTypeExampleDescription
startTimeInteger0The start timestamp of the sentence, in seconds.
endTimeInteger4065The end timestamp of the sentence, in seconds.
textStringDisgustingThe text converted from the audio.
labelStringpoliticsThe category of the detection result. Valid values:
  • normal: normal text
  • spam: Contains spam messages.
  • ad: ads
  • politics: political content
  • terrorism: terrorist content
  • abuse: Offensive or insulting language
  • porn: pornographic content
  • flood: excessive junk content
  • contraband: prohibited content
  • meaningless: No meaning.
  • harmful: undesirable scenarios (for protecting minors, including worshiping money, fan culture, negative emotions, and negative guidance)
  • customized: custom content (such as a hit on a custom keyword)
personsJSONArray[{"name":"Sensitive Person A"}]The voiceprint recognition result. This field is returned if the voiceprint of a sensitive person is hit.
The structure is as follows:
  • name: a string that indicates the sensitive person information identified from the audio.
Note This field is not returned by default. If you need this feature, contact your account manager.
keywordStringDisgustingIf a user-defined keyword is hit, the keyword is returned.
libNameStringtestIf a user-defined keyword is hit, the corresponding thesaurus is returned.

Examples

Sample request
http(s)://[Endpoint]/green/voice/syncscan
&<Common request parameters>{
    "scenes":[
        "antispam"
    ],
    "tasks":[
        {
            "dataId":"abcd-123",
            "url":"http://aliyundoc.com/test.mp3"
        }
    ]
}
Sample response
{
    "msg":"OK",
    "code":200,
    "data":[
        {
            "code":200,
            "dataId":"abcd-123",
            "results":[
                {
                    "rate":99.91,
                    "suggestion":"block",
                    "details":[
                        {
                            "libName":"test",
                            "startTime":0,
                            "endTime":4065,
                            "label":"customized",
                            "text":"Disgusting",
                            "keyword":"Disgusting"
                        },
                        {
                            "startTime":4430,
                            "endTime":10065,
                            "label":"normal",
                            "persons": [
                                {
                                    "name": "Sensitive Person A"
                                }
                            ],
                            "text":"Hahaha"
                        },
                        {
                            "libName":"Audio",
                            "startTime":11670,
                            "endTime":14685,
                            "label":"customized",
                            "text":"Clearance sale",
                            "keyword":"Sale"
                        },
                        {
                            "startTime":14685,
                            "endTime":16065,
                            "label":"ad",
                            "text":"12345"
                        }
                    ],
                    "label":"customized"
                }
            ],
            "taskId":"vc_f_1OsjIYTukH@4@AXkIQ9xxx-1ov52Y"
        }
    ],
    "requestId":"5A7A6198-6960-4DDC-B67E-58A111A4B20F"
}