Text moderation enhanced edition 2.0 multilingual plus

更新时间:
复制 MD 格式

Text Moderation Enhanced Edition features an upgraded multilingual model that can automatically detect languages and supports more language types. Tailored for international business, the service also provides moderation policies and a labeling system. This topic describes the features and usage of the multilingual service for Text Moderation Enhanced Edition.

Features

Compared to the multilingual service in Text Moderation 1.0, Text Moderation 2.0 uses a separate policy and labeling system for global businesses. It also offers more features to simplify usage and aid manual review.

Item

Text Moderation 2.0

Text Moderation 1.0

Multilingual support

Supports 38 languages.

Supports 18 languages.

Moderation capability

Uses multiple models in parallel. Policies are more precise, tailored to specific languages and regions.

Uses a single model. Policies balance precision and recall based on language features.

Labeling system

Uses an internationalized labeling system that includes labels for profanity and region-specific content. Supports multiple risk labels and fine-grained labels.

Uses a labeling system designed for Chinese content and supports only a single risk label.

Detection scope

You can configure all detection scopes in the console, enabling or disabling them as needed. Each scope maps one-to-one with a detection result.

Supports general detection scopes, which do not map one-to-one with detection results.

API features

Provides automatic detection without requiring you to specify the input language. After moderation, the API returns the language type and a translated English version of the text to aid manual review.

Requires you to specify the input language and does not return translated content.

Supported languages

Text moderation Enhanced Edition supports 38 languages.

Language

English name

Language code

English

English

en

Simplified Chinese

Simplified Chinese

zh

Traditional Chinese

Traditional Chinese

zh-tw

Indonesian

Indonesian

id

Malay

Malay

ms

Thai

Thai

th

Vietnamese

Vietnamese

vi

Tagalog

Tagalog

tl

Hindi

Hindi

hi

Arabic

Arabic

ar

Turkish

Turkish

tr

French

French

fr

German

German

de

Russian

Russian

ru

Portuguese

Portuguese

pt

Spanish

Spanish

es

Italian

Italian

it

Dutch

Dutch

nl

Polish

Polish

pl

Japanese

Japanese

ja

Korean

Korean

ko

Urdu

Urdu

ur

Uighur

Uighur

ug

Bengali

Bengali

bn

Persian

Persian

fa

Swedish

Swedish

sv

Danish

Danish

da

Norwegian

Norwegian

no

Icelandic

Icelandic

is

Finnish

Finnish

fi

Belarusian

Belarusian

be

Lithuanian

Lithuanian

lt

Czech

Czech

cs

Slovak

Slovak

sk

Hungarian

Hungarian

hu

Modern Greek

Modern Greek

el

Romanian

Romanian

ro

Irish

Irish

ga

Internationalized labels

The Content ModerationEnhanced Edition Multilingual PLUS service uses an internationalized label system. If content poses multiple risks, the service returns multiple labels. These labels include:

Label

Confidence score

Description

pornographic_adult

0–100. A higher score indicates greater confidence.

Suspected pornographic content

sexual_terms

0–100. A higher score indicates greater confidence.

Suspected sexual health content

sexual_suggestive

0–100. A higher score indicates greater confidence.

Suspected vulgar content

sexual_orientation

0–100. A higher score indicates greater confidence.

Suspected sexual orientation content

regional_cn

0–100. A higher score indicates greater confidence.

Suspected domestic political content

regional_illegal

0–100. A higher score indicates greater confidence.

Suspected illegal political content

regional_controversial

0–100. A higher score indicates greater confidence.

Suspected political controversy

regional_racism

0–100. A higher score indicates greater confidence.

Suspected racism

violent_extremist

0–100. A higher score indicates greater confidence.

Suspected extremist organizations

violent_incidents

0–100. A higher score indicates greater confidence.

Suspected extremist content

violent_weapons

0–100. A higher score indicates greater confidence.

Suspected weapons and ammunition

violence_unscList

0–100. A higher score indicates greater confidence.

United Nations Security Council Consolidated List

contraband_drug

0–100. A higher score indicates greater confidence.

Suspected drug-related content

contraband_gambling

0–100. A higher score indicates greater confidence.

Suspected gambling-related content

inappropriate_ethics

0–100. A higher score indicates greater confidence.

Suspected unethical content

inappropriate_profanity

0–100. A higher score indicates greater confidence.

Suspected abusive or insulting content

inappropriate_oral

0–100. A higher score indicates greater confidence.

Suspected vulgar language

inappropriate_religion

0–100. A higher score indicates greater confidence.

Suspected religious profanity

pt_to_contact

0–100. A higher score indicates greater confidence.

Suspected traffic diversion using contact information

pt_to_sites

0–100. A higher score indicates greater confidence.

Suspected off-site traffic diversion

customized

0–100. A higher score indicates greater confidence.

Matched a keyword in a custom keyword library

Billing

The text moderation Enhanced Edition service supports pay-as-you-go and resource plan offset.

Pay-as-you-go

After you activate the text moderation Enhanced Edition service, the default billing method is pay-as-you-go. You are billed daily based on your actual usage. If you do not use the service, you will not be charged.

Moderation type

Service

Unit price

Advanced text moderation (text_advanced)

Multilingual detection for international business: comment_multilingual_pro_cb

CNY 15 per 10,000 API calls

Resource plan offset

For large or consistent moderation needs, we recommend purchasing a resource plan. Larger plans offer greater discounts. You can purchase and use multiple resource plans. For more information, see Purchase a resource plan for Content Moderation Enhanced Edition.

This resource plan offsets usage for Content Moderation Enhanced Edition and is not compatible with Content Moderation 1.0 usage plans. The following table describes the offset factors.

Moderation type

Service

Offset factor

Advanced text moderation (text_advanced)

Multilingual detection for international business: comment_multilingual_pro_cb

The offset factor is 2. Each successful API call deducts 2 API calls from your resource plan's usage quota.

Note

For example, if your resource plan has a usage quota of 10 API calls, one successful API call deducts 2 API calls from the plan, leaving a balance of 8 API calls.

Integration

Step 1: Activate the service

To activate the Text Moderation Enhanced Edition service, visit Activate Service.

After you activate the Text Moderation Enhanced Edition service, the default billing method is pay-as-you-go. You are billed daily based on your actual usage. You are not charged if you do not call the service. After you integrate the API, the system automatically bills you based on your usage. For more information, see Billing. You can also purchase a pay-as-you-go resource plan, which offers tiered discounts and is suitable for users with predictable or high usage volumes.

Step 2: Grant permissions to a RAM User

Before integrating the SDK or API, grant permissions to a RAM User. You can create an AccessKey for your Alibaba Cloud account or a RAM User. An AccessKey authenticates your identity when you call Alibaba Cloud APIs. To learn how to get an AccessKey, see Obtain an AccessKey.

Grant permissions to a RAM user

  1. Log on to the RAM console using your Alibaba Cloud account.

  2. Create a RAM user. For details, see Create a RAM user.

  3. Grant the AliyunYundunGreenWebFullAccess system policy to the RAM user. This policy grants full access to Content Moderation. For details, see Manage RAM user permissions.

    The RAM user can now call the Content Moderation API.

Step 3: Install and integrate the SDK

The service is available in the following regions. To integrate the SDK for Text Moderation Enhanced Edition, see the Integration Guide.

Region

Public endpoint

Internal endpoint

Singapore

green-cip.ap-southeast-1.aliyuncs.com

green-cip-vpc.ap-southeast-1.aliyuncs.com

UK (London)

green-cip.eu-west-1.aliyuncs.com

Not available

US (Virginia)

green-cip.us-east-1.aliyuncs.com

green-cip-vpc.us-east-1.aliyuncs.com

US (Silicon Valley)

green-cip.us-west-1.aliyuncs.com

Not available

Germany (Frankfurt)

green-cip.eu-central-1.aliyuncs.com

green-cip-vpc.eu-central-1.aliyuncs.com

Note

Manage configurations for the UK (London) region in the Singapore region console, and for the US (Silicon Valley) region in the US (Virginia) region console.

API

Usage

  • API operation: TextModerationPlus

Use this operation to create a text content moderation task. To learn how to construct an HTTP request, see Request structure. You can also use a prebuilt HTTP request. For more information, see Integration guide.

  • Billing:

    This is a paid operation. You are charged only for successful requests that return a 200 status code. Requests that return other status codes are not billed. For more information about billing, see Billing.

QPS limit

The QPS limit for this operation is 100 requests per second per user. API calls that exceed this limit are throttled, which may affect your business. Plan your calls accordingly.

Request parameters

Parameter

Type

Required

Example

Description

Service

String

Yes

comment_multilingual_pro_cb

The type of the moderation service. Set the value to comment_multilingual_pro_cb for multilingual moderation for international services.

ServiceParameters

JSONString

Yes

The set of parameters for the moderation service, provided as a JSON string. For details about the parameters, see ServiceParameters.

Table 1. ServiceParameters

Parameter

Type

Required

Example

Description

content

String

Yes

testing content

The text content to moderate. The content can be up to 600 characters long.

dataId

String

No

text0424****

The data ID of the object to moderate.

The ID can contain uppercase and lowercase letters, digits, underscores (_), hyphens (-), and periods (.). The ID can be up to 64 characters long and can be used to uniquely identify your business data.

Response parameters

Parameter

Type

Example

Description

Code

Integer

200

The status code. For more information, see Code details.

Data

JSONObject

The moderation result data. For more information, see Data.

Message

String

OK

The response message.

RequestId

String

AAAAAA-BBBB-CCCCC-DDDD-EEEEEEEE****

The request ID.

Table 2. Data

Parameter

Type

Example

Description

Result

JSONArray

The moderation results, including risk labels, confidence scores, and other details. For more information, see Result.

DataId

String

text0424****

The data ID of the moderated object.

Note

This parameter is returned only if dataId is specified in the request.

RiskLevel

String

high

The risk level, determined by the configured risk score thresholds. Valid values:

  • high: High risk. If the content matches a keyword in a custom keyword library, the risk level defaults to high.

  • medium: Medium risk.

  • low: Low risk.

  • none: No risk detected.

Note

We recommend handling high-risk content directly and manually reviewing medium-risk content. Process low-risk content only if a high recall rate is required; otherwise, it can be treated as risk-free. You can configure risk score thresholds in the Content Moderation console.

TranslatedContent

String

Translated text

The translated text content. Returned only after enabling the text translation feature.

Note

The text translation feature is currently supported only in the Singapore region. Configure it under detection rule management in the console. Enabling this feature incurs additional fees. For billing details, see Billing.

DetectedLanguage

String

en

The detected language.

Table 3. Result

Parameter

Type

Example

Description

Label

String

political_xxx

The moderation label. A single piece of content may receive multiple labels and confidence scores. For a list of supported labels, see Internationalized labels.

Confidence

Float

81.22

The confidence score, ranging from 0 to 100. The value is precise to two decimal places. Some labels do not have a confidence score.

Riskwords

String

AA,BB,CC

The detected sensitive words. Multiple sensitive words are separated by commas. This parameter is not returned for some labels.

CustomizedHit

JSONArray

[{"LibName":"...","Keywords":"..."}]

Returned if the content matches a keyword in a custom keyword library (when Label is customized). This parameter provides the name of the custom library and the matched custom keywords. For more information, see CustomizedHit.

Description

String

Suspected pornographic content

A description of the Label field.

Important

This field provides a human-readable explanation of the Label and is subject to change. For automated processing, base your logic on the Label field, not this Description field.

Table 4. CustomizedHit

Parameter

Type

Example

Description

LibName

String

Custom library 1

The name of the custom keyword library.

Keywords

String

Custom keyword 1,Custom keyword 2

The matched custom keywords. Multiple keywords are separated by commas.

Examples

Request example

{
    "Service": "comment_detection_pro_cb",
    "ServiceParameters": {
        "content": "testing content",
        "dataId": "text0424****"
    }
}

Response examples

System policy match:

{
    "Code": 200,
    "Data": {
        "Result": [
            {
                "Label": "political_entity",
                "Description": "Suspected political entity",
                "Confidence": 100.0,
                "RiskWords": "Word A,Word B,Word C"
            },
            {
                "Label": "political_figure",
                "Description": "Suspected political figure",
                "Confidence": 100.0,
                "RiskWords": "Word A,Word B,Word C"
            }
        ],
        "RiskLevel": "high",
        "DetectedLanguage": "en",
        "TranslatedContent": "Translated content",
        "DataId": "text0424****"
    },
    "Message": "OK",
    "RequestId": "AAAAAA-BBBB-CCCCC-DDDD-EEEEEEEE****"
}

Custom keyword library match:

{
    "Code": 200,
    "Data": {
        "Result": [
            {
                "Description": "Matched a custom keyword library",
                "CustomizedHit": [
                    {
                        "LibName": "Custom keyword library name 1",
                        "KeyWords": "Custom keyword"
                    }
                ],
                "Confidence": 100,
                "Label": "customized"
            }
        ],
        "RiskLevel": "high",
        "DataId": "text0424****"
    },
    "Message": "OK",
    "RequestId": "AAAAAA-BBBB-CCCCC-DDDD-EEEEEEEE****"
}

Code

Code

Status code

Description

200

OK

The request was successful.

400

BAD_REQUEST

Invalid request. Check your request parameters.

407

NOT_SUPPORT

The language is not recognized or supported.

408

PERMISSION_DENY

Permission denied. This can occur if your account is not authorized, has an overdue payment, or if the service is inactive or blocked.

500

GENERAL_ERROR

A server-side error occurred. Retry the request. If the error persists, contact us through online support.

581

TIMEOUT

The request timed out. Retry the request. If the error persists, contact us through online support.

588

EXCEED_QUOTA

The request rate exceeds the QPS limit.