Text Moderation Enhanced Edition has been upgraded to the PLUS service, allowing you to enable or disable moderation labels. This topic explains how to use the service.
Risk labels
Label description
Text Moderation Pro can return over 60 granular risk labels across 10 categories, each with a confidence score. If content contains multiple risks, the service can return multiple labels for a single request. The following table describes these risk labels, their confidence score ranges, and their meanings.
|
Label type |
Label value |
Confidence score |
Description |
|
Text moderation risk labels |
pornographic_adult |
0–100 |
Indicates potentially pornographic content. |
|
sexual_terms |
0–100 |
Indicates content related to sexual health. |
|
|
sexual_suggestive |
0–100 |
Indicates potentially vulgar content. |
|
|
political_figure |
0–100 |
Indicates content is potentially related to political figures. |
|
|
political_entity |
0–100 |
Indicates content is potentially related to political entities. |
|
|
political_n |
0–100 |
Indicates potentially sensitive political content. |
|
|
political_p |
0–100 |
Indicates content is potentially related to prohibited political figures. |
|
|
political_a |
0–100 |
Indicates enhanced protection is applied for political content. |
|
|
violent_extremist |
0–100 |
Indicates content is potentially related to extremist organizations. |
|
|
violent_incidents |
0–100 |
Indicates potentially extremist content. |
|
|
violent_weapons |
0–100 |
Indicates content is potentially related to weapons and ammunition. |
|
|
contraband_drug |
0–100 |
Indicates content is potentially related to drugs. |
|
|
contraband_gambling |
0–100 |
Indicates content is potentially related to gambling. |
|
|
contraband_act |
0–100 |
Indicates content is potentially related to illegal activities. |
|
|
contraband_entity |
0–100 |
Indicates content is potentially related to illegal items. |
|
|
inappropriate_discrimination |
0–100 |
Indicates potentially biased or discriminatory content. |
|
|
inappropriate_ethics |
0–100 |
Indicates content with potentially harmful values. |
|
|
inappropriate_profanity |
0–100 |
Indicates potentially insulting or abusive content. |
|
|
inappropriate_oral |
0–100 |
Indicates potentially vulgar spoken content. |
|
|
inappropriate_superstition |
0–100 |
Indicates potentially superstitious content. |
|
|
inappropriate_nonsense |
0–100 |
Indicates potentially meaningless spam content. |
|
|
pt_to_sites |
0–100 |
Indicates content that potentially diverts traffic to external sites. |
|
|
pt_by_recruitment |
0–100 |
Indicates advertisements for online money-making schemes or part-time jobs. |
|
|
pt_to_contact |
0–100 |
Indicates advertisements that divert traffic. |
|
|
religion_b |
0–100 |
Indicates content is potentially related to Buddhism. |
|
|
religion_t |
0–100 |
Indicates content is potentially related to Taoism. |
|
|
religion_c |
0–100 |
Indicates content is potentially related to Christianity. |
|
|
religion_i |
0–100 |
Indicates content is potentially related to Islam. |
|
|
religion_h |
0–100 |
Indicates content is potentially related to Hinduism. |
|
|
ad_compliance |
0–100 |
Indicates content that violates advertising laws. |
|
|
customized |
0–100 |
Indicates a match with a keyword in a custom dictionary. |
|
|
nonLabel |
Not applicable |
Confirms that no risks were detected. |
|
|
AIGC detection labels |
aigc |
0–100 |
Indicates text is potentially AI-generated. |
|
ugc |
0–100 |
Indicates text is not AI-generated. |
|
|
nonLabel |
Not applicable |
Confirms that no risks were detected. |
Manage labels
You can enable or disable each risk label in the console. Some risk labels provide switches for more granular detection scopes. For more information, see the Content Moderation Console.
-
In the left-side navigation pane, choose Content Moderation Pro > Text Moderation > Rule Configuration.
-
On the Rule Management tab, for the llm_query_moderation service, click Modify Rules in the Actions column.
-
Select the detection type you want to adjust, for example, Unwanted Content Detection.
-
Click Edit, then modify the status of the detection item.
-
Click Save. The new configuration takes about 2 to 5 minutes to take effect.
-
Enable a Service
On the Rule Configuration page, each Service is listed as Unused by default. You do not need to activate a Service; after configuring its detection rules, you can use it by making an API call.
-
Log on to the Content Moderation Console.
-
In the left-side navigation pane, choose Content Moderation Pro > Text Moderation > Rules.
-
In the Service list, find the target Service, such as
comment_detection_pro, and click Modify Rules in the Actions column. -
On the Detection Scope tab, enable or disable the required detection items, then click Save. The changes take effect in about 2 to 5 minutes.
-
(Optional) To use a custom dictionary for keyword detection, return to the Service list and click Set Thesaurus in the Actions column.
-
When making an API call, set the
Serviceparameter to the corresponding Service name, such ascomment_detection_pro.
Integration
Step 1: Activate the service
Visit Activate Service to activate the Content Moderation Enhanced Edition.
Step 2: Grant permissions to a RAM user
Before you use the SDK or call the API, you must grant permissions to a RAM user. To authenticate API calls, use an access key from either your Alibaba Cloud account or a RAM user. For more information, see Obtain an access key.
Grant permissions to a RAM user
Log on to the RAM console using your Alibaba Cloud account.
Create a RAM user. For details, see Create a RAM user.
Grant the
AliyunYundunGreenWebFullAccesssystem policy to the RAM user. This policy grants full access to Content Moderation. For details, see Manage RAM user permissions.The RAM user can now call the Content Moderation API.
Step 3: Install and integrate the SDK
For the SDKs for the Content Moderation Enhanced Edition PLUS service, see SDKs and integration guide for the Content Moderation 2.0 PLUS service.
API
Usage
Call this operation to create a text moderation task. To learn how to build an HTTP request, see native HTTPS call. Alternatively, use a prebuilt HTTP request. For details, see getting started.
Use OpenAPI Explorer to run this API directly without having to calculate the signature. Upon a successful request, OpenAPI Explorer automatically generates SDK code examples.
-
API: TextModerationPlus
-
Regions and endpoints:
|
Region |
Public endpoint |
Private endpoint |
Supported services |
|
China (Shanghai) |
green-cip.cn-shanghai.aliyuncs.com |
green-cip-vpc.cn-shanghai.aliyuncs.com |
ugc_moderation_byllm_pro, ugc_moderation_byllm, nickname_detection_pro, chat_detection_pro, comment_detection_pro, ad_compliance_detection_pro, text_aigc_detector |
|
China (Beijing) |
green-cip.cn-beijing.aliyuncs.com |
green-cip-vpc.cn-beijing.aliyuncs.com |
|
|
China (Hangzhou) |
green-cip.cn-hangzhou.aliyuncs.com |
green-cip-vpc.cn-hangzhou.aliyuncs.com |
|
|
China (Shenzhen) |
green-cip.cn-shenzhen.aliyuncs.com |
green-cip-vpc.cn-shenzhen.aliyuncs.com |
|
|
China (Chengdu) |
green-cip.cn-chengdu.aliyuncs.com |
N/A |
|
|
Singapore |
green-cip.ap-southeast-1.aliyuncs.com |
green-cip-vpc.ap-southeast-1.aliyuncs.com |
comment_multilingual_pro_cb, ugc_moderation_byllm_cb |
|
UK (London) |
green-cip.eu-west-1.aliyuncs.com |
N/A |
comment_multilingual_pro_cb |
|
US (Virginia) |
green-cip.us-east-1.aliyuncs.com |
green-cip-vpc.us-east-1.aliyuncs.com |
|
|
US (Silicon Valley) |
green-cip.us-west-1.aliyuncs.com |
N/A |
|
|
Germany (Frankfurt) |
green-cip.eu-central-1.aliyuncs.com |
green-cip-vpc.eu-central-1.aliyuncs.com |
The UK (London) region reuses the console configuration of the Singapore region, and the US (Silicon Valley) region reuses that of the US (Virginia) region.
-
Billing: This is a paid API. You are only charged for requests that return a 200 HTTP status code; requests that result in an error are not charged. For more information about our billing method, see the billing description.
QPS limit
This API is subject to a single-user QPS limit. Exceeding this limit triggers API throttling and can disrupt your service.
-
AI-generated text detection (
text_aigc_detector): 50 requests per second. -
LLM-based UGC text moderation service (
ugc_moderation_byllm_pro,ugc_moderation_byllm, andugc_moderation_byllm_cb): 50 requests per second. -
Other services: 100 requests per second.
The UGC text moderation large model service has a lower QPS limit than other services. If your request volume is high, implement traffic control to avoid exceeding this limit.
Request parameters
|
Parameter |
Type |
Required |
Example value |
Description |
|
Service |
String |
Yes |
comment_detection_pro |
Note
For details on the multilingual detection service for international business, see Content Moderation Enhanced V2.0 Multilingual PLUS Service. |
|
ServiceParameters |
JSONString |
Yes |
The required parameter set for the moderation service, formatted as a JSON string. See the ServiceParameters table for parameter descriptions. |
Table 1. ServiceParameters
|
Parameter |
Type |
Required |
Example |
Description |
|
content |
String |
Yes |
Text to moderate |
The text to moderate. The character limit varies by service:
|
|
dataId |
String |
No |
text0424**** |
A unique data ID for the content to be moderated. This ID can contain uppercase and lowercase letters, digits, underscores (_), hyphens (-), and periods (.), and must not exceed 64 characters. |
|
accountId |
String |
No |
ID0728**** |
A unique account ID that identifies an end user. The platform uses this ID for record-keeping. For example, in a chat between User A and User B, pass User A's ID when moderating User A's text, and pass User B's ID when moderating User B's text. Note
The account ID can be used for context-aware moderation. To enable this feature, contact your business representative or submit a ticket. |
Return parameters
|
Parameter |
Type |
Example value |
Description |
|
Code |
Integer |
200 |
The status code. See Code Description. |
|
Data |
JSONObject |
{"Result":[...]} |
The moderation result data. For details, see Data. |
|
Message |
String |
OK |
The response message. |
|
RequestId |
String |
AAAAAA-BBBB-CCCCC-DDDD-EEEEEEEE**** |
The request ID. |
Table 2. Data
|
Parameter |
Type |
Value |
Description |
|
Result |
JSONArray |
The detection results, including risk tags and confidence scores. For more information, see Result. |
|
|
dataId |
String |
text0424**** |
The data ID of the detection object. Note
If you passed |
|
accountId |
String |
ID0728**** |
The account ID. Note
If you passed |
|
riskLevel |
String |
high |
The risk level, determined by the risk score thresholds you configure. Possible values include:
Note
Handle high-risk content immediately. Send medium-risk content for manual review. Handle low-risk content only when you need high recall. Otherwise, treat it as content with no risk. You can configure score thresholds in the Content Security console. |
|
manualTaskId |
String |
m_tx_042407280307*** |
The manual review task ID. Use it to query the manual review result. This field is returned only if you enable human-machine review and the content meets the criteria for manual review. For configuration details, see Human-Machine Review Service Configuration. |
|
Ext |
Object |
Supplemental information for text moderation. For more information, see Ext. |
Table 3. Result
|
Parameter |
Type |
Value |
Description |
|
Label |
String |
political_xxx |
The label returned by the text moderation service. The service may return multiple labels and confidence scores. For a list of supported labels, see the Risk Labels section. |
|
Description |
String |
Suspected pornographic content |
A human-readable description of the Important
This field is for informational purposes only and is subject to change. For automated processing, use the |
|
Confidence |
Float |
81.22 |
The confidence score for the detected label. The value ranges from 0 to 100 and is accurate to two decimal places. Some labels may not have a confidence score. |
|
Riskwords |
String |
AA,BB,CC |
The detected risk words. Multiple words are separated by commas. This field is not returned for all labels. |
|
CustomizedHit |
JSONArray |
[{"LibName":"...","Keywords":"..."}] |
If content matches a term in a custom library, the |
|
RiskPositions |
JSONArray |
The positions of the detected risk words in the text. For more information, see RiskPositions. |
Table 4. CustomizedHit
|
Parameter |
Type |
Example |
Description |
|
LibName |
String |
Custom Library 1 |
The name of the custom library. |
|
Keywords |
String |
Custom Keyword 1,Custom Keyword 2 |
Custom keywords, separated by commas. |
Table 5. Extension fields
|
Parameter |
Type |
Example |
Description |
|
LlmContent |
Object |
The detection results from the large language model. For more information, see LlmContent. |
Table 6. LlmContent
|
Parameter |
Type |
Value |
Description |
|
OutputText |
String |
Suspected abusive or insulting content |
Raw output from the text moderation large language model. |
Table 7. RiskPositions
|
Parameter |
Type |
Example |
Description |
|
RiskWord |
String |
AA |
The detected risk word. |
|
StartPos |
Integer |
10 |
The start position of the RiskWord. |
|
EndPos |
Integer |
12 |
The end position of the RiskWord. |
Example
Example request
{
"Service": "comment_detection_pro",
"ServiceParameters": {
"content": "testing content",
"dataId": "text0424****"
}
}
Response examples:
-
System policy match:
{
"Code": 200,
"Data": {
"Result": [
{
"Label": "political_entity",
"Description": "Suspected political entity",
"Confidence": 100.0,
"RiskWords": "wordA,wordB",
"RiskPositions": [
{
"EndPos": 14,
"RiskWord": "wordA",
"StartPos": 12
}
]
},
{
"Label": "political_figure",
"Description": "Suspected political figure",
"Confidence": 100.0,
"RiskWords": "wordB,wordC",
"RiskPositions": [
{
"EndPos": 20,
"RiskWord": "wordB",
"StartPos": 18
}
]
}
],
"RiskLevel": "high",
"DataId": "text0424****"
},
"Message": "OK",
"RequestId": "AAAAAA-BBBB-CCCCC-DDDD-EEEEEEEE****"
}
-
Custom library match:
{
"Code": 200,
"Data": {
"Result": [
{
"Description": "Custom library match",
"CustomizedHit": [
{
"LibName": "Custom library name 1",
"KeyWords": "custom keyword"
}
],
"Confidence": 100,
"Label": "customized"
}
],
"RiskLevel": "high",
"DataId": "text0424****"
},
"Message": "OK",
"RequestId": "AAAAAA-BBBB-CCCCC-DDDD-EEEEEEEE****"
}
-
Response from large model:
{
"Code": 200,
"Data": {
"Ext": {
"LlmContent": {
"OutputText": "Suspected abusive or insulting content"
}
},
"Result": [
{
"Confidence": 100.0,
"CustomizedHit": null,
"Description": "Suspected abusive or insulting content",
"Label": "inappropriate_profanity",
"RiskWords": "violatingWord1,violatingWord2"
}
],
"RiskLevel": "high"
},
"Message": "OK",
"RequestId": "12345-ABCDE-XXXXX-66666"
}
Code
|
Code |
Status code |
Description |
|
200 |
OK |
The request succeeded. |
|
400 |
BAD_REQUEST |
The request is invalid. This may be caused by incorrect request parameters. Please check them and try again. |
|
403 |
PERMISSION_DENY |
This error occurs if your account lacks the necessary permissions, has an overdue payment, is not enabled for the service, or is suspended. |
|
500 |
GENERAL_ERROR |
This may be a temporary server error. Retry the request. If the error persists, contact Online Support. |
|
581 |
TIMEOUT |
The request timed out. Retry the request. If the error persists, contact Online Support. |
|
588 |
EXCEED_QUOTA |
The request rate exceeds the quota. |