The Text Moderation PLUS service, an upgrade from Text Moderation Enhanced Edition, moderates prompts and generated text from large language models separately, retrieves standard answers for specific prompts, and lets you enable or disable moderation labels. This topic describes how to use the Text Moderation PLUS service.
For content safety in large models, Alibaba Cloud offers its specialized AI Safety product. We recommend using AI Safety for text moderation in AI applications.
Features
Compared to the general text moderation service (Text Moderation Enhanced Edition), the Text Moderation PLUS service offers moderation for large language models to meet compliance and specific business needs.
|
Item |
Text Moderation PLUS |
General text moderation |
|
Use cases |
Detects content in large language model applications:
|
Detects AI-generated content (AIGC):
|
|
Moderation capabilities |
|
|
|
Label system |
|
|
|
API features |
The API operation is TextModerationPlus:
|
The API operation is TextModeration:
|
Answer library management
You can manage answer libraries on the console. For more information, see the Content Moderation console.
-
In the left-side navigation pane, choose .
-
On the Answer Library Management tab, you can add or modify answer libraries and their answers.
-
Click Create Answer Library and enter a name for the library. You can choose to Add Answers in Batches or Upload File. Alternatively, you can select Create library first and add answers later. You can add up to 10,000 answers and create a maximum of 100 answer libraries per account. Each answer cannot exceed 1,000 characters. To add answers in batches, enter one answer per line. You can upload Excel files.
-
In the list of answer libraries, click manage in the Actions column to open the answer maintenance page.
-
Click Add, where you can Add answers in batches.
-
You can add, delete, or modify the library's answers. Use the search box to find specific answers. You can also select multiple answers and click Delete in Batches.
-
Custom answer library configuration
You can configure answer libraries based on labels on the console. For more information, see the Content Moderation console.
-
In the left-side navigation pane, choose .
-
On the Rule Management tab, for the LLM input moderation (llm_query_moderation) scenario, click Manage Detection Rules in the Actions column.
-
Select the detection type that you want to adjust, such as ad content detection.
-
Click Edit, then modify the settings for Custom answer library configuration.
-
In the Answer library selection column, select an existing answer library or click Add answer library to create a new one. You can associate up to three answer libraries with each label.
-
Click Save to save the new custom answer library configuration. The new configuration takes effect in about one minute and applies to the production environment. You can view the answer library currently assigned to each label. In the Custom answer library configuration section, the table shows each label and its meaning. Use the drop-down list on the right to select an answer library.
-
Billing
The Text Moderation Plus service supports pay-as-you-go and resource package deduction.
Pay-as-you-go
After you enable the Text Moderation Plus service, the default billing method is pay-as-you-go. You are billed daily based on your actual usage. You are not charged for the service if you do not use it.
|
Type |
Service scenarios |
Unit price |
|
advanced text moderation (text_advanced) |
|
CNY 15.00 per 10,000 calls |
Resource package deduction
For high-volume or predictable usage, we recommend purchasing a resource package in advance. Larger resource packages come with greater discounts. You can purchase and use multiple resource packages simultaneously. For more information, see Purchase a resource package for Text Moderation Plus.
This resource package covers Text Moderation Plus usage and cannot be shared with the Content Moderation 1.0 traffic package. The specific deduction factors are as follows:
|
Type |
Service scenarios |
Deduction factor |
|
advanced text moderation (text_advanced) |
|
The deduction factor is 2. This means that each successful API call deducts 2 calls from your resource package quota. For example, if you purchase a resource package with a quota of 10 calls and you make one successful API call, the system deducts 2 calls from your package, leaving a remaining quota of 8 calls. |
Integration
Step 1: Activate the service
Visit Activate Service to activate the Text Moderation Enhanced Edition service.
Step 2: Grant permissions to a RAM user
Before using the SDK or an API, you must grant permissions to a RAM user. API calls require an AccessKey for authentication, which you can create for your Alibaba Cloud account or a RAM user. For instructions on creating an AccessKey, see Obtain an AccessKey.
Procedure
Log in to the RAM console with your Alibaba Cloud account.
Create a RAM user.
For detailed instructions, see Create a RAM user.
Attach the
AliyunYundunGreenWebFullAccesssystem policy to the RAM user to grant permissions.For detailed instructions, see Manage RAM user permissions.
The RAM user can now call Content Moderation APIs.
Step 3: Install and integrate the SDK
The Content Moderation service is available in the following regions. For the SDK integration guide, see Integration Guide.
|
Region |
Public endpoint |
VPC endpoint |
|
China (Beijing) |
green-cip.cn-beijing.aliyuncs.com |
green-cip-vpc.cn-beijing.aliyuncs.com |
|
China (Shanghai) |
green-cip.cn-shanghai.aliyuncs.com |
green-cip-vpc.cn-shanghai.aliyuncs.com |
|
China (Hangzhou) |
green-cip.cn-hangzhou.aliyuncs.com |
green-cip-vpc.cn-hangzhou.aliyuncs.com |
|
China (Shenzhen) |
green-cip.cn-shenzhen.aliyuncs.com |
green-cip-vpc.cn-shenzhen.aliyuncs.com |
|
China (Chengdu) |
green-cip.cn-chengdu.aliyuncs.com |
N/A |
API
Usage
Call this API to create a text content detection task. To construct an HTTP request, see HTTPS native call. Alternatively, use a prebuilt request as described in getting started.
You can run this API directly in OpenAPI Explorer without the hassle of calculating a signature. After a successful API call, OpenAPI Explorer automatically generates an SDK code sample.
-
API: TextModerationPlus
-
Available regions and endpoints:
|
Region |
Public endpoint |
VPC endpoint |
|
China (Shanghai) |
https://green-cip.cn-shanghai.aliyuncs.com |
https://green-cip-vpc.cn-shanghai.aliyuncs.com |
|
China (Beijing) |
https://green-cip.cn-beijing.aliyuncs.com |
https://green-cip-vpc.cn-beijing.aliyuncs.com |
|
China (Hangzhou) |
https://green-cip.cn-hangzhou.aliyuncs.com |
https://green-cip-vpc.cn-hangzhou.aliyuncs.com |
|
China (Shenzhen) |
https://green-cip.cn-shenzhen.aliyuncs.com |
https://green-cip-vpc.cn-shenzhen.aliyuncs.com |
|
China (Chengdu) |
https://green-cip.cn-chengdu.aliyuncs.com |
N/A |
-
Billing: This is a billable API. metering and billing apply only to requests that return an HTTP status code of 200. Charges do not apply to requests that result in other error codes. For more information on billing, see the billing overview.
QPS limit
The per-user QPS limit for this API is 100 QPS. If this limit is exceeded, subsequent API calls will be throttled, which may impact your business. Please manage your call rate accordingly.
Request parameters
|
Parameter |
Type |
Required |
Example value |
Description |
|
Service |
String |
Yes |
llm_query_moderation |
|
|
ServiceParameters |
JSON string |
Yes |
The parameter set for the specified moderation service. For field descriptions, see ServiceParameters. |
Table 1. ServiceParameters
|
Parameter |
Type |
Required |
Example |
Description |
|
content |
String |
Yes |
Text to moderate |
The text to be moderated. The character limits for this content are as follows:
|
|
accountId |
String |
No |
13**** |
The unique ID of the account. The text moderation engine uses this ID to consider context from previous requests with the same account ID. Note
Recommended for the |
|
sessionId |
String |
No |
14**** |
The ID of the streaming session. The text moderation engine concatenates text segments from the same session and moderates the combined content. The combined content cannot exceed the service's character limit. Note
Recommended for the |
|
dataId |
String |
No |
text0424**** |
A unique identifier for your business data. The ID must be 64 characters or less and can contain letters, digits, underscores (_), hyphens (-), and periods (.). |
Return parameters
|
Parameter |
Type |
Value |
Description |
|
Code |
Integer |
200 |
The status code. For details, see Code Description. |
|
Data |
JSONObject |
{"Result":[...],"Advice":[...]} |
The audit result data. For details, see the Data parameter. |
|
Message |
String |
OK |
The response message. |
|
RequestId |
String |
AAAAAA-BBBB-CCCCC-DDDD-EEEEEEEE**** |
The request ID. |
Table 2. Data
|
Parameter |
Type |
Example value |
Description |
|
Result |
JSONArray |
[{"confidence":100.0,"label":"political_entity","riskWords":"sensitive_word_1"},{...}] |
Detection results, including risk labels and confidence scores. For details, see Result. |
|
RiskLevel |
String |
high |
The risk level, determined by the risk score thresholds you configure. Valid values:
Note
Handle high-risk content immediately. Review medium-risk content manually. Address low-risk content only if you need a high recall rate; otherwise, treat it as risk-free content. You can configure the risk score thresholds in the Content Moderation Console. |
|
Advice |
JSONArray |
[{"Answer":"This is a standard answer"}] |
The |
|
DataId |
String |
text0424**** |
The data ID of the object to moderate. Note
If you pass the |
Table 3. Result
|
Parameter |
Type |
Value |
Description |
|
Label |
String |
political_xxx |
The label returned by text analysis. Multiple labels and scores may be detected. For a list of supported labels, see risk labels. |
|
Confidence |
Float |
81.22 |
The confidence score, which ranges from 0 to 100, with a precision of up to two decimal places. Some labels do not have a confidence score. |
|
Riskwords |
String |
AA,BB,CC |
A comma-separated list of sensitive words detected in the content. This field may not be returned for all labels. |
|
CustomizedHit |
JSONArray |
[{"LibName":"...","Keywords":"..."}] |
If content matches a term in a custom library, the |
Table 4. CustomizedHit
|
Parameter |
Type |
Example |
Description |
|
LibName |
String |
custom library 1 |
The custom library name. |
|
Keywords |
String |
custom keyword 1, custom keyword 2 |
A comma-separated list of custom keywords. |
Table 5. Advice
|
Parameter |
Type |
Value |
Description |
|
Answer |
String |
This is a standard answer. |
The moderation service returns an alternative response in the following scenarios:
Note
This applies only to the llm_query_moderation service.
|
|
HitLabel |
String |
political_xxx |
The highest-risk label returned by text content moderation. For a list of supported labels, see risk labels. |
|
HitLibName |
String |
Custom Alternative Response Library 001 |
The name of the custom alternative response library. |
Example
Request example
{
"Service": "llm_query_moderation",
"ServiceParameters": {
"content": "testing content"
}
}
-
Response when no risks are detected.
{
"Code": 200,
"Data": {
"Result": [
{
"Label": "nonLabel"
}
],
"RiskLevel": "none"
},
"Message": "OK",
"RequestId": "AAAAAA-BBBB-CCCCC-DDDD-EEEEEEEE****"
}
-
Response when the prompt matches the mandatory and alternative response library.
{
"Code": 200,
"Data": {
"Advice": [
{
"Answer": "This is a sample standard answer."
}
],
"Result": [
{
"Label": "political_entity",
"Confidence": 100.0,
"RiskWords": "word_A,word_B,word_C"
},
{
"Label": "political_figure",
"Confidence": 100.0,
"RiskWords": "word_A,word_B,word_C"
}
],
"RiskLevel": "high"
},
"Message": "OK",
"RequestId": "AAAAAA-BBBB-CCCCC-DDDD-EEEEEEEE****"
}
-
Response when the prompt matches a user-defined rejection and alternative response library.
{
"Code": 200,
"Data": {
"Advice": [
{
"HitLabel": "political_entity",
"Answer": "This is a sample standard answer.",
"HitLibName": "political_entity-001"
}
],
"Result": [
{
"Label": "political_entity",
"Confidence": 100.0,
"RiskWords": "word_A,word_B,word_C"
},
{
"Label": "political_figure",
"Confidence": 100.0,
"RiskWords": "word_A,word_B,word_C"
}
],
"RiskLevel": "high"
},
"Message": "OK",
"RequestId": "AAAAAA-BBBB-CCCCC-DDDD-EEEEEEEE****"
}
-
Response when the prompt matches the system rejection and alternative response library.
{
"Code": 200,
"Data": {
"Advice": [
{
"HitLabel": "political_entity",
"Answer": "This is a sample standard answer."
}
],
"Result": [
{
"Label": "political_entity",
"Confidence": 100.0,
"RiskWords": "word_A,word_B,word_C"
},
{
"Label": "political_figure",
"Confidence": 100.0,
"RiskWords": "word_A,word_B,word_C"
}
],
"RiskLevel": "high"
},
"Message": "OK",
"RequestId": "AAAAAA-BBBB-CCCCC-DDDD-EEEEEEEE****"
}
Code
|
Code |
Status code |
Description |
|
200 |
OK |
The request succeeded. |
|
400 |
BAD_REQUEST |
The request is invalid because a request parameter is incorrect or missing. Check the parameters and retry. |
|
408 |
PERMISSION_DENY |
Permission denied. This can be due to an unauthorized account, overdue payments, an inactive service, or a disabled account. |
|
500 |
GENERAL_ERROR |
An internal server error occurred. This may be a temporary issue. Retry the request. If the issue persists, contact Support. |
|
581 |
TIMEOUT |
The request timed out. Retry the request. If the issue persists, contact Support. |
|
588 |
EXCEED_QUOTA |
The request rate exceeds your quota. |