Detects risks and violations in common documents using asynchronous moderation. This topic describes the operations available in Document Moderation 2.0.
Access guidelines
Register an Alibaba Cloud account: Register now.Register now
Activate the pay-as-you-go billing method for Content Moderation: Make sure that the Content Moderation 2.0 service is activated. For more information, see Activate service. Activation is free. After you call API operations, the billing system charges you based on usage.
Create an AccessKey pair: Make sure that you have created an AccessKey pair as a Resource Access Management (RAM) user. For more information, see Create AccessKey. To use an AccessKey pair belonging to a RAM user, use your Alibaba Cloud account to grant the AliyunYundunGreenWebFullAccess permission to the RAM user. For more information, see RAM authorization.
Use SDKs: For more information, see Document Moderation 2.0 SDK and integration guide.
Submit a moderation task
Usage notes
Business operation: FileModeration. Only asynchronous moderation is supported.
Supported regions and access addresses:
Region Public network access address Internal network access address Supported services Singapore green-cip.ap-southeast-1.aliyuncs.com green-cip-vpc.ap-southeast-1.aliyuncs.com document_detection_global Billing: This operation is chargeable and billed by the number of pages processed in the document.
Moderation object: Common documents are supported.
Result delivery: Moderation results are not returned in real time. Retrieve them by polling or by enabling callback notification. Results are retained for up to 24 hours.
Callback notification: Specify a callback URL in the callback parameter when submitting the moderation task.
Polling: Leave the callback parameter blank and call the result query operation after submission.
Document requirements:
Supported protocols: HTTP and HTTPS.
Supported formats: DOC, DOCX, PPT, PPTX, PPS, PPSX, PDF, XLS, XLSX, XLTX, XLTM, HTML, and TXT (UTF-8 encoding).
Size limit: 200 MB per document. Compress or split documents that exceed this limit.
Moderation time depends on document download time. Use a stable and reliable storage service such as Alibaba Cloud OSS.
Rule configuration: Configure Document Moderation rules in the Content Moderation console before making your first call. Without this configuration, Document Moderation 2.0 defaults to standard settings.
QPS limit
You can call this operation up to 100 times per second per account. The system supports a maximum of 20 concurrent moderation tasks. Requests that exceed this limit are dropped, which may interrupt your service. Note this limit when calling this operation.
Debugging
Use Alibaba Cloud OpenAPIDocument Moderation enhanced edition to debug the operation online, view sample code and SDK dependencies, and explore operation parameters.
Before calling the Content Moderation API, log on to the Content Moderation console using your Alibaba Cloud account. Fees incurred by calling the operations are billed to that account.
Request parameters
| Name | Type | Required | Example | Description |
|---|---|---|---|---|
| Service | String | Yes | document_detection_global | The moderation service type. Valid values: document_detection_global (General Document Moderation).
|
| ServiceParameters | JSONString | Yes | The parameters required by the moderation service, as a JSON string. For descriptions of each field, see ServiceParameters. |
Table 1. ServiceParameters
| Name | Type | Required | Example | Description |
|---|---|---|---|---|
| url | String | Yes* | http://www.aliyundoc.com/a.pdf | The URL of the document to moderate. The URL must be accessible over the public network. Maximum length: 2,048 characters. The URL cannot contain Chinese characters, and only one URL is allowed per request. |
| ossBucketName | String | No* | bucket_0307 | The name of the authorized OSS bucket. Before using OSS intranet addresses, use your Alibaba Cloud account to complete authorization on the Cloud Resource Access Authorization page. |
| ossObjectName | String | No* | 20240307/07/28/test.pdf | The name of the object in the authorized OSS bucket. |
| ossRegionId | String | No* | cn-shanghai | The region of the OSS bucket. |
| docType | String | No | The document format, required when the URL points to a file without a filename extension. Valid values: doc, docx, ppt, pptx, pps, ppsx, xls, xlsx, xltx, xltm, xlsb, xlsm, csv, pdf, html, txt. Note For txt files, only text content is moderated; image content is not moderated by screenshot. Extract text from txt files and call the Text Moderation 2.0 service instead. | |
| callback | String | No | http://www.aliyundoc.com | The callback URL for moderation result notification. Supports HTTP and HTTPS. If left blank, poll for results. The callback endpoint must support POST requests with UTF-8 encoding, and accept the checksum and content form parameters. Content Moderation populates these parameters as follows: checksum is a string in UID + seed + content format, signed using the SHA256 algorithm, where UID is the Alibaba Cloud account ID (query it in the Alibaba Cloud Management Console). Verify the checksum on your server to detect data tampering. Note The UID must be the Alibaba Cloud account ID, not a RAM user ID. content is a JSON-encoded string; parse it to retrieve the moderation result. For the content format, see the sample success responses of the result query operation. Note If your server successfully receives a callback notification, return HTTP 200. Otherwise, Content Moderation retries the notification up to 16 times, then stops. Check the callback URL status if notifications are not received. |
| seed | String | No | abc**** | A random string used to generate the callback notification signature. Maximum length: 64 characters. Allowed characters: letters, digits, and underscores (_). Required if callback is set. |
| cryptType | String | No | SHA256 | The signing algorithm for callback notification content. Valid values: SHA256 (default): signs using the SHA256 algorithm. SM3: signs using the HMAC-SM3 algorithm and returns a hexadecimal lowercase string (for example, 66c7f0f462eeedd9d1f2d46bdc10e4e24167c4875cf2f7a2297da02b8f4ba8e0). |
| dataId | String | No | fileId**** | The ID of the object to moderate. Maximum length: 128 characters. Allowed characters: letters, digits, underscores (_), hyphens (-), and periods (.). |
| referer | String | No | www.aliyun.com | The Referer request header, used for hotlink protection. Maximum length: 256 characters. |
*The url, ossBucketName/ossObjectName/ossRegionId (OSS authorization), and local document upload (via SDK) are three mutually exclusive input methods. Choose one. For local document upload code examples, see Document Moderation 2.0 SDK and access guide.
Response parameters
| Name | Type | Example | Description |
|---|---|---|---|
| Code | Integer | 200 | The status code, consistent with HTTP status codes. For details, see Code description. |
| Data | JSONObject | The moderation result data. | |
| Data.TaskId | String | AAAAA-BBBBB | The task ID. |
| Message | String | OK | The response message. |
| RequestId | String | ABCD1234-1234-1234-1234-123**** | The request ID. |
Examples
Sample requests
{
"Service": "document_detection_global",
"ServiceParameters":
{
"url": "http://www.aliyundoc.com/a.pdf",
"dataId": "fileId-2024-0307-0728***"
}
}Sample success responses
{
"Msg": "OK",
"Code": 200,
"Data":
{
"TaskId": "AAAAA-BBBBB-CCCCCCCC"
},
"RequestId": "ABCD1234-1234-1234-1234-123****"
}Obtain Document Moderation task results
Usage notes
Business operation: DescribeFileModerationResult. Retrieves Document Moderation task results.
Billing: This operation is free of charge.
Query timing: Query moderation results at least 30 seconds after submitting an asynchronous moderation request. Results are retained for up to 24 hours and are automatically deleted after 4 hours.
QPS limit
You can call this operation up to 100 times per second per account. Exceeding this limit triggers throttling, which may affect your service. Note this limit when calling this operation.
Debugging
Use Alibaba Cloud OpenAPI to debug the operation online, view sample code and SDK dependencies, and explore operation parameters.
Request parameters
| Name | Type | Required | Example | Description |
|---|---|---|---|---|
| Service | String | Yes | document_detection | The moderation service type. Must match the service type used when submitting the task. |
| ServiceParameters | JSONString | Yes | The parameters required by the moderation service, as a JSON string. For descriptions of each field, see ServiceParameters. |
Table 1. ServiceParameters
| Name | Type | Required | Example | Description |
|---|---|---|---|---|
| taskId | String | Yes | abcd**** | The task ID to query. One task ID per request. Obtain the task ID from the response to the submit operation. |
Response parameters
| Name | Type | Example | Description |
|---|---|---|---|
| RequestId | String | ABCD1234-1234-1234-1234-123**** | The request ID, used to locate and troubleshoot issues. |
| Data | Object | The document moderation results. For details, see Data. | |
| Code | String | 200 | The status code, consistent with HTTP status codes. For details, see Code description. |
| Message | String | OK | The response message. |
Table 2. Data
| Name | Type | Example | Description |
|---|---|---|---|
| DataId | String | fileId**** | The ID of the moderated object. Returned only if dataId was specified in the request. |
| Url | String | http://www.aliyundoc.com/a.docx | The URL of the moderated object. |
| DocType | String | The document format specified for files without a filename extension. Valid values: doc, docx, ppt, pptx, pps, ppsx, xls, xlsx, xltx, xltm, xlsb, xlsm, csv, pdf, html, txt. | |
| PageSummary | Object | A summary of the moderation results. For details, see PageSummary. | |
| RiskLevel | String | high | The overall risk level, calculated from both image and text moderation results. Valid values: high (handle directly), medium (manual review recommended), low (handle when more risky content is detected), none (no risk detected; handle based on business requirements). Configure risk score thresholds in the Content Moderation console. |
| PageResult | JSONArray | Per-page moderation results. HTTP status code 280 indicates moderation is in progress (partial results returned); 200 indicates moderation is complete. For details, see PageResult. |
Table 3. PageSummary
| Name | Type | Example | Description |
|---|---|---|---|
| PageSum | Integer | 10 | The total number of pages moderated. |
| ImageSummary | Object | A summary of image moderation results. Not present for txt files. For details, see ImageSummary. | |
| TextSummary | Object | A summary of text moderation results. For details, see TextSummary. |
Table 4. ImageSummary
| Name | Type | Example | Description |
|---|---|---|---|
| RiskLevel | String | high | The image risk level, based on configured risk score thresholds. Valid values: high, medium, low, none. |
| ImageLabels | JSONArray | A summary of image labels. For details, see ImageLabels. |
Table 5. ImageLabels
| Name | Type | Example | Description |
|---|---|---|---|
| Label | String | violent_explosion | The image risk label. For details, see Risk label interpretation table. |
| LabelSum | Integer | The number of occurrences of the label. | |
| Description | String | Fireworks content | A description of the label. This field is informational and may change. Base result processing on the Label field, not this field. |
Table 6. TextSummary
| Name | Type | Example | Description |
|---|---|---|---|
| RiskLevel | String | high | The text risk level. Valid values: high, medium, low, none. |
| TextLabels | JSONArray | A summary of text labels. For details, see TextLabels. |
Table 7. TextLabels
| Name | Type | Example | Description |
|---|---|---|---|
| Label | String | violent_explosion | The text risk label. |
| LabelSum | Integer | The number of times the label was matched. |
Table 8. PageResult
| Name | Type | Example | Description |
|---|---|---|---|
| PageNum | Integer | 50 | The page number of the document. |
| ImageUrl | String | http://oss.aliyundoc.com/a.png | The URL of the screenshot for the current page. |
| ImageResult | JSONArray | Image moderation results for the current page. Not present for txt files. For details, see ImageResult. | |
| TextResult | JSONArray | Text moderation results for the current page. For details, see TextResult. |
Table 9. ImageResult
| Name | Type | Example | Description |
|---|---|---|---|
| Description | String | Moderation of the image content of the document page | A description of the image moderation scope. |
| Service | String | baselineCheck | The service called for image moderation. |
| RiskLevel | String | high | The image risk level, based on configured risk score thresholds. Valid values: high, medium, low, none. |
| Location | JSONObject | {"x":0,"y":0,"w":100,"h":100} | (Reserved) The coordinates of the image area. |
| LabelResult | JSONArray | The labels returned for the image. For details, see LabelResult. |
Table 10. LabelResult
| Name | Type | Example | Description |
|---|---|---|---|
| Label | String | violent_explosion | The label returned for the image. Multiple labels may be returned for the same screenshot. For details, see Risk label interpretation table. |
| Confidence | Float | 81.22 | The confidence score. Valid values: 0 to 100, accurate to two decimal places. |
| Description | String | Fireworks content | A description of the label. This field is informational and may change. Base result processing on the Label field, not this field. |
Table 11. TextResult
| Name | Type | Example | Description |
|---|---|---|---|
| Description | String | Moderation of the text content of the document page. | A description of the text moderation scope. |
| Service | String | pgc_detection | The service called for text moderation. |
| Text | String | This is the text part | The text content of the moderated section. |
| Labels | String | ad_compliance,C_customized | The labels returned for the text. For details, see . |
| RiskWords | String | Risk word A, Risk word B | The risk words detected in the text. |
| RiskTips | String | Advertising Law_General Prohibition of Extreme Words | The sub-labels returned for the text. |
| RiskLevel | String | high | The text risk level, based on the calculated text risk. Valid values: high, medium, low, none. |
Examples
Sample requests
{
"service": "document_detection_global",
"serviceParameters": {
"taskId": "abcd****"
}
}Sample success responses
{
"Code": 200,
"Data": {
"DataId": "fileId-2024-0307-0728***",
"PageResult": [
{
"ImageResult": [
{
"Description": "Moderation of the image content of the document page",
"LabelResult": [
{
"label": "nonLabel"
}
],
"Service": "baselineCheck_global"
}
],
"ImageUrl": "http://oss.aliyundoc.com/a.png",
"PageNum": 1,
"TextResult": [
{
"Description": "Moderation of the text content of the document page",
"Labels": "",
"RiskTips": "",
"RiskWords": "",
"Service": "comment_multilingual_global",
"Text": "Content Moderation product test case a"
}
]
},
...
{
"ImageResult": [
{
"Description": "Moderation of the image content of the document page",
"LabelResult": [
{
"Confidence": 89.01,
"Label": "pornographic_adultContent_tii"
}
],
"Service": "baselineCheck_global"
}
],
"ImageUrl": "http://oss.aliyundoc.com/b.png",
"PageNum": 10,
"TextResult": [
{
"Description": "Moderation of the text content of the document page",
"Labels": "contraband,sexual_content",
"RiskTips": "Prohibited_Prohibited goods, Pornographic_Film resources, Pornographic_Vulgar",
"RiskWords": "Risk word A, Risk word B",
"Service": "comment_multilingual_global",
"Text": "Content Moderation product test case b"
}
]
}
],
"Url": "http://www.aliyundoc.com/a.docx"
},
"Message": "SUCCESS",
"RequestId": "1D0854A7-AAAAA-BBBBBBB-CC8292AE5"
}Code description
Only requests with code 200 or 280 are measured and billed. Other codes are not billed.
| Code | Description |
|---|---|
| 200 | The request succeeded or the moderation is complete. |
| 280 | Moderation is in progress. |
| 400 | Not all required request parameters are configured. |
| 401 | The request parameters are invalid. |
| 402 | Invalid request parameters. Check and modify them, then try again. |
| 403 | The QPS of requests exceeds the upper limit. Reduce the number of requests sent at a time. |
| 404 | The file failed to download. Check the file URL and try again. |
| 405 | File download or conversion timed out. The URL may be inaccessible. Check and adjust the file, then try again. |
| 406 | The file is too large. Check and adjust the file size, then try again. |
| 407 | The file format is not supported. Check and change the file format, then try again. |
| 408 | Insufficient permissions. The account may not be activated, may have overdue payments, or may not be authorized to call this operation. |
| 409 | The specified RequestId does not exist. The moderation results may have exceeded the 24-hour validity period. |
| 480 | The number of concurrent moderation tasks exceeds the upper limit. Reduce the number of concurrent tasks. |
| 500 | A system error occurred. |