API integration guide-AI Guardrails(AI Guardrails)-阿里云帮助中心

Screens text content for compliance risks, sensitive data, and prompt injection attacks using the TextModerationPlus API operation — without bundling the check with model inference.

Important

If you have already integrated the enhanced PLUS edition of the Guardrails service, upgrade the software development kit (SDK) to call this API operation. If you are starting fresh, integrate this API directly. You can reuse it later to moderate AI-generated images and files. For details, see the Multimodal API integration guide.

Prerequisites

Before you begin, decide the following:

Which content to check: user inputs (query_security_check_intl), LLM outputs (response_security_check_intl), or both
How to handle each risk level: block high-risk content automatically, route medium-risk content to human review, and treat low-risk content as safe unless you need high recall

Then make sure you have:

An Alibaba Cloud account with the
A Resource Access Management (RAM) user with the AliyunYundunGreenWebFullAccess system policy and an AccessKey pair (see Set up a RAM user)

Set up a RAM user

The AccessKey pair is used for identity verification when calling Alibaba Cloud API operations.

Log on to the RAM console with your Alibaba Cloud account.
Create a RAM user. For details, see Create a RAM user.
Grant the AliyunYundunGreenWebFullAccess system policy to the RAM user. For details, see Grant permissions to a RAM user.
Create an AccessKey pair for the RAM user. For details, see Obtain an AccessKey pair.

Install the SDK

For SDK installation and setup, see the SDK Reference.

API reference

Endpoint

Region	Public endpoint	Internal endpoint
China (Shanghai)	https://green-cip.cn-shanghai.aliyuncs.com	https://green-cip-vpc.cn-shanghai.aliyuncs.com
China (Chengdu)	https://green-cip.cn-chengdu.aliyuncs.com	Not available
China (Shenzhen)	https://green-cip.cn-shenzhen.aliyuncs.com	https://green-cip-vpc.cn-shenzhen.aliyuncs.com
China (Hangzhou)	https://green-cip.cn-hangzhou.aliyuncs.com	https://green-cip-vpc.cn-hangzhou.aliyuncs.com
China (Beijing)	https://green-cip.cn-beijing.aliyuncs.com	https://green-cip-vpc.cn-beijing.aliyuncs.com
Singapore	`green-cip.ap-southeast-1.aliyuncs.com`	`green-cip-vpc.ap-southeast-1.aliyuncs.com`

Usage notes

QPS limit: 50 requests per second per user. Requests that exceed this limit are throttled.
Content limit: 2,000 characters per request.
Billing: Only requests that return HTTP status code 200 are billed. See Billing overview for details.

Request parameters

Parameter	Type	Required	Description
Service	String	Yes	query_security_check	AI input content security check (query_security_check) AI-generated content security check (response_security_check) AI input content security check for services outside China (query_security_check_cb) AI-generated content security check for services outside China (response_security_check_cb)
`Service`	String	Yes	The moderation use case. Valid values: `query_security_check_intl` (AI input check) and `response_security_check_intl` (AI-generated content check).
`ServiceParameters`	JSONString	Yes	A JSON string containing the content to moderate. See the table below for fields.

ServiceParameters fields

Field	Type	Required	Description
`content`	String	At least one field required	The text to moderate. Maximum 2,000 characters.
`chatId`	String	No	A unique ID for an interaction record, pairing a user input with an LLM output.

Response parameters

Parameter	Type	Description
`Code`	Integer	The HTTP status code. See Status codes.
`Data`	JSONObject	The moderation result. See the table below for fields.
`Message`	String	The response message.
`RequestId`	String	The request ID.

Data fields

Field	Type	Description
`RiskLevel`	String	The overall compliance risk level: `high`, `medium`, `low`, or `none`. Determined by the configured risk score thresholds. If a custom dictionary is hit, the risk level is `high` by default. Configure thresholds in the Guardrails console.
`Result`	JSONArray	Compliance risk labels with confidence scores. See Result fields.
`SensitiveLevel`	String	The overall sensitive content level: `S0` (none detected) through `S4` (highest).
`SensitiveResult`	JSONArray	Sensitive content detection results. See SensitiveResult fields.
`AttackLevel`	String	The overall attack detection level: `high`, `medium`, `low`, or `none`.
`AttackResult`	JSONArray	Prompt injection detection results. See AttackResult fields.

Result fields

Field	Type	Description
`Label`	String	The compliance risk label (e.g., `political_entity`, `political_figure`, `customized`). Multiple labels may be returned.
`Confidence`	Float	The confidence score, from 0 to 100 with two decimal places. Not all labels include a score.
`Riskwords`	String	Detected sensitive words, comma-separated. Not all labels include this field.
`CustomizedHit`	JSONArray	Populated when `Label` is `customized`. Contains the matched custom dictionary name and keywords. See CustomizedHit fields.
`Description`	String	A human-readable explanation of the label. This field may change — use `Label` to drive your business logic, not `Description`.

CustomizedHit fields

Field	Type	Description
`LibName`	String	The name of the matched custom dictionary.
`Keywords`	String	The matched custom words, comma-separated.

SensitiveResult fields

Field	Type	Description
`Label`	String	The sensitive content label (e.g., `1780`).
`SensitiveLevel`	String	The sensitivity level: `S0` (none) through `S3`.
`SensitiveData`	JSONArray	Detected sensitive samples (0–5 items).
`Description`	String	A human-readable explanation of the label. Use `Label` to drive your business logic, not `Description`.

AttackResult fields

China (Shanghai)

https://green-cip.cn-shanghai.aliyuncs.com

https://green-cip-vpc.cn-shanghai.aliyuncs.com

Field	Type	Description
`Label`	String	The attack type (e.g., `Indirect Prompt Injection`).
`AttackLevel`	String	The attack level: `high`, `medium`, `low`, or `none`.
`Confidence`	Float	The confidence score, from 0 to 100.
`Description`	String	A human-readable explanation of the label. Use `Label` to drive your business logic, not `Description`.

Handle moderation results

Use the top-level fields (RiskLevel, SensitiveLevel, AttackLevel) to route content. Drill into Result, SensitiveResult, and AttackResult arrays for the specific labels and confidence scores that explain the decision.

Level	Recommended action
`high`	Block the content automatically.
`medium`	Route to human review.
`low`	Treat as safe unless your application requires high recall.
`none`	No action required.

A custom dictionary match always sets RiskLevel to high.

Example

Request

{
    "Service": "query_security_check",
    "ServiceParameters": {
        "content": "testing content",
        "chatId":"ABC123"
    }
}

Response (system policy matched)

{
    "Code": 200,
    "Data": {
        "Result": [
            {
                "Label": "political_entity",
                "Description":"Suspected political entity",
                "Confidence": 100.0,
                "RiskWords": "Word A,Word B,Word C"
            },
            {
                "Label": "political_figure",
                "Description":"Suspected political figure",
                "Confidence": 100.0,
                "RiskWords": "Word A,Word B,Word C"
            },
            {
                "Label": "customized",
                "Description": "Hit custom dictionary",
                "Confidence": 100.0,
                "CustomizedHit": [
                     {
                        "LibName": "Custom Dictionary Name 1",
                        "KeyWords": "Custom Keyword"
                     }
                ]
             }
        ],
         "SensitiveResult": [
            {
                "Label": "1780",
                "SensitiveLevel": "S4",
                "Description":"Credit card number",
                "SensitiveData": ["6201112223455"]
            }
        ],
         "AttackResult": [
            {
                "Label": "Indirect Prompt Injection",
                "AttackLevel": "high",
                "Description":"Indirect prompt injection",
                "Confidence": 100.0
            }
        ],
        "RiskLevel": "high",
        "SensitiveLevel": "S3",
        "AttackLevel": "high"
    },
    "Message": "OK",
    "RequestId": "AAAAAA-BBBB-CCCCC-DDDD-EEEEEEEE****"
}

Status codes

Code	Status	Description
200	OK	The request was successful.
400	BAD_REQUEST	The request is invalid. Check the request parameters.
408	PERMISSION_DENY	The account is not authorized, has an overdue payment, has not activated the service, or is banned.
500	GENERAL_ERROR	A temporary server-side error occurred. Retry the request. If this code persists, contact online supportonline support.
581	TIMEOUT	The request timed out. Retry the request. If this code persists, contact online supportonline support.
588	EXCEED_QUOTA	The request frequency exceeds the quota.

What's next

Multimodal API integration guide — extend moderation to AI-generated images and files
SDK Reference — SDK installation and usage
Billing overview — understand how requests are billed