API integration guide

更新时间:
复制 MD 格式

Screens text content for compliance risks, sensitive data, and prompt injection attacks using the TextModerationPlus API operation — without bundling the check with model inference.

Important

If you have already integrated the enhanced PLUS edition of the Guardrails service, upgrade the software development kit (SDK) to call this API operation. If you are starting fresh, integrate this API directly. You can reuse it later to moderate AI-generated images and files. For details, see the Multimodal API integration guide.

Prerequisites

Before you begin, decide the following:

  • Which content to check: user inputs (query_security_check_intl), LLM outputs (response_security_check_intl), or both

  • How to handle each risk level: block high-risk content automatically, route medium-risk content to human review, and treat low-risk content as safe unless you need high recall

Then make sure you have:

  • An Alibaba Cloud account with the

  • A Resource Access Management (RAM) user with the AliyunYundunGreenWebFullAccess system policy and an AccessKey pair (see Set up a RAM user)

Set up a RAM user

The AccessKey pair is used for identity verification when calling Alibaba Cloud API operations.

  1. Log on to the RAM console with your Alibaba Cloud account.

  2. Create a RAM user. For details, see Create a RAM user.

  3. Grant the AliyunYundunGreenWebFullAccess system policy to the RAM user. For details, see Grant permissions to a RAM user.

  4. Create an AccessKey pair for the RAM user. For details, see Obtain an AccessKey pair.

Install the SDK

For SDK installation and setup, see the SDK Reference.

API reference

Endpoint

Region

Public endpoint

Internal endpoint

China (Shanghai)

https://green-cip.cn-shanghai.aliyuncs.com

https://green-cip-vpc.cn-shanghai.aliyuncs.com

China (Chengdu)

https://green-cip.cn-chengdu.aliyuncs.com

Not available

China (Shenzhen)

https://green-cip.cn-shenzhen.aliyuncs.com

https://green-cip-vpc.cn-shenzhen.aliyuncs.com

China (Hangzhou)

https://green-cip.cn-hangzhou.aliyuncs.com

https://green-cip-vpc.cn-hangzhou.aliyuncs.com

China (Beijing)

https://green-cip.cn-beijing.aliyuncs.com

https://green-cip-vpc.cn-beijing.aliyuncs.com

Singapore

green-cip.ap-southeast-1.aliyuncs.com

green-cip-vpc.ap-southeast-1.aliyuncs.com

Usage notes

  • QPS limit: 50 requests per second per user. Requests that exceed this limit are throttled.

  • Content limit: 2,000 characters per request.

  • Billing: Only requests that return HTTP status code 200 are billed. See Billing overview for details.

Request parameters

Parameter

Type

Required

Description

Service

String

Yes

query_security_check

  • AI input content security check (query_security_check)

  • AI-generated content security check (response_security_check)

  • AI input content security check for services outside China (query_security_check_cb)

  • AI-generated content security check for services outside China (response_security_check_cb)

Service

String

Yes

The moderation use case. Valid values: query_security_check_intl (AI input check) and response_security_check_intl (AI-generated content check).

ServiceParameters

JSONString

Yes

A JSON string containing the content to moderate. See the table below for fields.

ServiceParameters fields

Field

Type

Required

Description

content

String

At least one field required

The text to moderate. Maximum 2,000 characters.

chatId

String

No

A unique ID for an interaction record, pairing a user input with an LLM output.

Response parameters

Parameter

Type

Description

Code

Integer

The HTTP status code. See Status codes.

Data

JSONObject

The moderation result. See the table below for fields.

Message

String

The response message.

RequestId

String

The request ID.

Data fields

Field

Type

Description

RiskLevel

String

The overall compliance risk level: high, medium, low, or none. Determined by the configured risk score thresholds. If a custom dictionary is hit, the risk level is high by default. Configure thresholds in the Guardrails console.

Result

JSONArray

Compliance risk labels with confidence scores. See Result fields.

SensitiveLevel

String

The overall sensitive content level: S0 (none detected) through S4 (highest).

SensitiveResult

JSONArray

Sensitive content detection results. See SensitiveResult fields.

AttackLevel

String

The overall attack detection level: high, medium, low, or none.

AttackResult

JSONArray

Prompt injection detection results. See AttackResult fields.

Result fields

Field

Type

Description

Label

String

The compliance risk label (e.g., political_entity, political_figure, customized). Multiple labels may be returned.

Confidence

Float

The confidence score, from 0 to 100 with two decimal places. Not all labels include a score.

Riskwords

String

Detected sensitive words, comma-separated. Not all labels include this field.

CustomizedHit

JSONArray

Populated when Label is customized. Contains the matched custom dictionary name and keywords. See CustomizedHit fields.

Description

String

A human-readable explanation of the label. This field may change — use Label to drive your business logic, not Description.

CustomizedHit fields

Field

Type

Description

LibName

String

The name of the matched custom dictionary.

Keywords

String

The matched custom words, comma-separated.

SensitiveResult fields

Field

Type

Description

Label

String

The sensitive content label (e.g., 1780).

SensitiveLevel

String

The sensitivity level: S0 (none) through S3.

SensitiveData

JSONArray

Detected sensitive samples (0–5 items).

Description

String

A human-readable explanation of the label. Use Label to drive your business logic, not Description.

AttackResult fields

China (Shanghai)

https://green-cip.cn-shanghai.aliyuncs.com

https://green-cip-vpc.cn-shanghai.aliyuncs.com

Field

Type

Description

Label

String

The attack type (e.g., Indirect Prompt Injection).

AttackLevel

String

The attack level: high, medium, low, or none.

Confidence

Float

The confidence score, from 0 to 100.

Description

String

A human-readable explanation of the label. Use Label to drive your business logic, not Description.

Handle moderation results

Use the top-level fields (RiskLevel, SensitiveLevel, AttackLevel) to route content. Drill into Result, SensitiveResult, and AttackResult arrays for the specific labels and confidence scores that explain the decision.

Level

Recommended action

high

Block the content automatically.

medium

Route to human review.

low

Treat as safe unless your application requires high recall.

none

No action required.

A custom dictionary match always sets RiskLevel to high.

Example

Request

{
    "Service": "query_security_check",
    "ServiceParameters": {
        "content": "testing content",
        "chatId":"ABC123"
    }
}

Response (system policy matched)

{
    "Code": 200,
    "Data": {
        "Result": [
            {
                "Label": "political_entity",
                "Description":"Suspected political entity",
                "Confidence": 100.0,
                "RiskWords": "Word A,Word B,Word C"
            },
            {
                "Label": "political_figure",
                "Description":"Suspected political figure",
                "Confidence": 100.0,
                "RiskWords": "Word A,Word B,Word C"
            },
            {
                "Label": "customized",
                "Description": "Hit custom dictionary",
                "Confidence": 100.0,
                "CustomizedHit": [
                     {
                        "LibName": "Custom Dictionary Name 1",
                        "KeyWords": "Custom Keyword"
                     }
                ]
             }
        ],
         "SensitiveResult": [
            {
                "Label": "1780",
                "SensitiveLevel": "S4",
                "Description":"Credit card number",
                "SensitiveData": ["6201112223455"]
            }
        ],
         "AttackResult": [
            {
                "Label": "Indirect Prompt Injection",
                "AttackLevel": "high",
                "Description":"Indirect prompt injection",
                "Confidence": 100.0
            }
        ],
        "RiskLevel": "high",
        "SensitiveLevel": "S3",
        "AttackLevel": "high"
    },
    "Message": "OK",
    "RequestId": "AAAAAA-BBBB-CCCCC-DDDD-EEEEEEEE****"
}

Status codes

Code

Status

Description

200

OK

The request was successful.

400

BAD_REQUEST

The request is invalid. Check the request parameters.

408

PERMISSION_DENY

The account is not authorized, has an overdue payment, has not activated the service, or is banned.

500

GENERAL_ERROR

A temporary server-side error occurred. Retry the request. If this code persists, contact online supportonline support.

581

TIMEOUT

The request timed out. Retry the request. If this code persists, contact online supportonline support.

588

EXCEED_QUOTA

The request frequency exceeds the quota.

What's next