Synchronous scan

更新时间:
复制 MD 格式

This topic explains how to use the synchronous image detection API (/green/image/scan) for image moderation. Image moderation helps you detect risky or illegal content in images for scenarios such as pornography detection, terrorism and politically sensitive content detection, ad violation detection, quick response (QR) code detection, undesirable scene detection, and logo detection.

Usage notes

The /green/image/scan API operation performs synchronous image moderation.

Call this operation to create synchronous image moderation tasks.To learn how to construct an HTTP request, see request structure. Alternatively, use a pre-constructed request as shown in the SDK overview.

  • Billing:

    This is a paid API operation. For more information about billing, see Content Moderation Pricing.

  • Response timeout:

    The maximum detection time for a synchronous detection request is 6 seconds. If the detection is not completed within this time limit, a timeout error is returned. If you do not require real-time results, you can use asynchronous detection. Otherwise, use synchronous detection because its API call is simpler. For these calls, set the timeout period to 6 seconds.

  • Returned results:

    Synchronous detection requests typically return a result within one second. However, the response time may increase in specific scenarios, such as high system load, large image size, or a large amount of text for optical character recognition (OCR).

  • Image requirements:
    • The image URL must use the HTTP or HTTPS protocol.

    • Supported image formats: PNG, JPG, JPEG, BMP, GIF, and WEBP.

    • The image size cannot exceed 20 MB for both synchronous and asynchronous calls. The height or width cannot exceed 30,000 pixels (px), and the total number of pixels cannot exceed 250 million (px).

      Note

      For GIF images, the total number of pixels cannot exceed 4,194,304 (px), and the height or width cannot exceed 30,000 pixels (px).

    • The image must be downloaded within 3 seconds. If the download time exceeds 3 seconds, a download timeout error is returned.

    • For optimal performance, we recommend that the image resolution be at least 256x256 pixels. A lower resolution may affect the detection accuracy.

    • The response time of the image detection API depends on the image download time. Ensure that the storage service where the image is stored is stable and reliable. For best performance, use Alibaba Cloud Object Storage Service (OSS) or a Content Delivery Network (CDN).

Table 1. Scenario descriptions
Scenario Description Detection categories
pornography detection Detects pornographic or sexually suggestive content in images. normal, pornographic, sexually suggestive
terrorist content detection Detects terrorist or politically sensitive content in images. normal, bloody, explosions and smoke/flashes, special attire, special symbols, weapons, politics, fighting, gatherings, marches, traffic accident scenes, flags, landmarks
ad violation detection Detects policy-violating ads or text in images. normal, text contains politically sensitive content, text contains pornographic content, text contains abusive content, text contains terrorist content, text contains prohibited content, text contains other spam content, small ad stickers, contains QR codes, contains mini program codes, other ads
Note Configure the detection categories based on your business requirements. For more information, see custom machine-assisted moderation policy.
QR code detection Detects QR codes or mini program codes in images. normal, contains QR codes, contains mini program codes
Note Configure the detection categories based on your business requirements. For more information, see custom machine-assisted moderation policy.
undesirable scene detection Detects undesirable scenes in images, such as black screens, black borders, dim footage, picture-in-picture, smoking, or live streaming inside a vehicle. normal, no content in the image (for example, a black screen or a white screen), picture-in-picture, smoking, live streaming inside a vehicle
logo detection Detects logos in images, such as TV station logos and trademarks. normal, contains controlled logos, contains trademarks

QPS limit

The queries per second (QPS) limit for this API is 50 per user. Exceeding this limit triggers throttling, which can impact your business. Plan your calls accordingly.

Request parameters

Parameter Type Required Example Description
bizType String No default

This field identifies your business scenario. You can create a business scenario in the Content Moderation console. For more information, see Customize moderation rules.

scenes StringArray Yes ["porn","terrorism","ad","live","qrcode","logo"] Specifies the moderation scenario. Valid values:
  • porn: pornography detection
  • terrorism: terrorist content detection
  • ad: ad and violation detection
  • qrcode: QR code detection
  • live: undesirable scene detection
  • Logo: logo detection
You can specify multiple scenarios. For example, ["porn", "terrorism"] indicates that the image is moderated for both pornographic and terrorist content.
Note If you specify multiple scenarios for moderation, you are charged the cumulative fee for all scenarios. The fee for each scenario is calculated by multiplying the number of moderated images by the unit price of the scenario.
tasks JSONArray Yes Specifies the moderation tasks to submit as an array of task objects. You can submit up to 100 tasks in a single request. To submit 100 tasks at once, the concurrency limit must be set to 100 or higher. For details about the object structure, see task.
Table 2. Task
Parameter Type Required Example Description
clientInfo JSONObject No {"userId":"12023****","userNick":"Mike","userType":"others"}

The client information. For more information, see the common query parameters in Common parameters.

The server merges the global clientInfo with the individual clientInfo specified for the request.

Note

The individual clientInfo has a higher priority.

dataId String No cfd33235-71a4-468b-8137-a5ffe323****

The data ID of the detection object.

This ID can contain uppercase and lowercase letters, digits, underscores (_), hyphens (-), and periods (.), and must be 128 characters or less. Use it to uniquely identify your business data.

url String Yes http://www.aliyundoc.com/xxx.jpg

The URL of the object to detect.

  • A public HTTP or HTTPS URL. The URL cannot exceed 2,048 characters in length.

  • The path of a file stored in an Alibaba Cloud OSS bucket. You must first grant Content Moderation permissions to access the OSS bucket. The OSS bucket must be in the same region as the Content Moderation service. For more information, see Authorize Content Moderation to access an OSS bucket.

    File path format: oss://<bucket-name>.<endpoint>/<object-name>

extras JSONObject No {"hitLibInfo":[{"context":"Haokan Video","libCode":"2144002","libName":"Pre-release Test Ad Similar Text Librarya"}]} Additional parameters for the API call. This parameter is not required for image moderation scenarios.
interval Integer No 2 The frame capture interval. This parameter is used for moderating GIF and long images.
  • intervalFor GIF images, this parameter specifies the interval for capturing frames for moderation. Frame capture occurs only if this parameter is set.
  • Long images are classified as tall (portrait) or wide (landscape).
    • For tall images (height > 400 pixels and height-to-width ratio > 2.5), the image is segmented, and the total number of frames is calculated by rounding the result of height/width.
    • For wide images (width > 400 pixels and width-to-height ratio > 2.5), the image is segmented, and the total number of frames is calculated by rounding the result of width/height.

By default, only the first frame of a GIF or long image is moderated. Use the interval parameter to enable interval-based frame capture to help reduce moderation costs.

Note You must use the interval parameter with the maxFrames parameter. For example, if you set interval to 2 and maxFrames to 10 for a GIF or long image, the system moderates one of every two frames, up to a maximum of 10 frames. You are billed based on the actual number of frames moderated.
maxFrames Integer No 10

The maximum number of frames to capture. This parameter is used only for GIF and long image detection. Default value: 1.

If interval * maxFrames is less than the total number of frames in the GIF or long image, the interval is automatically adjusted to (Total frames / maxFrames) to improve overall detection coverage.

Response parameters

Parameter Type Example Description
code Integer 200

The error code. It is the same as the HTTP status code.

For more information, see Common error codes.

msg String OK The response message.
dataId String cfd33235-71a4-468b-8137-a5ffe323****

The data ID of the detection object.

Note

If dataId was passed in the detection request, the same dataId is returned here.

taskId String img4wlJcb7p4wH4lAP3111111-123456 The ID of the moderation task.
url String http://www.aliyundoc.com/xxx.jpg

The URL of the object to detect.

  • A public HTTP or HTTPS URL. The URL cannot exceed 2,048 characters in length.

  • The path of a file stored in an Alibaba Cloud OSS bucket. You must first grant Content Moderation permissions to access the OSS bucket. The OSS bucket must be in the same region as the Content Moderation service. For more information, see Authorize Content Moderation to access an OSS bucket.

    File path format: oss://<bucket-name>.<endpoint>/<object-name>

storedUrl String http://www.aliyundoc.com If you enable the evidence storage feature and the moderation task meets the configured rules, the image is saved to your Alibaba Cloud OSS bucket and the corresponding HTTP URL is returned.
extras JSONObject {"hitLibInfo":[{"context":"Haokan Video","libCode":"2144002","libName":"Ad text library for pre-release testing a"}]} Additional information.

In the ad violation (ad) scenario, this parameter may return the following content.

hitLibInfo: If the text in an image hits a custom text library, this parameter returns an array that contains information about the hit text library. For details about the structure, see hitLibInfo.

results JSONArray The moderation results. If the call is successful (code=200), this parameter contains an array of one or more result objects. For more information about the structure of the object, see result.
Table 3. result
Parameter Type Example Description
scene String porn The image moderation scenario. This parameter value matches the scenario that you specified in the request. Valid values:
  • porn: pornography detection
  • terrorism: terrorist content detection
  • ad: ad and violation detection
  • qrcode: QR code detection
  • live: undesirable scene detection
  • Logo: logo detection
label String sexy The category of the moderation result. The categories vary based on the moderation scenario. Valid values:
  • For porn (pornographic content detection):
    • normal: normal content
    • sexy: sexy content
    • porn: pornographic content
  • For terrorism (terrorism and political content detection):
    • normal: normal content
    • bloody: bloody content
    • explosion: explosion and smoke
    • outfit: special costume
    • Logo: special logo
    • weapon: weapon
    • politics: political content
    • violence: violence
    • crowd: crowd
    • parade: parade
    • carcrash: car crash
    • flag: flag
    • location: landmark
    • drug: drug-related content
    • gamble: gambling
    • others: other specified content
  • For ad (ad violation):
    • normal: normal content
    • ad: other ads
    • politics: political content in text
    • porn: pornographic content in text
    • abuse: abuse in text
    • terrorism: terrorist content in text
    • contraband: prohibited content in text
    • spam: junk content in text
    • npx: overlay ad
    • qrcode: QR code
    • programCode: mini program code
  • For qrcode (QR code detection):
    • normal: normal content
    • qrcode: QR code
    • programCode: mini program code
  • For live (undesirable scene detection):
    • normal: normal content
    • meaningless: no content in the image, such as a black or white screen
    • PIP: picture-in-picture
    • smoking: smoking
    • drivelive: streaming while driving
    • drug: drug-related content
    • gamble: gambling
  • For logo (logo detection):
    • normal: normal content
    • TV: logo of banned media
    • trademark: trademark
sublabel String porn

If the detection scenes include pornography (porn) and terrorism/politics (terrorism), this field can return fine-grained labels for the detection results.

This field is not returned by default. To enable this feature, join the DingTalk group (ID: 35573806) and contact our technical experts.

suggestion String block The recommended subsequent action. Valid values:
  • pass: The content is normal. No further action is required.
  • review: The result is uncertain. Perform a manual review.
  • block: The content is in violation. Delete the content or restrict its public access.
rate Float 91.54

The confidence score. Valid values: 0 (lowest confidence) to 100 (highest confidence).

If the suggestion is pass, the higher the confidence score, the more likely the content is compliant. If the suggestion is review or block, the higher the confidence score, the more likely the content is non-compliant.

Important

We recommend that you use the suggestion and label (or sublabel, for some API operations) fields to determine whether the content is in violation.

frames JSONArray If the moderated image is too long and is truncated, this parameter returns the temporary URL of each frame in the truncated image. For more information about the structure, see frame.
hintWordsInfo JSONArray If the image contains ad violations, this parameter returns information about the risk keywords matched in the ad text. For more information about the structure, see hintWordsInfo.
Note adThis parameter is returned only for the (ad violation) scenario.
Example:
"hintWordsInfo":[{"context":"Sensitive word"}]
qrcodeData StringArray ["http://www.aliyundoc.com/01ZZOliO"] If the image contains a QR code, this parameter returns the text content of all detected QR codes.
Note QR codeThis parameter is returned only for the (QR code detection) scenario.
qrcodeLocations JSONArray The coordinates of the QR codes detected in the image. For more information about the structure, see qrcodeLocation.
programCodeData JSONArray If the image contains a mini program code, this parameter returns the location of the code. For more information about the structure, see programCodeData.
Note QR codeThis parameter is returned only for the (QR code detection) scenario and only if you enable mini program code recognition.
logoData JSONArray If the image contains a logo, this parameter returns information about the detected logo. For more information about the structure, see logoData.
Note LogoThis parameter is returned only for the (logo detection) scenario.
sfaceData JSONArray If the image contains terrorist or political content, this parameter returns information about the detected faces. For more information about the structure, see sfaceData.
Note terrorismThis parameter is returned only for the (terrorism and political content detection) scenario.
ocrData Array Haokan Video The full text recognized in the image.
Note This parameter is not returned by default. To use this feature, join the DingTalk group (ID: 35573806) to contact our product and technical experts.
Table 4. frame
Parameter Type Example Description
rate Float 89.85

The confidence score. Valid values: 0 to 100. A higher confidence score indicates a higher probability that the detection result is accurate. Avoid using this score in your business logic.

url String http://www.aliyundoc.com/xxx-0.jpg The temporary URL of the truncated image frame. The URL is valid for 5 minutes.
Table 5. programCodeData
Parameter Type Example Description
x Float 11.0 The x-coordinate of the upper-left corner of the mini program code area. The origin (0,0) is the upper-left corner of the image. Unit: pixels.
y Float 0.0 The y-coordinate of the upper-left corner of the mini program code area. The origin (0,0) is the upper-left corner of the image. Unit: pixels.
w Float 402.0 The width of the mini program code area. Unit: pixels.
h Float 413.0 The height of the mini program code area. Unit: pixels.
Table 6. logoData
Parameter Type Example Description
type String TV The type of the detected logo. The value is TV, which indicates a TV station logo.
name String xxx TV The name of the detected logo.
x Float 140 The x-coordinate of the upper-left corner of the logo area. The origin (0,0) is the upper-left corner of the image. Unit: pixels.
y Float 68 The y-coordinate of the upper-left corner of the logo area. The origin (0,0) is the upper-left corner of the image. Unit: pixels.
w Float 106 The width of the logo area. Unit: pixels.
h Float 106 The height of the logo area. Unit: pixels.
Table 7. sfaceData
Parameter Type Example Description
x Float 49 The x-coordinate of the upper-left corner of the face area. The origin (0,0) is the upper-left corner of the image. Unit: pixels.
y Float 39 The y-coordinate of the upper-left corner of the face area. The origin (0,0) is the upper-left corner of the image. Unit: pixels.
w Float 97 The width of the face area. Unit: pixels.
h Float 131 The height of the face area. Unit: pixels.
faces JSONArray [{"name":"Matched person","rate":91.54,"id":"AliFace_0123****"}] Information about the detected faces. Each object in the array contains the following fields:
  • name: String. The name of the matched person.
  • rate: Float. The confidence score. The value ranges from 0 (lowest confidence) to 100 (highest confidence). A higher score indicates a higher probability that the face recognition result is accurate.
  • id: String. The face ID.
Table 8. hitLibInfo
Parameter Type Example Description
context String Haokan Video The matched content from the custom text library.
libCode String 123456 The code of the matched custom text library.
libName String abc The name of the matched custom text library.
Table 9. hintWordsInfo
Parameter Type Example Description
context String Haokan Video The matched risk keyword.
Table 10. qrcodeLocation
Parameter Type Example Description
x Float 11.0 The x-coordinate of the upper-left corner of the QR code area. The origin (0,0) is the upper-left corner of the image. Unit: pixels.
y Float 0.0 The y-coordinate of the upper-left corner of the QR code area. The origin (0,0) is the upper-left corner of the image. Unit: pixels.
w Float 402.0 The width of the QR code area. Unit: pixels.
h Float 413.0 The height of the QR code area. Unit: pixels.
qrcode String http://www.aliyundoc.com/0.ZZOliO The URL to which the detected QR code points.

Examples

Sample request
http(s)://[Endpoint]/green/image/scan
&<common request parameters>
{
    "scenes": [
        "porn",
        "terrorism",
        "ad",
        "live",
        "qrcode",
        "logo"
    ],
    "tasks": [
        {
            "dataId": "uuid-xxxx-xxxx-1234",
            "url": "http://www.aliyundoc.com/xxx.jpg"
        }
    ]
}
Sample success response
{
    "msg": "OK",
    "code": 200,
    "data": [
        {
            "msg": "OK",
            "code": 200,
            "dataId": "cfd33235-71a4-468b-8137-a5ffe323****",
            "extras": {

            },
            "results": [
                {
                    "rate": 99.63,
                    "suggestion": "block",
                    "label": "sexy",
                    "scene": "porn"
                },
                {
                    "label": "politics",
                    "rate": 91.54,
                    "scene": "terrorism",
                    "sfaceData": [
                        {
                            "faces": [
                                {
                                    "id": "AliFace_0123****",
                                    "name": "matched name",
                                    "rate": 91.54
                                }
                            ],
                            "h": 131,
                            "w": 97,
                            "x": 49,
                            "y": 39
                        }
                    ],
                    "suggestion": "block"
                },
                {
                    "extras": {
                        "qrcodes": "http://www.aliyundoc.com/0.ZZOliO",
                        "npx": "72.01",
                        "hitCustomLibCode": "8012345000",
                        "hitCustomLibName": "Name of the custom image library",
                        "hitLibInfo": [
                            {
                                "context": "matched text",
                                "libCode": "123456",
                                "libName": "Name of the text library"
                            }
                        ]
                    },
                    "programCodeData": [
                        {
                            "w": 402.0,
                            "h": 413.0,
                            "x": 11.0,
                            "y": 0.0
                        }
                    ],
                    "frames": [
                        {
                            "rate": 89.85,
                            "url": "http://www.aliyundoc.com/xxx-0.jpg"
                        },
                        {
                            "rate": 68.06,
                            "url": "http://www.aliyundoc.com/xxx-1.jpg"
                        }
                    ],
                    "rate": 99.91,
                    "suggestion": "block",
                    "label": "ad",
                    "scene": "ad"
                },
                {
                    "rate": 99.91,
                    "suggestion": "block",
                    "label": "drug",
                    "scene": "live"
                },
                {
                    "qrcodeData": [
                        "http://www.aliyundoc.com/01ZZOliO"
                    ],
                    "rate": 99.91,
                    "suggestion": "review",
                    "label": "qrcode",
                    "scene": "qrcode"
                },
                {
                    "logoData": [
                        {
                            "name": "xxx TV",
                            "type": "TV",
                            "x": 140,
                            "y": 68,
                            "w": 106,
                            "h": 106
                        }
                    ],
                    "rate": 99.9,
                    "suggestion": "block",
                    "label": "TV",
                    "scene": "logo"
                }
            ],
            "taskId": "img4wlJcb7p4wH4lAP3111111-123456",
            "url": "http://www.aliyundoc.com/xxx.jpg"
        }
    ],
    "requestId": "69B41AE8-1234-1234-1234-12D395695D2D"
}