Integrating visual intelligence service

更新时间:
复制 MD 格式

Serverless Workflow is integrated with Alibaba Cloud visual intelligence services. You can use Serverless Workflow to orchestrate the APIs of these services. This topic describes how to integrate visual intelligence services.

Background information

In a task step of a Serverless Workflow, you can specify a visual intelligence service API operation as the resource type and provide the required parameters. Serverless Workflow calls the specified API operation and uses the result of the call as the output of the step. You can use values from the output to complete subsequent tasks. If an error occurs during the call, you can catch the error in the flow and implement a policy, such as a retry or a redirect, based on the error type.

Prerequisites

Procedure

  1. Define the API operation to call in the flow.
    You can call the API operations of visual intelligence services in a task step. In the task step, specify the Action parameter instead of the resourceArn parameter in the following format.
    action: {serviceName}:{apiName}
  2. Specify API parameters.

    Specify the API call parameters in the ServiceParams section. Serverless Workflow validates the parameters against the API definition for the service.

    For example, consider the ClassifyCommodity API of the visual intelligence service. The following code shows the corresponding task in Flow Definition Language (FDL). For more information about the API, see Product categorization.

    ...
      - type: task
        name: APIClassifyCommodity
        action: goodstech:ClassifyCommodity
        inputMappings:
          - target: image
            source: $input.imageURL
        serviceParams: # Describes the parameters required for ClassifyCommodity. For the parameter list, see the API description.
          ImageURL: $.image

    In this example, action specifies the service name and API name for the call. The ServiceParams section contains the parameters for the corresponding API.

  3. Handle errors.
    If an API call fails, Serverless Workflow returns the error and cause keys in the output of the step. You can use the error key to catch the error type and implement corresponding logic, such as a redirect. Serverless Workflow adds the {serviceName}. prefix to the original error code that is returned by the service to be used as a catch identifier. For example, consider the error codes of goodstech:ClassifyCommodity. For more information, see Error codes.
    ...
    steps:
      - type: task
        name: APIClassifyCommodity
        action: goodstech:ClassifyCommodity # Format: {serviceName}:{apiName}. See the API list at the end of this topic.
        ...
        retry: # Retry after catching an error.
          - errors:
            - goodstech.InternalError.Busy
            intervalSeconds: 10
            maxAttempts: 2
            multiplier: 2
        catch: # Redirect after catching an error.
          - errors:
            - goodstech.CommonError
            - goodstech.IllegalUrlParameter
            - goodstech.InvalidParameter
            goto: xxxxx
    In this case, InternalError.Busy is the original service error code. To catch this error code in the flow, you must add the service prefix goodstech..

Example: Orchestrate a visual intelligence image recognition API

This example uses a visual intelligence API for image recognition to categorize products in an image. For more information, see Product categorization.

version: v1 
type: flow
steps:
  - type: task
    name: APIClassifyCommodity
    action: goodstech:ClassifyCommodity # Format: {serviceName}:{apiName}. See the API list at the end of this topic.
    inputMappings: # Map the input variable for the ClassifyCommodity parameters.
      - target: image
        source: $input.imageURL
    outputMappings: # Map the output to a local variable.
      - target: classifyCommodity_Categories
        source: $local.Data.Categories
    serviceParams: # Describes the parameters required for ClassifyCommodity. For the parameter list, see the API description.
      ImageURL: $.image
    retry: # Retry after catching an error.
      - errors:
        - goodstech.InternalError.Busy
        intervalSeconds: 10
        maxAttempts: 2
        multiplier: 2
    catch: # Redirect after catching an error.
      - errors:
        - goodstech.CommonError
        - goodstech.IllegalUrlParameter
        - goodstech.InvalidParameter
        goto: ExecFailed
  - type: foreach # According to the ClassifyCommodity response parameters, the API returns an array. Process this type of data in a foreach step in the subsequent flow.
    name: ElemDealt
    iterationMapping:
      collection: $.classifyCommodity_Categories
      item: catagory
    steps:
      - type: pass
        name: pass1
        inputMappings:
          # the index can be from context
          - target: index
            source: $context.step.iterationIndex
          - target: catagory
            source: $input.catagory
    end: true
  - type: fail
    name: ExecFailed
Create the flow in Serverless Workflow and use the following input to start the execution.
{
    # The URL of the image to process.
    "imageURL": "https://viapi-demo.oss-cn-shanghai.aliyuncs.com/viapi-demo/images/DetectImageElements/detect-elements-src.png"
}

List of supported visual intelligence services and API operations

Service name API name Feature Description
facebody

Activate the face and body service

DetectFace Detects human faces in an image and returns the coordinates of the bounding boxes. It can detect up to thousands of faces at once. It supports faces with 360-degree planar rotation and up to 90-degree side profiles. It also extracts 105 key facial points for localization in milliseconds.
RecognizeExpression Recognizes facial expressions in an image.
RecognizeFace Performs high-performance facial recognition based on face detection. Our algorithm achieves a recognition accuracy of 99.58% on the LFW public test dataset.
CompareFace Detects faces in two input images, selects the largest face in each for comparison, and determines if they belong to the same person. It also returns the coordinates of the bounding boxes for both faces, the comparison confidence level, and confidence level thresholds for different false acceptance rates.
DetectBodyCount Detects and counts the number of human bodies in an image. This feature is mainly for indoor scenarios.
CreateFaceDb Creates a face database.
ListFaceDbs Lists the face databases.
AddFaceEntity Adds a face entity to a face database.
GetFaceEntity Queries a face entity in a face database.
ListFaceEntities Lists the face entities in a face database.
UpdateFaceEntity Updates a face entity in a face database.
AddFace Adds a face to a specified database.
SearchFace Searches a database for faces that are similar to a face in an input image.
DeleteFace Deletes a face from a specified database.
DeleteFaceEntity Deletes a face entity from a face database.
DeleteFaceDb Deletes a specified face database.
BodyPosture Gets 18 key points of a human body.
HandPosture Gets 21 key points of a hand gesture.
DetectPedestrian Detects people in an image.
EnhanceFace Crops and aligns a face in an input image, enhances its details, and then merges the result back into the original image.
FaceBeauty Applies retouching effects to a face in an image. Effects include skin smoothing, skin whitening, and removal of dark circles and nasolabial folds.
FaceMakeup Simulates makeup application to enhance facial appearance. You can add lipstick, highlights, and full makeup looks.
FaceTidyup Adjusts the facial contour and features. You can also manually adjust the intensity for fine-grained control.
FaceFilter Changes the overall style of an image.
ocr

Activate the OCR service

RecognizeIdentityCard Automatically locates and recognizes information on an ID card in an image.
RecognizeBankCard Automatically locates a bank card in an image and recognizes information, such as the card number.
RecognizeBusinessCard Automatically locates and recognizes information on a business card in an image.
RecognizeAccountPage Automatically locates the personal information page of a household register in an image and recognizes the information on the page.
RecognizeDrivingLicense Automatically locates a vehicle registration certificate in an image and recognizes the information on it. It also supports certificates for new energy vehicles.
RecognizeDriverLicense Automatically locates a driver's license in an image and recognizes the information on it.
RecognizeLicensePlate Automatically locates a license plate in an image and recognizes its content. It also supports license plates for new energy vehicles.
RecognizeVINCode Automatically locates and recognizes a Vehicle Identification Number (VIN) in an image.
RecognizeTaxiInvoice Automatically locates a taxi receipt in an image and recognizes the information on it.
RecognizeTrainTicket Automatically locates a train ticket in an image and recognizes the information on it.
RecognizeBusinessLicense Automatically locates a business license in an image and recognizes the information on it.
RecognizeStamp Automatically locates a company or official seal in an image and recognizes the name of the organization, such as a government agency, enterprise, or public institution.
RecognizeVATInvoice Recognizes content on electronic and paper value-added tax (VAT) invoices.
RecognizeCharacter Recognizes text in images from various scenarios and returns coordinate information.
GetAsyncJobResult When you call an API operation asynchronously, the immediate response does not contain the final result. Save the RequestId from the response and call GetAsyncJobResult to retrieve the actual result.
TrimDocument Parses the content of an input document and outputs it in a structured format, such as HTML or JSON.
RecognizeChinapassport Recognizes key fields in a Chinese passport.
RecognizeTakeoutOrder Recognizes key fields on a food delivery order. It outputs information such as store name, phone number, packaging fee, delivery fee, subtotal, other fees, discounts, total items, online payment status, order number, and order time. Currently, it supports orders from Ele.me.
RecognizePassportMRZ Detects the Machine-Readable Zone (MRZ) of a passport from an image and outputs 11 pieces of information to help with subsequent information extraction and certificate verification.
goodstech

Activate the product understanding service

ClassifyCommodity Recognizes the product category in an image and returns information such as the product category and confidence level. It supports over 10,000 categories, including apparel, footwear, bags, 3C digital products, and household goods.
RecognizeFurnitureAttribute Recognizes the style of an input furniture image. It supports 16 styles.
RecognizeFurnitureSpu Classifies the furniture in an input image. It supports up to 70 categories.
imagerecog

Activate the image recognition service

RecognizeImageColor Analyzes the color information of an input image and provides color values (in RGB and HEX formats) and their corresponding percentages.
TaggingImage Recognizes the main content of an image and assigns it type labels. It supports thousands of content labels covering common object categories.
RecognizeScene Recognizes the scene or environment in an image. It supports dozens of common scenes, such as sky and grassland.
DetectImageElements Recognizes the elements in an input image, marks their positions with bounding boxes, and classifies them into basic types, such as person, decoration, and text.
RecognizeImageStyle Analyzes the style of an input image and identifies possible style and semantic labels.
ClassifyingRubbish Classifies the waste items in an image and provides the specific names of the items.
RecognizeVehicleType Recognizes the type of vehicle in an image (full or partial). It mainly supports categories such as sedan, multi-purpose vehicle, and SUV.
imageseg

Activate the image segmentation service

SegmentHead Recognizes the head outline in an input image, including the face, hair, ears, and hair ornaments, but not the neck. It returns a transparent 4-channel image where only the head region is visible. It is suitable for single-person and multi-person scenarios. The effect is better for images with clear portraits.
SegmentFace Recognizes the face outline in an input image, not including the neck, ears, or hair. It returns a transparent 4-channel image where only the face region is visible. It is suitable for single-person and multi-person scenarios. The effect is better for images with clear faces.
SegmentHair Identifies the hair outline in an input image, excluding the neck and ears, and returns a 4-channel transparent image showing only the face area. This feature supports both single-person and multi-person scenarios. For best results, use input images with clearly visible faces.
ParseFace Detects the contours of facial features in an input image and performs pixel-level semantic segmentation on the eyes, nose, and mouth. The segmentation is more accurate for images where the face is prominent.
SegmentVehicle Recognizes the outline of a vehicle in an input image, performs pixel-level segmentation on the vehicle, and outputs a transparent image.
SegmentCommodity Recognizes the outline of a product in an input image, separates it from the background, and returns a segmented foreground product image (4-channel). It is suitable for single-product, multi-product, and complex background scenarios.
SegmentBody Recognizes the human body outline in an input image, separates it from the background, and returns a segmented foreground portrait image (4-channel). It is suitable for scenarios with single or multiple people, complex backgrounds, and various body postures.
SegmentCommonImage Recognizes the outline of the main visual object in an input image, separates it from the background, and returns a segmented foreground object image (4-channel).
SegmentFurniture Performs pixel-level matting for furniture in an input image.
RefineMask Refines a coarse mask for an input image and outputs a fine-grained mask.
imageenhan

Activate the image enhancement service

ChangeImageSize Changes the size of an image.
IntelligentComposition Takes an input image, performs an aesthetic assessment, and intelligently outputs several bounding boxes. You can use these bounding boxes to crop the original image for better composition.
ExtendImageStyle Transfers the style of a specified style image to an input image, transforming visual styles such as color and brushstrokes.
MakeSuperResolutionImage Enlarges an input image by four times while maintaining the clarity of the resulting image based on inferred details.
RecolorImage Converts the colors of an input image automatically or based on a specified color palette, while ensuring that visual hot spots are not abnormally colored.
RemoveImageSubtitles Erases standard captions from an image.
RemoveImageWatermark Erases common logos from an image, such as TV station logos or Internet platform logos.
ImageBlindCharacterWatermark Adds or parses a specified text watermark for an image.
ImageBlindPicWatermark Adds or parses an image watermark for an image.
objectdet

Activate the object detection service

ClassifyVehicleInsurance Classifies input vehicle insurance images.
RecognizeVehicleParts Detects the locations and names of vehicle parts in an image.
DetectVehicle Detects the main body of a motor vehicle in an image and returns its location and coordinate information.
DetectMainBody Detects the main body in a matted image and outputs its location information.
RecognizeVehicleDashboard Recognizes information on a vehicle dashboard, such as fault lights.
RecognizeVehicleDamage Detects the locations and types of vehicle damage in an image.
DetectTransparentImage Checks whether the background of an image is transparent.
DetectObject Detects objects in an input image.
DetectWhiteBaseImage Checks whether an image has a white background.