Integrating visual intelligence service-CloudFlow(CloudFlow)-阿里云帮助中心

Serverless Workflow is integrated with Alibaba Cloud visual intelligence services. You can use Serverless Workflow to orchestrate the APIs of these services. This topic describes how to integrate visual intelligence services.

Background information

In a task step of a Serverless Workflow, you can specify a visual intelligence service API operation as the resource type and provide the required parameters. Serverless Workflow calls the specified API operation and uses the result of the call as the output of the step. You can use values from the output to complete subsequent tasks. If an error occurs during the call, you can catch the error in the flow and implement a policy, such as a retry or a redirect, based on the error type.

Prerequisites

Obtain the API semantics and required parameters for the service that you want to call. For more information, see the visual intelligence official website or the List of supported visual intelligence services and API operations section in this topic.
Ensure that the execution role for the flow has the required permissions to call the API operation, such as AliyunVIAPIFullAccess. For more information, see Execution roles.

Procedure

Define the API operation to call in the flow.
You can call the API operations of visual intelligence services in a task step. In the task step, specify the Action parameter instead of the resourceArn parameter in the following format.
```
action: {serviceName}:{apiName}
```
- {serviceName}: The service name. For more information, see the List of supported visual intelligence services and API operations.
- {apiName}: The name of the API operation to call. For more information, see the List of supported visual intelligence services and API operations.
Specify API parameters.
Specify the API call parameters in the ServiceParams section. Serverless Workflow validates the parameters against the API definition for the service.

For example, consider the ClassifyCommodity API of the visual intelligence service. The following code shows the corresponding task in Flow Definition Language (FDL). For more information about the API, see Product categorization.
```
...
  - type: task
    name: APIClassifyCommodity
    action: goodstech:ClassifyCommodity
    inputMappings:
      - target: image
        source: $input.imageURL
    serviceParams: # Describes the parameters required for ClassifyCommodity. For the parameter list, see the API description.
      ImageURL: $.image
```
In this example, action specifies the service name and API name for the call. The ServiceParams section contains the parameters for the corresponding API.
Handle errors.
If an API call fails, Serverless Workflow returns the error and cause keys in the output of the step. You can use the error key to catch the error type and implement corresponding logic, such as a redirect. Serverless Workflow adds the {serviceName}. prefix to the original error code that is returned by the service to be used as a catch identifier. For example, consider the error codes of goodstech:ClassifyCommodity. For more information, see Error codes.
```
...
steps:
  - type: task
    name: APIClassifyCommodity
    action: goodstech:ClassifyCommodity # Format: {serviceName}:{apiName}. See the API list at the end of this topic.
    ...
    retry: # Retry after catching an error.
      - errors:
        - goodstech.InternalError.Busy
        intervalSeconds: 10
        maxAttempts: 2
        multiplier: 2
    catch: # Redirect after catching an error.
      - errors:
        - goodstech.CommonError
        - goodstech.IllegalUrlParameter
        - goodstech.InvalidParameter
        goto: xxxxx
```
In this case, InternalError.Busy is the original service error code. To catch this error code in the flow, you must add the service prefix goodstech..

Example: Orchestrate a visual intelligence image recognition API

This example uses a visual intelligence API for image recognition to categorize products in an image. For more information, see Product categorization.

version: v1 
type: flow
steps:
  - type: task
    name: APIClassifyCommodity
    action: goodstech:ClassifyCommodity # Format: {serviceName}:{apiName}. See the API list at the end of this topic.
    inputMappings: # Map the input variable for the ClassifyCommodity parameters.
      - target: image
        source: $input.imageURL
    outputMappings: # Map the output to a local variable.
      - target: classifyCommodity_Categories
        source: $local.Data.Categories
    serviceParams: # Describes the parameters required for ClassifyCommodity. For the parameter list, see the API description.
      ImageURL: $.image
    retry: # Retry after catching an error.
      - errors:
        - goodstech.InternalError.Busy
        intervalSeconds: 10
        maxAttempts: 2
        multiplier: 2
    catch: # Redirect after catching an error.
      - errors:
        - goodstech.CommonError
        - goodstech.IllegalUrlParameter
        - goodstech.InvalidParameter
        goto: ExecFailed
  - type: foreach # According to the ClassifyCommodity response parameters, the API returns an array. Process this type of data in a foreach step in the subsequent flow.
    name: ElemDealt
    iterationMapping:
      collection: $.classifyCommodity_Categories
      item: catagory
    steps:
      - type: pass
        name: pass1
        inputMappings:
          # the index can be from context
          - target: index
            source: $context.step.iterationIndex
          - target: catagory
            source: $input.catagory
    end: true
  - type: fail
    name: ExecFailed

Create the flow in Serverless Workflow and use the following input to start the execution.

{
    # The URL of the image to process.
    "imageURL": "https://viapi-demo.oss-cn-shanghai.aliyuncs.com/viapi-demo/images/DetectImageElements/detect-elements-src.png"
}

List of supported visual intelligence services and API operations

Service name	API name	Feature Description
facebody Activate the face and body service	DetectFace	Detects human faces in an image and returns the coordinates of the bounding boxes. It can detect up to thousands of faces at once. It supports faces with 360-degree planar rotation and up to 90-degree side profiles. It also extracts 105 key facial points for localization in milliseconds.
	RecognizeExpression	Recognizes facial expressions in an image.
	RecognizeFace	Performs high-performance facial recognition based on face detection. Our algorithm achieves a recognition accuracy of 99.58% on the LFW public test dataset.
	CompareFace	Detects faces in two input images, selects the largest face in each for comparison, and determines if they belong to the same person. It also returns the coordinates of the bounding boxes for both faces, the comparison confidence level, and confidence level thresholds for different false acceptance rates.
	DetectBodyCount	Detects and counts the number of human bodies in an image. This feature is mainly for indoor scenarios.
	CreateFaceDb	Creates a face database.
	ListFaceDbs	Lists the face databases.
	AddFaceEntity	Adds a face entity to a face database.
	GetFaceEntity	Queries a face entity in a face database.
	ListFaceEntities	Lists the face entities in a face database.
	UpdateFaceEntity	Updates a face entity in a face database.
	AddFace	Adds a face to a specified database.
	SearchFace	Searches a database for faces that are similar to a face in an input image.
	DeleteFace	Deletes a face from a specified database.
	DeleteFaceEntity	Deletes a face entity from a face database.
	DeleteFaceDb	Deletes a specified face database.
	BodyPosture	Gets 18 key points of a human body.
	HandPosture	Gets 21 key points of a hand gesture.
	DetectPedestrian	Detects people in an image.
	EnhanceFace	Crops and aligns a face in an input image, enhances its details, and then merges the result back into the original image.
	FaceBeauty	Applies retouching effects to a face in an image. Effects include skin smoothing, skin whitening, and removal of dark circles and nasolabial folds.
	FaceMakeup	Simulates makeup application to enhance facial appearance. You can add lipstick, highlights, and full makeup looks.
	FaceTidyup	Adjusts the facial contour and features. You can also manually adjust the intensity for fine-grained control.
	FaceFilter	Changes the overall style of an image.
ocr Activate the OCR service	RecognizeIdentityCard	Automatically locates and recognizes information on an ID card in an image.
	RecognizeBankCard	Automatically locates a bank card in an image and recognizes information, such as the card number.
	RecognizeBusinessCard	Automatically locates and recognizes information on a business card in an image.
	RecognizeAccountPage	Automatically locates the personal information page of a household register in an image and recognizes the information on the page.
	RecognizeDrivingLicense	Automatically locates a vehicle registration certificate in an image and recognizes the information on it. It also supports certificates for new energy vehicles.
	RecognizeDriverLicense	Automatically locates a driver's license in an image and recognizes the information on it.
	RecognizeLicensePlate	Automatically locates a license plate in an image and recognizes its content. It also supports license plates for new energy vehicles.
	RecognizeVINCode	Automatically locates and recognizes a Vehicle Identification Number (VIN) in an image.
	RecognizeTaxiInvoice	Automatically locates a taxi receipt in an image and recognizes the information on it.
	RecognizeTrainTicket	Automatically locates a train ticket in an image and recognizes the information on it.
	RecognizeBusinessLicense	Automatically locates a business license in an image and recognizes the information on it.
	RecognizeStamp	Automatically locates a company or official seal in an image and recognizes the name of the organization, such as a government agency, enterprise, or public institution.
	RecognizeVATInvoice	Recognizes content on electronic and paper value-added tax (VAT) invoices.
	RecognizeCharacter	Recognizes text in images from various scenarios and returns coordinate information.
	GetAsyncJobResult	When you call an API operation asynchronously, the immediate response does not contain the final result. Save the RequestId from the response and call GetAsyncJobResult to retrieve the actual result.
	TrimDocument	Parses the content of an input document and outputs it in a structured format, such as HTML or JSON.
	RecognizeChinapassport	Recognizes key fields in a Chinese passport.
	RecognizeTakeoutOrder	Recognizes key fields on a food delivery order. It outputs information such as store name, phone number, packaging fee, delivery fee, subtotal, other fees, discounts, total items, online payment status, order number, and order time. Currently, it supports orders from Ele.me.
	RecognizePassportMRZ	Detects the Machine-Readable Zone (MRZ) of a passport from an image and outputs 11 pieces of information to help with subsequent information extraction and certificate verification.
goodstech Activate the product understanding service	ClassifyCommodity	Recognizes the product category in an image and returns information such as the product category and confidence level. It supports over 10,000 categories, including apparel, footwear, bags, 3C digital products, and household goods.
	RecognizeFurnitureAttribute	Recognizes the style of an input furniture image. It supports 16 styles.
	RecognizeFurnitureSpu	Classifies the furniture in an input image. It supports up to 70 categories.
imagerecog Activate the image recognition service	RecognizeImageColor	Analyzes the color information of an input image and provides color values (in RGB and HEX formats) and their corresponding percentages.
	TaggingImage	Recognizes the main content of an image and assigns it type labels. It supports thousands of content labels covering common object categories.
	RecognizeScene	Recognizes the scene or environment in an image. It supports dozens of common scenes, such as sky and grassland.
	DetectImageElements	Recognizes the elements in an input image, marks their positions with bounding boxes, and classifies them into basic types, such as person, decoration, and text.
	RecognizeImageStyle	Analyzes the style of an input image and identifies possible style and semantic labels.
	ClassifyingRubbish	Classifies the waste items in an image and provides the specific names of the items.
	RecognizeVehicleType	Recognizes the type of vehicle in an image (full or partial). It mainly supports categories such as sedan, multi-purpose vehicle, and SUV.
imageseg Activate the image segmentation service	SegmentHead	Recognizes the head outline in an input image, including the face, hair, ears, and hair ornaments, but not the neck. It returns a transparent 4-channel image where only the head region is visible. It is suitable for single-person and multi-person scenarios. The effect is better for images with clear portraits.
	SegmentFace	Recognizes the face outline in an input image, not including the neck, ears, or hair. It returns a transparent 4-channel image where only the face region is visible. It is suitable for single-person and multi-person scenarios. The effect is better for images with clear faces.
	SegmentHair	Identifies the hair outline in an input image, excluding the neck and ears, and returns a 4-channel transparent image showing only the face area. This feature supports both single-person and multi-person scenarios. For best results, use input images with clearly visible faces.
	ParseFace	Detects the contours of facial features in an input image and performs pixel-level semantic segmentation on the eyes, nose, and mouth. The segmentation is more accurate for images where the face is prominent.
	SegmentVehicle	Recognizes the outline of a vehicle in an input image, performs pixel-level segmentation on the vehicle, and outputs a transparent image.
	SegmentCommodity	Recognizes the outline of a product in an input image, separates it from the background, and returns a segmented foreground product image (4-channel). It is suitable for single-product, multi-product, and complex background scenarios.
	SegmentBody	Recognizes the human body outline in an input image, separates it from the background, and returns a segmented foreground portrait image (4-channel). It is suitable for scenarios with single or multiple people, complex backgrounds, and various body postures.
	SegmentCommonImage	Recognizes the outline of the main visual object in an input image, separates it from the background, and returns a segmented foreground object image (4-channel).
	SegmentFurniture	Performs pixel-level matting for furniture in an input image.
	RefineMask	Refines a coarse mask for an input image and outputs a fine-grained mask.
imageenhan Activate the image enhancement service	ChangeImageSize	Changes the size of an image.
	IntelligentComposition	Takes an input image, performs an aesthetic assessment, and intelligently outputs several bounding boxes. You can use these bounding boxes to crop the original image for better composition.
	ExtendImageStyle	Transfers the style of a specified style image to an input image, transforming visual styles such as color and brushstrokes.
	MakeSuperResolutionImage	Enlarges an input image by four times while maintaining the clarity of the resulting image based on inferred details.
	RecolorImage	Converts the colors of an input image automatically or based on a specified color palette, while ensuring that visual hot spots are not abnormally colored.
	RemoveImageSubtitles	Erases standard captions from an image.
	RemoveImageWatermark	Erases common logos from an image, such as TV station logos or Internet platform logos.
	ImageBlindCharacterWatermark	Adds or parses a specified text watermark for an image.
	ImageBlindPicWatermark	Adds or parses an image watermark for an image.
objectdet Activate the object detection service	ClassifyVehicleInsurance	Classifies input vehicle insurance images.
	RecognizeVehicleParts	Detects the locations and names of vehicle parts in an image.
	DetectVehicle	Detects the main body of a motor vehicle in an image and returns its location and coordinate information.
	DetectMainBody	Detects the main body in a matted image and outputs its location information.
	RecognizeVehicleDashboard	Recognizes information on a vehicle dashboard, such as fault lights.
	RecognizeVehicleDamage	Detects the locations and types of vehicle damage in an image.
	DetectTransparentImage	Checks whether the background of an image is transparent.
	DetectObject	Detects objects in an input image.
	DetectWhiteBaseImage	Checks whether an image has a white background.