Serverless Workflow is integrated with Alibaba Cloud visual intelligence services. You can use Serverless Workflow to orchestrate the APIs of these services. This topic describes how to integrate visual intelligence services.
Background information
In a task step of a Serverless Workflow, you can specify a visual intelligence service API operation as the resource type and provide the required parameters. Serverless Workflow calls the specified API operation and uses the result of the call as the output of the step. You can use values from the output to complete subsequent tasks. If an error occurs during the call, you can catch the error in the flow and implement a policy, such as a retry or a redirect, based on the error type.Prerequisites
- Obtain the API semantics and required parameters for the service that you want to call. For more information, see the visual intelligence official website or the List of supported visual intelligence services and API operations section in this topic.
- Ensure that the execution role for the flow has the required permissions to call the API operation, such as AliyunVIAPIFullAccess. For more information, see Execution roles.
Procedure
- Define the API operation to call in the flow.
You can call the API operations of visual intelligence services in a task step. In the task step, specify the Action parameter instead of the resourceArn parameter in the following format.
action: {serviceName}:{apiName}{serviceName}: The service name. For more information, see the List of supported visual intelligence services and API operations.{apiName}: The name of the API operation to call. For more information, see the List of supported visual intelligence services and API operations.
- Specify API parameters.
Specify the API call parameters in the ServiceParams section. Serverless Workflow validates the parameters against the API definition for the service.
For example, consider the ClassifyCommodity API of the visual intelligence service. The following code shows the corresponding task in Flow Definition Language (FDL). For more information about the API, see Product categorization.
... - type: task name: APIClassifyCommodity action: goodstech:ClassifyCommodity inputMappings: - target: image source: $input.imageURL serviceParams: # Describes the parameters required for ClassifyCommodity. For the parameter list, see the API description. ImageURL: $.imageIn this example, action specifies the service name and API name for the call. The ServiceParams section contains the parameters for the corresponding API.
- Handle errors.
If an API call fails, Serverless Workflow returns the error and cause keys in the output of the step. You can use the error key to catch the error type and implement corresponding logic, such as a redirect. Serverless Workflow adds the {serviceName}. prefix to the original error code that is returned by the service to be used as a catch identifier. For example, consider the error codes of
goodstech:ClassifyCommodity. For more information, see Error codes.
In this case, InternalError.Busy is the original service error code. To catch this error code in the flow, you must add the service prefix goodstech..... steps: - type: task name: APIClassifyCommodity action: goodstech:ClassifyCommodity # Format: {serviceName}:{apiName}. See the API list at the end of this topic. ... retry: # Retry after catching an error. - errors: - goodstech.InternalError.Busy intervalSeconds: 10 maxAttempts: 2 multiplier: 2 catch: # Redirect after catching an error. - errors: - goodstech.CommonError - goodstech.IllegalUrlParameter - goodstech.InvalidParameter goto: xxxxx
Example: Orchestrate a visual intelligence image recognition API
This example uses a visual intelligence API for image recognition to categorize products in an image. For more information, see Product categorization.
version: v1
type: flow
steps:
- type: task
name: APIClassifyCommodity
action: goodstech:ClassifyCommodity # Format: {serviceName}:{apiName}. See the API list at the end of this topic.
inputMappings: # Map the input variable for the ClassifyCommodity parameters.
- target: image
source: $input.imageURL
outputMappings: # Map the output to a local variable.
- target: classifyCommodity_Categories
source: $local.Data.Categories
serviceParams: # Describes the parameters required for ClassifyCommodity. For the parameter list, see the API description.
ImageURL: $.image
retry: # Retry after catching an error.
- errors:
- goodstech.InternalError.Busy
intervalSeconds: 10
maxAttempts: 2
multiplier: 2
catch: # Redirect after catching an error.
- errors:
- goodstech.CommonError
- goodstech.IllegalUrlParameter
- goodstech.InvalidParameter
goto: ExecFailed
- type: foreach # According to the ClassifyCommodity response parameters, the API returns an array. Process this type of data in a foreach step in the subsequent flow.
name: ElemDealt
iterationMapping:
collection: $.classifyCommodity_Categories
item: catagory
steps:
- type: pass
name: pass1
inputMappings:
# the index can be from context
- target: index
source: $context.step.iterationIndex
- target: catagory
source: $input.catagory
end: true
- type: fail
name: ExecFailed
{
# The URL of the image to process.
"imageURL": "https://viapi-demo.oss-cn-shanghai.aliyuncs.com/viapi-demo/images/DetectImageElements/detect-elements-src.png"
}
List of supported visual intelligence services and API operations
| Service name | API name | Feature Description |
| facebody | DetectFace | Detects human faces in an image and returns the coordinates of the bounding boxes. It can detect up to thousands of faces at once. It supports faces with 360-degree planar rotation and up to 90-degree side profiles. It also extracts 105 key facial points for localization in milliseconds. |
| RecognizeExpression | Recognizes facial expressions in an image. | |
| RecognizeFace | Performs high-performance facial recognition based on face detection. Our algorithm achieves a recognition accuracy of 99.58% on the LFW public test dataset. | |
| CompareFace | Detects faces in two input images, selects the largest face in each for comparison, and determines if they belong to the same person. It also returns the coordinates of the bounding boxes for both faces, the comparison confidence level, and confidence level thresholds for different false acceptance rates. | |
| DetectBodyCount | Detects and counts the number of human bodies in an image. This feature is mainly for indoor scenarios. | |
| CreateFaceDb | Creates a face database. | |
| ListFaceDbs | Lists the face databases. | |
| AddFaceEntity | Adds a face entity to a face database. | |
| GetFaceEntity | Queries a face entity in a face database. | |
| ListFaceEntities | Lists the face entities in a face database. | |
| UpdateFaceEntity | Updates a face entity in a face database. | |
| AddFace | Adds a face to a specified database. | |
| SearchFace | Searches a database for faces that are similar to a face in an input image. | |
| DeleteFace | Deletes a face from a specified database. | |
| DeleteFaceEntity | Deletes a face entity from a face database. | |
| DeleteFaceDb | Deletes a specified face database. | |
| BodyPosture | Gets 18 key points of a human body. | |
| HandPosture | Gets 21 key points of a hand gesture. | |
| DetectPedestrian | Detects people in an image. | |
| EnhanceFace | Crops and aligns a face in an input image, enhances its details, and then merges the result back into the original image. | |
| FaceBeauty | Applies retouching effects to a face in an image. Effects include skin smoothing, skin whitening, and removal of dark circles and nasolabial folds. | |
| FaceMakeup | Simulates makeup application to enhance facial appearance. You can add lipstick, highlights, and full makeup looks. | |
| FaceTidyup | Adjusts the facial contour and features. You can also manually adjust the intensity for fine-grained control. | |
| FaceFilter | Changes the overall style of an image. | |
| ocr | RecognizeIdentityCard | Automatically locates and recognizes information on an ID card in an image. |
| RecognizeBankCard | Automatically locates a bank card in an image and recognizes information, such as the card number. | |
| RecognizeBusinessCard | Automatically locates and recognizes information on a business card in an image. | |
| RecognizeAccountPage | Automatically locates the personal information page of a household register in an image and recognizes the information on the page. | |
| RecognizeDrivingLicense | Automatically locates a vehicle registration certificate in an image and recognizes the information on it. It also supports certificates for new energy vehicles. | |
| RecognizeDriverLicense | Automatically locates a driver's license in an image and recognizes the information on it. | |
| RecognizeLicensePlate | Automatically locates a license plate in an image and recognizes its content. It also supports license plates for new energy vehicles. | |
| RecognizeVINCode | Automatically locates and recognizes a Vehicle Identification Number (VIN) in an image. | |
| RecognizeTaxiInvoice | Automatically locates a taxi receipt in an image and recognizes the information on it. | |
| RecognizeTrainTicket | Automatically locates a train ticket in an image and recognizes the information on it. | |
| RecognizeBusinessLicense | Automatically locates a business license in an image and recognizes the information on it. | |
| RecognizeStamp | Automatically locates a company or official seal in an image and recognizes the name of the organization, such as a government agency, enterprise, or public institution. | |
| RecognizeVATInvoice | Recognizes content on electronic and paper value-added tax (VAT) invoices. | |
| RecognizeCharacter | Recognizes text in images from various scenarios and returns coordinate information. | |
| GetAsyncJobResult | When you call an API operation asynchronously, the immediate response does not contain the final result. Save the RequestId from the response and call GetAsyncJobResult to retrieve the actual result. | |
| TrimDocument | Parses the content of an input document and outputs it in a structured format, such as HTML or JSON. | |
| RecognizeChinapassport | Recognizes key fields in a Chinese passport. | |
| RecognizeTakeoutOrder | Recognizes key fields on a food delivery order. It outputs information such as store name, phone number, packaging fee, delivery fee, subtotal, other fees, discounts, total items, online payment status, order number, and order time. Currently, it supports orders from Ele.me. | |
| RecognizePassportMRZ | Detects the Machine-Readable Zone (MRZ) of a passport from an image and outputs 11 pieces of information to help with subsequent information extraction and certificate verification. | |
| goodstech | ClassifyCommodity | Recognizes the product category in an image and returns information such as the product category and confidence level. It supports over 10,000 categories, including apparel, footwear, bags, 3C digital products, and household goods. |
| RecognizeFurnitureAttribute | Recognizes the style of an input furniture image. It supports 16 styles. | |
| RecognizeFurnitureSpu | Classifies the furniture in an input image. It supports up to 70 categories. | |
| imagerecog | RecognizeImageColor | Analyzes the color information of an input image and provides color values (in RGB and HEX formats) and their corresponding percentages. |
| TaggingImage | Recognizes the main content of an image and assigns it type labels. It supports thousands of content labels covering common object categories. | |
| RecognizeScene | Recognizes the scene or environment in an image. It supports dozens of common scenes, such as sky and grassland. | |
| DetectImageElements | Recognizes the elements in an input image, marks their positions with bounding boxes, and classifies them into basic types, such as person, decoration, and text. | |
| RecognizeImageStyle | Analyzes the style of an input image and identifies possible style and semantic labels. | |
| ClassifyingRubbish | Classifies the waste items in an image and provides the specific names of the items. | |
| RecognizeVehicleType | Recognizes the type of vehicle in an image (full or partial). It mainly supports categories such as sedan, multi-purpose vehicle, and SUV. | |
| imageseg | SegmentHead | Recognizes the head outline in an input image, including the face, hair, ears, and hair ornaments, but not the neck. It returns a transparent 4-channel image where only the head region is visible. It is suitable for single-person and multi-person scenarios. The effect is better for images with clear portraits. |
| SegmentFace | Recognizes the face outline in an input image, not including the neck, ears, or hair. It returns a transparent 4-channel image where only the face region is visible. It is suitable for single-person and multi-person scenarios. The effect is better for images with clear faces. | |
| SegmentHair | Identifies the hair outline in an input image, excluding the neck and ears, and returns a 4-channel transparent image showing only the face area. This feature supports both single-person and multi-person scenarios. For best results, use input images with clearly visible faces. | |
| ParseFace | Detects the contours of facial features in an input image and performs pixel-level semantic segmentation on the eyes, nose, and mouth. The segmentation is more accurate for images where the face is prominent. | |
| SegmentVehicle | Recognizes the outline of a vehicle in an input image, performs pixel-level segmentation on the vehicle, and outputs a transparent image. | |
| SegmentCommodity | Recognizes the outline of a product in an input image, separates it from the background, and returns a segmented foreground product image (4-channel). It is suitable for single-product, multi-product, and complex background scenarios. | |
| SegmentBody | Recognizes the human body outline in an input image, separates it from the background, and returns a segmented foreground portrait image (4-channel). It is suitable for scenarios with single or multiple people, complex backgrounds, and various body postures. | |
| SegmentCommonImage | Recognizes the outline of the main visual object in an input image, separates it from the background, and returns a segmented foreground object image (4-channel). | |
| SegmentFurniture | Performs pixel-level matting for furniture in an input image. | |
| RefineMask | Refines a coarse mask for an input image and outputs a fine-grained mask. | |
| imageenhan | ChangeImageSize | Changes the size of an image. |
| IntelligentComposition | Takes an input image, performs an aesthetic assessment, and intelligently outputs several bounding boxes. You can use these bounding boxes to crop the original image for better composition. | |
| ExtendImageStyle | Transfers the style of a specified style image to an input image, transforming visual styles such as color and brushstrokes. | |
| MakeSuperResolutionImage | Enlarges an input image by four times while maintaining the clarity of the resulting image based on inferred details. | |
| RecolorImage | Converts the colors of an input image automatically or based on a specified color palette, while ensuring that visual hot spots are not abnormally colored. | |
| RemoveImageSubtitles | Erases standard captions from an image. | |
| RemoveImageWatermark | Erases common logos from an image, such as TV station logos or Internet platform logos. | |
| ImageBlindCharacterWatermark | Adds or parses a specified text watermark for an image. | |
| ImageBlindPicWatermark | Adds or parses an image watermark for an image. | |
| objectdet | ClassifyVehicleInsurance | Classifies input vehicle insurance images. |
| RecognizeVehicleParts | Detects the locations and names of vehicle parts in an image. | |
| DetectVehicle | Detects the main body of a motor vehicle in an image and returns its location and coordinate information. | |
| DetectMainBody | Detects the main body in a matted image and outputs its location information. | |
| RecognizeVehicleDashboard | Recognizes information on a vehicle dashboard, such as fault lights. | |
| RecognizeVehicleDamage | Detects the locations and types of vehicle damage in an image. | |
| DetectTransparentImage | Checks whether the background of an image is transparent. | |
| DetectObject | Detects objects in an input image. | |
| DetectWhiteBaseImage | Checks whether an image has a white background. |