iTAG provides labeling templates for optical character recognition (OCR), object detection, and image classification. To create a labeling job, select a template based on your scenario. This document describes the scenarios and data structure for these templates.
Background information
This topic describes the data structure for the following image labeling templates:
Optical character recognition (OCR)
An optical character recognition (OCR) job extracts text from an image and then classifies the image based on the extracted text.
Scenarios
Common scenarios include recognizing text on ID cards, receipts, license plates, and bank cards.
Data structure
Input data
Each line in the manifest file represents a single data item and must include the source field.
{"data":{"source":"oss://****.oss-cn-hangzhou.aliyuncs.com/demo_test/ocr_pic/img6.jpeg"}} ...Output data
Each line in the output manifest file combines the original data item with its labeling result. The JSON structure for each line is as follows.
{ "data": { "source": "oss://****.oss-cn-hangzhou.aliyuncs.com/demo_test/ocr_pic/img6.jpeg" }, "label-144863699223676****": { "results": [ { "questionId": "1", "data": [ { "id": "ecdb7552-2a4e-4d0e-8abb-0f1a2dc0****", "type": "image/polygon", "value": [ [ 368.1112214498511, 71.72740814299901 ], [ 444.34359483614696, 71.72740814299901 ], [ 444.34359483614696, 106.26762661370405 ], [ 368.1112214498511, 106.26762661370405 ] ], "labels": { "OCR Recognition Result": "Financial Advisor", "Single-choice": "Label 1" } } ], "rotation": 0, "markTitle": "OCR Label Configuration", "width": 1024, "type": "image", "height": 1024 } ] } }
Object detection
An object detection labeling job locates objects in an image by drawing bounding boxes with a rectangle selection tool.
Scenarios
Common scenarios include vehicle detection, pedestrian detection, and image search.
Data structure
Input data
Each line in the manifest file represents a single data item and must include the source field.
{"data":{"source":"oss://****.oss-cn-hangzhou.aliyuncs.com/pic_ocr/img17.jpeg"}} ...Output data
Each line in the output manifest file combines the original data item with its labeling result. The JSON structure for each line is as follows.
{ "data": { "source": "oss://****.oss-cn-hangzhou.aliyuncs.com/pic_ocr/img17.jpeg" }, "label-144853549785619****": { "results": [ { "questionId": "1", "data": [ { "id": "e02a574b-9fd9-45e9-8c8a-9682567b****", "type": "image/polygon", "value": [ [ 499.93454545454546, 255.0981818181818 ], [ 911.0109090909091, 255.0981818181818 ], [ 911.0109090909091, 338.6836363636363 ], [ 499.93454545454546, 338.6836363636363 ] ], "labels": { "Single-choice": "Label 1" } } ], "rotation": 0, "markTitle": "Object Detection Label Configuration", "width": 1024, "type": "image", "height": 1024 } ] } }
Image classification
Image classification assigns one or more predefined classification labels to an image based on its content. This template supports both single-label and multi-label classification.
Scenarios
Common scenarios include image sorting, image recognition, image search, and content recommendation.
Data structure
Input data
Each line in the manifest file represents a single data item and must include the source field.
{"data":{"source":"oss://****.oss-cn-hangzhou.aliyuncs.com/iTAG/pic/1.jpg"}} ...Output data
Each line in the output manifest file combines the original data item with its labeling result. The JSON structure for each line is as follows.
{ "data": { "source": "oss://****.oss-cn-hangzhou.aliyuncs.com/pic/3.jpg" }, "label-143082452899667****": { "results": [ { "questionId": "2", "data": [ "Label 1", "Label 2" ], "markTitle": "Multiple-choice", "type": "survey/multivalue" } ] } }