OCR-Robotic Process Automation(RPA)-阿里云帮助中心

element_text

element_text(element, element_index=1, engine='google', window=None, eliminate_spaces=False)

Description

Uses OCR to extract all text from a specified element.

Parameters

element<str> The name of the element.

element_index<int> The index of the element, used when multiple elements share the same name. The default value is 1.

engine<str> The OCR engine to use.

Options:

google: Google
aliyun: Alibaba Cloud
paddle: PaddlePaddle

eliminate_spaces<bool> Specifies whether to remove spaces from the recognition result.

window<object> The window object that contains the element.

Return Value

Returns the recognized text as a <str>.

Example- rpa.ai.ocr.element_text-

# Notes:
# 1. Before using this method, you must capture the target element using the capture control feature.
# 2. Ensure that the page containing the element is open during execution.
page = rpa.app.chrome.create('www.taobao.com')
text = rpa.ai.ocr.element_text('Taobao-logo-chrome', engine='paddle')

click

click(element, keyword, element_index=1, keyword_index=1, engine='google', button='left', offset_x=0, offset_y=0, window=None, timeout=15)

Description

Uses OCR to find a keyword within an element and then clicks its center.

Warning

For client versions 4.11.0.1068 and later, we recommend using click.
This method remains supported, and existing automation flows are unaffected.

Parameters

element<str> The name of the element.

keyword<str> The keyword to find within the element.

element_index<int> The index of the element to use when multiple elements share the same name. The default value is 1.

keyword_index<int> The index of the keyword to use when it appears multiple times within the element. The default value is 1.

engine<str> The OCR engine to use.

Options:

google: Google
aliyun: Alibaba Cloud
paddle: PaddlePaddle

button<str> The mouse button to use.

Options:

left: Left button
right: Right button

offset_x<int> The horizontal offset in pixels from the center of the keyword.

offset_y<int> The vertical offset in pixels from the center of the keyword.

window<object> The window object that contains the element.

timeout<int> The timeout in seconds to wait for the keyword.

Example- rpa.ai.ocr.click-

# Notes:
# 1. Before using this method, you must capture the target element using the capture control feature.
# 2. Ensure that the page containing the element is open during execution.
# 3. This method recognizes the keyword within the element, moves the mouse to its location (adjusted by the offset), and simulates a click.
# The following example finds the keyword "Documents" within the 'Alibaba-Cloud-top-right-banner-chrome' element and clicks it.
page = rpa.app.chrome.create('www.aliyun.com')
rpa.ai.ocr.click('Alibaba-Cloud-top-right-banner-chrome', 'Documents', engine='paddle', offset_x=0, offset_y=0)

double_click

double_click(element, keyword, element_index=1, keyword_index=1, engine='google', offset_x=0, offset_y=0, window=None, timeout=15)

Description

Uses OCR to find a keyword within an element and then double-clicks its center.

Warning

For client versions 4.11.0.1068 and later, we recommend using double_click.
This method remains supported, and existing automation flows are unaffected.

Parameters

element<str> The name of the element.

keyword<str> The keyword to find within the element.

element_index<int> The index of the element to use when multiple elements share the same name. The default value is 1.

keyword_index<int> The index of the keyword to use when it appears multiple times within the element. The default value is 1.

engine<str> The OCR engine to use.

Options:

google: Google
aliyun: Alibaba Cloud
paddle: PaddlePaddle

offset_x<int> The horizontal offset in pixels from the center of the keyword.

offset_y<int> The vertical offset in pixels from the center of the keyword.

window<object> The window object that contains the element.

timeout<int> The timeout in seconds to wait for the keyword.

Example- rpa.ai.ocr.double_click-

# Notes:
# 1. Before using this method, you must capture the target element using the capture control feature.
# 2. Ensure that the page containing the element is open during execution.
# 3. This method recognizes the specified keyword on the target element, moves the mouse to the keyword's location (adjusted by the offset), and then performs a simulated double-click.
# The following example recognizes the keyword "Documents" on a page element and then simulates a double-click on it.
page = rpa.app.chrome.create('www.aliyun.com')
rpa.ai.ocr.double_click('Alibaba-Cloud-top-right-banner-chrome', 'Documents', engine='paddle', offset_x=0, offset_y=0)

input_text

input_text(element, keyword, value, element_index=1, keyword_index=1, engine='google', simulate=False, offset_x=0, offset_y=0, window=None, wait_mili_seconds=20, timeout=15)

Description

Uses OCR to find a keyword within an element, clicks its center to focus, and then simulates keyboard input.

Warning

For client versions 4.11.0.1068 and later, we recommend using input_text.
This method remains supported, and existing automation flows are unaffected.

Parameters

element<str> The name of the element.

keyword<str> The keyword to find within the element.

value<str> The text to input.

element_index<int> The index of the element to use when multiple elements share the same name. The default value is 1.

keyword_index<int> The index of the keyword to use when it appears multiple times within the element. The default value is 1.

engine<str> The OCR engine to use.

Options:

google: Google
aliyun: Alibaba Cloud
paddle: PaddlePaddle

offset_x<int> The horizontal offset in pixels from the center of the keyword.

offset_y<int> The vertical offset in pixels from the center of the keyword.

window<object> The window object that contains the element.

wait_mili_seconds<int> The interval in milliseconds between keystrokes. This parameter applies only to simulated input. The default value is 20, and the maximum is 100. Setting this value too high might cause a timeout.

timeout<int> The timeout in seconds to wait for the keyword.

Example- rpa.ai.ocr.input_text-

# Notes:
# 1. Before using this method, you must capture the target element using the capture control feature.
# 2. Ensure that the page containing the element is open during execution.
# 3. This method recognizes the specified keyword on the target element, moves the mouse to that location (adjusted by the offset), and then simulates typing the specified text.
# The following example recognizes the keyword "Baidu", moves the mouse 350 pixels to the left of it, and then simulates typing "Alibaba Cloud RPA".
page = rpa.app.chrome.create('www.baidu.com')
rpa.ai.ocr.input_text('Baidu-search-chrome', 'Baidu', 'Alibaba Cloud RPA', engine='paddle', offset_x=-350)

mouse_move

mouse_move(element, keyword, element_index=1, keyword_index=1, engine='google', offset_x=0, offset_y=0, window=None, timeout=15)

Description

Uses OCR to find a keyword within an element and moves the mouse cursor to its center.

Parameters

element<str> The name of the element.

keyword<str> The keyword to find within the element.

element_index<int> The index of the element to use when multiple elements share the same name. The default value is 1.

keyword_index<int> The index of the keyword to use when it appears multiple times within the element. The default value is 1.

engine<str> The OCR engine to use.

Options:

google: Google
aliyun: Alibaba Cloud
paddle: PaddlePaddle

offset_x<int> The horizontal offset in pixels from the center of the keyword.

offset_y<int> The vertical offset in pixels from the center of the keyword.

window<object> The window object that contains the element.

timeout<int> The timeout in seconds to wait for the keyword.

Example- rpa.ai.ocr.mouse_move-

# Notes:
# 1. Before using this method, you must capture the target element using the capture control feature.
# 2. Ensure that the page containing the element is open during execution.
# 3. This method recognizes the specified keyword on the target element and moves the mouse to that location, adjusted by the specified offset.
# The following example recognizes the keyword "Documents" on a page element and then moves the mouse to it.
page = rpa.app.chrome.create('www.aliyun.com')
rpa.ai.ocr.mouse_move('Alibaba-Cloud-top-right-banner-chrome', 'Documents', engine='paddle', offset_x=0, offset_y=0)

text

text(image_path, engine='aliyun', app_code='', detail=False, eliminate_spaces=False)

Note

Using the aliyun engine requires a subscription to the specified Alibaba Cloud OCR service.

Description

Performs text recognition on an image file.

Parameters

image_path<str> The file path of the image.

engine<str> The OCR engine to use.

Options:

google: Google
aliyun: Alibaba Cloud
paddle: PaddlePaddle

app_code<str> The AppCode for the text recognition service, required only when the engine is aliyun.

detail<bool> Specifies whether to return detailed information about the recognized text, such as coordinates.

eliminate_spaces<bool> Specifies whether to remove spaces from the recognition result. This parameter applies only when detail is False.

Return Value

Returns the recognized text as a <str>.

Example- rpa.ai.ocr.text-

image_path = r'D:\test_files\OCR_text_recognition.jpg'
text = rpa.ai.ocr.text(image_path, engine='paddle')