Configure the Easysearch input component
The Easysearch input component reads data from an Easysearch data source. To sync data from Easysearch to another data source, you must first configure the Easysearch input component, and then configure the component for the target data source. This topic describes how to configure the Easysearch input component.
Prerequisites
An Easysearch data source has been created. For more information, see Create an Easysearch data source.
The account used to configure the Easysearch input component must have read-through permission for the data source. If the account does not have this permission, you must request it. For more information, see Request permissions for a data source.
Procedure
On the Dataphin home page, choose Develop > Data Integration from the top menu bar.
In the top menu bar of the integration page, select a Project. If you are in Dev-Prod mode, you must also select an environment.
In the left navigation pane, click Offline Integration. On the Offline Integration page, click the offline pipeline that you want to develop to open its configuration page.
In the upper-right corner of the page, click Component Library to open the Component Library panel.
In the navigation pane on the left of the Component Library panel, select Input. In the list of input components on the right, find the Easysearch component and drag it to the canvas.
Click the
icon on the Easysearch input component card to open the Easysearch Input Configuration dialog box.In the Easysearch Input Configuration dialog box, configure the parameters.
Parameter
Description
Basic Configuration
Step Name
The name of the Easysearch input component. Dataphin automatically generates a step name, which you can change as needed. The naming conventions are as follows:
Can contain only Chinese characters, letters, underscores (_), and digits.
Cannot exceed 64 characters.
Datasource
The data source dropdown list contains all Easysearch data sources and project levels in Dataphin, including those with and without read-through permission. You can click the
icon to copy the current data source name.For a data source that you do not have read-through permission for, click Request next to the data source to request the permission. For more information, see Request permissions for a data source.
If you do not have an Easysearch data source, click New to create one. For more information, see Create an Easysearch data source.
Index Document
The name of the index in Easysearch. Click the
icon to copy the name of the selected index document.Retrieval Query Condition
The `query` parameter for Easysearch. Use this parameter for full or incremental queries. For example,
{ "match_all": {}}performs a full query.Cursor Time
The duration for which the cursor is stored. This is the paging parameter for Easysearch.
If this value is set too low and the idle time between fetching two pages of data exceeds the scroll time, the cursor expires. This can cause data loss.
If this value is set too high and the number of concurrent queries exceeds the server-side
max_open_scroll_contextsetting, a query error occurs. For example, 5m specifies a cursor time of 5 minutes.
Unit: day (-d), hour (-h), minute (-m), second (-s), millisecond (-ms), microsecond (-micros), nanosecond (-nanos).
Advanced Configuration
Batch Read Size
The number of data records to read in a single batch. The default value is 1024. Configuring a batch size reduces interactions with the data source, which improves I/O efficiency and lowers network latency.
Connection Timeout
The client connection timeout period. The default value is 60000 ms.
Read Timeout
The client read timeout period. The default value is 60000 ms.
Date Format
If a synced field is of the date type and its
mappingdoes not have aformatconfiguration, configure thedateFormatparameter. The default format in Easysearch isyyyy-MM-dd'T'HH:mm:ssZ.Output Fields
Displays the output fields.
Add fields in batches.
Click Batch Add.
To configure in JSON format, use the following example:
[{"name":"col_integer","type":"integer"}, {"name":"col_long","type":"long"}, {"name":"col_double","type":"double"}]Notename specifies the name of the field to import, and type specifies the data type of the field. For example,
"name":"user_id","type":"String"imports the field named user_id and sets its data type to String.To configure in TEXT format, use the following example:
col_long,long col_double,doubleThe row delimiter separates the information for each field. The default delimiter is a line feed (\n). Semicolons (;) and periods (.) are also supported.
The column delimiter separates the field name from the field type. The default delimiter is a comma (,).
Click OK.
Create an output field.
Click New Output Field, then enter a Column name and select a Type as prompted.
Manage output fields.
You can perform the following operations on added fields:
Click and drag the move icon
next to the Field to change its position.Click the
Edit icon in the Actions column to edit existing fields.Click the
delete icon in the Actions column to delete an existing field.
Click Confirm to complete the property configuration for the Easysearch input component.