How to configure PolarDB-X input component-Dataphin(Dataphin)-阿里云帮助中心

The PolarDB-X input component reads data from a PolarDB-X data source. To synchronize data from PolarDB-X to other data sources, configure the PolarDB-X input component first, and then configure the target data source.

Prerequisites

A PolarDB-X data source has been created. For more information, see Create PolarDB-X data source.
To configure the PolarDB-X input component, your account must have read-through permission for the data source. If you do not have permission, request it first. For more information, see Request, renew, and return data source permissions.

Procedure

Select Development > Data Integration from the menu bar at the top of the Dataphin home page.
In the menu bar at the top of the integration page, select Project (Dev-Prod mode requires selecting an environment).
In the left-side navigation pane, click on the Batch Pipeline. From the Batch Pipeline list, select the offline pipeline you want to develop to access its configuration page.
To open the Component Library panel, click Component Library located in the upper-right corner of the page.
In the Component Library panel's left-side navigation pane, select Input. Locate the PolarDB-X (formerly DRDS) component within the right-side list of input components, and then drag it onto the canvas.
Click the icon on the PolarDB-X (formerly DRDS) input component card to open the PolarDB-X Input Configuration dialog box.

In the PolarDB-X (formerly DRDS) Input Configuration dialog box, you can set the parameters.

Parameter	Description
Step Name	The name of the PolarDB-X input component. Dataphin automatically generates the step name, which you can modify as needed. The naming convention is as follows: Can only contain Chinese characters, letters, underscores (_), and numbers. Cannot exceed 64 characters.
Datasource	The drop-down list displays all PolarDB-X data sources in Dataphin, including those you have read-through permission for and those you do not. Click the icon to copy the data source name. For data sources without read-through permission, click Request after the data source to request permission. For more information, see Request, renew, and return data source permissions. If no PolarDB-X data source exists, click Create to create one. For more information, see Create PolarDB-X data source.
Table	Select the source table for data synchronization. You can search by keyword or enter the exact table name and click Exact Search. After you select a table, the system automatically detects the table status. Click the icon to copy the selected table name.
Batch Read Count (optional)	The number of records read per batch. Configure a batch read count (such as 1024 records) to reduce interactions with the data source, improve I/O efficiency, and lower network latency.
Input Filter (optional)	Filter conditions for data extraction. Configure them as follows: Configure Static Field: Extract the corresponding data, such as `ds=20211111`. Configure Variable Parameter: Extract a certain part of the data, such as `ds=${bizdate}`.
Output Fields	Displays all fields that match the selected table and filter conditions. You can perform the following operations: Field Management: To remove fields that you do not need to pass to downstream components: Single Field Deletion Scenario: To delete a small number of fields, click the icon in the Operation column to delete a field. Batch Field Deletion Scenario: To delete multiple fields at once, click Field Management, select the fields in the Field Management dialog box, then click the shift left icon to move them to the unselected list, and click Confirm to complete the deletion. Batch Add: Click Batch Add to configure fields in JSON, TEXT, or DDL format. Note After batch addition, clicking confirm overwrites the existing field information. Batch configuration in JSON format, for example: `// Example: [{ "index": 1, "name": "id", "type": "int(10)", "mapType": "Long", "comment": "comment1" }, { "index": 2, "name": "user_name", "type": "varchar(255)", "mapType": "String", "comment": "comment2" }]` Note Index indicates the column number, name indicates the field name, and type indicates the field type. For example, `"index":3,"name":"user_id","type":"String"` means the fourth column is imported with field name user_id and field type String. Batch configuration in TEXT format, for example: `// Example: 1,id,int(10),Long,comment1 2,user_name,varchar(255),Long,comment2` The row delimiter is used to separate each field's information. The default is a line feed (\n), supporting line feed (\n), semicolon (;), and period (.). The column delimiter is used to separate the field name and field type. The default is a comma (,), supporting`','`. The field type can be omitted, defaulting to`','`. Batch configuration in DDL format, for example: `CREATE TABLE tablename ( user_id serial, username VARCHAR(50), password VARCHAR(50), email VARCHAR (255), created_on TIMESTAMP, );` Create New Output Field: Click +create New Output Field, specify Column, Type, Comment, and select a Mapping Type. After configuring the row, click the icon to save.

Click Confirm to finalize the PolarDB-X input component properties.

上一篇: Configure Vertica input component 下一篇: Configure the PostgreSQL input component