Configure the Amazon RDS for DB2 input component-Dataphin(Dataphin)-阿里云帮助中心

The Amazon RDS for DB2 input component reads data from an Amazon RDS for DB2 data source. To synchronize data from an Amazon RDS for DB2 data source to a target data source, you must first configure the Amazon RDS for DB2 input component as the source. Then, you must configure the target data source. This topic describes how to configure the Amazon RDS for DB2 input component.

Prerequisites

Before you begin, confirm that the following operations are complete:

You have created an Amazon RDS for DB2 data source. For more information, see Create an Amazon RDS for DB2 data source.
The account that you use to configure the Amazon RDS for DB2 input component has the read permission for data synchronization on the data source. If your account does not have this permission, you must request it. For more information, see Request, renew, and return data source permissions.

Procedure

From the top menu bar on the Dataphin home page, choose Developer > Data Integration.
In the top menu bar of the Data Integration page, select a Project. If you are in Dev-Prod mode, you must also select an environment.
In the navigation pane on the left, click Batch Pipeline. In the Batch Pipeline list, click the target offline pipeline to open its configuration page.
Click Component Library in the upper-right corner of the page to open the Component Library panel.
In the left navigation pane of the Component Library panel, select Input. Find the Amazon RDS for DB2 component and drag it to the canvas.
Click the icon on the Amazon RDS for DB2 input component card to open the Amazon RDS for DB2 Input Configuration dialog box.

Configure the parameters in the Amazon RDS for DB2 Input Configuration dialog box.

Parameter	Description
Step Name	The name of the Amazon RDS for DB2 input component. Dataphin automatically generates a step name. You can also change the name as needed. The naming convention is as follows: Only letters, numbers, underscores (_), and Chinese characters are allowed. The name cannot exceed 64 characters in length.
Datasource	The drop-down list displays all Amazon RDS for DB2 data sources in the current Dataphin project. This includes data sources for which you have the read permission for synchronization and those for which you do not. Click the icon to copy the current data source name. For a data source for which you do not have the read permission for synchronization, you can click Request next to the data source to request the permission. For more information, see Request, renew, and return data source permissions. If you do not have an Amazon RDS for DB2 data source, click New Data Source to create one. For more information, see Create an Amazon RDS for Oracle data source.
Table	Enter a keyword to search for a table, or enter the exact table name and click Search. After you select a table, the system automatically checks its status. Click the icon to copy the name of the selected table.
Split key (optional)	The system partitions data based on the specified split key. Use this parameter with the concurrency parameter to enable concurrent reads. You can use a column from the source table as the split key. Use a primary key or an indexed column as the split key to ensure transfer performance. Important If you select a date and time column, the system splits the data by identifying the minimum and maximum values. It divides the total time range based on the concurrency level. The data chunks are not guaranteed to be of equal size.
Number of records per batch (optional)	The number of records to read in a single batch. Instead of reading data one record at a time, you can configure a batch size, such as 1024 records. This reduces interactions with the data source, improves I/O efficiency, and lowers network latency.
Input filter (optional)	The filter condition to extract data. The configuration is as follows: Use a static value to extract specific data. Example: `ds=20210101`. Use a variable to extract a subset of data. Example: `ds=${bizdate}`.
Output fields	The Output fields section displays all fields from the selected table that match the filter conditions. The following operations are supported: Manage Fields: If you do not need to output certain fields to downstream components, you can delete them: To delete a single field: To delete a few fields, click the icon in the Actions column to remove unwanted fields. To delete fields in a batch: To delete many fields, click Manage Fields. In the Manage Fields dialog box, select multiple fields. Click the left arrow icon to move the selected input fields to the unselected input fields list. Then, click OK to delete the fields in a batch. Batch Add: Click Batch Add. You can configure fields in a batch using JSON, TEXT, or DDL format. Note After you add fields in a batch and click OK, the existing field configuration is overwritten. To configure in JSON format, for example: `// Example: [{ "index": 1, "name": "id", "type": "int(10)", "mapType": "Long", "comment": "comment1" }, { "index": 2, "name": "user_name", "type": "varchar(255)", "mapType": "String", "comment": "comment2" }]` Note index specifies the column number of the object. name specifies the name of the imported field. type specifies the data type of the imported field. For example, `"index":3,"name":"user_id","type":"String"` imports the fourth column from the file, names the field user_id, and sets its data type to String. To configure in TEXT format, for example: `// Example: 1,id,int(10),Long,comment1 2,user_name,varchar(255),Long,comment2` The row delimiter separates the information for each field. The default delimiter is a line feed (\n). Semicolons (;) and periods (.) are also supported. The column delimiter separates the field name and field type. The default delimiter is a comma (,). The field type is optional. To configure in DDL format, for example: `CREATE TABLE tablename ( user_id serial, username VARCHAR(50), password VARCHAR(50), email VARCHAR (255), created_on TIMESTAMP, );` Add Output Field: Click +Add Output Field. Follow the on-screen instructions to enter the Column, Type, and Remarks, and select a Mapping Type. After you configure the current row, click the icon to save.

Click Confirm to finish configuring the properties for the Amazon RDS for DB2 input component.

上一篇: Configure the Amazon RDS for Oracle Input Component 下一篇: Configure the TDSQL for MySQL Input Component