How to configure the IBM DB2 input component to read data from a data source-Dataphin(Dataphin)-阿里云帮助中心

The IBM DB2 input component reads data from an IBM DB2 data source. To synchronize IBM DB2 data to other destinations, configure the IBM DB2 input component as the source before setting up the target data source.

Prerequisites

Before you begin, ensure the following prerequisites are met:

An IBM DB2 data source has been created. For more information, see Create an IBM DB2 data source.
The account used to configure the IBM DB2 input component must have read-through permissions for the data source. If the account lacks these permissions, request them before proceeding. For more information, see Request, renew, and return data source permissions.

Procedure

On the Dataphin home page, navigate to the top menu bar and select Development > Data Integration.
At the top of the integration page, select Project (choose an environment in Dev-Prod mode).
In the left-side navigation pane, click Batch Pipeline. From the Batch Pipeline list, select the offline pipeline you want to develop to access its configuration page.
Click Component Library in the upper-right corner to open the Component Library panel.
In the Component Library panel's left-side navigation, select Input. Then, locate the IBM DB2 component in the list on the right and drag it onto the canvas.
Click the icon on the IBM DB2 input component card to open the IBM DB2 Input Configuration dialog box.

In the IBM DB2 Input Configuration dialog box, set the parameters.

Parameter	Description
Step Name	The name of the IBM DB2 input component. Dataphin automatically generates a default name, which you can modify. The naming conventions are as follows: Only Chinese characters, letters, underscores (_), and digits are allowed. The name cannot exceed 64 characters.
Datasource	The drop-down list displays all IBM DB2 data sources in Dataphin, including those you have read-through permissions for and those you do not. Click the icon to copy the name of the current data source. For data sources you lack read-through permissions for, click Request next to the data source to request permissions. For more information, see Request, renew, and return data source permissions. If you do not have an IBM DB2 data source, click Create Data Source to create one. For more information, see Create an IBM DB2 data source.
Table	You can search for a table by entering a keyword or enter the exact table name and click Exact Search. After you select a table, the system automatically detects the table status. Click the icon to copy the name of the selected table.
Shard Key (Optional)	Specifies a column in the source table used to shard data for concurrent reading. Combine this with the concurrency setting to enable parallel reads. We recommend using a primary key or indexed column as the shard key for optimal performance. Important If you select a date-time type, the system identifies the maximum and minimum values and performs brute-force sharding based on the total time range and concurrency. Even distribution is not guaranteed.
Batch Read Count (Optional)	The number of records to read per batch. Setting a batch size (such as 1,024 records) instead of reading one record at a time reduces round trips to the data source, improves I/O efficiency, and lowers network latency.
Input Filter (Optional)	Filter conditions for extracting data: Configure a static field to extract the corresponding data. For example, `ds=20210101`. Configure a variable parameter to extract a specific portion of data. For example, `ds=${bizdate}`.
Output Fields	Displays all fields in the selected table that match the filter conditions. You can perform the following operations: Field Management: To exclude certain fields from downstream components, delete them: Delete a single field: To delete individual fields, click the icon in the operation column. Delete multiple fields: To delete multiple fields at once, click Field Management. In the Field Management dialog box, select the fields to remove, click the left arrow icon to move them to the unselected list, and then click Confirm to apply. Batch addition: Click Batch Addition to configure fields in JSON, TEXT, or DDL format. Note Clicking Confirm after a batch addition will overwrite the existing field information. Batch configure fields in JSON format. For example: `// Example: [{ "index": 1, "name": "id", "type": "int(10)", "mapType": "Long", "comment": "comment1" }, { "index": 2, "name": "user_name", "type": "varchar(255)", "mapType": "String", "comment": "comment2" }]` Note The index specifies the column number of the object. The name specifies the name of the field after it is imported. The type specifies the type of the field after it is imported. For example, `"index":3,"name":"user_id","type":"String"` indicates that the fourth column in the file is imported. The field name is user_id, and the field type is String. Batch configure fields in TEXT format. For example: `// Example: 1,id,int(10),Long,comment1 2,user_name,varchar(255),Long,comment2` The row delimiter is used to separate the information of each field. The default is a line feed (\n). The supported delimiters include line feed (\n), semicolon (;), and period (.). The column delimiter is used to separate field names from field types, with the default being a half-width comma (,). It can support`','`. The field type can be omitted, with the default being`','`. Batch configure fields in DDL format. For example: `CREATE TABLE tablename ( user_id serial, username VARCHAR(50), password VARCHAR(50), email VARCHAR (255), created_on TIMESTAMP, );` Create a new output field: Click +create Output Field. Follow the instructions on the page to fill in Column, Type, and Comment, and select Mapping Type. After configuring each row, click the icon to save.

Click Confirm to save the IBM DB2 input component configuration.