Configure the Amazon RDS for Oracle Input Component-Dataphin(Dataphin)-阿里云帮助中心

The Amazon RDS for Oracle input component reads data from an Amazon RDS for Oracle data source. When you sync data from an Amazon RDS for Oracle data source to another data source, first configure the Amazon RDS for Oracle input component with the source data source information. Then configure the destination data source for the sync task. This topic explains how to configure the Amazon RDS for Oracle input component.

Prerequisites

You have created an Amazon RDS for Oracle data source. For more information, see Create an Amazon RDS for Oracle Data Source.
The account used to configure the Amazon RDS for Oracle input component must have read-through permission on the data source. If the account does not have this permission, request it. For more information, see Request Data Source Permissions.

Procedure

In the top menu bar on the Dataphin homepage, choose Develop > Data Integration.
In the top menu bar of the Integration page, select a Project. In Dev-Prod mode, also select an environment.
In the navigation pane on the left, click Batch Pipeline. In the Batch Pipeline list, click the offline pipeline that you want to develop. The configuration page for the offline pipeline opens.
In the upper-right corner of the page, click Component Library to open the Component Library panel.
In the navigation pane on the left of the Component Library panel, click Input. In the input component list on the right, find the Amazon RDS for Oracle component and drag it onto the canvas.
Click the icon in the Amazon RDS for Oracle input component card to open the Amazon RDS for Oracle Input Configuration dialog box.

In the Amazon RDS for Oracle Input Configuration dialog box, configure the following parameters.

Parameter	Description
Step Name	The name of the Amazon RDS for Oracle input component. Dataphin generates a step name automatically. You can change it based on your business scenario. Naming rules: Use only Chinese characters, letters, underscores (_), and digits. Keep the name no longer than 64 characters.
Datasource	The drop-down list shows all Amazon RDS for Oracle data sources in Dataphin. This includes data sources for which you have read-through permission and those for which you do not. Click the icon to copy the current data source name. If you do not have read-through permission for a data source, click Request next to the data source to request permission. For more information, see Request Data Source Permissions. If you do not have an Amazon RDS for Oracle data source, click Create Data Source to create one. For more information, see Create an Amazon RDS for Oracle Data Source.
Schema (Optional)	Select the schema where the table resides. This supports cross-schema table selection. If you do not specify a schema, the default is the schema configured in the data source.
Source Table Count	Select the number of source tables. Options are Single Table and Multiple Tables: Single Table: Use this option when syncing business data from one source table to one destination table. Multiple Tables: Use this option when syncing business data from multiple source tables to one destination table. When writing data from multiple tables into one destination table, the system uses the union algorithm. For more information about union, see INTERSECT, UNION, and EXCEPT.
Table Matching Method	Select Generic Rule or Database Regex. Note This setting is available only when you select Multiple Tables for Source Table Count.
Table	Select the source table: If you selected Single Table for Source Table Count, search by entering a keyword in the table name field. Or enter the exact table name and click Exact Match. After you select a table, the system automatically checks the table status. Click the icon to copy the name of the selected table. If you selected Multiple Tables for Source Table Count, enter an expression based on the table matching method. If you selected Generic Rule for table matching, enter an expression in the field to filter tables with the same structure. Supported formats include enumeration, regex-like patterns, and combinations. Example: `table_[001-100];table_102;`. If you selected Database Regex for table matching, enter a regular expression supported by the database. The system matches tables in the destination database using this regex. At runtime, the system dynamically matches new tables based on the regex. After you enter the expression, click Exact Match to open the Confirm Match Details dialog box and view the list of matched tables.
Split Key (Optional)	The system partitions data based on the split key column you specify. Use this with concurrency settings to enable concurrent reads. You can use any column from the source table as the split key. For best performance, use a primary key or an indexed column. Important If you select a date-time type, the system performs a brute-force split across the full time range based on the minimum and maximum values. This split is not guaranteed to be even.
Batch Read Size (Optional)	The number of records to read at a time. Configure a batch size—such as 1024 records—to reduce round trips to the source database. This improves I/O efficiency and lowers network latency.
Codec (Optional)	Select the codec for reading data. Dataphin supports the following Codecs: UTF-8, GBK, and ISO-8859-1.
Input Filter (Optional)	Set conditions to filter the data to extract. Details: Static Value: Extract specific data. Example: `ds=20211111`. Variable Parameter: Extract part of the data. Example: `ds=${bizdate}`.
Output Fields	This section lists all fields from the selected table and filtered by your input filter. You can perform the following actions: Field Management: Remove fields you do not need to pass to downstream components: Remove One Field: Click the icon in the Actions column to remove a single field. Batch field deletion scenario: To delete multiple fields, click Field Management. In the Field Management dialog box, select multiple fields, click the left-moving icon to move the selected input fields to the unselected input fields, and click OK to complete batch field deletion. Bulk Add: Click Bulk Add to add fields in JSON, TEXT, or DDL format. Note After you click OK, the bulk-added fields overwrite existing field configurations. Add fields in JSON format. Example: `// Example: [{ "index": 1, "name": "id", "type": "int(10)", "mapType": "Long", "comment": "comment1" }, { "index": 2, "name": "user_name", "type": "varchar(255)", "mapType": "String", "comment": "comment2" }]` Note index specifies the column index, name specifies the field name, and type specifies the field type after import. For example, `"index":3,"name":"user_id","type":"String"` means importing the fourth column from the file, with the field name user_id and field type String. The following is an example of a batch configuration in TEXT format: `// Example: 1,id,int(10),Long,comment1 2,user_name,varchar(255),Long,comment2` The row delimiter separates each field’s information. The default is a line feed (\n). You can also use a semicolon (;) or period (.). The column delimiter separates field names from field types. The default is a comma (,). You can also use `','`. The field type is optional and defaults to `','`. Batch configuration in DDL format, such as: `CREATE TABLE tablename ( user_id serial, username VARCHAR(50), password VARCHAR(50), email VARCHAR (255), created_on TIMESTAMP, );` Add Output Field: Click +Add Output Field. Enter the Column, Type, and Comment. Select a Mapping Type. Click the icon to save the row.

Click OK to finish configuring the Amazon RDS for Oracle input component.