Configure the Oracle Input Component-Dataphin(Dataphin)-阿里云帮助中心

The Oracle input component reads data from an Oracle data source. To sync Oracle data to another data source, configure the Oracle input component to specify the source, and then configure the destination.

Prerequisites

You have created an Oracle data source. For more information, see Create an Oracle Data Source.
The account used to configure the Oracle input component must have sync-read permission on the data source. If the account does not have this permission, request it. For more information, see Request Data Source Permissions.

Procedure

On the Dataphin homepage, in the top menu bar, click Develop, and then click Data Integration.
On the Data Integration page, in the top menu bar, select a Project. If you are using Dev-Prod mode, also select an environment.
In the left navigation pane, click Batch Pipeline. In the Batch Pipeline list, click the offline pipeline that you want to develop. The offline pipeline configuration page opens.
In the upper-right corner of the page, click Component Library to open the Component Library panel.
In the left navigation pane of the Component Library panel, click Input. In the input component list on the right, locate the Oracle component and drag it onto the canvas.
On the Oracle input component card, click the icon to open the Oracle Input Configuration dialog box.

In the Oracle Input Configuration dialog box, configure the following parameters.

Parameter	Description
Step Name	The name of the Oracle input component. Dataphin generates a default name that you can change. Naming rules: Use only Chinese characters, letters, underscores (_), and digits. Keep the name up to 64 characters long.
Datasource	Lists all Oracle data sources in Dataphin, including those you have sync-read permission for and those you do not. Click the icon to copy the data source name. If you do not have sync-read permission for a data source, click Request next to the data source to request sync-read permission. For more information, see Request Data Source Permissions. If you do not have an Oracle data source, click Create Data Source to create one. For more information, see Create an Oracle Data Source.
Time Zone	Dataphin processes time-formatted data based on the time zone configured for the selected data source. You cannot change this setting. Note For tasks created before version V5.1.2, you can choose Data Source Default Configuration or Channel Configuration Time Zone. The default option is Channel Configuration Time Zone. Data Source Default Configuration: The default time zone of the selected data source. Channel Configuration Time Zone: The time zone configured for the current integration task in Properties > Channel Configuration.
Schema (optional)	Select the schema where the table resides. Cross-schema table selection is supported. If not specified, Dataphin uses the schema configured for the data source.
Source Table Count	Select the number of source tables. Options are Single Table and Multiple Tables: Single Table: Use this option when you sync business data from one source table to one destination table. Multiple Tables: Use this option when you sync business data from multiple source tables to one destination table. When writing data from multiple tables into one destination table, Dataphin uses the union algorithm. For more information about union, see INTERSECT, UNION, and EXCEPT.
Table Matching Method	Choose Generic Rule or Database Regex. Note This parameter is available only when you select Multiple Tables for Source Table Count.
Table	Select the source table: If you select Single Table for Source Table Count, search by entering a keyword for the table name. Or enter the exact table name and click Exact Match. After you select a table, Dataphin automatically checks its status. Click the icon to copy the selected table name. If you select Multiple Tables for Source Table Count, enter an expression based on the table matching method. If you select Generic Rule for table matching: Enter an expression in the field to filter tables with the same structure. Dataphin supports enumeration, regex-like patterns, and mixed formats. For example: `table_[001-100];table_102;`. If you select Database Regex for table matching: Enter a regex pattern supported by your database. Dataphin matches tables in the destination database using this pattern. During task runtime, Dataphin dynamically matches new tables based on the regex. After you enter the expression, click Exact Match. In the Confirm Match Details dialog box, view the list of matched tables.
Split Key (optional)	Specify a column to split data for concurrent reads. Use this with concurrency settings. Any column from the source table can serve as the split key. For best performance, use a primary key or indexed column. Important If you select a date-time type, Dataphin performs brute-force splitting across the full time range based on the maximum and minimum values. This method does not guarantee even distribution.
Batch Read Size (optional)	The number of records to read per batch. Setting a batch size (for example, 1024) reduces database interactions, improves I/O efficiency, and lowers network latency.
Codec (optional)	Select the codec for reading data: UTF-8, GBK, or ISO-8859-1.
Input Filter (optional)	Filter the data to extract. Configuration details: Static value: Extract specific data. For example: `ds=20211111`. Variable parameter: Extract part of the data. For example: `ds=${bizdate}`.
Output Fields	Lists all fields from the selected table, filtered by your input filter. You can perform the following actions: Field management: Remove fields you do not need to pass to downstream components: Remove individual fields: Click the icon in the Actions column to remove extra fields. Batch field deletion: To remove many fields at once, click Field Management. In the Field Management dialog box, select the fields, click the left-shift icon to move them to the unselected list, and click OK. Batch add: Click Batch Add to configure output fields in JSON, TEXT, or DDL format. Note After you click OK, the batch configuration overwrites existing field configurations. JSON format example: `// Example: [{ "index": 1, "name": "id", "type": "int(10)", "mapType": "Long", "comment": "comment1" }, { "index": 2, "name": "user_name", "type": "varchar(255)", "mapType": "String", "comment": "comment2" }]` Note index is the column number. name is the field name after import. type is the field type after import. For example, `"index":3,"name":"user_id","type":"String"` imports column 4 from the file as user_id with type String. Batch configuration in TEXT format, for example: `// Example: 1,id,int(10),Long,comment1 2,user_name,varchar(255),Long,comment2` Row delimiter separates field entries. The default is line feed (\n). You can also use semicolon (;) or period (.). Column delimiter separates field names and types. The default is comma (,). You can use`','`. Field type is optional and defaults to`','`. DDL format example: `CREATE TABLE tablename ( user_id serial, username VARCHAR(50), password VARCHAR(50), email VARCHAR (255), created_on TIMESTAMP, );` Create a new output field: Click + Create Output Field. Enter the Column, Type, and Comment. Select a Mapping Type. Click the icon to save the row.

Click OK to complete the configuration of the Oracle input component.