Configure the GBase 8a Input Component
The GBase 8a input component reads data from a GBase 8a data source for syncing to a target. Configure the input component before configuring the target data source.
Prerequisites
-
A GBase 8a data source is created. Create a GBase 8a Data Source.
-
Your account has sync-read permission on the data source. If you do not have this permission, request it. Request Data Source Permissions.
Procedure
-
On the Dataphin homepage, in the top menu bar, click Develop > Data Integration.
-
On the Integration page, select a Project. In Dev-Prod mode, also select an environment.
-
In the left navigation pane, click Offline Integration. In the Offline Integration list, click the target pipeline to open its configuration page.
-
Click Component Library in the upper-right corner to open the Component Library panel.
-
In the Component Library panel, select Input, locate GBase 8a, and drag it onto the canvas.
-
Click the
icon on the GBase 8a Input card to open the GBase 8a Input Configuration dialog box. -
Configure the following parameters in the GBase 8a Input Configuration dialog box.
Parameter
Description
Step Name
Auto-generated name for this component. You can rename it. Naming rules:
-
Use only Chinese characters, letters, underscores (_), and digits.
-
Keep the name under 64 characters.
Datasource
Lists all GBase 8a data sources in Dataphin, including those you lack sync-read permission for. Click the
icon to copy the data source name.-
To request sync-read permission, see Request Data Source Permissions.
-
To create a data source, click Create Data Source. Create a GBase 8a Data Source.
Source Table Count
Select Single Table or Multiple Tables:
-
Single Table: Sync data from one source table to one target table.
-
Multiple Tables: Sync data from multiple source tables into one target table using the union algorithm.
Table Matching Method
Only Generic Rule is supported.
NoteAvailable only when Source Table Count is set to Multiple Tables.
Table
Select the source table:
-
For Single Table: search by keyword or enter the exact name and click Exact Match. The system validates the table after selection. Click the
icon to copy the table name. -
For Multiple Tables, add tables as follows:
-
In the input box, enter an expression to filter tables with the same structure.
Supports enumerated, regex-like, and mixed formats. Example:
table_[001-100];table_102. -
Click Exact Match. In the Confirm Match Details dialog box, review the matched tables.
-
Click Confirm.
-
Shard Key (Optional)
Splits data by this field for concurrent reads. Use any source table column; primary keys or indexed columns perform best.
ImportantDate-time shard keys use brute-force splitting across the full time range based on min/max values and concurrency. The split is not guaranteed to be even.
Batch Read Size (Optional)
Records read per batch. Set a value (for example, 1024) to reduce database round trips and lower network latency.
Input Filter (Optional)
Filter conditions for data extraction. Examples:
-
Static value:
ds=20210101. -
Variable parameter:
ds=${bizdate}.
Output Fields
Lists all fields from the selected table that match filter conditions. Options:
-
Manage fields: Remove unneeded fields:
-
Remove one field: Click the
icon in the Actions column. -
Remove multiple fields: Click Field Management. In the Field Management dialog box, select fields and click the left-shift icon (
) to move them to the Unselected list. Click OK.
-
-
Batch add: Click Batch Add to add output fields in JSON, TEXT, or DDL format.
NoteAfter you click OK, batch-added fields overwrite existing field configurations.
-
JSON format example:
// Example: [{ "index": 1, "name": "id", "type": "int(10)", "mapType": "Long", "comment": "comment1" }, { "index": 2, "name": "user_name", "type": "varchar(255)", "mapType": "String", "comment": "comment2" }]NoteThe
indexindicates the column number,nameis the field name after import, andtypeis the field type after import. For example,"index":3,"name":"user_id","type":"String"imports the fourth column with field name user_id and type String. -
TEXT format example:
// Example: 1,id,int(10),Long,comment1 2,user_name,varchar(255),Long,comment2-
Row delimiter: line feed (\n) by default. Semicolon (;) and period (.) are also supported.
-
Column delimiter: comma (,) by default. You can also use
','. Field type is optional and defaults to','.
-
-
DDL format example:
CREATE TABLE tablename ( user_id serial, username VARCHAR(50), password VARCHAR(50), email VARCHAR (255), created_on TIMESTAMP, );
-
-
Add a new output field: Click + Add Output Field, fill in Column, Type, and Comment, select a Mapping Type, and click the
icon to save.
-
-
Click OK to complete the GBase 8a Input Component configuration.