Configure GBase 8c Output Component-Dataphin(Dataphin)-阿里云帮助中心

The GBase 8c output component writes data to a GBase 8c data source. When synchronizing data from other data sources to a GBase 8c data source, configure the source data first and then configure the target data source for the GBase 8c output component. This topic describes how to configure the GBase 8c output component.

Prerequisites

The GBase 8c data source is created. For more information, see .
The account used to configure the GBase 8c output component properties must have write-through permission for the data source. If you do not have this permission, request data source permission. For more information, see Request, renew, and return data source permissions.

Procedure

On the Dataphin homepage, in the top menu bar, select Development > Data Integration.
On the Integration page, in the top menu bar, select Project. (In Dev-Prod mode, select an environment.)
In the navigation pane on the left, click Offline Integration, and then in the Offline Integration list, click the offline pipeline that you want to develop to open its configuration page.
Click Component Library in the upper-right corner of the page to open the Component Library panel.
In the left navigation pane of the Component Library panel, select Outputs. In the output component list on the right, locate the GBase 8c component and drag it to the canvas.
Click and drag the icon of the upstream component to connect it to the GBase 8c output component.
Click the icon in the GBase 8c output component card to open the GBase 8c Output Configuration dialog box.

In the GBase 8c Output Configuration dialog box, configure the following parameters.

Parameter		Description
Basic Settings	Step Name	The name of the GBase 8c output component. Dataphin automatically generates the step name. You can modify it as needed. Naming conventions are as follows: Can contain only Chinese characters, letters, underscores (_), and digits. Length cannot exceed 64 characters.
	Datasource	The data source drop-down list displays all GBase 8c data sources, including those for which you have write-through permission and those for which you do not. Click the icon to copy the current data source name. For data sources for which you do not have write-through permission, click Request after the data source to request write-through permission for the data source. For more information, see Request data source permissions. If you do not have a GBase 8c data source, click the New icon to create a data source. For more information, see .
	Schema (Optional)	Supports cross-schema table selection. Select the schema where the table is located. If not specified, the schema configured in the data source is used by default.
	Table	Select the target table for output data. Enter a table name keyword to search, or enter the exact table name and click Exact Search. After selecting a table, the system automatically detects the table status. Click the icon to copy the name of the selected table. If the GBase 8c data source does not have a target table for data synchronization, you can quickly generate a target table using the one-click table creation feature. The detailed procedure is as follows: Click One-click Table Creation. Dataphin automatically matches the code for creating the target table, including the target table name (source table name by default), field types (initially converted based on Dataphin fields), and other information. Modify the SQL script for creating the target table as needed, then click Create. After the target table is successfully created, Dataphin automatically uses the new target table as the target table for output data. Note If a table with the same name exists in the development environment, Dataphin reports an error indicating that the table already exists after you click Create. If there are no matching items, you can also integrate based on a manually entered table name.
	Production Table Missing Policy	The policy for handling cases where the production table does not exist. You can select Do Not Process or Automatic Creation. The default is Automatic Creation. If you select Do Not Process, the production table is not created when the task is published. If you select Automatic Creation, a table with the same name is created in the target environment when the task is published. Do Not Process: If the target table does not exist, a prompt indicates that the target table does not exist when you submit the task, but you can still publish it. In this case, create the target table in the production environment before executing the task. Automatic Creation: You need to Edit The Table-creation Statement. By default, the statement is pre-filled with the table-creation statement for the selected table, and you can modify it. The table name in the statement uses the placeholder `${table_name}`, and only this placeholder is supported. During execution, it is replaced with the actual table name. If the target table does not exist, a table is created based on the table creation statement. If table creation fails, the publishing check result is failed. Modify the table creation statement based on the error message, then publish again. If the target table already exists, table creation is not performed. Note This option is supported only in Dev-Prod mode projects.
	Loading Policy	Select the policy for writing data to the target table. Loading Policy includes: Append data (insert into): If a primary key or constraint violation occurs, a dirty data error is reported. Update on primary key conflict (on conflict do update set): If a primary key or constraint violation occurs, data in the mapped fields is updated on existing records.
	Write-through	Primary key update syntax is not an atomic operation. If the written data contains duplicate primary keys, enable write-through. Otherwise, use parallel write. Write-through performance is lower than parallel write. Note This option is supported only when the loading policy is set to Update on primary key conflict.
	Batch Write Data Volume (Optional)	The amount of data written at one time. You can also set the Number of Batch Write Records. The system writes data based on which of the two configurations reaches the limit first. The default is 32 MB.
	Number of Batch Write Records (Optional)	Default is 2048 records. When synchronizing data, a batch write strategy is used. The parameters include Number of Batch Write Records and Batch Write Data Volume. When the accumulated data volume reaches either of the set limits (that is, the batch write data volume or record limit), the system considers a batch of data to be full and immediately writes this batch of data to the target. Set the batch write data volume to 32 MB. For the upper limit of batch insert records, adjust it flexibly based on the actual size of a single record. Typically, set a larger value to fully leverage the advantages of batch writing. For example, if a single record is approximately 1 KB, set the batch insert byte size to 16 MB. Considering this condition, set the number of batch insert records to a value greater than 16 MB divided by 1 KB (that is, greater than 16384 records). Here, assume it is set to 20000 records. After this configuration, the system triggers batch write operations based on the batch insert byte size. Each time the accumulated data volume reaches 16 MB, a write action is performed.
	Preparation Statement (Optional)	The SQL script executed on the database before data import. For example, to ensure continuous service availability, create target table Target_A before the current step writes data, then write to Target_A. After the current step finishes writing data, rename Service_B (the table continuously providing service in the database) to Temp_C, then rename Target_A to Service_B, and finally delete Temp_C.
	End Statement (Optional)	The SQL script executed on the database after data import.
Field Mapping	Input Fields	Displays input fields based on the upstream output.
	Output Fields	Displays output fields. Supports the following operations: Field Management: Click Field Management to select output fields. Click the icon to move Selected Input Fields to Unselected Input Fields. Click the icon to move Unselected Input Fields to Selected Input Fields. Batch Add: Click Batch Add. Supports batch configuration in JSON, TEXT, and DDL formats. For batch configuration in JSON format, for example: `// Example: [{ "name": "user_id", "type": "String" }, { "name": "user_name", "type": "String" }]` Note name indicates the field name to be imported, and type indicates the field type after import. For example, `"name":"user_id","type":"String"` means to import the field named user_id and set its field type to String. For batch configuration in TEXT format, for example: `// Example: user_id,String user_name,String` The row delimiter separates information for each field. The default is a line feed (\n). It supports line feed (\n), semicolon (;), and period (.). The column delimiter separates the field name and field type. The default is a comma (,). For batch configuration in DDL format, for example: `CREATE TABLE tablename ( id INT PRIMARY KEY, name VARCHAR(50), age INT );` Create Output Field: Click + Create Output Field. Fill in Column and select Type as prompted on the page. After configuring the current row, click the icon to save.
	Mapping	Manually select field mappings based on upstream input and target table fields. Quick Mapping includes Row Mapping and Name Mapping. Name Mapping: Maps fields with the same field name. Row Mapping: The field names of the source table and target table are inconsistent, but data in corresponding rows needs to be mapped. Only maps fields in the same row.

Click Confirm to complete the configuration of the GBase 8c output component.