Configure the GaussDB (DWS) Output Component-Dataphin(Dataphin)-阿里云帮助中心

The GaussDB (DWS) output component writes data to a GaussDB (DWS) data source. When syncing data from another data source to GaussDB (DWS), configure the target data source in the GaussDB (DWS) output component after you complete the source data configuration. This topic describes how to configure the GaussDB (DWS) output component.

Prerequisites

You have created a GaussDB (DWS) data source. For more information, see Create a GaussDB (DWS) data source.
The account used to configure the GaussDB (DWS) output component must have write-through permission for the data source. If you lack this permission, request it. For more information, see Request data source permissions.

Procedure

On the Dataphin homepage, in the top menu bar, choose R&D > Data Integration.
On the integration page, in the top menu bar, select a project. In Dev-Prod mode, also select an environment.
In the left navigation pane, click Offline Integration. In the Offline Integration list, click the target offline pipeline to open its configuration page.
In the upper-right corner, click Component Library to open the Component Library panel.
In the left navigation pane of the Component Library panel, select Output. In the output component list on the right, locate the GaussDB (DWS) component and drag it onto the canvas.
Click and drag the icon from the upstream input, transform, or flow component and connect it to the GaussDB (DWS) output component.
On the GaussDB (DWS) output component card, click the icon to open the GaussDB (DWS) output configuration dialog box.

In the GaussDB (DWS) output configuration dialog box, configure the following parameters.

Parameter		Description
Basic Information	Step Name	The name of the GaussDB (DWS) output component. Dataphin auto-generates a step name, but you can modify it based on your business scenario. Follow these naming rules: Use only letters, digits, underscores (_), and Chinese characters. Do not exceed 64 characters.
	Datasource	The data source drop-down list shows all GaussDB (DWS) data sources, including those with and without write-through permission. Click the icon to copy the current data source name. For data sources without write-through permission, click Request next to the data source to apply for permission. For more information, see Request data source permissions. If you do not have a GaussDB (DWS) data source, click Create Data Source to create one. For more information, see Create a GaussDB (DWS) data source.
	Schema (optional)	Select a schema to choose a table across schemas. If you do not specify a schema, Dataphin uses the schema configured in the data source by default.
	Table	Select the target table for output data. Enter a keyword to search for tables, or enter the exact table name and click Exact Search. After selecting a table, Dataphin automatically checks its status. Click the icon to copy the selected table name. If the target table does not exist in the GaussDB (DWS) data source, use the one-click table creation feature to quickly generate it. Follow these steps: Click One-Click Table Creation. Dataphin auto-generates SQL code to create the target table, including the table name (defaulting to the source table name) and field types (preliminarily converted based on Dataphin fields). Modify the SQL script as needed, then click Create. After successful creation, Dataphin automatically sets the new table as the output target. Note If a table with the same name exists in the development environment, Dataphin returns an error when you click Create.
	Production Table Missing Policy	Choose how to handle missing production tables: Do Nothing or Automatic Creation. The default is Automatic Creation. If you select Do Nothing, Dataphin skips table creation during task publishing. If you select Automatic Creation, Dataphin creates a table with the same name in the target environment during publishing. Do Nothing: If the target table does not exist, Dataphin shows an error during submission but still allows publishing. You must manually create the table in the production environment before running the task. Automatic Creation: Click Edit Table Creation Statement to adjust the auto-filled SQL. Use the placeholder `${table_name}` for the table name—this is the only supported format. Dataphin replaces it with the actual table name at runtime. If the target table does not exist, Dataphin runs the table creation statement first. If creation fails, publishing fails, and you must fix the SQL based on the error message before republishing. If the table already exists, Dataphin skips creation. Note This setting is available only for projects in Dev-Prod mode.
	Loading Policy	Choose between insert and copy strategies. Insert strategy: Uses the GaussDB (DWS) `insert into...values...` statement to write data. If a primary key or unique index conflict occurs, the conflicting row becomes dirty data and fails to write. Use this strategy by default. Copy strategy: Uses the GaussDB (DWS) `copy from` command to load data from standard input into a table. On conflict, it follows a conflict resolution policy. Use this strategy only if you encounter performance issues. You must also configure the Conflict Resolution Policy, which includes Error on Conflict and Overwrite on Conflict. Important The conflict resolution policy applies only in Copy mode and only when the AnalyticDB for PostgreSQL kernel version is greater than 4.3. If the kernel version is less than 4.3 or unknown, use this policy with caution to avoid task failure.
	Batch Write Data Volume (optional)	The maximum data volume written in a single batch. You can also set Batch Write Record Count. Dataphin writes data as soon as either limit is reached. The default is 32 MB.
	Batch Write Record Count (optional)	The default is 2,048 records. During data sync, Dataphin uses batch writing based on two parameters: Batch Write Record Count and Batch Write Data Volume. When the accumulated data reaches either limit (volume or record count), Dataphin writes the batch immediately. We recommend setting the batch write data volume to 32 MB. Adjust the record count based on your record size to maximize batch efficiency. For example, if each record is about 1 KB, set the batch size to 16 MB and the record count to more than 16,384 (16 MB ÷ 1 KB). Setting it to 20,000 records ensures Dataphin triggers writes based on the 16 MB volume limit.
	Preparation Statement (optional)	An SQL script executed before data import. For example, to maintain service availability, you might create a temporary table Target_A before writing data, write to Target_A, rename the live table Service_B to Temp_C, rename Target_A to Service_B, and finally delete Temp_C.
	Completion Statement (optional)	An SQL script executed after data import.
Field Mapping	Input Fields	Shows input fields based on the upstream component's output.
	Output Fields	Shows output fields. Click Field Management to select output fields. Click the icon to move a selected input field to the unselected input fields list. Click the icon to move an unselected input field to the selected input fields list.
	Mapping	Manually map fields based on upstream input and target table fields. Mapping includes Row-Based Mapping and Name-Based Mapping. Name-Based Mapping: Maps fields with identical names. Row-Based Mapping: Maps fields by position when source and target field names differ. Only maps fields in the same row.

Click Confirm to complete the configuration of the GaussDB (DWS) output component.