Configure a PolarDB output component

更新时间: 2026-06-23 11:39:51

The PolarDB output component writes data to a PolarDB data source. When synchronizing data from another source to PolarDB, configure this component as the destination.

Prerequisites

Procedure

  1. In the top menu bar of the Dataphin home page, choose Develop > Data Integration.

  2. On the Data Integration page, select a project. If you are in Dev-Prod mode, you must also select an environment.

  3. In the navigation pane on the left, click Batch Pipeline. In the Batch Pipeline list, click the offline pipeline that you want to edit to open its configuration page.

  4. In the upper-right corner of the page, click Component Library to open the Component Library panel.

  5. In the navigation pane on the left of the Component Library panel, select Outputs. Find the PolarDB component in the list on the right and drag it to the canvas.

  6. Click and drag from the image icon on a source, transform, or flow component to the PolarDB output component to connect them.

  7. Click the image icon on the PolarDB output component card to open the PolarDB Output Configuration dialog box.image

  8. In the PolarDB Output Configuration dialog box, configure the parameters.

    Parameter

    Description

    Basic Settings

    Step Name

    The name of the PolarDB output component. Dataphin automatically generates a step name, which you can modify. The naming rules are:

    • Can contain only letters, underscores (_), and digits.

    • Cannot exceed 64 characters.

    Datasource

    Lists all PolarDB data sources, including those you have write-through permission for and those you do not. Click the image icon to copy the data source name.

    Time Zone

    Determines how the component processes time-formatted data. By default, the time zone is inherited from the selected data source and cannot be changed.

    Note

    For nodes created before V5.1.2, you can select Data Source Default Configurations or Channel Configuration Time Zone. By default, Channel Configuration Time Zone is selected.

    • Data Source Default Configurations: The default time zone of the selected data source.

    • Channel Configuration Time Zone: The time zone configured in Properties > Channel Configuration for the current integration node.

    Table

    The target table for the output data. Enter a keyword to search for a table, or enter the exact table name and click Exact Match. After you select a table, the system automatically checks the table status. Click the image icon to copy the table name.

    Loading Policy

    The policy for writing data to the target table. The Loading Policy options are:

    • Append Data: Appends data to the target table without modifying existing data. A dirty data error is reported if a primary key or constraint violation occurs.

    • Overwrite Data: If a primary key or constraint violation occurs, the system deletes the old row with the duplicate primary key and then inserts the new row.

    Note

    The loading policy does not take effect for the PostgreSQL protocol.

    Batch Write Size (optional)

    The maximum data size per batch. Works together with Batch Write Records — the system writes a batch when either limit is reached. Default: 32 MB.

    Batch Write Records (optional)

    The maximum number of records per batch. Default: 2048. Works together with Batch Write Records and Batch Write Size to control batch boundaries.

    • A batch is written to the destination when the accumulated data reaches either the record limit or the size limit.

    • For best performance, set a large batch size. For example, if a single record is about 1 KB, you can set the batch size to 16 MB. Then, set the batch record count to a value greater than 16,384 (16 MB / 1 KB), such as 20,000. With this configuration, the system writes a batch each time the accumulated data reaches 16 MB.

    Prepare Statement (optional)

    An SQL script to run on the database before data import.

    For example, to ensure continuous service availability, you can create a temporary target table `Target_A` before the write step. The data is written to `Target_A`. After the write step is complete, you can rename the production table `Service_B` to `Temp_C`, rename `Target_A` to `Service_B`, and then delete `Temp_C`.

    Post Statement (optional)

    An SQL script to run on the database after data import.

    Field Mapping

    Input Fields

    The input fields from the upstream source.

    Output Fields

    The output fields. You can manage them using the following options:

    • Field Management: Click Field Management to select output fields.

      image

      • Click the gaagag icon to move fields from Selected Input Fields to Unselected Input Fields.

      • Click the agfag icon to move fields from Unselected Input Fields to Selected Input Fields.

    • Batch Add: Click Batch Add to configure fields in bulk using JSON, TEXT, or DDL format.

      • To configure in JSON format, use the following example:

        // Example:
        [{
          "name": "user_id",
          "type": "String"
         },
         {
          "name": "user_name",
          "type": "String"
         }]
        Note

        `name` is the name of the field to import. `type` is the data type of the field after import. For example, "name":"user_id","type":"String" imports the field named `user_id` and sets its data type to String.

      • To configure in TEXT format, use the following example:

        // Example:
        user_id,String
        user_name,String
        • The row delimiter separates the information for each field. The default is a line feed (\n). Semicolons (;) and periods (.) are also supported.

        • The column delimiter separates the field name from the field type. The default is a comma (,).

      • To configure in DDL format, use the following example:

        CREATE TABLE tablename (
            id INT PRIMARY KEY,
            name VARCHAR(50),
            age INT
        );
    • Create an output field: Click +Create Output Field. Enter a Column name and select a Type as prompted. After you configure the row, click the image icon to save.

    Mapping

    Maps fields between the upstream source and the target table. The Mapping options are Same Row Mapping and Same Name Mapping.

    • Same Name Mapping: Maps fields that have the same name.

    • Same Row Mapping: Maps fields based on their row position. Use this when source and target field names differ, but their order is the same.

  9. Click Confirm to save the configuration of the PolarDB output component.

上一篇: Configure the AnalyticDB for PostgreSQL Output Component 下一篇: Configure SAP HANA Output Components
阿里云首页 智能数据建设与治理 Dataphin 相关技术圈