configure dolphindb output component

更新时间:
复制 MD 格式

The DolphinDB output component writes data to a DolphinDB data source. When you synchronize data from other data sources to a DolphinDB data source, you must configure the DolphinDB output component as the target after you configure the source data. This topic describes how to configure the DolphinDB output component.

Prerequisites

Procedure

  1. On the Dataphin home page, in the top menu bar, select Development > Data Integration.

  2. On the Integration page, in the top menu bar, select Project. If you are in Dev-Prod mode, select an environment.

  3. In the left navigation pane, click Offline Integration. Then, in the Offline Integration list, click the offline pipeline that you want to develop to open its configuration page.

  4. Click Component Library in the upper-right corner of the page to open the Component Library panel.

  5. In the navigation pane on the left of the Component Library panel, select Outputs. Find the DolphinDB component in the list of output components on the right and drag it to the canvas.

  6. Click and drag the image icon of the upstream component to connect it to the DolphinDB output component.

  7. Click the image icon on the DolphinDB output component to open the DolphinDB Output Configuration dialog box.

    image

  8. In the DolphinDB Output Configuration dialog box, configure the following parameters.

    Parameter

    Description

    Basic Settings

    Step Name

    The name of the DolphinDB output component. Dataphin automatically generates the step name. You can modify it as needed. Naming conventions are as follows:

    • Can only contain Chinese characters, letters, underscores (_), and numbers.

    • Length cannot exceed 64 characters.

    Datasource

    The data source drop-down list displays all DolphinDB data sources, including those with and without write-through permissions. Click the image icon to copy the current data source name.

    • For data sources without write-through permission, you can click Apply next to the data source to request write-through permission for the data source. For detailed steps, see Request Data Source Permission.

    • If you do not have a DolphinDB data source, click the dfag New icon to create one. For detailed steps, see Creating a DolphinDB Data Source.

    Database (Optional)

    Select the database where the table resides. If you leave this blank, the system uses the database specified during data source registration.

    Table

    Select the target table for output data. Search by entering a table name keyword, or enter the exact table name and click Exact Search. After you select a table, the system automatically checks its status. Click the image icon to copy the name of the selected table.

    If the DolphinDB data source does not have a target table for data synchronization, use the one-click table creation feature to quickly generate a target table. The detailed procedure is as follows:

    1. Click One-Click Table Creation. Dataphin automatically matches the code for creating the target table, including the target table name (defaults to the source table name), field types (initially converted based on Dataphin fields), and other information.

    2. Modify the SQL script for creating the target table as needed, then click Create. After the target table is created, Dataphin automatically uses the new table as the target table for output data.

      Note
      • If a table with the same name exists in the development environment, Dataphin reports an error after you click Create.

      • If no matching items are found, you can still integrate data based on a manually entered table name.

    Production Table Missing Policy

    The policy for handling a missing production table. Select Do Not Process or Automatic Creation. The default is Automatic Creation. If you select Do Not Process, the production table is not created when the node is published. If you select Automatic Creation, a table with the same name is created in the target environment when the node is published.

    • Do Not Process: If the target table does not exist, the system prompts that the target table is missing upon submission, but the node can still be published. You must manually create the target table in the production environment before running the node.

    • Automatic Creation: You must Edit Table Creation Statement. The system populates the table creation statement for the selected table by default, which you can adjust. Use the placeholder ${table_name} for the table name in the creation statement. Only this placeholder is supported; it is replaced with the actual table name during execution.

      If the target table does not exist, the system first creates it based on the creation statement. If table creation fails, the publishing check fails. Modify the creation statement based on the error message, then publish again. If the target table already exists, the system does not create it.

    Note

    This option is only supported in Dev-Prod mode projects.

    Loading Policy

    Select the policy for writing data to the target table. Loading Policy includes:

    • Append Data: If a primary key or constraint violation occurs, the system prompts a dirty data error.

    • Overwrite Data: If a primary key or constraint violation occurs, the system first deletes the original data, then inserts the entire new row.

    Matching Key

    Select one or more fields from the target table. The system overwrites existing data based on the selected fields.

    Note

    This option is only supported when the loading policy is set to Overwrite Data.

    Batch Write Data Volume (Optional)

    The amount of data written in a single operation. You can also set Batch Write Record Count. The system writes data based on whichever limit is reached first. The default is 32 MB.

    Batch Write Record Count (Optional)

    The default is 2048 records. Data synchronization uses a batch write strategy. Parameters include Batch Write Record Count and Batch Write Data Volume.

    • When the accumulated data reaches either of the set limits (data volume or record count), the system considers a batch full and immediately writes this batch of data to the target.

    • Set the batch write data volume to 32 MB. For the batch insert record count limit, adjust it flexibly based on the actual size of a single record. Typically, set a large value to fully leverage the benefits of batch writing. For example, if a single record is about 1 KB, set the batch insert byte size to 16 MB. Considering this, set the batch insert record count to a value greater than 16 MB divided by 1 KB (i.e., greater than 16384 records). Here, we assume it is set to 20000 records. With this configuration, the system triggers batch write operations based on the batch insert byte size. Each time the accumulated data reaches 16 MB, a write operation is performed.

    Preparation Statement (Optional)

    The SQL script executed on the database before data import.

    For example, to ensure continuous service availability, create target table Target_A before this step writes data. Write data to Target_A. After this step finishes writing data, rename Service_B (the continuously serving table in the database) to Temp_C, then rename Target_A to Service_B, and finally delete Temp_C.

    Completion Statement (Optional)

    The SQL script executed on the database after data import.

    Field Mapping

    Input Fields

    Displays input fields based on the upstream output.

    Output Fields

    Displays output fields. You can perform the following operations:

    • Field Management: Click Field Management to select output fields.

      image

      • Click the gaagag icon to move Selected Input Fields to Unselected Input Fields.

      • Click the agfag icon to move Unselected Input Fields to Selected Input Fields.

    • Batch Add: Click Batch Add to configure fields in JSON, TEXT, or DDL format.

      • Batch configure in JSON format. For example:

        // Example:
        [{
          "name": "user_id",
          "type": "String"
         },
         {
          "name": "user_name",
          "type": "String"
         }]
        Note

        name indicates the imported field name, and type indicates the field type after import. For example, "name":"user_id","type":"String" means to import the field named user_id and set its field type to String.

      • Batch configure in TEXT format. For example:

        // Example:
        user_id,String
        user_name,String
        • The row delimiter separates information for each field. The default is a line feed (\n). It supports line feed (\n), semicolon (;), and period (.).

        • The column delimiter separates the field name and field type. The default is a comma (,).

      • Batch configure in DDL format. For example:

        CREATE TABLE tablename (
            id INT PRIMARY KEY,
            name VARCHAR(50),
            age INT
        );
    • New Output Field: Click +New Output Field. Fill in Column and select Type as prompted on the page. After configuring the current row, click the image icon to save.

    Mapping

    Manually select field mappings based on upstream input and target table fields. Quick Mapping includes Row-based Mapping and Name-based Mapping.

    • Name-based Mapping: Maps fields with the same name.

    • Row-based Mapping: Maps fields where source and target table field names differ, but the data in corresponding rows needs mapping. Only maps fields in the same row.

  9. Click Confirm to complete the property configuration of the DolphinDB output component.