Configure the TDSQL for MySQL output component

更新时间:
复制 MD 格式

The TDSQL for MySQL output component writes data to a TDSQL for MySQL data source. When you sync data from other data sources to a TDSQL for MySQL data source, you must configure the target data source for the TDSQL for MySQL output component after you configure the source data source information. This topic describes how to configure the TDSQL for MySQL output component.

Prerequisites

  • You have created a TDSQL for MySQL data source. For more information, see .

  • The account that you use to configure the properties of the TDSQL for MySQL output component has write-through permission for the data source. If the account does not have this permission, request it. For more information, see Request, renew, and return data source permissions.

Procedure

  1. In the top menu bar of the Dataphin home page, select Develop > Data Integration.

  2. In the top menu bar of the integration page, select a Project. If you are in Dev-Prod mode, also select an Environment.

  3. In the navigation pane on the left, click Offline Integration. Then, from the Offline Integration list, click the target offline pipeline to open its configuration page.

  4. In the upper-right corner of the page, click Component Library to open the Component Library panel.

  5. In the left navigation pane of the Component Library panel, select Outputs. In the output component list on the right, locate the TDSQL for MySQL component and drag it to the canvas.

  6. Click and drag the image icon of the target input, transform, or flow component and connect it to the TDSQL for MySQL output component.

  7. Click the image icon on the TDSQL for MySQL output component card to open the TDSQL for MySQL Output Configuration dialog box.image

  8. In the TDSQL for MySQL Output Configuration dialog box, configure the parameters.

    Parameter

    Description

    Basic Settings

    Step Name

    The name of the TDSQL for MySQL output component. Dataphin automatically generates a step name. You can also change it as needed. The naming convention is as follows:

    • The name can contain only Chinese characters, letters, underscores (_), and digits.

    • The name cannot exceed 64 characters in length.

    Datasource

    The drop-down list displays all TDSQL for MySQL data sources, including those for which you have write-through permission and those for which you do not. Click the image icon to copy the current data source name.

    • For a data source for which you do not have write-through permission, click Request after the data source to request the permission. For more information, see Request, renew, and return data source permissions.

    • If you do not have a TDSQL for MySQL data source, click Create Data Source to create one. For more information, see .

    Library (Optional)

    Select the database where the table is located. If you leave this blank, the database specified during data source registration is used.

    Table

    Select the target table for data output. You can enter a keyword to search for a table, or enter the exact table name and click Exact Search. After you select a table, the system automatically checks its status. Click the image icon to copy the name of the selected table.

    If the target table for data synchronization does not exist in the TDSQL for MySQL data source, you can use the one-click table creation feature to quickly generate it. The procedure is as follows:

    1. Click One-click Table Creation. Dataphin automatically generates the code to create the target table, including the target table name (which defaults to the source table name) and field types (which are initially converted based on Dataphin fields).

    2. Modify the SQL script for creating the target table as needed, and then click Create. After the target table is created, Dataphin automatically uses it as the target table for data output.

      Note
      • If a table with the same name exists in the development environment, Dataphin reports an error that the table already exists when you click Create.

      • If there are no matching items, you can still perform integration based on a manually entered table name.

    Policy for missing production table

    The policy for handling a non-existent production table. You can select Do not handle or Automatic creation. The default value is Automatic creation. If you select Do not handle, the production table is not created when the node is published. If you select Automatic creation, a table with the same name is created in the target environment when the node is published.

    • Do not handle: If the target table does not exist, a message is displayed during submission, but the node can still be published. In this case, you must manually create the target table in the production environment before you can execute the node.

    • Automatic creation: You must edit the table creation statement. The statement for the selected table is filled in by default, and you can modify it. The table name in the statement uses the placeholder ${table_name}. Only this placeholder is supported. It will be replaced with the actual table name during execution.

      If the target table does not exist, it is first created according to the table creation statement. If the creation fails, the check result upon publishing is failed. You can modify the statement based on the error message and publish again. If the target table already exists, the creation statement is not executed.

    Note

    This parameter is supported only in projects that use the Dev-Prod mode.

    Loading Policy

    Select a policy for writing data to the target table. Loading Policy includes:

    • Append data (insert into): Appends data to the existing data in the target table without modifying historical data. A dirty data error is reported in case of a primary key or constraint violation.

    • Overwrite on primary key conflict (replace into): In case of a primary key or constraint violation, the old data in the entire row with the duplicate primary key is deleted before the new data is inserted.

    • Update on primary key conflict (on duplicate key update): In case of a primary key or constraint violation, the data in the mapped fields of the existing record is updated.

    Batch write data volume (optional)

    The amount of data written at a time. You can also set Batch write record count. The system writes data when either of the two configured limits is reached. The default value is 32 MB.

    Batch write record count (optional)

    The default value is 2,048. Data synchronization uses a batch write strategy. The parameters include Batch write record count and Batch write data volume.

    • When the accumulated data reaches either of the set limits (the data volume or record count for a batch write), the system considers a batch full and immediately writes it to the destination.

    • Set the batch write data volume to 32 MB. You can flexibly adjust the maximum number of records for a bulk insert based on the actual size of a single record. This is usually set to a large value to fully leverage the benefits of batch writing. For example, if a single record is about 1 KB, you can set the bulk insert size to 16 MB. Then, set the bulk insert record count to a value greater than 16 MB divided by 1 KB (which is 16,384). For example, set it to 20,000. With this configuration, the system triggers a batch write operation based on the bulk insert size. A write operation is performed each time the accumulated data volume reaches 16 MB.

    Pre-job statement (optional)

    An SQL script that is executed on the database before data import.

    For example, to ensure continuous service availability, you can create a target table Target_A before the current step writes data. Then, write data to Target_A. After the write is complete, rename the service table Service_B to Temp_C. Then, rename Target_A to Service_B. Finally, delete Temp_C.

    Post-job statement (optional)

    An SQL script that is executed on the database after data import.

    Field Mapping

    Input Fields

    The input fields are displayed based on the output of the upstream component.

    Output Fields

    The output fields are displayed. You can perform the following operations:

    • Field Management: Click Field Management to select output fields.

      image

      • Click the gaagag icon to move fields from Selected Input Fields to Unselected Input Fields.

      • Click the agfag icon to move fields from Unselected Input Fields to Selected Input Fields.

    • Batch Add: Click Batch Add to configure fields in JSON, TEXT, or DDL format.

      • Configure in JSON format. Example:

        // Example:
        [{
          "name": "user_id",
          "type": "String"
         },
         {
          "name": "user_name",
          "type": "String"
         }]
        Note

        name specifies the name of the field to import, and type specifies the data type of the field after it is imported. For example, "name":"user_id","type":"String" imports the field named user_id and sets its data type to String.

      • Configure in TEXT format. Example:

        // Example:
        user_id,String
        user_name,String
        • The row delimiter separates the information for each field. The default is a line feed (\n). Semicolons (;) and periods (.) are also supported.

        • The column delimiter separates the field name and field type. The default is a comma (,).

      • Configure in DDL format. Example:

        CREATE TABLE tablename (
            id INT PRIMARY KEY,
            name VARCHAR(50),
            age INT
        );
    • Create Output Field: Click + Create Output Field, enter a Column name, and select a Type. After you configure the current row, click the image icon to save.

    Mapping

    You can manually map fields based on the upstream input and the target table fields. Mapping includes Same-row mapping and Same-name mapping.

    • Same-name mapping: Maps fields that have the same name.

    • Same-row mapping: Maps data between fields in the same row, even if the field names in the source and target tables are different.

  9. Click Confirm to complete the property configuration for the TDSQL for MySQL output component.