Configure the TDSQL for MySQL Input Component

更新时间:
复制 MD 格式

The TDSQL for MySQL input component reads data from a TDSQL for MySQL data source. To sync data from TDSQL for MySQL to another data source, configure the input component first, and then configure the destination data source.

Prerequisites

  • You have created a TDSQL for MySQL data source. For more information, see .

  • The account used to configure the TDSQL for MySQL input component must have read-through permission on the data source. If the account does not have this permission, request it. For more information, see Request Data Source Permissions.

Procedure

  1. On the Dataphin homepage, in the top menu bar, click Develop, and then click Data Integration.

  2. On the Integration page, in the top menu bar, select a Project. In Dev-Prod mode, also select an environment.

  3. In the left navigation pane, click Batch Pipeline. In the Batch Pipeline list, click the offline pipeline that you want to develop. The configuration page for the offline pipeline opens.

  4. In the upper-right corner of the page, click Component Library to open the Component Library panel.

  5. In the left navigation pane of the Component Library panel, click Input. In the list of input components on the right, locate the TDSQL for MySQL component and drag it onto the canvas.

  6. Click the image icon in the TDSQL for MySQL input component card to open the TDSQL for MySQL Input Configuration dialog box.

  7. In the TDSQL for MySQL Input Configuration dialog box, configure the parameters.

    Parameter

    Description

    Step Name

    The name of the TDSQL for MySQL input component. Dataphin generates a default name that you can change. Naming rules:

    • Use only Chinese characters, letters, underscores (_), and digits.

    • Do not exceed 64 characters.

    Source Table Count

    Select the number of source tables. Options are Single Table and Multiple Tables:

    • Single Table: Syncs data from one source table to one destination table.

    • Multiple Tables: Syncs data from multiple source tables to one destination table. The system uses the UNION algorithm to merge the data.

      For more information about UNION, see .

    Datasource

    Lists all TDSQL for MySQL data sources in the current Dataphin instance, regardless of whether you have read-through permission. If the source table count is set to single table, you can select only one data source. You can click the image icon to copy the data source name. If the source table count is set to multiple tables, you can select multiple data sources that share the same time zone.

    Multiple Tables

    • For data sources where you lack read-through permission, click Request next to the data source name to request read-through permission. For more information, see Request Data Source Permissions.

    • If you do not have a TDSQL for MySQL data source, click Create Data Source in the drop-down list. For more information, see .

    Database (Optional)

    Select the database that contains the source table. If you leave this field blank, the system uses the database specified when registering the data source.

    If you select Multiple Tables, you can select multiple databases. Click the image icon to open the Database List dialog box and view all selected databases.

    Table Matching Method

    Select Generic Rule or Database Regex.

    Note

    This parameter is available only when you select Multiple Tables for Source Table Count.

    Table

    Select the source table:

    • If you select Single Table for Source Table Count, enter a keyword to search for the table name, or enter the exact table name and click Exact Match. After you select a table, the system automatically checks the table status. Click the image icon to copy the name of the selected table.

    • If you select Multiple Tables for Source Table Count, enter an expression based on the table matching method:

      • Generic Rule: Enter a table expression in the input box to filter tables with the same structure. The system supports enumeration, regex-like syntax, and mixed formats. Example: table_[001-100];table_102;.

      • Database Regex: Enter a regex supported by the current database. The system matches tables in the target database using this regex. At runtime, the system dynamically matches new tables based on the regex.

      After entering the expression, click Exact Match to open the Confirm Match Details dialog box and view the list of matched tables.

    Shard Key (Optional)

    Partitions data based on the specified shard key column. Combine this with concurrency settings to enable concurrent reads. Any column from the source table can serve as the shard key. For best performance, use a primary key or indexed column.

    Important

    If you select a date-time type, the system partitions the full time range based on the concurrency setting. This method does not guarantee even distribution.

    Input Filter (Optional)

    Set filter conditions for the data to extract:

    • Static Value: Extracts data that matches the specified value. Example: ds=20211111.

    • Variable Parameter: Extracts a subset of data using a variable. Example: ds=${bizdate}.

    Output Fields

    Lists all fields from the selected table and filtered results. You can:

    • Manage Fields: Remove fields you do not need to pass to downstream components:

      • Remove One Field: Click the sgaga icon in the Actions column.

      • Remove Multiple Fields: Click Field Management. In the Field Management dialog box, select the fields to remove. Click the image left arrow icon to move them from Selected to Not Selected. Click OK.

        image..png

    • Batch Add: Click Batch Add to add fields in JSON, TEXT, or DDL format.

      Note

      After you click OK, the batch-added fields overwrite existing field configurations.

      • JSON format example:

        // Example:
          [{
             "index": 1,
             "name": "id",
             "type": "int(10)",
             "mapType": "Long",
             "comment": "comment1"
           },
           {
             "index": 2,
             "name": "user_name",
             "type": "varchar(255)",
             "mapType": "String",
             "comment": "comment2"
         }]
        Note

        The index specifies the column number, name specifies the field name after import, and type specifies the field type after import. For example, "index":3,"name":"user_id","type":"String" imports the fourth column with the field name user_id and the field type String.

      • TEXT format example:

        // Example:
        1,id,int(10),Long,comment1
        2,user_name,varchar(255),Long,comment2
        • The row delimiter separates field entries. Default: line feed (\n). Supported delimiters: \n, semicolon (;), and period (.).

        • The column delimiter separates field names and types. Default: comma (,). Supported delimiters: ','. The field type is optional and defaults to ','.

      • DDL format example:

        CREATE TABLE tablename (
        	user_id serial,
        	username VARCHAR(50),
        	password VARCHAR(50),
        	email VARCHAR (255),
        	created_on TIMESTAMP,
        );
    • Add Output Field: Click + Add Output Field. Enter values for Column, Type, and Comment. Select a Mapping Type. Click the image icon to save the row.

  8. Click OK to finish configuring the TDSQL for MySQL input component.