Configure the DM input component

更新时间: 2026-06-23 15:20:41

Configure the DM input component to read data from a DM data source into Dataphin for data integration and development.

Prerequisites

Procedure

  1. On the menu bar at the top of the Dataphin home page, choose Development > Data Integration.

  2. On the menu bar at the top of the Data Integration page, select a project. In Dev-Prod mode, also select an environment.

  3. In the navigation pane on the left, click Batch Pipeline. In the Batch Pipeline list, click the target batch pipeline to open its configuration page.

  4. In the upper-right corner of the page, click Component Library to open the Component Library panel.

  5. In the navigation pane of the Component Library panel, choose Input. Find the DM component in the list and drag it to the canvas.

  6. Click the image icon on the DM input component card to open the DM Input Configuration dialog box.

  7. In the DM Input Configuration dialog box, configure the parameters.

    Parameter

    Description

    Step Name

    The name of the DM input component. Dataphin automatically generates a step name, which you can change. The naming conventions are as follows:

    • The name can contain only Chinese characters, letters, underscores (_), and digits.

    • The name cannot exceed 64 characters in length.

    Datasource

    The drop-down list displays all DM data sources in the current Dataphin project, including those for which you have read-through permission and those for which you do not. Click the image icon to copy the data source name.

    Number of Source Tables

    Select whether to use a single table or multiple tables with the same schema as input. Valid values: Single Table and Multiple Tables.

    • Single Table: Syncs data from one source table to one destination table.

    • Multiple Tables: Syncs data from multiple source tables to the same destination table. The union algorithm is used to merge data from multiple tables into a single table.

      For more information about union, see INTERSECT, UNION, and EXCEPT.

    Table Match Method

    Select General Rule or Database Regex.

    Note

    This parameter is available only when you set Number of Source Tables to Multiple Tables.

    Table

    Select the source table or tables:

    • If you set Number of Source Tables to Single Table, enter a keyword to search for the table, or enter the exact table name and click Exact Match. After you select a table, the system automatically checks its status. Click the image icon to copy the name of the selected table.

    • If you set Number of Source Tables to Multiple Tables, enter an expression to add tables based on the selected table match method.

      • If you select General Rule for Table Match Method: In the input box, enter a table expression to filter for tables with the same structure. The system supports enumerations, regular expression-like patterns, and a mix of both. For example, table_[001-100];table_102;.

      • If you select Database Regex for Table Match Method: In the input box, enter a regular expression that the current database supports. The system matches tables in the destination database based on this expression. At runtime, the node uses the database regex to match the new range of tables in real time for synchronization.

      After you enter the expression, click Exact Match to view the list of matched tables in the Confirm Match Details dialog box.

    Split Key (Optional)

    The system partitions data based on the configured split key. Use this parameter together with the concurrency parameter to enable concurrent reads. Select a column from the source table as the split key. For best performance, use a primary key or an indexed column.

    Important

    If you select a date and time type, the system identifies the maximum and minimum values and performs a rough split based on the total time range and concurrency. The splits are not guaranteed to be even.

    Batch Read Size (Optional)

    The number of records to read per batch. Specify a batch size, such as 1024 records, instead of reading one record at a time to reduce data source interactions, improve I/O efficiency, and lower network latency.

    Input Filter (Optional)

    A filter condition for input fields. For example, ds=${bizdate}. The Input Filter applies to the following scenarios:

    • Filtering a fixed portion of data.

    • Parameter-based filtering.

    Output Fields

    Displays all fields from the selected tables that match the filter criteria. You can perform the following operations:

    • Field Management: To exclude certain fields from downstream components, delete them:

      • Single field deletion: To delete a small number of fields, click the sgaga icon in the Actions column to delete the extra fields.

      • Batch field deletion: To delete many fields, click Field Management. In the Field Management dialog box, select multiple fields, click the image left arrow icon to move the selected fields to the unselected list, and then click Confirm.

        image..png

    • Batch Add: Click Batch Add to configure fields in a batch using JSON, TEXT, or DDL format.

      Note

      After you add fields in a batch and click Confirm, the existing field configuration is overwritten.

      • To configure in a batch using JSON format, for example:

        // Example:
          [{
             "index": 1,
             "name": "id",
             "type": "int(10)",
             "mapType": "Long",
             "comment": "comment1"
           },
           {
             "index": 2,
             "name": "user_name",
             "type": "varchar(255)",
             "mapType": "String",
             "comment": "comment2"
         }]
        Note

        The `index` parameter specifies the column number of the object. The `name` parameter defines the field name, and the `type` parameter defines the field type after import. For example, "index":3,"name":"user_id","type":"String" indicates that the fourth column from the file is imported, with 'user_id' as the field name and 'String' as the field type.

      • To configure in a batch using TEXT format, for example:

        // Example:
        1,id,int(10),Long,comment1
        2,user_name,varchar(255),Long,comment2
        • The row delimiter separates the information for each field. The default delimiter is a line feed (\n). You can also use a semicolon (;) or a period (.).

        • The column delimiter separates field names from field types. The default value is a comma (,). You can use ',' as the column delimiter. The field type is optional and defaults to ','.

      • To configure in a batch using DDL format, for example:

        CREATE TABLE tablename (
        	user_id serial,
        	username VARCHAR(50),
        	password VARCHAR(50),
        	email VARCHAR (255),
        	created_on TIMESTAMP,
        );
    • Add Output Field: Click +Add Output Field and enter the Column, Type, and Comment, and select the Mapping Type. After you configure the row, click the image icon to save.

  8. Click Confirm to complete the configuration of the DM input component properties.

上一篇: Configure ArgoDB input widget 下一篇: Configure the Doris input component
阿里云首页 智能数据建设与治理 Dataphin 相关技术圈