Configure Kudu input widget

更新时间:
复制 MD 格式

Configure the Kudu input widget to read data from a Kudu data source into Dataphin for data integration and development.

Prerequisites

  • The Kudu data source is created. For more information, see .

  • The account that configures the Kudu input widget must have read-through permission on the data source. If you do not have this permission, request it. For more information, see Request data source permission.

Procedure

  1. On the top menu bar of the Dataphin home page, select Development > Data Integration.

  2. On the top menu bar of the integration page, select Project (Dev-Prod mode requires selecting the environment).

  3. In the left-side navigation pane, click Batch Pipeline. In the Batch Pipeline list, click the offline pipeline that needs development to open its configuration page.

  4. Click Component Library in the upper right corner of the page to open the Component Library panel.

  5. In the left-side navigation pane of the Component Library panel, select Input, find the Kudu component in the input widget list on the right, and drag the component to the canvas.

  6. Click the image icon in the Kudu input widget card to open the Kudu Input Configuration dialog box.

  7. In the Kudu Input Configuration dialog box, configure the parameters.

    Parameter

    Description

    Basic configuration

    Step Name

    The name of the Kudu input widget. Dataphin automatically generates a step name, which you can modify as needed. The name must meet the following requirements:

    • Can only contain Chinese characters, letters, underscores (_), and numbers.

    • Cannot exceed 64 characters.

    Datasource

    The drop-down list shows all Kudu-type data sources, including those you have read-through permission on and those you do not. Click the image icon to copy the current data source name.

    • For data sources that you do not have read-through permission on, click Request after the data source to request permission. For more information, see Request, renew, and return data source permission.

    • If no Kudu-type data source exists, click Create to create a data source. For more information, see .

    Table

    Select the table to read. Click the image icon to copy the name of the currently selected table.

    Advanced configuration

    Batch Read Data Volume

    The number of bytes read per batch. Default: 1 MB. Must be greater than 0 (one decimal place).

    Input Filter

    Filter conditions. Supported operators: =, >, <, >=, <=, is not null, is null. Each expression must be enclosed in double quotes "" and separated by spaces. Example: "id > 10", "name = "dataphin".

    Connection Timeout

    Sets the AdminOperationTimeout value, which controls the timeout for operations such as createTable and deleteTable. Default: 30 s. Set to 0 to disable the timeout.

    Read Timeout

    Sets the defaultOperationTimeout value, which controls the timeout for sessions and scans. Default: 30 s. Set to 0 to disable the timeout.

    Output Fields

    Displays all fields that match the selected table and filter conditions. To exclude specific fields from downstream widgets, delete them:

    • Single field deletion scenario: To delete a small number of fields, click the sgaga icon in the operation column to delete the field.

    • Batch field deletion scenario: To delete many fields at once, click Field Management, select multiple fields in the Field Management dialog box, then click the image left shift icon to move the selected input fields to the unselected input fields and click Confirm to complete the batch deletion of fields.

      image..png

  8. Click Confirm to complete the configuration of the Kudu input widget.