Create Flink SQL tasks

更新时间:
复制 MD 格式

Create Flink SQL tasks based on the Ververica Flink engine to process real-time and batch data.

Prerequisites

Before you begin, make sure that the project has enabled the real-time engine and configured the Ververica Flink compute source. For more information, see Create a general project.

Permissions

Only super administrators, project administrators, and developers can create Flink SQL compute tasks.

Step 1: Create a Flink SQL task

  1. In the top menu bar of the Dataphin home page, select Development > Data Development.

  2. In the top menu bar, select Project. If in Dev-Prod mode, also select Environment.

  3. In the navigation pane on the left, select Data Processing > Compute Task. In the list of compute tasks on the right, click the image icon and select Flink SQL.

  4. In the Create Flink SQL Task dialog box, configure the parameters.

    Parameter

    Description

    Task Name

    The naming conventions are as follows:

    • Only lowercase English letters, numbers, and underscores (_) are allowed.

    • The name must be 4 to 63 characters in length.

    • Duplicate names are not allowed within the same project.

    • The name must start with an English letter.

    Production Environment Cluster

    Select the cluster for the Flink SQL task.

    Production Engine Version

    Select the engine version for running tasks in production.

    Note

    If your project space is in Basic mode, this configuration item is Engine Version.

    Development Environment Cluster And Engine Version

    You can select System Default Configuration or Custom Configuration.

    • System Default Configuration: The default option. Uses the same environment cluster and engine version as production.

    • Custom Configuration: Manually select the environment cluster and engine version for running tasks in development.

    Note

    If your project space is in Basic mode, this configuration item does not apply.

    Storage Directory

    Select the directory for the task.

    If no directory exists, you can Create Folder:

    1. Above the compute task list on the left side of the page, click the image icon to open the Create Folder dialog box.

    2. In the Create Folder dialog box, enter the folder Name and select the Directory location as needed.

    3. Click Confirm.

    Creation Method

    The following methods are supported: Blank Creation, Reference Sample Code, and Use Template.

    • Create Blank: Create a normal, blank Flink SQL task.

    • Reference Sample Code: Quickly create a task by referencing built-in sample code.

    • Use Template: Quickly create a task based on a real-time computing task template.

    Description

    Enter a description of the Flink SQL task, up to 1,000 characters.

  5. Click OK.

Step 2: Develop and precompile the Flink SQL node code

  1. On the Flink SQL node code page, write the node code.

    After writing the code, click Format in the menu bar to auto-format the SQL code.

  2. Click Precompile to check for syntax and permission issues.

    If precompilation is successful, a Precompilation Successful message appears. If it fails, a Precompilation Failed message appears. Click Console at the bottom of the page to view the failure log.

Step 3: Configure Flink SQL Job

  1. Click Configuration in the right sidebar of the current compute task.

  2. On the configuration panel, configure the settings for the Flink SQL node in Real-time mode and Offline mode.

    Note

    Dataphin real-time computing supports stream-batch integrated tasks using a unified compute engine. You can configure Stream + Batch on a single code to generate instances in different modes. To enable batch processing, activate offline mode on the task configuration page and configure resources, schedule dependencies, and other settings.

    • Real-time mode

    • Offline mode (Beta)

      • Schedule Configuration (Required): Schedule configuration defines the recurring schedule pattern of a node in the production environment. Use schedule properties to set the scheduling cycle and effective date. For configuration instructions, see Offline Mode Schedule Configuration.

      • Resource Configuration (Required): Configure the cluster, engine version, degree of parallelism, number of Task Managers, Job Manager Memory, and Task Manager Memory for the production and development environments. For configuration instructions, see Configure resources for Ververica Flink offline mode.

      • Runtime parameters: Control the execution behavior and performance of Flink applications by configuring runtime parameters. For configuration instructions, see Offline mode runtime parameter configuration.

      • Dependency files: Configure the resource files required by the Flink SQL task. For configuration instructions, see Offline mode dependency file configuration.

      • Dependency Relationships (Required): Configuring dependency relationships helps you quickly identify upstream and downstream tasks during troubleshooting. For more information, see Offline Mode Dependency Relationship Configuration.

  3. Click OK.

Step 4: Test the Flink SQL node code

  1. Test your Flink SQL code in Dataphin. Click Test in the top menu bar to sample data from the code node and run local tests to verify correctness.

  2. In the test configuration dialog box, select Real-time Pattern - FLINK Stream Node for real-time pattern testing or Offline Pattern - FLINK Batch Node for offline pattern testing.

    • Real-time Pattern Testing: Samples the corresponding real-time physical data and runs a local test using the Flink Stream pattern. For more information, see real-time pattern testing.

    • Offline Pattern Test: Uses data from the corresponding offline physical table and runs a local test using the Flink Batch pattern. For more information, see documentation.

Note

Currently, only single pattern testing is supported. After selecting a pattern, you can sample the corresponding pattern table data for testing.

Step 5: Submit a Flink SQL Job

  1. Click Submit in the top menu bar.

  2. In the Submit dialog box, review the Submission Content and Pre-check information, and fill in the Submission Remarks.

  3. Click Confirm And Submit.

    Note

    If your project follows a Dev-Prod pattern, you must publish the Flink SQL node to the production environment. For detailed instructions, see here.

What to do next

In the Operation Center, view and manage Flink SQL nodes to ensure they run as expected. For more information, see View and manage real-time instances or View and manage real-time nodes.