Create a Flink DataStream task
Create a Flink DataStream task based on the Ververica Flink real-time engine.
Prerequisites
-
Ensure that the project has enabled the real-time engine and configured the Ververica Flink computing source. For more information, see Create a general project.
-
Ensure that you have uploaded the JAR package of the completed DataStream job to the Dataphin platform. For more information, see Upload resources and references.
Permissions required
Only super administrators, project administrators, and developers can create Flink DataStream tasks.
Step 1: Create a Flink DataStream task
-
Navigate to Development > Data Development from the top menu bar on the Dataphin home page.
-
Select Project from the top menu bar (in Dev-Prod mode, also select Environment).
-
In the navigation pane on the left, select Data Processing > Compute Jobs. In the list of compute jobs, click the
icon and select Flink DataStream. -
In the Create Flink DataStream Task dialog box, configure the parameters.
Parameter
Description
Task Name
A name must meet the following requirements:
-
Can only contain lowercase English letters, numbers, and underscores (_).
-
The length of the name must be between 4 and 63 characters.
-
Names within the project must be unique.
-
The name must start with an English letter.
Production Environment Cluster
Select the cluster for the Flink DataStream task production environment.
Production Environment Engine Version
Select the engine version.
Development Environment Cluster And Engine Version
Select the resource queue and engine version for the Flink DataStream task development environment. Options: System Default Configuration and Custom Configuration. If you select Custom Configuration, configure the cluster and engine version for the development environment.
-
Development Environment Resource Cluster: Select the resource cluster in the development environment where the Flink DataStream task is located.
-
Development Environment Engine Version: Select the version of the Ververica Flink engine.
Storage Directory
Select the directory where the task is stored.
If no directory is created, you can Create a Folder. The operation method is as follows:
-
Click the
icon above the computing task list on the left side of the page to open the Create Folder dialog box. -
In the Create Folder dialog box, enter the folder Name and Select Directory location as needed.
-
Click Confirm.
Select Resource
The resource that the Flink DataStream task depends on.
Class Name
Use the full class name (fully qualified class name) of the resource.
Description
Provide a brief description of the Flink DataStream task.
-
-
Click Confirm.
Step 2: Precompile the Flink DataStream task code
Click Precompile in the top menu bar to check the task code for syntax and permission issues.
On success, a Precompilation Successful message appears. On failure, a Precompilation Failed message appears. To view the failure log, click Console at the bottom of the page.
Step 4: Configure the Flink DataStream task
-
Click Configure in the right sidebar of the current computing task.
-
In the configuration panel, configure the Real-time Mode for the Flink DataStream task.
ImportantFlink DataStream tasks cannot run in offline mode.
-
Real-time Mode
-
Resource Configuration (Required): You must configure the resource queue, engine version, and both the Job Manager CPUs and Job Manager Memory for the task's production and development environments. For configuration instructions, see Configure resources for Ververica Flink real-time mode.
-
Variable Configuration: Variables can be defined dynamically in the code without prior declaration. The system automatically extracts them into the parameter list, where you can adjust types and set values. For configuration instructions, see Real-time mode variable configuration.
-
Checkpoint Configuration: Configure checkpoints for the Flink DataStream task to enable state recovery after unexpected failures. For configuration instructions, see Real-time mode Checkpoint configuration.
-
State Configuration: Set the interval for automatic data cleanup in the State. For configuration instructions, see Real-time mode State configuration.
-
Runtime parameters: Configure runtime parameters to control the execution behavior and performance of Flink applications. For configuration instructions, see real-time mode runtime parameter configuration.
-
Dependency Files: Configure the resource files that the task depends on. For configuration instructions, see the real-time mode dependency file configuration guide.
-
Dependencies: Configure dependencies to identify upstream and downstream tasks during debugging. For configuration instructions, see Real-time mode dependency configuration.
-
-
-
Click Confirm.
Step 5: Submit the Flink DataStream task
-
Press the Submit button in the top menu bar.
-
In the Submit dialog box, review the Submission Content and Pre-check details, and provide the Submission Remarks.
-
Click Confirm And Submit.
If your project is in Dev-Prod mode, you must publish the Flink DataStream task to the production environment. For more information, see Manage release packages.
What to do next
View and manage the Flink DataStream task in the Operation Center to ensure that it runs as expected. For more information, see Manage real-time tasks.