This topic guides you through creating, deploying, and starting a Flink SQL job, outlining the basic development and operational workflow.
Prerequisites
-
The RAM user or RAM role you use has the required permissions for the Realtime Compute console. For more information, see permissions.
-
A Flink workspace is created. For more information, see Activate Realtime Compute for Apache Flink.
Step 1: Create an SQL draft
-
Go to the SQL draft creation page.
-
Log on to the Realtime Compute console.
-
Find the target Flink workspace and click Console in the Actions column.
-
In the navigation pane, click .
-
-
Click the
icon, and then click New Blank Stream Draft. Enter a File Name and select an Engine Version.Realtime Compute for Apache Flink provides a variety of code templates and data synchronization templates. Each template includes use case descriptions, code examples, and instructions. You can click a template to quickly learn about product features and syntax to implement your business logic. For more information, see Code templates and Data synchronization templates.
Parameter
Description
Example
File Name
The name of the SQL draft.
NoteThe name must be unique within the current project.
flink-test
Engine Version
The Flink engine version for the SQL draft.
We recommend using versions with the Recommend or Stable tag for better reliability and performance. For more information about engine versions, see Release notes and Engine versions.
vvr-8.0.8-flink-1.17
-
Click Create.
Step 2: Write SQL and view draft configurations
-
Write SQL code.
Copy the following SQL into the editor. This example uses a Datagen connector to generate a random data stream and a Print connector to write the output to the console logs. For more information about supported connectors, see Supported connectors.
-- Create a temporary source table named datagen_source. CREATE TEMPORARY TABLE datagen_source( randstr VARCHAR ) WITH ( 'connector' = 'datagen' -- Use the Datagen connector. ); -- Create a temporary sink table named print_table. CREATE TEMPORARY TABLE print_table( randstr VARCHAR ) WITH ( 'connector' = 'print', -- Use the Print connector. 'logger' = 'true' -- Write the output to logs. ); -- Select a substring from the randstr field and insert it into the sink table. INSERT INTO print_table SELECT SUBSTRING(randstr,0,8) from datagen_source;Note-
This example uses an
INSERT INTOstatement to write data to a single sink table. You can also use theINSERT INTOstatement to write data to multiple sink tables. For more information, see INSERT INTO statement. -
In production, use tables registered in Catalogs instead of temporary tables. For more information, see Catalogs.
-
-
View draft configurations.
On the right of the SQL editor, you can view or configure settings on several tabs.
Tab
Description
More configurations
-
Engine version: For more information, see Engine versions and Lifecycle policies. We recommend that you use a recommended or stable version. The version tags are as follows:
-
Recommend: The latest minor version of the current major version.
-
Stable: The latest minor version of a major version within its service period, with known defects fixed.
-
Normal: Other minor versions that are still within their service periods.
-
Deprecated: Versions that have passed their end-of-life (EOL) date.
-
-
Additional dependencies: Additional dependencies required for the job, such as temporary functions.
-
Kerberos Authentication: Enable Kerberos authentication and configure a registered Kerberos cluster and principal information. If you have not registered a Kerberos cluster, see Register a Hive Kerberos cluster.
Code structure
-
Data flow: Use the data flow diagram to quickly view the data lineage.
-
Tree structure: Use the tree structure diagram to quickly view the data sources.
Version information
You can view the version history of the SQL draft here. For more information about the functions in the Actions column, see Manage draft versions.
-
(Optional) Step 3: Validate and debug the SQL draft
-
Validate the SQL draft.
Validation checks the SQL semantics, network connectivity, and metadata of the tables used in your draft. After validation, you can click SQL Optimization in the results area to view potential SQL risks and optimization suggestions.
-
In the upper-right corner of the SQL editor, click Deep Check.
-
In the Deep Check dialog box, click Confirm.
NoteIf a timeout error occurs, you might see the following message:
The RPC times out maybe because the SQL parsing is too complicated. Please consider enlarging theflink.sqlserver.rpc.execution.timeoutoption in flink-configuration, which by default is120 s.Solution: Add the following configuration parameter at the top of your SQL editor.
SET 'flink.sqlserver.rpc.execution.timeout' = '600s'; -
-
Debug the SQL draft.
The debug feature lets you simulate a job run to check its output and verify your SELECT or INSERT logic. This feature improves development efficiency and reduces data quality risks.
NoteThe debug feature does not write data to the sink table.
-
In the upper-right corner of the SQL editor, click Debug.
-
In the Debug dialog box, select a debug cluster and click Next.
If no session cluster is available, you must create one. The session cluster must use the same engine version as the SQL draft and be running. For more information, see Step 1: Create a session cluster.
-
Configure the debug data and click OK.
For more information about the configuration, see Step 2: Debug a job.
-
Step 4: Deploy the SQL draft
In the upper-right corner of the SQL editor, click Deploy. In the Deploy New Version dialog box, configure the parameters as needed and click OK.
When deploying the draft, you can select a queue or a session cluster as the Deployment target. The following table compares these two options.
|
Deployment target |
Environment |
Key features |
|
Queue |
Production |
|
|
Session cluster |
Development and testing |
Important
Logs are not available for jobs that run on a session cluster. |
Step 5: Start job and view results
-
In the navigation pane, click .
-
Find the target job and click Start in the Actions column.
Select stateless start, and then click Start. The job is running when its status changes to Running. For more information about start parameters, see Start a job.
-
On the Deployment details page, view the Flink job results.
-
On the page, click the name of the target job.
-
On the Logs tab, click the Running TaskManagers subtab. In the Path, ID column, click a TaskManager.
-
Click the Logs tab and search for logs related to PrintSinkOutputWriter.
If you find log entries containing the chain
Source: datagen_source → Calc → Sink: print_table, the data stream is being processed correctly.
-
(Optional) Step 6: Stop the job
To apply changes to a job (such as code modifications, WITH parameter updates, or version changes), you must redeploy, stop, and then restart it. A restart is also required for a stateless start or to apply non-dynamic configuration changes. For more information about stopping a job, see Stop a job.
-
On the page, find the target job and click Cancel in the Actions column.
-
Click OK.
Related documents
-
FAQs for job development and O&M
-
Configure job information
-
You can configure resources before starting a job or modify them for a running deployment. Two resource configuration modes are supported: basic (coarse-grained) and expert (fine-grained). For more information, see Configure job resources.
-
You can configure log levels and specify different outputs for each level. For more information, see Configure job log outputs.
-
-
Development workflows for other job types
-
Best practices for Flink