DataWorks offers end-to-end data development and governance through a unified process that lets you implement process management at key stages to meet your business needs.
Limitations
This feature is available only in DataWorks Enterprise Edition and later versions.
How it works
DataWorks provides workspaces in two modes: standard mode and basic mode. Each mode has a different node development process. The following diagrams illustrate the data development workflows for each mode.
Node development workflow in a standard mode workspace
Node development workflow in a basic mode workspace
You can implement process management at key stages of the workflow, such as before node debugging, deploying a node to the development environment, and deploying a node to the production environment.
Stage | Check example |
Before a node is run | In the Debugging Configurations panel on the right, under DataWorks Configuration, set the resource group to Serverless_Resource_Group and the computing quota to Default post-paid Quota. In the editor, enter the SQL statement |
Before a node is deployed to the development environment | When you deploy a node, the system runs the Development Checker to review the SQL code. Click the Development Checker step in the deployment workflow. The Operation Checks panel on the left displays the check record and its status. Click the record to open the Check Details panel, where you can view the results of each check item, such as Full Table Scan Check and High-Cost SQL Check. For check items with a Warning status, you can click Mark as Passed to manually approve them. |
Before a node is deployed to the production environment | In the Production Checker step of the deployment pipeline, you can click View Details or Initiate Code Review. The left panel lists the SQL records to be deployed, such as testSQL. Click a record to open the corresponding Check Details panel on the right. After you confirm that all checks have passed, you can proceed with the deployment. |
Features like Open Platform and Data Asset Governance let you implement validation checks at key stages of the data development process.
Feature module | Pre-run check | Dev pre-deployment check | Prod pre-deployment check | Process management |
Data Asset Governance |
|
|
| Data Asset Governance in DataWorks provides multiple built-in check items. You can enable these check items based on your business needs. Once enabled, the corresponding operation automatically triggers DataWorks' built-in validation logic. The process proceeds only after the check is complete. |
Open Platform |
|
|
| If the built-in check items do not meet your process management requirements, you can use Open Platform to develop your own validation programs for specific events and integrate them into the data development workflow. |
The following sections use the workflow in a standard mode workspace as an example to describe these process management capabilities.
Built-in checks: Data Asset Governance
Data Asset Governance in DataWorks provides multiple built-in check items. You can enable them as needed. When you perform a relevant operation, it triggers the built-in validation logic. The process proceeds only after the operation passes all checks.
In the left-side navigation pane of Data Asset Governance, choose Governance Settings > Check Items to go to the Check Item Configuration page. Click the R&D tab. In the Effective Checkpoint column, identify the trigger point for a check item. In the Enable column, toggle the switch to enable or disable the corresponding check item.
For node debugging: Enable check items where the Effective Checkpoint is Pre-event for Code Running.
For node deployment to the development environment: Enable check items where the Effective Checkpoint is Pre-event for Node Commit.
For node deployment to the production environment: Enable check items where the Effective Checkpoint is Pre-event for Node Deployment.
Configuration
You need to enable check items in Data Asset Governance and specify the workspace in which they take effect. For general instructions, see Configure governance items.
Customize validation logic: Open Platform
If the built-in check items do not meet your requirements, you can use Open Platform to develop custom validation programs for specific events and integrate them into the data development workflow.
OpenEvent lets you subscribe to event messages for user operations in Data Studio. After receiving an event message, Extensions allows you to create a custom validation and approval program. You can then send the approval results back to DataWorks through an OpenAPI callback. For more information about OpenEvent and Extensions, see OpenEvent overview and Extensions overview.
When you use Open Platform to subscribe to events and validate key operations in Data Studio, performing an operation triggers the validation process. The following flowchart shows the validation process for a pre-run check.
Configuration
You need to subscribe to Data Studio events in Open Platform, develop an extension to handle these events, publish the extension to DataWorks, and enable it in the desired workspace.
For node debugging: Subscribe to and process run-related events, such as pre-event for file running.
For node deployment to the development environment: Subscribe to and process commit-related events, such as Pre-event for Node Commit and pre-event for table commit.
For node deployment to the production environment: Subscribe to and process deployment-related events, such as Pre-event for Node Deployment and pre-event for table deployment.
For a list of event types supported by Open Platform, see Extensions overview.
For general instructions on using Open Platform, see Develop and deploy an extension using a self-managed service.
For best practices in typical process management scenarios, see the following topics:





