Risk identification rule response use case

更新时间:
复制 MD 格式

DataWorks' OpenEvent capability lets you subscribe to event messages. You can register your services as a DataWorks extension to capture and respond to these events, which allows you to implement custom notifications and process controls. This topic shows you how to develop and verify a risk identification rule by using an example scenario: blocking or requiring approval for data downloads that exceed 1,000 rows.

Background

Controlling data downloads is a critical part of enterprise risk management. Typically, data developers and analysts can browse and use data within the platform, but they are not allowed to download detailed data to local machines for analysis. Once data is exported, its usage cannot be audited or controlled. This creates a risk of data misuse, leaks, or malicious attacks that can cause security incidents. This guide demonstrates how to block data export operations in real time.

Objective

Automatically block or trigger an approval process for any data download that exceeds 1,000 rows.

Prerequisites

  • You have activated DataWorks Enterprise Edition. This solution relies on its Open Platform capabilities.

  • You have activated EventBridge. This service receives event messages about user operations, and the risk detection extension consumes them.

  • You have an ECS instance or an on-premises server to deploy the risk detection extension.

Step 1: Configure message subscription

  1. Enable and configure message subscription.

    Because downloading query results is not a workspace-specific operation, this example uses the default bus to receive event messages.

  2. Query for events of type dataworks:ResourcesDownload:DownloadResources.

  3. In the Actions column, click Details to view the event message body. The following is an example:

    Important
    • The message body provides context for risk assessment. You can use the key fields in the following table as contextual information for risk assessment in other use cases.

    • If you need to use a RAM user or a RAM role to read events from the default bus, you must grant the required RAM permissions.

    {
      "datacontenttype": "application/json;charset=utf-8",
      "aliyunaccountid": "110755000425****",
      "aliyunpublishtime": "2023-12-05T07:25:31.708Z",
      "data": {
        "eventCode": "download-resources",
        "extensionBizId": "audit_4d7ebb42b805428483148295a97a****",
        "extensionBizName": "DataWorks_IDE_Query_20231205152530.csv",
        "requestId": "77cac0c2fc12cecbf1d289128897****@@ac15054317017611303051804e****",
        "appId": ****,
        "tenantId": 52425742456****,
        "blockBusiness": true,
        "eventBody": {
          "sqlText": "SELECT * FROM table_1",
          "queryDwProjectId": "****",
          "moduleType": "develop_query",
          "operatorBaseId": "110755000425****",
          "datasourceId": "1****",
          "queryDwProjectName": "yongxunQA_emr_chen****",
          "dataRowSize": 4577,
          "datasourceName": "odps_source",
          "operatorUid": "110755000425****"
        },
        "operator": "110755000425****"
      },
      "aliyunoriginalaccountid": "110755000425****",
      "specversion": "1.0",
      "aliyuneventbusname": "default",
      "id": "169d171c-d523-4370-a874-bb0fa083****",
      "source": "acs.dataworks",
      "time": "2023-12-05T15:25:31.588Z",
      "aliyunregionid": "cn-chengdu",
      "type": "dataworks:ResourcesDownload:DownloadResources"
    }

    Key parameter descriptions:

    Parameter

    Description

    sqlText

    The SQL query.

    queryDwProjectId

    The ID of the data source's workspace.

    moduleType

    The download source. Valid values:

    • develop_query: A query from DataStudio.

    • sqlx_query: A query from DataAnalysis.

    • dw_excel: A Workbook in DataAnalysis.

    operatorBaseId

    The user ID (UID) of the operator.

    datasourceId

    The ID of the queried data source.

    queryDwProjectName

    The name of the data source's workspace.

    dataRowSize

    The number of data rows to download.

    datasourceName

    The name of the queried data source.

Step 2: Develop and deploy the extension

  1. Prepare for development:

    Enable message subscription, register an extension, and obtain the information required for development. For more information, see Develop and deploy an extension: Self-managed service.

  2. Develop and deploy the extension.

    Use the information you obtained to develop and deploy the extension as an application service. For more information, see Develop and deploy an extension: Function Compute. The following section describes key parameters and provides sample code:

    • When you register the extension, for the Processed Extension Points parameter, select Pre-event for Resources Download.

    • The following sample code shows how to develop the extension:

      Important
      • This sample extension uses the dataRowSize field from the event message body in Step 1 to assess risk based on the number of rows to be downloaded.

      • When you configure the response, to implement approval, ensure that callbackExtensionRequest.setCheckResult() returns WARN when the extension identifies a user's risk behavior. To implement blocking, callbackExtensionRequest.setCheckResult() should return FAIL.

      • The sample code in this topic uses a limit of 1,000 rows as an example. If you want to trigger different approval workflows based on different download sizes, you can configure multiple extensions. For more information, see Step 3: Configure a risk identification rule. For example:

        • The first extension is triggered for downloads of 0 to 2,000 rows and is mapped to approval workflow 1.

        • The second extension is triggered for downloads of 2,001 or more rows and is mapped to approval workflow 2.

      package com.aliyun.dataworks.demo;
      import com.alibaba.fastjson.JSON;
      import com.alibaba.fastjson.JSONObject;
      import com.aliyun.dataworks.config.Constants;
      import com.aliyun.dataworks.config.EventCheckEnum;
      import com.aliyun.dataworks.config.ExtensionParamProperties;
      import com.aliyun.dataworks.services.DataWorksOpenApiClient;
      import com.aliyun.dataworks_public20200518.Client;
      import com.aliyun.dataworks_public20200518.models.CallbackExtensionRequest;
      import com.aliyun.dataworks_public20200518.models.CallbackExtensionResponse;
      import com.aliyun.dataworks_public20200518.models.GetOptionValueForProjectRequest;
      import com.aliyun.dataworks_public20200518.models.GetOptionValueForProjectResponse;
      import org.springframework.beans.factory.annotation.Autowired;
      import org.springframework.web.bind.annotation.PostMapping;
      import org.springframework.web.bind.annotation.RequestBody;
      import org.springframework.web.bind.annotation.RequestMapping;
      import org.springframework.web.bind.annotation.RestController;
      /**
       * @author DataWorks Demo
       */
      @RestController
      @RequestMapping("/extensions")
      public class ExtensionsController {
          @Autowired(required = false)
          private DataWorksOpenApiClient dataWorksOpenApiClient;
          @Autowired
          private ExtensionParamProperties extensionParamProperties;
          /**
           * Receives messages pushed from EventBridge.
           *
           * @param jsonParam
           */
          @PostMapping("/consumer")
          public void consumerEventBridge(@RequestBody String jsonParam) {
              JSONObject jsonObj = JSON.parseObject(jsonParam);
              String eventCode = jsonObj.getString(Constants.EVENT_CODE_FILED);
              if (Constants.COMMIT_FILE_EVENT_CODE.equals(eventCode) || Constants.DEPLOY_FILE_EVENT_CODE.equals(eventCode)) {
                  // Initializes the client.
                  Client client = dataWorksOpenApiClient.createClient();
                  try {
                      // Retrieves the parameter information of the current event.
                      String messageId = jsonObj.getString("id");
                      JSONObject data = jsonObj.getObject("data", JSONObject.class);
                      // Long projectId = data.getLong("appId");
                      // Initializes the event callback.
                      CallbackExtensionRequest callbackExtensionRequest = new CallbackExtensionRequest();
                      callbackExtensionRequest.setMessageId(messageId);
                      callbackExtensionRequest.setExtensionCode(extensionParamProperties.getExtensionCode());
                      JSONObject eventBody = data.getJSONObject("eventBody");
                      Long dataRowSize = eventBody.getLong("dataRowSize");
                      // Obtains the configurations of the extension option in the workspace.
                      GetOptionValueForProjectRequest getOptionValueForProjectRequest = new GetOptionValueForProjectRequest();
                      // The project ID for the configuration of a global extension point event is -1 by default.
                      getOptionValueForProjectRequest.setProjectId("-1");
                      getOptionValueForProjectRequest.setExtensionCode(extensionParamProperties.getExtensionCode());
                      GetOptionValueForProjectResponse getOptionValueForProjectResponse = client.getOptionValueForProject(getOptionValueForProjectRequest);
                      JSONObject jsonObject = JSON.parseObject(getOptionValueForProjectResponse.getBody().getOptionValue());
                      // Note: You must set this parameter based on the format configured in DataWorks.
                      Long maxDataRowSize = jsonObject.getLong("dataRowSize");
                      // Checks if the number of rows to download exceeds the limit.
                      if (dataRowSize > 1000) {
                          callbackExtensionRequest.setCheckResult(EventCheckEnum.FAIL.getCode());
                          callbackExtensionRequest.setCheckMessage("The number of rows to download exceeds the limit.");
                      } else { // Successful callback.
                          callbackExtensionRequest.setCheckResult(EventCheckEnum.OK.getCode());
                      }
                      // Sends the callback to DataWorks.
                      CallbackExtensionResponse acsResponse = client.callbackExtension(callbackExtensionRequest);
                      // The unique ID of the request, which can be used for troubleshooting.
                      System.out.println("acsResponse:" + acsResponse.getBody().getRequestId());
                  } catch (Exception e) {
                      // Error description.
                      System.out.println("ErrMsg:" + e.getMessage());
                  }
              } else {
                  System.out.println("Failed to filter other events. Check your configuration.");
              }
          }
      }

Step 3: Configure a risk identification rule

  1. Log on to the DataWorks console. In the target region, click Data Governance > Security Center in the left-side navigation pane. On the page that appears, click Go to Security Center.

  2. In the left-side navigation pane, click Security policy > Risk identification rules.

  3. Configure an approval process for the published extension. For more information, see Configure a risk response. At the top of the page, you can use the Operation Event and Extension Name drop-down lists to filter rules. The table lists each extension's name, code, owner, source, status, and response. In the Actions column, you can click Configure Response to set the response action, or click Create Extension to add a new extension.

Step 4: Enable the risk identification rule

Turn on the Enable switch and follow the prompts. On the Risk identification rules page, filter by Operation Event. Find the target rule and the corresponding extension (for example, Approval-triggering Extension). Turn on the switch in the Enable column to enable the rule. To change the response method, click Configure Response.

Step 5: Verify the results

  1. Go to the Data Download page.

  2. For the desired file, click Download in the Operation column.

    • If the check passes, the download proceeds.

    • If the check fails, the download is blocked, or you are prompted to submit an application for approval.

Other use cases

You can use other fields in the download event payload, such as the workspace name, SQL details, data source name, or user ID (UID), to implement other real-time risk control scenarios that fit your business needs. Examples include:

  • Allowing or denying data downloads based on the user's department, which is represented by the workspace.

  • Blocking downloads if the SQL query contains sensitive fields.

  • Implementing tiered risk control. For example, requiring approval for downloads of more than 20,000 rows and blocking downloads of more than 50,000 rows.

  • Defining download limits based on workspace roles. For example, you can allow a user with the Developer role to download N rows and a user with the Analyst role to download M rows before the action is blocked. This requires using the ListProjectMembers - Query workspace members API.

  • Setting different download limit policies for DataStudio and DataAnalysis scenarios.