This topic uses an OSS data source as an example to explain how to configure a data source using the RAM role-based authorization mode to enhance cloud data security.
Prerequisites
To use a RAM user to perform the operations in this topic, you must first grant the AliyunDataWorksFullAccess and AliyunRAMFullAccess policies to the RAM user. For more information, see Manage RAM user permissions.
If you are using an Alibaba Cloud account, you can skip this prerequisite.
In the Add Permissions panel, set Resource scope to Account level, search for and select the two required system policies in the Policies section, and then click Add Permissions.
Background
Data sources are fundamental to data synchronization tasks and crucial for securing your enterprise data. DataWorks lets you use the more secure RAM role authorization mode to configure and access certain data sources, such as OSS, AnalyticDB for MySQL 2.0, LogHub, Tablestore, and Hologres. This mode enhances the security of your cloud data and helps prevent risks such as data source misuse and AccessKey pair leaks.
You can configure a data source using either the RAM role-based authorization mode or the AccessKey pair-based authorization mode. This topic describes how to configure a data source by using the RAM role-based authorization mode. You can select a mode based on your business requirements. The following sections describe how each mode works:
-
Access Key mode
In the less secure AccessKey pair-based authorization mode, you only need to enter the AccessKey pair of an Alibaba Cloud account or a RAM user to complete the configuration.
For example, to configure an OSS data source, you enter the AccessKey pair of an account with the required permissions on the Configure Data Source page.
Select AccessKey pair-based authorization mode for Access mode, fill in connection details such as region, endpoint, and bucket, and then click Connection Configuration to select a resource group and run a connectivity test. After the test passes, click Complete Creation.
When a synchronization task runs, DataWorks uses this AccessKey pair to access OSS to read or write data.
NoteIn Access Key mode, if the AccessKey pair of an Alibaba Cloud account is leaked, the associated OSS data may be compromised.
-
RAM role authorization mode
The RAM role authorization mode is a more secure method for accessing data sources because it does not require an AccessKey pair, eliminating the risk of key leaks.
In the RAM role authorization mode, you authorize the DataWorks service account to assume a role with OSS access permissions, which enables access to the OSS data source without an AccessKey pair.

This mode also supports enterprise needs by letting you assign roles with different permission scopes to different data sources, enabling more granular permission control.
Workflow
The following workflow outlines the steps for using the RAM role-based authorization mode. It also specifies the requirements for RAM users who perform these operations.
-
An Alibaba Cloud account or a RAM user with the AliyunRAMFullAccess policy logs on to the RAM console to define a role to be assumed and a policy for authorization.
-
Assumable role: You need to create a custom role for the DataWorks service account to assume. After assuming the role, the DataWorks service account can access the OSS data source within the permission scope of the role.
-
Authorization policy: A policy that includes the ram:PassRole permission. This policy allows a user to use a specific role to create a data source or run a synchronization task.
-
-
An Alibaba Cloud account or a RAM user with the AliyunRAMFullAccess policy logs on to the RAM console to grant the RAM users who will create data sources (Step 3) and run synchronization tasks (Step 5) the permission to use the role.
NoteIf an unauthorized RAM user creates a data source by using the RAM role authorization mode, synchronization tasks that use this data source will fail.
-
The data source creator creates the data source in DataWorks Data Integration by using the RAM role authorization mode. This allows the DataWorks service account to assume a specific role to access the OSS data source when a synchronization task runs.
NoteThe data source creator can perform this step only after being granted permissions in Step 2.
-
The synchronization task creator goes to DataStudio and creates a synchronization task based on the configured data source.
-
The task executor runs the data synchronization task in DataStudio or Operation Center.
NoteThe task executor can perform this step only after being granted permissions in Step 2.
Procedure
-
Create an assumable role.
You need to create different custom roles for different data sources based on your security requirements. This topic uses the following scenario as an example to illustrate how to create an assumable role.
NoteOnly an Alibaba Cloud account or a RAM user with the AliyunRAMFullAccess policy can perform this step.
Consider an enterprise with 100 buckets that store all of its data, but the big data team only needs to use data from one specific bucket. If the predefined AliyunDataWorksAccessingOSSRole role were used, the big data team might access the other 99 buckets, which poses a management risk.
To prevent this, the account administrator can create a custom role named BigDataOssRole for the big data team and restrict its use to relevant members of that team. This helps implement permission control between teams.
-
Create a custom role.
In this example, create a custom role named BigDataOssRole and set the trusted entity to Alibaba Cloud Account. For detailed steps, see Create a RAM role for a trusted Alibaba Cloud account.
-
Create a custom permission policy.
Create a policy that grants permissions to read data from and write data to only the specified bucket. For detailed steps, see Create a custom policy. The following code shows the policy document:
{ "Version": "1", "Statement": [ { "Effect": "Allow", "Action": [ "oss:GetObject", "oss:ListObjects", "oss:GetObjectMetadata", "oss:GetObjectMeta", "oss:GetBucketAcl", "oss:GetBucketInfo", "oss:PutObject", "oss:DeleteObject", "oss:PutBucket" ], "Resource": [ "acs:oss:*:*:bucket_name_1", "acs:oss:*:*:bucket_name_1/*" ] } ] } -
Grant the role the required permissions.
Modify the trust policy of the BigDataOSSRole role and grant the BigDataOSSRole role the policy created in Step 2. This allows users who assume the BigDataOSSRole role to read data from the two specified buckets.
ImportantYou must perform this step. Otherwise, the role cannot be used.
For more information about how to modify the trust policy of a role, see Modify the trust policy of a RAM role. The following code shows the policy document:
{ "Statement": [ { "Action": "sts:AssumeRole", "Effect": "Allow", "Principal": { "Service": [ "di.dataworks.aliyuncs.com" ] } } ], "Version": "1" }
-
-
Grant users permission to use the role.
After you create the roles to be assumed, you must grant a policy that contains the ram:PassRole permission to the relevant users. This allows them to use the role to create a data source and run synchronization tasks. You can also configure mappings between users and roles.
-
Policy template 1: You can create a policy based on the following template. This template allows the authorized user to use all roles related to DataWorks Data Integration. Grant this policy with caution.
{ "Version": "1", "Statement": [ { "Action": "ram:PassRole", "Resource": "*", "Effect": "Allow", "Condition": { "StringEquals": { "acs:Service": "di.dataworks.aliyuncs.com" } } } ] } -
Policy template 2: You can create a custom policy that contains the ram:PassRole permission and configure mappings between users and roles based on your security requirements.
NoteOnly an Alibaba Cloud account or a RAM user with the AliyunRAMFullAccess policy can perform this step.
Scenario example: As described in the preceding scenario, after the account administrator defines the BigDataOssRole role for the big data team, the administrator must specify which users can use this role. You can create a custom policy named BigDataOssRoleAllowUse and grant it to the relevant users.
Create a policy named BigDataOssRoleAllowUse. For more information, see Create a custom policy. The following code shows the policy document:
{ "Version": "1", "Statement": [ { "Action": "ram:PassRole", "Resource": "acs:ram::19122324****:role/BigDataOssRole", "Effect": "Allow", "Condition": { "StringEquals": { "acs:Service": [ "oss.aliyuncs.com", "di.dataworks.aliyuncs.com" ] } } } ] }NoteReplace the UID (19122324****) in the policy with the UID of your Alibaba Cloud account.
An administrator grants the BigDataOssRoleAllowUse policy to a RAM user who is allowed to use the BigDataOssRole role. A RAM user who is granted the BigDataOssRoleAllowUse policy can then assume the BigDataOssRole role to create data sources (using the BigDataOssRole as the access identity for the data source) and run synchronization tasks.
-
-
Create a data source.
After the account administrator grants permissions to the data source creator, the creator can add the data source.
-
An Alibaba Cloud account or a RAM user with the AliyunDataWorksFullAccess policy creates and configures an OSS data source.
During configuration, select RAM role authorization mode for Access mode and configure the other parameters. If you use a workspace in standard mode, you can add the data source to the Development or Production.
NoteThis topic uses an OSS data source as an example. The parameters may vary for other data source types. For more information about how to add an OSS data source, see Configure an OSS data source.
Parameter
Description
Data Source Name
It can contain letters, digits, and underscores (_) but cannot start with a digit or an underscore.
Description
A brief description of the data source. The description can be up to 80 characters in length.
Endpoint
The endpoint of the OSS service. The format is
http://oss.aliyuncs.com. Endpoints vary by region. You must use the correct endpoint for the region you are accessing.NoteThe correct format for the Endpoint is
http://oss.aliyuncs.com. However, usinghttp://oss.aliyuncs.comwith a bucket value prepended and separated by a dot is incorrect. For example, for an endpoint such ashttp://xxx.oss.aliyuncs.com, a connectivity test may pass, but the synchronization will fail.Bucket
The name of the OSS bucket. A bucket is a container that stores objects.
You can create one or more buckets and add one or more files to each bucket.
DataWorks can find objects only in the bucket that you specify here.
Access mode
Select RAM role authorization mode. This mode uses Security Token Service (STS) to authorize the DataWorks service account to assume a role to access the data source, which provides higher security.
Select Role
Select a RAM role from the Select Role drop-down list.
Region
Select a region from the Region drop-down list.
-
Test the network connectivity.
In the Connection Configuration section, find the target resource group and click Test Connectivity in its row.
Each synchronization task uses a single resource group. You must test the connectivity of the intended resource group to ensure it can connect to the data source; otherwise, the task will fail. To test multiple resource groups at the same time, select the resource groups and click Batch test connectivity. For more information, see Network connectivity solutions.
-
After the connectivity test is successful, click Complete.
-
-
Create a synchronization task.
After the data source is created, developers can create a synchronization task in DataStudio that uses this data source. For more information, see Configure a synchronization task.
-
Run the synchronization task.
The task executor runs the data synchronization task in DataStudio or Operation Center.
NoteWhen running a task in DataStudio, ensure the task executor has the permissions granted in Step 2 (Grant users permission to use the role) to prevent task failures.