To migrate objects like scheduled tasks and tables with Migration Assistant, first export them from a source workspace using the DataWorks export feature, and then import them to a destination workspace using the DataWorks import feature. This topic describes how to create an import task.
Limitations
The following table lists the DataWorks migration features supported by each Migration Assistant edition.
DataWorks migration feature
Basic Edition
Standard Edition
Professional Edition
Enterprise Edition
Number of DataWorks export packages supported per tenant
NoteWhen you create an export task and the export succeeds, an export package is generated.
If the number of export packages exceeds the limit of your edition, you can only upgrade your edition to obtain more export packages. You cannot purchase additional export packages separately.
Up to 10 (cumulative)
Up to 30 (cumulative)
Up to 100 (cumulative)
Unlimited
DataWorks import package size limit
Local file upload
30 MB
30 MB
30 MB
30 MB
OSS file upload
Not supported
Not supported
Unlimited
Unlimited
Whether auto-commit and auto-deploy are supported during DataWorks import
Not supported
Not supported
Supported
Supported
Only Alibaba Cloud accounts and workspace administrators can perform import and export operations. Members with other roles can only view the import and export task lists and do not have operation permissions.
When you export and import Data Quality rules (supported only in legacy Data Studio), take note of the following points:
When you export Data Quality rules, subscription management (alert-related configurations) is not exported.
If the table associated with an imported Data Quality rule does not exist in the destination workspace, the imported rule is not displayed on the Data Quality rules page. If you create the destination table after importing the Data Quality rule, the imported rule is displayed on the rule details page of the table.
The Data Quality import step depends on the successful completion of the scheduled task import step. If an import package contains both Data Quality rules and associated scheduling node information, you must first import the associated nodes to the destination workspace and successfully commit and deploy them before the scheduling nodes can be correctly associated with the Data Quality rules.
If you import only Data Quality rules without importing the associated scheduling nodes, the Data Quality rules are still imported successfully, but without the associated scheduling node information. After you import and commit and deploy the associated scheduling node information, you can re-import the Data Quality rules, and DataWorks updates the rules and associates them with the scheduling nodes.
Data Studio and legacy Data Studio have differences in Migration Assistant features. For details, see the notes in each operation section below.
Export packages from legacy Data Studio workspaces cannot be imported to new Data Studio workspaces.
Prerequisites
Before you create a DataWorks import task, complete the following preparations:
An export task has been created.
You must use DataWorks Migration Assistant to create an export task for the objects that you want to migrate, such as scheduled tasks and tables. For more information, see Create and view DataWorks export tasks.
The import file is ready.
DataWorks lets you upload an import file from your local computer or Alibaba Cloud Object Storage Service (OSS). Prepare the file based on the upload method you choose:
Local upload: Download the export package that is generated in the preceding step to your local computer. For more information, see Download an export package.
OSS upload (not supported in new Data Studio): Download the export package that is generated in the preceding step to your local computer and then upload the package to OSS. For more information, see Simple upload.
Go to Migration Assistant
Log on to the DataWorks console, switch to the target region, and then click More > Migration Assistant in the left-side navigation pane to go to the Migration Assistant Homepage.
Create an import task
New Data Studio
In the left-side navigation pane of Migration Assistant, click .
On the import task list page, click Create Import Task in the upper-right corner.
In the Create Import Task dialog box, configure the parameters.
Parameter
Description
Import Package Name
The import name can contain only letters, Chinese characters, digits, underscores (_), and periods (.).
Upload Method
You can upload the import file from your local computer or OSS. The following options are available:
Local Upload: Click Upload File and follow the on-screen instructions to upload and verify the local file.
NoteThe maximum size of a local file that can be uploaded is 30 MB.
After The resource package passes verification. is displayed, you can click Preview File to view the details of the file to be imported.
Remarks
Enter a brief description of the import task.
Click OK to go to the Import Task Settings page.
Before you import the task, you must verify the format and content of the import file. You can click OK only after the verification is passed.
Configure the import task.
When you configure the import task, you must configure Computing Resource Mapping (the following figure uses the MaxCompute compute engine as an example). Other configurations are optional. You can configure them based on your business requirements.
NoteIf you import and export between different workspaces under the same Alibaba Cloud account in the same region, you only need to configure engine instance mapping.
In the Computing Resource Mapping section, configure the computing resource mapping between the source workspace and the destination workspace.
The Computing Resources in Destination Workspace field displays the display name of the computing resource bound to Data Studio in the destination workspace, not the project name used to create the computing resource. You can go to Data Studio, click the
icon in the left-side navigation pane, and go to the Computing Resources list page to view the display name of the computing resource.If Data Studio in the source workspace is bound to multiple types of computing resources but the destination workspace is bound to only one type of computing resource, the import task fails because the destination workspace does not have the permissions to create other types of nodes.
Optional: In the Resource Group Mapping section, modify the resource group mapping between the source workspace and the destination workspace to prevent resource groups from being not found when tasks are running.
Optional: In the Dependency Mapping section, configure project mapping for related nodes.
When you import a task, if the task uses the source workspace name (for example, the task code, the input name of the current node, or the output name of the current node contains the source workspace name), you can modify the New Project Name to quickly replace the related names with the new workspace name and ensure that dependencies are correct after the task is imported.
Optional: In the Configure Import Policy section, configure the data source conflict policy.
If the name of an imported data source conflicts with an existing data source in the destination workspace, you can select Skip (default; retains the existing data source in the destination workspace) or Replace (overwrites the existing data source with the data source in the import package). You can preview conflicts to confirm the scope of impact.
Optional: In the Committing Rule section, you can Change Owner.
NoteIf an object with the same name already exists in the destination workspace, the commit fails.
If you choose not to change the owner and the source task has no owner, the submitter is set as the task owner.
Click Start Import in the lower-left corner.
In the Prompt dialog box, click OK.
Legacy Data Studio
In the left-side navigation pane of Migration Assistant, click .
On the import task list page, click Create Import Task in the upper-right corner.
In the Create Import Task dialog box, configure the parameters.
Parameter
Description
Import Package Name
The import name can contain only letters, Chinese characters, digits, underscores (_), and periods (.).
Upload Method
You can upload the import file from your local computer or OSS. The following options are available:
Local Upload: Click Upload File and follow the on-screen instructions to upload and verify the local file.
NoteThe maximum size of a local file that can be uploaded is 30 MB.
OSS Object: Enter the OSS Endpoint and click Check. You can log on to the OSS console and follow these steps to obtain the URL of the file.
NoteOnly DataWorks Professional Edition and higher support OSS upload.
In the left-side navigation pane, click Bucket List. Go to the target bucket, and then choose File Management > File List in the left-side menu. Click the target file. In the details panel on the right side, find the URL field and click Copy File URL to obtain the URL.
After The resource package passes verification. is displayed, you can click Preview File to view the details of the file to be imported.
Remarks
Enter a brief description of the import task.
Click OK to go to the Import Task Settings page.
Before you import the task, you must verify the format and content of the import file. You can click OK only after the verification is passed.
Configure the import task.
When you configure the import task, you must configure Computing Resource Mapping (the following figure uses the MaxCompute compute engine as an example). Other configurations are optional. You can configure them based on your business requirements.
NoteIf you import and export between different workspaces under the same Alibaba Cloud account in the same region, you only need to configure engine instance mapping.
In the Computing Resource Mapping section, configure the computing resource mapping between the source workspace and the destination workspace.
The Computing Resources in Destination Workspace field displays the display name of the computing resource bound to Data Studio in the destination workspace, not the project name used to create the computing resource. You can go to Data Studio, click the
icon in the left-side navigation pane, and go to the Computing Resources list page to view the display name of the computing resource.If Data Studio in the source workspace is bound to multiple types of computing resources but the destination workspace is bound to only one type of computing resource, the import task fails because the destination workspace does not have the permissions to create other types of nodes.
Optional: In the Resource Group Mapping section, modify the resource group mapping between the source workspace and the destination workspace to prevent resource groups from being not found when tasks are running.
Optional: In the Dependency Mapping section, configure project mapping for related nodes.
When you import a task, if the task uses the source workspace name (for example, the task code, the input name of the current node, or the output name of the current node contains the source workspace name), you can modify the New Project Name to quickly replace the related names with the new workspace name and ensure that dependencies are correct after the task is imported.
Optional: In the Conflict Rules section, configure the conflict policies for data sources and workflow parameters.
If the name of an imported data source or workflow conflicts with an existing data source in the destination workspace, you can select Skip (default; retains the existing data source in the destination workspace) or Replace (overwrites the existing data source with the data source in the import package). You can preview conflicts to confirm the scope of impact.
Optional: In the Dry-run Property section, click Set to Dry-run next to the corresponding node.
You can also select multiple nodes that you want to set to dry-run and click Batch Set to Dry-run.
This configuration is used to set the time property in the scheduling parameters for scheduled tasks. After a node is set to dry-run, it runs successfully without generating data.
Optional: In the DataService Studio Settings section, Specify Mappings for DataService Studio Business Processes for the corresponding DataService Studio instances.
Optional: In the Committing Rule section, you can configure the committing rules for Resource, Function, and Table, and Change Owner.
NoteIf an object with the same name already exists in the destination workspace, the commit fails.
If you choose not to change the owner and the source task has no owner, the submitter is set as the task owner.
Click Start Import in the lower-left corner.
In the Prompt dialog box, click OK.
View import tasks
On the Import Task List page, different operations are displayed for tasks in different states:
After a task is imported, you can click View Import Report next to the task on the Import Task List page to view the Basic Information, Import Results, Details, and Import Settings of the import task.
For a task in the Editing state, you can perform the following operations:
Click Continue to Edit next to the task to modify the task configuration on the Import Task Settings page.
Click Preview next to the task to view the Basic Information, Overview, and Details of the import file.
Click Delete next to the task, and then click OK in the confirmation dialog box to delete the import task.
For a task in the Import Failed state, you can click Import Again next to the task (supported only in legacy Data Studio). In the Import Progress dialog box, after the import is completed, click Return to Import Task List. New Data Studio (Data Studio) does not support re-importing. If an import fails, you must create a new import task.
New Data Studio (Data Studio) supports terminating an import task that is in progress.