This solution helps DataWorks users improve development efficiency. It uses a Secure Shell (SSH) connection to link your local Visual Studio Code editor with a personal development environment in the cloud. This method lets you use your favorite local tools while accessing cloud computing and data resources, allowing you to code and debug more efficiently.
Process overview
Environment preparation: Generate an SSH key pair on your local computer.
Environment configuration: Configure the SSH public key and access method for your personal development environment in DataWorks.
Service installation: Install and start the SSH service in your personal development environment.
Remote connection: Connect to your personal development environment using Visual Studio Code.
Step 1: Generate an SSH key pair
First, generate an SSH key pair on your local development machine, which is the computer you will connect from.
macOS
Run the generation command: Open the Terminal application. In the terminal, enter
ssh-keygenand press Enter.ssh-keygenUse default settings: The system asks for a save path and a password. To ensure a seamless connection with other tools, press the
Enterkey three times to accept all default options.The system then generates two files in the default path:
id_rsa: The private key. Keep it safe and do not share it.id_rsa.pub: The public key. This key is used to configure the DataWorks environment.
Copy the public key: The key is saved by default in the
/Users/<your_username>/.ssh/directory. The fastest way to copy the key is to run the following command in the terminal. This command automatically copies the public key content to the clipboard:pbcopy < ~/.ssh/id_rsa.pub
Window
Run the generation command: Search for and open PowerShell from the Start menu, or use Git Bash. In the terminal, enter
ssh-keygenand press Enter.ssh-keygenUse default settings: The system asks for a save path and a password. To ensure a seamless connection with tools such as VS Code, press the
Enterkey three times to accept all default options. This uses the default path and sets no password.The system then generates two files in the default path:
id_rsa: The private key. Keep it safe and do not share it.id_rsa.pub: The public key. This key is used to configure the DataWorks environment.
Copy the public key: The key is saved by default in the
C:\Users\<your_username>\.ssh\directory. Open the public key fileid_rsa.pubwith a text editor such as Notepad and copy its entire content.
Linux (not recommended)
Visual Studio Code requires a graphical user interface (GUI). Use a system with a GUI.
Run the generation command: Open the Terminal application. In the terminal, enter
ssh-keygenand press Enter.ssh-keygenUse default settings: The system asks for a save path and a password. To ensure a seamless connection with other tools, press the
Enterkey three times to accept all default options.The system then generates two files in the default path:
id_rsa: The private key. Keep it safe and do not share it.id_rsa.pub: The public key. This key is used to configure the DataWorks environment.
Copy the public key:
The key for the root user is saved in
/root/.ssh/.The key for a regular user is saved in
/home/<your_username>/.ssh/.Run the following command in the terminal to display the public key content, and then manually copy all the output text:
cat ~/.ssh/id_rsa.pub
Step 2: Configure the personal development environment
Go to the Workspaces page in the DataWorks console. In the top navigation bar, select a desired region. Find the desired workspace and choose in the Actions column.
Open the personal development environment.
In the top navigation bar, click the
icon next to Personal Development Environment · Please Select, and then click the personal development environment that you created.In the Personal Development Environment - Instance Configuration dialog box that appears, select a configuration method.
For more information, see Create a personal development environment instance.
Scenario A: Access over the public network (Flexible, but with extra costs)
This scenario applies when you connect directly from any network with public network access.
On the configuration page, configure the following settings:
Enable SSH: Turn on this switch.
SSH Public Key: Paste the public key that you copied in Step 1 into this field.
SSH Access Method: Select both Log on within VPC and Log on over Public Network.
NAT Gateway: Select an existing NAT Gateway, or create a new one as prompted on the page.
Elastic IP Address: Select an existing EIP, or create a new one as prompted on the page.
ImportantIf a virtual private cloud (VPC) is already configured in the network settings, configure the NAT Gateway for that VPC. Otherwise, configure the NAT Gateway for the default VPC of the resource group.
Click Submit Changes and wait for the configuration to take effect.
ImportantBilling reminder: The NAT Gateway and Elastic IP Address (EIP) used for public network access are cloud products that are billed separately. These products continue to incur charges even when your development environment instance is stopped. If you no longer need them, go to the Alibaba Cloud Management Console to manually delete them and avoid unnecessary costs.
Scenario B: Access over the internal VPC network (Secure, no extra network costs)
This scenario applies in the following two situations:
Your on-premises network is connected to the Alibaba Cloud VPC through a VPN or a leased line.
You are connecting from another ECS instance within the same VPC.
On the configuration page, configure the following settings:
Enable SSH: Turn on this switch.
SSH Public Key: Paste the public key that you copied in Step 1 into this field.
NAT Gateway: No configuration is required.
Elastic IP Address: No configuration is required.
Click Submit Changes and wait for the configuration to take effect.
Step 3: Install the SSH service
Go to the Data Development page and open your configured personal development environment.
In the Terminal at the bottom, run the following commands:
# Update the software list and install the SSH server sudo apt-get update sudo apt-get install openssh-server -y # Start the SSH service sudo service ssh start # (Optional) Check the SSH service status. "active (running)" indicates success sudo service ssh statusTo pass the environment variables from the personal development environment during a local connection, run the following commands:
sed -i 's/^#\?PermitUserEnvironment.*$/PermitUserEnvironment yes/' /etc/ssh/sshd_config cat /proc/1/environ | tr '\0' '\n' > ~/.ssh/environment service ssh restart
Step 4: Connect to the development environment using Visual Studio Code
Obtain the connection command
In the top navigation bar, click the
icon next to Personal Development Environment · Select and then click Manage Development Environment. View the Access Configuration for the target instance.Public network configuration: Copy the command from Public Network Access Method.
VPC-only configuration: Copy the command from VPC Access Method.
Install the Visual Studio Code extension
ImportantThis applies only to graphical systems, such as Windows and macOS. If you use a Linux system without a GUI, connect through the terminal.
In the Visual Studio Code Marketplace, search for and install the
Remote - SSHextension.Add and connect to the host
Click the remote connection icon
in the lower-left corner of Visual Studio Code.In the menu that appears, select
Connect to Host...>Add New SSH Host....Paste the full SSH command that you copied in Step 1 into the input box, and then press Enter.
ImportantIf your private key is not in the default path, you must specify its location by adding the `-i` flag to the command. For example:
ssh -i /your/path/rsa root@xx.xx.xx.xx -p 1024.Select the default SSH configuration file when prompted.
A notification appears in the lower-right corner. Click
Connect.A new Visual Studio Code window opens and attempts to connect. After the connection is successful, the address of the SSH-connected host is displayed in the lower-left corner of the window.
Open the working directory
After connecting, Visual Studio Code might open to the
/rootdirectory by default. Your code files are usually located in/mnt/workspace.Click the Open Folder button in the resource manager on the left.
In the path that appears, enter
/mnt/workspaceand click OK.You can now view and edit all your files from the directory in your DataWorks environment within Visual Studio Code.
Appendix: Differences between local and DataStudio development
EMR Serverless Spark development
Unlike developing in DataStudio, when you use a local Visual Studio Code, you must add the following parameters: --project {EMR Serverless Spark cluster ID} --endpoint {EMR Serverless Spark endpoint}.

MaxFrame development
Unlike developing in DataStudio, when you use a local Visual Studio Code, you must add the following parameters: --project {MaxCompute project name} --endpoint {MaxCompute endpoint}.