This topic describes how to submit a data import task using Dolphin (public preview).
Add data sources
Before you use Dolphin, you must add a compute engine data source (a Spark data source), a Lindorm wide table data source, and a source data source (an ODPS data source or an HDFS data source).
Log on to Dolphin
Create a project
In the top navigation bar, click Project Management.
On the Project Management page, click Create Project and specify the required parameters.
Click OK, and then click the project name to open its management page.
Create a workflow and submit the task
In the left-side navigation pane, select Workflow Definition.
Click Create Workflow.
From the Common Components section, drag the SHELL component to the canvas on the right and then edit it.
Enter a node name. In the Script text box, enter the task submission script. The following code is an example:
#!/bin/bash # The LTS URL for task submission. Replace the value with your LTS instance ID. MASTER_URL="http://ld-bp1i13ko5e222****-master3-001.lindorm.rds.aliyuncs.com:12311/pro/proc/" # Define the API endpoint for job submission. SUBMIT_URL="${MASTER_URL}bulkload/create" echo "Submit URL: $SUBMIT_URL" # Define the payload for the job submission. # The reader configuration. This example uses MaxCompute as the source. READERCONFIG='{ "partition":[ "pt=20250212" ], "numPartitions":20, "column":[ "rowkey", "stringcol1", "longcol1", "doublecol1" ], "table":"test_table_10c" }' # The writer configuration. WRITERCONFIG='{ "columns":[ "rowkey", "stringcol1", "longcol1", "doublecol1" ], "namespace":"default", "lindormTable":"test_table_10c_mig", "compression":"zstd" }' # The number of compute engine executors for the task. INSTANCE='5' # SRC specifies the name of the ODPS data source. You must first add the data source on the LTS page. DST specifies the name of the Lindorm wide table data source. SRC='odps' DST='ld-bp1i13ko5e222****' # The specifications for the compute engine executor and driver. DRIVERSPEC='large' EXECUTORSPEC='xlarge' # The source file type. This parameter is optional if the source is MaxCompute. FILETTYPE='Parquet' create_payload() { local result="" local first=true for arg in "$@"; do if [[ "$first" == true ]]; then result="$arg" first=false else result="$result&$arg" fi done echo "$result" } PAYLOAD=$(create_payload "src=$SRC" "dst=$DST" "readerConfig=$READERCONFIG" "writerConfig=$WRITERCONFIG" "instance=$INSTANCE" "driverSpec=$DRIVERSPEC" "executorSpec=$EXECUTORSPEC" "fileType=$FILETTYPE") echo "PAYLOAD: $PAYLOAD" # Submit the job. submit_response=$(curl -d "$PAYLOAD" -H "Content-Type: application/x-www-form-urlencoded" -X POST "$SUBMIT_URL") echo $submit_response # Parse the submission response to extract the success status and message. success=$(echo $submit_response | grep -o '"success":"[^"]*' | cut -d'"' -f4) message=$(echo $submit_response | grep -o '"message":"[^"]*' | cut -d'"' -f4) # Check if the job was submitted successfully. if [[ "$success" != "true" ]]; then echo "Job submission failed: $message" exit 1 fi # Parse the submission response to extract the job ID. job_id=$message echo "JobId: $job_id" STATUS_URL="${MASTER_URL}${job_id}/info" echo $STATUS_URL # Check the job status. while true; do # Get the job status. status_response=$(curl --silent --request GET "$STATUS_URL") # Parse the status response to extract the state. state=$(echo $status_response | sed 's/.*"state":"\([^"]*\)".*/\1/') # Print the current state. echo "Current job state: $state" # Check if the job is complete. if [[ $state == "SUCCESS" ]]; then echo "Job completed successfully." exit 0 elif [[ $state == "FAILED" ]]; then echo "Job failed." exit 1 elif [[ $state == "KILLED" ]]; then echo "Job killed." exit 1 fi # Check the status again after 60 seconds. sleep 60 done