Submit a data import task using an API operation

更新时间:
复制 MD 格式

Submits Spark-based bulk load tasks to Lindorm and manages them through their lifecycle — check status, stop, retry, or delete.

Prerequisites

Before you begin, ensure that you have:

  • A Lindorm instance with the bulk load service enabled

  • The master hostname of your Lindorm instance (find it in the Basic Information section of the Cluster Information page in the Lindorm console)

  • Source and destination data sources configured

Submit a task

Endpoint: POST http://{BDSMaster}:12311/pro/proc/bulkload/create

Replace {BDSMaster} with the master hostname of your Lindorm instance.

Obtain hostname

Request skeleton:

curl -d "src=<source> \
  &dst=<destination> \
  &readerConfig=<json> \
  &writerConfig=<json> \
  &driverSpec=large \
  &instances=<count> \
  &fileType=<CSV|Parquet> \
  &sparkAdditionalParams=<params>" \
  -H "Content-Type: application/x-www-form-urlencoded" \
  -X POST http://{BDSMaster}:12311/pro/proc/bulkload/create

Parameters:

ParameterRequiredDescription
srcRequiredThe name of the source data source.
dstRequiredThe name of the destination data source.
readerConfigRequiredThe read plugin configuration, in JSON format. For a configuration example, see General-purpose batch import service.
writerConfigRequiredThe write plugin configuration, in JSON format. For a configuration example, see General-purpose batch import service.
driverSpecRequiredThe driver size. Valid values: small, medium, large, xlarge. Set this parameter to large.
instancesRequiredThe number of executors.
fileTypeConditionalThe file format of the source data. Required when the source data source is HDFS. Valid values: CSV, Parquet.
sparkAdditionalParamsOptionalExtension parameters for the Spark job.

Example:

curl -d "src=hdfs&dst=ld&readerConfig={\"filePath\":\"parquet/\",\"column\":[\"id\",\"intcol\",\"doublecol\",\"stringcol\",\"string1col\",\"decimalcol\"]}&writerConfig={\"columns\":[\"ROW||String\",\"f:intcol||Int\",\"f:doublecol||Double\",\"f:stringcol||String\",\"f:string1col||String\",\"f:decimalcol||Decimal\"],\"namespace\":\"default\",\"lindormTable\":\"bulkload_test\",\"compression\":\"zstd\"}&driverSpec=large&instances=5&fileType=Parquet" -H "Content-Type: application/x-www-form-urlencoded" -X POST http://{LTSMaster}:12311/pro/proc/bulkload/create

Response: The message field contains the task ID. Save this ID — you need it for all subsequent operations.

{"success":"true","message":"proc-91-ff383c616e5242888b398e51359c****"}

Get task information

Endpoint: GET http://{LTSMaster}:12311/pro/proc/{procId}/info

Replace {procId} with the task ID returned when you submitted the task.

Example:

curl http://{LTSMaster}:12311/pro/proc/proc-91-ff383c616e5242888b398e51359c****/info

Response:

{
    "data": {
        "checkJobs": Array,
        "procId": "proc-91-ff383c616e5242888b398e51359c****",
        "incrJobs": Array,
        "procConfig": Object,
        "stage": "WAIT_FOR_SUCCESS",
        "fullJobs": Array,
        "mergeJobs": Array,
        "srcDS": "hdfs",
        "sinkDS": "ld-uf6el41jkba96****",
        "state": "RUNNING",
        "schemaJob": Object,
        "procType": "SPARK_BULKLOAD"
    },
    "success": "true"
}

Key response fields:

FieldDescription
stateThe overall task status. RUNNING indicates the task is active.
stageThe current execution phase within the task. WAIT_FOR_SUCCESS indicates the task is waiting for final success confirmation.
procTypeThe task type. SPARK_BULKLOAD indicates a Spark-based bulk load task.
srcDSThe name of the source data source.
sinkDSThe name of the destination data source.
procIdThe task ID.
state and stage track different things. state reflects whether the task is active or has ended. stage reflects the current phase within the execution pipeline. Poll state to check whether a task is still running; use stage to monitor progress within a running task.

Stop a task

Endpoint: GET http://{LTSMaster}:12311/pro/proc/{procId}/abort

Example:

curl http://{LTSMaster}:12311/pro/proc/proc-91-ff383c616e5242888b398e51359c****/abort

Response:

{"success":"true","message":"ok"}

Retry a task

Endpoint: GET http://{LTSMaster}:12311/pro/proc/{procId}/retry

Example:

curl http://{LTSMaster}:12311/pro/proc/proc-91-ff383c616e5242888b398e51359c****/retry

Response:

{"success":"true","message":"ok"}

Delete a task

Endpoint: GET http://{LTSMaster}:12311/pro/proc/{procId}/delete

Example:

curl http://{LTSMaster}:12311/pro/proc/proc-91-ff383c616e5242888b398e51359c****/delete

Response:

{"success":"true","message":"ok"}