Command line interface

更新时间:
复制 MD 格式

The console CLI lets you access DataHub projects and run commands to manage projects, topics, connectors, shards, and subscriptions.

Prerequisites

Make sure that the following requirement is met:

  • Java 8 or later is installed on the target device.

Install and configure the console client

  1. Download the datahub_console.tar.gz package and extract it.

  2. The extracted package contains the bin, conf, and lib folders.

  3. Open the conf folder and enter your AccessKey and endpoint in the datahub.properties file:

datahub.accessid=
datahub.accesskey=
datahub.endpoint=

The following table describes the parameters:

Parameter

Required

Description

Example

datahub.accessid

Yes

The AccessKey ID of your Alibaba Cloud account or RAM user.

N/A

datahub.accesskey

Yes

The AccessKey secret that corresponds to the AccessKey ID.

N/A

datahub.endpoint

Yes

The endpoint of the DataHub service.

Configure the endpoint based on the region and network type of your DataHub project. DataHub Domain Names.

https://dh-cn-hangzhou.aliyuncs.com

Run the console client

Start the console client using one of the following methods:

  • Method 1: In the bin folder, double-click datahubcmd.bat (Windows). The following output indicates a successful start.

  • Method 2: Open a terminal, navigate to the bin folder, and run datahubcmd (Windows) or sh datahubcmd.sh (Linux/macOS). The following output shows a successful connection to DataHub.

Command help

You can get command help in two ways:

Method 1: View help in the console client

  • View help for all commands:

help
  • View help for a specific command by keyword:

Example: list topics:

DataHub=>help lt
NAME
        lt - List topic

SYNOPSYS
        lt [-p] string

OPTIONS
        -p  string
                projectName
                [Mandatory]

Method 2: Run the following command from the bin folder in your terminal to view all commands:

...\bin>datahubcmd help

Usage guide

Project operations

Create a project

  • -p: The project name.

  • -c: The project description.

cp -p test_project  -c test_comment

Delete a project

  • -p: The project name.

dp -p test_project

Note: Delete all resources in the project (topics, subscriptions, and sync tasks) before you delete the project.

List projects

lp

Topic operations

Create a topic

  • -p: The project name.

  • -t: The topic name.

  • -m: The topic type. BLOB for BLOB topics, TUPLE for TUPLE topics.

  • -f: The field format for TUPLE topics: [(fieldName,fieldType,isNull)]. Separate multiple fields with commas (,).

  • -s: The number of shards.

  • -l: The data TTL in days. Valid values: 1 to 7.

  • -c: The topic description.

ct -p test_project -t test_topic -m TUPLE -f [(name,string,true)] -s 3 -l 3 -c test_comment

Delete a topic

  • -p: The project name.

  • -t: The topic name.

dt -p test_project -t test_topic

Get topic details

  • -p: The project name.

  • -t: The topic name.

gt -p test_project -t test_topic

Export a topic schema to a JSON file

  • -f: The path where the file is saved.

  • -p: The project name.

  • -t: The topic name.

gts -f filepath -p test_project -t test_topic

List topics

  • -p: The project name.

lt -p test_project

Create a topic from a JSON file

  • -s: The number of shards.

  • -l: The data TTL in days. Valid values: 1 to 7.

  • -f: The file path.

  • -p: The project name.

  • -t: The topic name.

rtt -s 3 -l 3 -c test_comment -f filepath -p test_project -t test_topic

Modify the lifecycle of a topic

  • -p: The project name.

  • -t: The topic name.

  • -l: The topic lifecycle in days.

  • -c: The topic description.

utl -p test_project -t test_topic -l 3 -c test_comment

Connector operations

Create an ODPS connector

  • -p: The project name.

  • -t: The topic name.

  • -m: The sync type. Supported types for ODPS: SYSTEM_TIME, USER_DEFINE, EVENT_TIME, and META_TIME.

  • -e: The ODPS endpoint. Use the classic network endpoint.

  • -op: The ODPS project name.

  • -oa: The AccessKey ID used to access ODPS.

  • -ok: The AccessKey used to access ODPS.

  • -tr: The partition interval in minutes. Default: 60.

  • -tf: The partition format. `ds` indicates partitioning by day, `ds hh` indicates partitioning by hour, and `ds hh mm` indicates partitioning by minute.

coc -p test_project -t test_topic -m SYSTEM_TIME -e odpsEndpoint -op odpsProject -ot odpsTable -oa odpsAccessId -ok odpsAccessKey -tr 60 -c (field1,field2) -tf ds hh mm

Add a field for ODPS sync

  • -p: The project name.

  • -t: The topic name.

  • -c: The connector ID. Find it on the Data Synchronization tab.

  • -f: The name of the new field.

acf -p test_project -t test_topic -c connectorId -f fieldName

Create a connector to sync data to MySQL or RDS

  • -p: The project name.

  • -t: The topic name.

  • -h: The host. Use the classic network address.

  • -po: The port.

  • -ty: The sync type:

  • SINK_MYSQL: Sync data to MySQL.

  • SINK_ADS: Sync data to ADS.

  • -d: The database name.

  • -ta: The table name.

  • -u: The username.

  • -pa: The password.

  • -ht: The insert mode:

  • IGNORE

  • OVERWRITE

  • -n: The fields to sync, for example (field1,field2).

cdc -p test_project -t test_topic -h host -po 3306 -ty mysql -d mysql_database -ta msyql_table -u username -pa password -ht IGNORE -n (field1,field2)

Create a DataHub connector

  • -p: The project name.

  • -t: The topic name.

  • -sp: The sink project where data is imported.

  • -st: The sink topic where data is imported.

  • -m: The authentication type.

  • AK: AccessKey authentication. Requires the AccessKey ID and AccessKey secret.

  • STS: STS authentication.

cdhc -p test_project -t test_topic -sp sinkProject -st sinkTopic -m AK -i accessid k accessKey

Create an FC connector

  • -p: The project name.

  • -t: The topic name.

  • -e: The FC endpoint. Use the classic network endpoint.

  • -s: The FC service name.

  • -f: The FC function name.

  • -au: The authentication method.

  • AK: AccessKey authentication. Requires the AccessKey ID and AccessKey secret.

  • STS: Authentication using STS.

  • -n: The fields to sync, for example (field1,field2).

cfc -p test_project -t test_topic -e endpoint -s service -f function -au AK -i accessId -k accessKey -n (field1,field2)

Create a Hologres connector

  • -p: The project name.

  • -t: The topic name.

  • -e: The endpoint.

  • -cl: The fields to sync to Hologres.

  • -au: The authentication method. Only AccessKey authentication is supported for Hologres sync.

  • -m: The parsing type. Delimiter requires lineDelimiter, parseData, and columnDelimiter. InformaticaJson requires parseData.

  • Delimiter

  • InformaticaJson

chc -p test_project -t test_topic -e endpoint -cl (field,field2) -au AK -hp holoProject -ht holoTopic -i accessId -k accessKey -m Delimiter -l 1 -b false -n (field1,field2)

Create an OTS connector

  • -p: The project name.

  • -t: The topic name.

  • -it: The OTS instance name.

  • -m: The authentication type. Default: STS.

  • AK: AccessKey authentication. Requires the AccessKey ID and AccessKey secret.

  • STS: STS authentication.

  • -t: The OTS table name.

  • -wm: The write mode:

  • PUT

  • UPDATE

  • -c: The fields to sync, for example (field1,field2).

cotsc -p test_project -t test_topic -i accessId -k accessKey -it instanceId -m AK -t table -wm PUT -c (field1,field2)

Create an OSS connector

  • -p: The project name.

  • -t: The topic name.

  • -b: The OSS bucket name.

  • -e: The OSS endpoint name.

  • -pr: The directory prefix for syncing data to OSS.

  • -tf: The synchronization time format. For example, %Y%m%d%H%M indicates partitioning by minute.

  • -tr: The partition interval.

  • -c: The fields to sync.

csc -p test_project -t test_topic -b bucket -e endpoint -pr ossPrefix -tf ossTimeFormat -tr timeRange -c (f1,f2)

Delete a connector

  • -p: The project name.

  • -t: The topic name.

  • -c: The connector ID. Find it on the Data Synchronization tab.

dc -p test_project -t test_topic -c connectorId

Get connector details

  • -p: The project name.

  • -t: The topic name.

  • -c: The connector ID. Find it on the Data Synchronization tab.

gc -p test_project -t test_topic -c connectorId

List connectors in a topic

  • -p: The project name.

  • -t: The topic name.

lc -p test_project -t test_topic

Restart a connector

  • -p: The project name.

  • -t: The topic name.

  • -c: The connector ID. Find it on the Data Synchronization tab.

rc -p test_project -t test_topic -c connectorId

Update connector AccessKey

  • -p: The project name.

  • -t: The topic name.

  • -ty: The sync type, for example SINK_ODPS.

uca -p test_project -t test_topic -ty SINK_ODPS  -a accessId -k accessKey

Shard operations

Merge shards

  • -p: The project name.

  • -t: The topic name.

  • -s: The ID of the shard to merge.

  • -a: The ID of the other shard to merge.

ms -p test_project -t test_topic -s shardId -a adjacentShardId

Split a shard

  • -p: The project name.

  • -t: The topic name.

  • -s: The ID of the shard to split.

ss -p test_project -t test_topic -s shardId

List shards in a topic

  • -p: The project name.

  • -t: The topic name.

ls -p test_project -t topicName

Get shard sync status

  • -p: The project name.

  • -t: The topic name.

  • -s: The shard ID.

  • -c: The connector ID. Find it on the Data Synchronization tab.

gcs -p test_project -t test_topic -s shardId -c connectorId

Get consumer offset per shard for a subscription

  • -p: The project name.

  • -t: The topic name.

  • -s: The subscription ID.

  • -i: The shard ID.

gso -p test_project -t test_topic -s subid -i shardId

Subscription operations

Create a subscription

  • -p: The project name.

  • -t: The topic name.

  • -c: The subscription description.

css -p test_project -t test_topic -c comment

Delete a subscription

  • -p: The project name.

  • -t: The topic name.

  • -s: The subscription ID.

dsc -p test_project -t test_topic -s subId

List subscriptions

  • -p: The project name.

  • -t: The topic name.

lss -p test_project -t test_topic

Upload and download data

Upload data

  • -f: The file path. On Windows, use escape characters, for example D:\\test\\test.txt.

  • -p: The project name.

  • -t: The topic name.

  • -m: The text separator. Commas (,) and spaces are supported.

  • -n: The batch size per upload. Default: 1000.

uf -f filepath -p test_topic -t test_topic -m "," -n 1000

Example: Upload a CSV file

The following example shows how to upload a CSV file to DataHub. The CSV format is:

1. 0,qe614c760fuk8judu01tn5x055rpt1,true,100.1,14321111111
2. 1,znv1py74o8ynn87k66o32ao4x875wi,true,100.1,14321111111
3. 2,7nm0mtpgo1q0ubuljjjx9b000ybltl,true,100.1,14321111111
4. 3,10t0n6pvonnan16279w848ukko5f6l,true,100.1,14321111111
5. 4,0ub584kw88s6dczd0mta7itmta10jo,true,100.1,14321111111
6. 5,1ltfpf0jt7fhvf0oy4lo8m3z62c940,true,100.1,14321111111
7. 6,zpqsfxqy9379lmcehd7q8kftntrozb,true,100.1,14321111111
8. 7,ce1ga9aln346xcj761c3iytshyzuxg,true,100.1,14321111111
9. 8,k5j2id9a0ko90cykl40s6ojq6gruyi,true,100.1,14321111111
10. 9,ns2zcx9bdip5y0aqd1tdicf7bkdmsm,true,100.1,14321111111
11. 10,54rs9cm1xau2fk66pzyz62tf9tsse4,true,100.1,14321111111

Each line is a record with comma-separated fields. The file is saved at /temp/test.csv. The DataHub topic schema is:

Field name

Field type

id

BIGINT

name

STRING

gender

BOOLEAN

salary

DOUBLE

my_time

TIMESTAMP

Upload command:

uf -f /temp/test.csv -p test_topic -t test_topic -m "," -n 1000

Download data

  • -f: The file path. On Windows, use escape characters, for example D:\\test\\test.txt.

  • -p: The project name.

  • -t: The topic name.

  • -s: The shard ID.

  • -d: The subscription ID.

  • -f: The download path.

  • -ti: The start time for reading data. Format: yyyy-mm-dd hh:mm:ss.

  • -l: The number of records to read per batch.

  • -g: Continuous reading mode.

  • 0: Read once and stop.

  • 1: Read continuously.

down -p test_project -t test_topic -s shardId -d subId -f filePath -ti "1970-01-01 00:00:00" -l 100 -g 0

FAQ

  • Startup failure: If the script fails to start in Windows, check whether the script path contains parentheses.