The Add ID Column component is an algorithm that processes data tables. It inserts a unique ID column as the first column of a table and assigns a serial number to each row to help with data identification and management.
Algorithm description
The algorithm supports a scale of 1,000,000,000 × 1,023.
Configure the component
Method 1: Use the GUI
Add the Add ID Column component to the Designer workflow. Configure the component parameters in the pane on the right.
Parameter type | Parameter | Description |
Parameters Setting | All Selected by Default | By default, all columns are selected. Extra columns do not affect the prediction result. |
Serial number | The default value is append_id. | |
Execution Tuning | Number of computing cores | The number of cores. |
Memory per core | The memory size of each core, in MB. The value must be in the range of (1, 65536). |
Method 2: Use PAI commands
Configure the Add ID Column component parameters using PAI commands. You can run PAI commands in the SQL script component. For more information, see SQL Script.
PAI -name AppendId
-project algo_public
-DinputTableName=maple_test_appendid_basic_input
-DoutputTableName=maple_test_appendid_basic_output;Parameter | Required | Default value | Description |
inputTableName | Yes | None | The name of the input table. |
selectedColNames | No | All columns | The columns in the input table that are used for training. Separate column names with commas (,). Columns of the INT and DOUBLE types are supported. If the input is in sparse format, STRING columns are also supported. |
inputTablePartitions | No | All partitions | The partitions in the input table that are used for training. The following formats are supported:
Note If you specify multiple partitions, separate them with commas (,). |
outputTableName | Yes | None | The output table. |
IDColName | No | append_id | The name of the ID column. |
lifecycle | No | None | The output table lifecycle. |
coreNum | No | System allocated | The number of cores. |
memSizePerCore | No | System allocated | The memory size of each core, in MB. The value must be in the range of (1, 65536). |
Example
PAI -name AppendId
-project algo_public
-DinputTableName=maple_test_appendid_basic_input
-DoutputTableName=maple_test_appendid_basic_output;Data generation
col0
col1
col2
col3
col4
10
0.0
aaaa
Thu Oct 01 00:00:00 CST 2015
true
11
1.0
aaaa
Thu Oct 01 00:00:00 CST 2015
false
12
2.0
aaaa
Thu Oct 01 00:00:00 CST 2015
true
13
3.0
aaaa
Thu Oct 01 00:00:00 CST 2015
true
14
4.0
aaaa
Thu Oct 01 00:00:00 CST 2015
true
Output table
append_id
col0
col1
col2
col3
col4
0
10
0.0
aaaa
Thu Oct 01 00:00:00 CST 2015
true
1
11
1.0
aaaa
Thu Oct 01 00:00:00 CST 2015
false
2
12
2.0
aaaa
Thu Oct 01 00:00:00 CST 2015
true
3
13
3.0
aaaa
Thu Oct 01 00:00:00 CST 2015
true
4
14
4.0
aaaa
Thu Oct 01 00:00:00 CST 2015
true