This topic describes how to create tables in DataWorks and upload data to them, using the bank_data and result_table tables as examples.
Prerequisites
The MaxCompute directory appears in DataStudio only after you add a MaxCompute data source to your workspace and bind the corresponding compute resource.
-
You have added a MaxCompute data source to your workspace. For more information, see Bind a MaxCompute compute resource.
-
You have bound a MaxCompute compute resource in DataStudio. To do this, in the left-side navigation pane of the DataStudio console, click Computing Resources and follow the on-screen instructions.
-
You have configured a resource group for the engine in System Management. For more information, see System Management.
ImportantIf you do not configure a resource group for the engine in System Management, the following error message appears:
The source or destination engine requires a resource group for data upload. Please contact the workspace administrator to configure a resource group.
Background information
The bank_data table stores business data, and the result_table table stores analysis results.
Create the bank_data table
-
Go to the DataStudio page.
Log on to the DataWorks console. In the target region, click in the left-side navigation pane. Select a workspace from the drop-down list and click Go to Data Development.
-
On the DataStudio page, hover over the
icon and click .Alternatively, you can open a workflow, right-click MaxCompute, and select Create Table.
-
In the Create Table dialog box, select a Path, enter bank_data for Name, and click Create.
-
On the table editing page, click DDL.
-
In the DDL dialog box, enter the following statement and click Generate Table Schema.
CREATE TABLE IF NOT EXISTS bank_data ( age BIGINT COMMENT 'Age', job STRING COMMENT 'Job type', marital STRING COMMENT 'Marital status', education STRING COMMENT 'Education level', default STRING COMMENT 'Whether the user has a credit card', housing STRING COMMENT 'Housing loan', loan STRING COMMENT 'Personal loan', contact STRING COMMENT 'Contact method', month STRING COMMENT 'Month of the year', day_of_week STRING COMMENT 'Day of the week', duration STRING COMMENT 'Last contact duration, in seconds', campaign BIGINT COMMENT 'Number of contacts performed during this campaign', pdays DOUBLE COMMENT 'Days since the last contact from a previous campaign', previous DOUBLE COMMENT 'Number of contacts performed before this campaign', poutcome STRING COMMENT 'Outcome of the previous marketing campaign', emp_var_rate DOUBLE COMMENT 'Employment variation rate', cons_price_idx DOUBLE COMMENT 'Consumer price index', cons_conf_idx DOUBLE COMMENT 'Consumer confidence index', euribor3m DOUBLE COMMENT 'Euribor 3-month rate', nr_employed DOUBLE COMMENT 'Number of employees', y BIGINT COMMENT 'Whether the client has subscribed to a term deposit' );For more information about the SQL syntax for creating tables, see CREATE TABLE.
-
In the Confirm Operation dialog box, click Confirm.
-
After the table schema is generated, enter a Display Name for the table in the General section. Then, click Commit to Development Environment and Commit to Production Environment.
NoteThis example uses a workspace in standard mode. If you use a workspace in basic mode, click only Commit to Production Environment.
-
In the navigation pane on the left, click Table Management.
-
On the Table Management page, double-click the table name to view its details.
Create the result_table table
-
On the DataStudio page, hover over the
icon and click .Alternatively, you can open a workflow, right-click MaxCompute, and select Create Table.
-
In the Create Table dialog box, select a Path, enter result_table for Table Name, and click Create.
-
In the DDL Statement dialog box, enter the following statement and click Generate Table Schema.
CREATE TABLE IF NOT EXISTS result_table ( education STRING COMMENT 'Education level', num BIGINT COMMENT 'Number of people' ); -
In the Confirm Operation dialog box, click Confirm.
-
After the table schema is generated, enter a Display Name for the table in the General section. Then, click Commit to Development Environment and Commit to Production Environment.
-
In the navigation pane on the left, click Table Management.
-
On the Table Management page, double-click the table name to view its details.
Upload local data to bank_data
DataWorks lets you:
-
Upload local text files to tables in a workspace.
-
Import business data from various data sources to a workspace using Data Integration.
The following limits apply when you upload local text files:
-
File type: Only .txt, .csv, and .log files are supported.
-
File size: A file cannot exceed 30 MB.
To upload files larger than 30 MB, use one of the following methods:
-
Upload the data file to OSS and access the data by creating a MaxCompute external table. For more information about how to upload data to OSS, see Upload objects. For more information about MaxCompute external tables, see External tables.
-
Upload the data file to OSS and use Data Integration to synchronize the OSS data to a MaxCompute table. For more information about how to upload data to OSS, see Upload objects. For more information about how to synchronize data from OSS to a MaxCompute table, see Configure a synchronization task in wizard mode.
-
Use the feature.
-
-
Target: You can import data to partitioned tables and non-partitioned tables. However, partition key values cannot contain Chinese characters, ampersands (&), asterisks (*), or other special characters.
This example demonstrates how to import the local file banking.txt to DataWorks.
-
On the DataStudio page, click the
icon. -
In the Data Import Wizard dialog box, search for the
bank_datatable by entering at least three characters of its name, and then click Next.NoteIf you cannot find the table after creating it, you can manually synchronize it in Data Map and then search for it again. For more information about how to manually synchronize a table, see My Data.
-
Set Select Data Import Method to Upload Local File and click Browse next to Select File. Select the local data file and configure the import parameters.
Parameter
Description
Select Data Import Method
The default value is Upload Local File.
File Format
You can select CSV or Custom Text File.
Select File
Click Browse to select the local file that you want to upload.
Select Delimiter
Supported delimiters include Comma (,), Tab, Semicolon (;), Space, |, #, and &. In this example, select Comma (,).
Original Character Set
Supported character sets include GBK, UTF-8, CP936, and ISO-8859. In this example, select GBK.
Import First Row
Select the row from which to start the import. In this example, select 1.
First Row as Field Names
Specify whether the first row contains column names. For this example, do not select First Row as Field Names.
Preview Data
This section displays a preview of the data.
-
Click Next.
-
Select a method to map the source fields to the destination fields. In this example, select By Location.
-
Click Import Data.
Next steps
You have now learned how to create a table and upload data. In the next tutorial, you will learn how to create, configure, and run a workflow to analyze and process data in your workspace. For more information, see Create a workflow.