Export data to OSS
AnalyticDB for MySQL lets you export data from AnalyticDB for MySQL Data Warehouse Edition (V3.0) to Object Storage Service (OSS) using external tables and the INSERT INTO statement. This export feature supports only the CSV and Parquet file formats.
Prerequisites
-
Create a directory in Object Storage Service (OSS) for data from AnalyticDB for MySQL:
-
Note
The OSS bucket and your AnalyticDB for MySQL Data Warehouse Edition (V3.0) cluster must be in the same region.
-
For example, create a directory named
adb_data/in OSS. Data exported from AnalyticDB for MySQL Data Warehouse Edition (V3.0) is stored in this directory.
-
-
You have created a Data Warehouse Edition (V3.0) cluster, configured its whitelist, and created an account and a database. For more information, see Quick start for AnalyticDB for MySQL.
If the AnalyticDB for MySQL Data Warehouse Edition cluster is in elastic mode, log on to the AnalyticDB for MySQL console to enable the ENI (Elastic Network Interface) network. On the Network Information panel of the instance, turn on the ENI Network switch. The ENI network must be enabled for instances in elastic mode to import data by using external tables.
Procedure
本示例将AnalyticDB for MySQL数仓版的adb_demo库中的source_table表数据导出至OSS的adb_data文件夹下。
-
Connect to the target cluster and go to the source database.
-
In the
adb_demodatabase, create an external table. For more information, see Non-partitioned OSS external tables. -
Execute a write statement to copy the source data to the previously created external table. For information about the supported syntax for different external table types, see Syntax for non-partitioned external tables and Syntax for partitioned external tables.
-
After the write job in Step 3 completes, log on to the OSS console to view the exported data files in the destination folder. You can also directly query the data exported to the external table from AnalyticDB for MySQL.
Syntax for non-partitioned external tables
When you use INSERT INTO or INSERT OVERWRITE SELECT to write data, the job runs synchronously by default. If you write a large amount of data, such as several hundred gigabytes, the connection between the client and the AnalyticDB for MySQL server might time out, causing the job to fail. Therefore, if you write a large amount of data, we recommend running the INSERT OVERWRITE SELECT statement asynchronously by adding SUBMIT JOB before the statement. Example: SUBMIT JOB INSERT OVERWRITE adb_table SELECT * FROM oss_table;. For more information about submitting jobs asynchronously, see asynchronous import job submission.
By default, files exported to OSS contain only data and do not include column headers. To export files with column headers, add the /*+sql_output_oss_file_head_enable=true*/ hint before the statement.
-
Insert select from
-
Description: If your data already exists in another table, you can use
INSERT SELECT FROMto copy it to an external table. This statement writes data from the source table to the external table's OSS location. Each write operation generates new OSS data files.ImportantYou must write all columns to the external table. You cannot specify only a subset of columns.
Syntax:
INSERT INTO table_name SELECT select_statement FROM from_statement;Example:
insert into oss_table select col1, col2, col3 from source_table;
-
-
Insert overwrite select
-
Description: Use
INSERT OVERWRITE SELECTto perform a bulk insert into a table. If the destination external table already contains data, this statement first deletes all existing data files in the external table's path and then generates new OSS data files.ImportantYou must write all columns to the external table. You cannot specify only a subset of columns.
Syntax:
INSERT OVERWRITE table_name SELECT select_statement FROM from_statement;Example:
INSERT OVERWRITE oss_table SELECT col1, col2, col3 FROM source_table;
-
-
Replace select from
-
Description: Because OSS external tables do not support primary keys,
REPLACE SELECT FROMbehaves identically toINSERT SELECT FROM. It copies data to another table. Existing data in the destination table remains unchanged, and new data is appended as new OSS data files.
-
-
Export to a single OSS file (CSV format only; not supported for Parquet format)
-
Description: You can use a hint to export data to a single specified OSS file. If the
OVERWRITEkeyword is included, any existing file with the same name in the directory defined inTABLE_PROPERTIESof the external table is overwritten. Other files in the directory are not affected.ImportantYou must write all columns to the external table. You cannot specify only a subset of columns.
Version notes:
-
For AnalyticDB for MySQL clusters earlier than V3.1.2: This feature is not supported. The system automatically generates the file names and exports the data into multiple files. The number of destination files is dynamically determined by the job concurrency.
-
For AnalyticDB for MySQL clusters of version 3.1.2 and later, you can use a hint to export regular external tables in the AnalyticDB for MySQL CSV format to a single file in OSS. You can also customize the filename during the export. Without a hint, the export behavior is consistent with that of versions earlier than 3.1.2: the data is exported to multiple files.
Syntax:
/*+output_filename=adb.txt*/INSERT [OVERWRITE] table_name SELECT select_statement FROM from_statement;Example:
/*+output_filename=adb.txt*/INSERT [OVERWRITE] oss_table SELECT * FROM source_table;If you export data without a hint, file names are automatically generated by the system, and multiple files are exported.
When you export to a single file, the exported file, such as adb.txt, is stored in the OSS directory defined for the external table, such as
adb_data/. -
-
Syntax for partitioned external tables
When you use INSERT INTO or INSERT OVERWRITE SELECT to write data, the job runs synchronously by default. If you write a large amount of data, such as several hundred gigabytes, the connection between the client and the AnalyticDB for MySQL server might time out, causing the job to fail. Therefore, if you write a large amount of data, we recommend running the INSERT OVERWRITE SELECT statement asynchronously by adding SUBMIT JOB before the statement. Example: SUBMIT JOB INSERT OVERWRITE adb_table SELECT * FROM oss_table;. For more information about submitting jobs asynchronously, see asynchronous import job submission.
By default, files exported to OSS contain only data and do not include column headers. To export files with column headers, add the /*+sql_output_oss_file_head_enable=true*/ hint before the statement.
When you write data from a partitioned table, the data files do not include the data of partition key columns. The partition information is stored in the OSS directory structure.
Exporting data from a partitioned external table to a single OSS file is not supported.
For example, consider a partitioned table that has two partition key columns and three regular columns. The first-level partition key is pcol1 with a value of 1, and the second-level partition key is pcol2 with a value of a. The data from the partitioned table is exported to the OSS path adb_data/. When you write data to the partition where pcol1=1 and pcol2=a, the relative path of the data file is adb_data/pcol1=1/pcol2=a/. The resulting CSV or Parquet data files do not contain the values for pcol1 and pcol2. They contain only the values of the three regular columns.
-
Insert into partition select from
Function: The INSERT INTO PARTITION SELECT FROM command bulk inserts data into a partitioned external table. When you write data, you can use the PARTITION clause to specify all partition columns and values. Alternatively, you can specify only the high-level partition values and allow the low-level partitions to be generated dynamically. You can also omit the PARTITION clause entirely, which dynamically generates all partition levels.
When data is written, it is appended to the corresponding partition. Each write operation generates new OSS data files.
You must write all columns to the external table. You cannot specify only a subset of columns.
Fully static partitions
Syntax
INSERT INTO table_name PARTITION(par1=val1,par2=val2,...)
SELECT select_statement FROM from_statement;
Example
INSERT INTO oss_table_par PARTITION(par1=val1,par2=val2)
select col1, col2, col3 FROM source_table;
Partially static and partially dynamic partitions
Syntax
INSERT INTO table_name PARTITION(par1=val1,par2,...)
SELECT select_statement FROM from_statement;
Example
INSERT INTO oss_table_par PARTITION(par1=val1,par2)
SELECT col1, col2, col3, par2col FROM source_table;
Fully dynamic partitions
Syntax
INSERT INTO table_name
SELECT select_statement FROM from_statement;
Example
INSERT INTO oss_table_par
SELECT col1, col2, col3, par1col, par2col FROM source_table;
-
Insert overwrite partition select
Function: The usage is the same as INSERT INTO PARTITION SELECT. However, this command also deletes the existing data files from the target partitions involved in the execution. Data files in partitions that do not receive new data are not deleted.
Syntax:
INSERT OVERWRITE table_name PARTITION(par1=val1,par2=val2,...)[IF NOT EXISTS]
SELECT select_statement FROM from_statement;
Example:
INSERT OVERWRITE oss_table_par PARTITION(par1=val1,par2=val2) IF NOT EXISTS
SELECT col1, col2, col3 FROM source_table;
-
Replace into partition select from
Description: Because external tables do not support primary keys, REPLACE INTO PARTITION SELECT FROM behaves identically to INSERT INTO PARTITION SELECT FROM.
Unsupported syntax
AnalyticDB for MySQL external tables do not support row-level write syntax. Specifically, the INSERT INTO VALUES and REPLACE INTO VALUES statements are not supported.