Supported data sources in Dataphin
Before using Dataphin, select a database or data warehouse as a data source for reading raw data and writing processed data during development. Dataphin supports various data engines, including data warehouses like MaxCompute and Hive, and databases such as MySQL and Oracle.
Background
Dataphin connects to various data sources, including big data storage, file, message queue, relational, and NoSQL data sources. The supported data source types vary by module:
To connect to a data source in Dataphin, you must first create it in data source management.
Dataphin supports both production and development data sources. Basic projects and the Prod environment of Dev-Prod projects read from and write to production data sources, while the Dev environment of Dev-Prod projects reads from and writes to development data sources. The same rule applies to data service. However, synchronization tasks do not support this dual-environment model and always read from and write to production data sources.
NoteIf the built-in data source types do not meet your requirements, you can create a custom offline or real-time source type to connect a new data source. For instructions, see the following topics:
Data source overview
Use cases | Description | References |
Batch integration | Batch integration supports various components, such as input, output, and transform components. You can build a single batch integration pipeline by dragging, configuring, and assembling these components on a canvas. Batch integration also supports a script mode for more flexible configuration. When you create a custom RDBMS data source, Dataphin automatically creates its input and output components in the component library to support diverse data synchronization. | |
Real-time integration | Dataphin uses real-time integration to synchronize data from an entire source database or all of its tables to a destination, ensuring that the source and destination remain synchronized. | |
Offline development - Database SQL | After you connect a data source to Dataphin, you can create database SQL tasks for offline development. | |
Metadata collection | The metadata center extracts, processes, centrally stores, and manages metadata from business systems. It supports data governance and improves data organization, search, and analysis. | |
Real-time development | You can use connected data sources for real-time development, which includes creating real-time metatables and developing real-time tasks. | |
Data quality | Asset quality is a comprehensive data quality solution on the Dataphin platform for data development and consumption. You can create global table quality rules or data source quality rules based on your data sources. You can create data source quality rules for any data source in Dataphin to monitor data quality. You can test all supported data sources for connectivity. However, only some data sources support rules for monitoring table structure changes. For details, see the Data quality - Table structure change monitoring column in the table below. | |
Data service | OneService (data service) is the final step in building a data middle platform with Dataphin. Acting as a unified gateway for data services, it enables centralized, marketplace-style data management. This approach simplifies data access while ensuring its security. | |
Label Factory | Label Factory is a one-stop platform for the entire label lifecycle, from creation to service delivery. Designed for data development teams and engineers, it is suited for scenarios like risk control and marketing. It provides tools to develop, manage, explore, and serve offline, real-time, and service-based labels. Label Factory empowers business applications and helps you build a valuable label asset library, ensuring that labels are easy to develop, find, use, and manage. |
This topic provides an overview of the data sources supported by Dataphin and their use cases. For more details on the specific features supported for each data source, see:
Data sources for big data storage
Data source type | Batch integration | Real-time integration | Batch development - SQL | Metadata collection | Real-time development | Global table quality | Schema change | Data service | Tag factory | Guide |
MaxCompute | Supported | Supported | Not supported | Not supported | Supported | Supported | Supported | Supported | Supported | |
Hive | Supported | Supported | Not supported | Supported | Supported | Supported | Supported | Supported | Not supported | |
Hologres | Supported | Supported | Supported | Supported | Supported | Supported | Supported | Supported | Supported | |
Impala | Supported | Not supported | Not supported | Not supported | Not supported | Not supported | Supported | Supported | Not supported | |
TDH Inceptor | Supported | Not supported | Not supported | Not supported | Supported | Supported | Supported | Supported | Not supported | |
Kudu | Supported | Not supported | Not supported | Not supported | Not supported | Not supported | Supported | Not supported | Not supported | |
StarRocks | Supported | Supported | Supported | Supported | Supported | Supported | Supported | Supported | Not supported | |
Hudi | Not supported | Not supported | Not supported | Not supported | Supported | Not supported | Not supported | Not supported | Not supported | |
Doris | Supported | Not supported | Supported | Supported | Supported | Supported | Supported | Supported | Not supported | |
Greenplum | Supported | Not supported | Not supported | Supported | Not supported | Not supported | Not supported | Not supported | Supported | |
TDengine | Supported | Not supported | Not supported | Not supported | Not supported | Not supported | Not supported | Supported | Not supported | |
ArgoDB | Supported | Not supported | Not supported | Not supported | Not supported | Supported | Supported | Not supported | Not supported | |
SelectDB | Supported | Not supported | Supported | Supported | Not supported | Supported | Supported | Supported | Not supported | |
Databricks | Supported | Supported | Not supported | Not supported | Not supported | Supported | Supported | Supported | Not supported | |
Amazon Redshift | Supported | Not supported | Not supported | Supported | Not supported | Supported | Supported | Supported | Not supported | |
DolphinDB | Supported | Not supported | Not supported | Supported | Not supported | Not supported | Not supported | Supported | Not supported | |
Snowflake | Supported | Not supported | Not supported | Not supported | Not supported | Not supported | Not supported | Not supported | Not supported | |
Data Lake Formation | Supported | Not supported | Not supported | Not supported | Not supported | Not supported | Not supported | Not supported | Not supported |
File data sources
Data source type | Offline integration | Real-time integration | Offline development - database SQL | Metadata collection | Real-time development | Global table quality | Structure volatility | Data service | Tag factory | Guide |
HDFS | Supported | Not supported | Not supported | Not supported | Not supported | Not supported | Not supported | Not supported | Not supported | |
FTP | Supported | Not supported | Not supported | Not supported | Not supported | Not supported | Not supported | Not supported | Not supported | |
OSS | Supported | Not supported | Not supported | Not supported | Supported | Not supported | Not supported | Not supported | Not supported | |
Amazon S3 | Supported | Not supported | Not supported | Not supported | Not supported | Not supported | Not supported | Not supported | Not supported |
Message queue data sources
Data source type | Batch integration | Real-time integration | Batch development (SQL) | Metadata collection | Real-time development | Global table quality | Structure changes | Data service | Tag factory | Guide |
Log Service | Supported | Not supported | Not supported | Not supported | Supported | Not supported | Not supported | Not supported | Not supported | |
Kafka | Supported | Supported | Not supported | Not supported | Supported | Not supported | Not supported | Not supported | Supported | |
DataHub | Supported | Supported | Not supported | Not supported | Supported | Not supported | Not supported | Not supported | Supported | |
RabbitMQ | Not supported | Not supported | Not supported | Not supported | Supported | Not supported | Not supported | Not supported | Not supported |
Relational data source
Data source type | Offline integration | Real-time integration | Offline development - database SQL | Metadata collection | Real-time development | Global table quality | Data source quality - schema change | Data service | Tag factory | Setup guide |
PolarDB | Supported | Not supported | Not supported | Not supported | Supported | Supported | Supported | Not supported | Not supported | |
PolarDB-X | Supported | Not supported | Not supported | Supported | Supported | Supported | Supported | Not supported | Not supported | |
PolarDB-X 2.0 | Supported | Not supported | Supported | Not supported | Not supported | Supported | Supported | Supported | Not supported | |
MySQL | Supported | Supported | Supported | Supported | Supported | Supported | Supported | Supported | Supported | |
SAP HANA | Supported | Not supported | Not supported | Supported | Supported | Supported | Supported | Supported | Not supported | |
Microsoft SQL Server | Supported | Supported | Supported | Supported | Supported | Supported | Supported | Supported | Not supported | |
PostgreSQL | Supported | Supported | Supported | Supported | Supported | Supported | Supported | Supported | Supported | |
AnalyticDB for MySQL 2.0 | Supported | Not supported | Supported | Not supported | Supported | Not supported | Supported | Supported | Not supported | |
AnalyticDB for MySQL 3.0 | Supported | Not supported | Supported | Supported | Supported | Not supported | Supported | Supported | Not supported | |
AnalyticDB for PostgreSQL | Supported | Not supported | Supported | Not supported | Supported | Supported | Supported | Supported | Supported | |
OceanBase | Supported | Not supported | Supported | Supported | Supported | Not supported | Supported | Supported | Not supported | |
Oracle | Supported | Supported | Supported | Supported | Supported | Supported | Supported | Supported | Supported | |
Vertica | Supported | Not supported | Not supported | Not supported | Not supported | Not supported | Supported | Not supported | Not supported | |
IBM Db2 | Supported | Supported | Not supported | Supported | Not supported | Supported | Supported | Not supported | Not supported | |
Teradata | Supported | Not supported | Not supported | Not supported | Not supported | Not supported | Supported | Not supported | Not supported | |
ClickHouse | Supported | Not supported | Supported | Supported | Supported | Supported | Supported | Supported | Not supported | |
DM | Supported | Not supported | Supported | Supported | Not supported | Supported | Supported | Supported | Not supported | |
GBase 8a | Supported | Not supported | Not supported | Not supported | Not supported | Not supported | Not supported | Not supported | Not supported | |
KingbaseES | Supported | Not supported | Not supported | Not supported | Not supported | Not supported | Not supported | Not supported | Not supported | |
TiDB | Supported | Not supported | Not supported | Not supported | Supported | Not supported | Not supported | Not supported | Not supported | |
GoldenDB | Supported | Not supported | Not supported | Not supported | Not supported | Supported | Supported | Not supported | Not supported | |
openGauss | Supported | Not supported | Supported | Supported | Not supported | Not supported | Not supported | Not supported | Supported | |
GaussDB(DWS) | Supported | Not supported | Supported | Not supported | Supported | Supported | Supported | Supported | Not supported | |
Amazon RDS for MySQL | Supported | Not supported | Not supported | Supported | Not supported | Supported | Supported | Supported | Not supported | |
Amazon RDS for PostgreSQL | Supported | Not supported | Not supported | Supported | Not supported | Supported | Supported | Supported | Not supported | |
Amazon RDS for SQL Server | Supported | Not supported | Not supported | Supported | Not supported | Supported | Supported | Supported | Not supported | |
Amazon RDS for Oracle | Supported | Not supported | Not supported | Supported | Not supported | Supported | Supported | Supported | Not supported | |
Amazon RDS for Db2 | Supported | Not supported | Not supported | Supported | Not supported | Supported | Supported | Not supported | Not supported | |
TDSQL for MySQL | Supported | Not supported | Not supported | Supported | Not supported | Supported | Supported | Not supported | Not supported | |
GBase 8c | Supported | Not supported | Not supported | Not supported | Not supported | Not supported | Not supported | Supported | Not supported | |
TDSQL for PostgreSQL | Supported | Not supported | Not supported | Not supported | Not supported | Not supported | Not supported | Not supported | Not supported | |
OushuDB | Supported | Supported | Supported | Not supported | Not supported | Supported | Not supported | Not supported | Not supported |
NoSQL data sources
Data source type | Offline integration | Real-time integration | Offline SQL development | Metadata collection | Real-time development | Global table quality | Schema change quality | Data service | Tag factory | Guide |
HBase 0.9.4 | Supported | Not supported | Not supported | Not supported | Supported | Not supported | Not supported | Supported | Supported | |
HBase 1.1.x | Supported | Not supported | Not supported | Not supported | Supported | Not supported | Not supported | Supported | Supported | |
HBase 2.0 | Supported | Not supported | Not supported | Not supported | Supported | Not supported | Not supported | Supported | Supported | |
Elasticsearch | Supported | Not supported | Not supported | Supported | Supported | Not supported | Not supported | Supported | supported | |
MongoDB | Supported | Not supported | Not supported | Not supported | Supported | Not supported | Not supported | Supported | Not supported | |
Tablestore | Supported | Not supported | Not supported | Not supported | Supported | Not supported | Not supported | Not supported | Supported | |
ApsaraDB for HBase | Not supported | Not supported | Not supported | Not supported | Supported | Not supported | Not supported | Not supported | Not supported | |
Redis | Supported | Not supported | Not supported | Not supported | Supported | Not supported | Not supported | Not supported | Not supported | |
Presto | Not supported | Not supported | Supported | Not supported | Not supported | Not supported | Not supported | Not supported | Not supported | |
Easysearch | Supported | Not supported | Not supported | Not supported | Not supported | Not supported | Not supported | Not supported | Not supported | |
Trino | Not supported | Not supported | Supported | Not supported | Not supported | Not supported | Not supported | Not supported | Not supported | |
OpenSearch | Supported | Not supported | Not supported | Not supported | Not supported | Not supported | Not supported | Not supported | Not supported | |
InfluxDB | Supported | Not supported | Not supported | Not supported | Not supported | Not supported | Not supported | Not supported | Not supported | |
Lindorm | Supported | Not supported | Supported (excluding the wide-column engine - HBase connection type) | Not supported | Not supported | Not supported | Not supported | Supported | Supported (only for the wide-column engine - HBase connection type) |
Semi-structured data sources
Data source type | Offline integration | Real-time integration | Offline development - SQL | Metadata collection | Real-time development | Global table quality | Data source quality - schema drift | Data service | Label factory | Guide |
API | Supported | Unsupported | Unsupported | Unsupported | Unsupported | Unsupported | Unsupported | Unsupported | Supported | |
SAP Table | Supported | Unsupported | Unsupported | Unsupported | Unsupported | Unsupported | Unsupported | Unsupported | Unsupported | |
Salesforce | Supported | Unsupported | Unsupported | Unsupported | Unsupported | Unsupported | Unsupported | Unsupported | Unsupported | |
Feishu Bitable data source | Supported | Unsupported | Unsupported | Unsupported | Unsupported | Unsupported | Unsupported | Unsupported | Unsupported |