Metadata collection overview
Metadata collection supports a wide range of source types, including traditional databases such as MySQL and Oracle, big data storage systems such as Hologres, and application systems. The collection overview page shows the number of collection tasks and collection object types for each data source or application system.
Prerequisites
Create an application system in Management Hub > Datasource Management > Application System before using the application system type as a collection source.
Limits
By default, only metadata collection for relational databases is supported. To collect metadata from other data source types, purchase the corresponding features.
Metadata collection workflow description
If the network environment of the data source is not connected to the Dataphin cluster, you must use the registered scheduling cluster feature. Collected data is first written to the Object Storage Service (such as OSS) that the Dataphin deployment depends on as a transit, and then written to the Dataphin system. This incurs additional storage costs.
Procedure
-
In the top menu bar of the Dataphin homepage, select Administration > Metadata.
-
In the navigation pane on the left, select Metadata Collection > Collection Overview.
-
On the Welcome To Metadata Collection And Management page, Dataphin displays the number of collection tasks configured for each data source or application system and the supported collection object types in card format.
-
Data Source: Various data source types are supported, such as relational databases and big data storage databases. For more information, see Supported data sources in Dataphin.
The supported versions of MySQL, Oracle are as follows:
-
MySQL: MySQL 5.1.43, MySQL 5.6/5.7, MySQL 8, and RDS MySQL.
-
Oracle: Oracle 11g, Oracle 12c, Oracle 18c, Oracle 19c, Oracle 21c, and Oracle 23c.
-
-
Application System: Supports Quick BI.
-
-
You can quickly create collection tasks for the target data source or application system.
Create Collection Task: Hover over a card to quickly create a collection task. For more information, see Create and manage metadata collection tasks.
NoteOnly one collection task can be configured per data source. The development environment and production environment of the same data source can each have a separate collection task.