Use a DLF catalog
EMR Serverless Spark lets you view databases and tables in bound data catalogs and add existing catalogs.
Limitations
This feature is only supported in EMR versions esr-4.3.0, esr-3.3.0, esr-2.7.0, and later.
Add a DLF catalog
After you bind a DLF catalog to your Serverless Spark workspace, submitted jobs can access it by default.
After you bind a DLF catalog to a Serverless Spark workspace, both Livy Gateway and Kyuubi Gateway natively use it as the default data catalog.
During workspace creation
To create a Serverless Spark workspace, see Create a workspace.
When you create the workspace, enable Use DLF as Metadata Service. In the DLF Data Catalog section, select the data catalog to bind. The Execution Role defaults to AliyunEMRSparkJobRunDefaultRole.
To an existing workspace
-
Go to the Data Catalog page.
-
Log on to the EMR console.
-
In the left-side navigation pane, choose .
-
On the Spark page, click the name of the target workspace.
-
On the EMR Serverless Spark page, click Catalog in the left-side navigation pane.
NoteThe Data Catalog page shows the databases and tables in the DLF data catalog you selected when creating the workspace.
-
-
Click Add Catalog.
-
In the Add Catalog dialog box, configure the settings and click Add.
-
DLF Catalog: A metadata management service for managing and querying metadata in a data lake. Select an existing DLF data catalog or create a new one to access metadata in your data lake.
-
To create a new DLF data catalog, click Create Catalog. You are redirected to the Data Lake Formation console. For details, see Metadata management.