Engine integration

更新时间:
复制 MD 格式

Data Lake Formation (DLF) integrates with mainstream big data compute engines to support real-time and offline data lakehouse workloads and Online Analytical Processing (OLAP). Native integrations are available for real-time computing Flink (VVP), EMR Serverless Spark, EMR Serverless StarRocks, and EMR on ECS.

Integration methods

DLF supports three integration methods, each suited to a different engine type and access pattern:

  1. Paimon REST: A RESTful metadata service interface compliant with Apache Paimon community standards. Use this method when your compute engine is built on Apache Paimon. It supports table schema management and snapshot queries.

  2. Iceberg REST: A RESTful metadata service interface compliant with Apache Iceberg community standards. Use this method when your compute engine is built on Apache Iceberg. It supports table schema management and snapshot queries.

  3. File access: Uses the Paimon Virtual File System (PVFS) to expose table data as standard file paths, letting you read underlying data files and metadata directly without a full compute engine. Use this method for scripted exploration, debugging, and lightweight data processing.