Lindorm Column Store Engine is a fully managed, high-performance, stable, and cost-effective online column store database service. It delivers efficient read/write operations, high-ratio data compression, and high-performance online analytics for use cases such as IoT, Internet of vehicles, and log processing.
Core Capabilities
Iceberg ecosystem compatibility: The Lindorm Column Store Engine is compatible with the Iceberg data lake ecosystem. It integrates seamlessly with Spark and Ray in the Lindorm compute engine to support batch processing.
Fully managed data lake administration: The Lindorm Column Store Engine provides fully managed data lake administration, including file merge, snapshot cleanup, and hot and cold data separation. It automatically schedules Spark elastic computing resources from the compute engine to perform these tasks.
High-performance writes: It supports high-concurrency real-time writes at millions of queries per second (QPS). Write performance scales horizontally with cluster size. For wide-column data scenarios such as IoT and IoV, it enables efficient writes to tens of thousands of tables using the high-performance column store Sink Connector in the Lindorm stream engine.
High freshness queries: The Lindorm Column Store Engine builds a real-time Delta Layer on top of the data lake. It transparently merges and queries both the Delta Layer and the Base Layer using the Lindorm compute engine. Data becomes visible immediately after a successful write.
Primary key table support: The Lindorm Column Store Engine supports primary key tables and provides capabilities such as overwrite, partial update, and delete.
Service Architecture
Lindorm Column Store Engine architecture diagram is as follows:
The following describes each module in the Lindorm Column Store Engine:
Delta Layer: The Delta Layer receives new data writes and serves queries. It ensures persistence of newly written data through LogStore and stores recently ingested data in DeltaStore. It also exposes query interfaces to the compute engine’s OLAP resource group.
Base Layer: The Base Layer stores full data and persists it on LindormDFS in an Iceberg-compatible data lake format.
LakehouseCoordinator: LakehouseCoordinator periodically triggers dump actions to move data from the Delta Layer to the Base Layer. It automatically administers the Base Layer by performing file merge, snapshot cleanup, and hot and cold data separation. It uses the compute engine’s ETL resources to execute dump and Base Layer administration tasks.
Compute Engine - OLAP Resource Group: The Compute Engine - OLAP Resource Group executes ad hoc queries on column-oriented table data. It optionally supports joint queries across the Base Layer and the Delta Layer, enabling real-time data freshness for the Lindorm Column Store Engine.
Compute Engine - ETL Resource Group: The Compute Engine - ETL Resource Group supports batch processing operations on column-oriented tables, such as offline computing, batch import, and update. It is also invoked by the Lindorm Column Store Engine to execute dump and Base Layer administration tasks.