Sync Kafka data

更新时间: 2026-06-16 18:42:08

Tablestore Sink Connector batch imports data from Apache Kafka to a data table or time series table in Tablestore.

Background information

Apache Kafka is a distributed Message Queuing (MSMQ) system. Kafka Connect allows data systems to import and export data streams through Apache Kafka.

Tablestore Sink Connector is built on Kafka Connect. It polls subscribed topics in Apache Kafka in poll mode, parses the message records, and batch imports the data to Tablestore. The connector optimizes the import process and supports custom configurations.

Tablestore is a multi-model data storage service developed by Alibaba Cloud. It stores large amounts of structured data and supports multiple data models, including the Wide Column model and the TimeSeries model. You can synchronize data from Apache Kafka to a data table (Wide Column model) or time series table (TimeSeries model) in Tablestore. For more information, see Sync Kafka data to a data table and Stream Kafka data to a time series table.

Features

Tablestore Sink Connector supports the following features:

  • At-least-once delivery

    Ensures that message records are delivered from Kafka topics to Tablestore at least once.

  • Data mapping

    Deserializes data in Kafka topics by using a converter. To use a converter, modify the key.converter and value.converter attributes in the worker or connector configurations of Kafka Connect. You can use the built-in JsonConverter, a third-party converter, or a custom converter.

  • Automatic creation of destination tables in Tablestore

    If the destination table does not exist in Tablestore, the connector automatically creates one based on the primary key columns and attribute column whitelist that you specify. If no attribute column whitelist is specified, all fields in the record values of Kafka message records are used as the attribute columns.

  • Error handling policy

    Errors may occur when message records are parsed or written to Tablestore during batch import. You can terminate the task, ignore the error, or log the message record and error details in Kafka or Tablestore.

Working mode

Tablestore Sink Connector supports standalone and distributed modes.

  • In standalone mode, all tasks run in a single process. This mode is easy to configure and suitable for learning about Tablestore Sink Connector.

  • In distributed mode, tasks run in parallel across multiple processes. This mode allocates tasks based on process workloads and provides fault tolerance, offering better stability than standalone mode. We recommend the distributed mode.

上一篇: Tapdata Cloud: Import MySQL data 下一篇: Sync Kafka data to a data table
阿里云首页 表格存储 相关技术圈