DataHub is a streaming data processing service provided by MaxCompute. It lets you publish and subscribe to data streams, and archive streaming data directly into MaxCompute tables for downstream analytics.
Use cases
DataHub is a good fit for the following scenarios:
-
Ingest application or system logs in real time — Push log data from your servers directly into a DataHub stream. Data is available for processing within seconds and is protected even if the source server fails.
-
Run real-time metrics and reporting — Collect event data into DataHub and analyze it as it arrives, without waiting for batch processing cycles.
-
Archive streaming data to MaxCompute — Route continuous data streams into MaxCompute tables automatically, enabling SQL-based analytics on fresh data.
Data archiving
DataHub provides a data archiving feature to archive streaming data in MaxCompute. For details on setting up the real-time data tunnel, see the DataHub documentation.
SDKs
DataHub provides SDKs for multiple languages:
Next steps
-
DataHub documentation — Full reference for the real-time data tunnel.