Simple Log Service provides four data processing solutions across the data lifecycle: processing plugins (at collection), ingest processors (at write), data transformation (post-storage), and consumer processors (at consumption). Compare these solutions to choose the right one for your performance, cost, and capability requirements.
Background information
Each solution operates at a different stage of the data pipeline:
-
Processing plugin: The Simple Log Service data collector supports processing plugins and SPL statements to process data on the client side before ingestion.
-
Ingest processor : An ingest processor is associated with a Logstore. By default, all data written to the Logstore is processed server-side by the ingest processor at write time.
-
Data transformation: Data is first written to a source Logstore, then processed based on transformation rules and written to a destination Logstore.
-
Consumer processor: A consumer processor performs real-time data processing on Logstore data through SPL during consumption. Consumer processors integrate with third-party services such as SDK, Flink, and DataWorks.
Capability comparison
Processing plugins, ingest processors, data transformation, and consumer processors span the full data lifecycle: before storage (at collection), during storage (at write), and after storage. All four solutions support SPL, but differ in scope, resource usage, and supported scenarios.
|
Dimension |
Processing plugin |
Ingest processor |
Data transformation |
Consumer processor |
|
Processing stage |
Before storage (at collection) |
During storage (at write) |
After storage |
After storage |
|
Write to multiple Logstores |
Not supported by a single collection configuration. Use multiple collection configurations with processing plugins instead. |
Not supported |
Supported |
Not supported |
|
SPL support |
Supported |
Supported |
Supported |
Supported |
|
Supported SPL instructions |
Single-row instructions only. Input: one row; output: zero or one row. |
Single-row instructions only. Input: one row; output: zero or one row. |
Full SPL instruction set |
Full SPL instruction set |
|
Prevent sensitive data from being written to disk |
Supported |
Supported |
Not supported. Data passes through the source Logstore. |
Not supported. Data passes through the source Logstore. |
|
Resource usage |
Consumes client-side resources |
Server-side auto-scaling, transparent to users |
Server-side auto-scaling, transparent to users |
Server-side auto-scaling, transparent to users |
|
Performance impact |
Collection performance varies slightly with the number and complexity of plugins. Write performance remains unaffected. |
Write latency increases by several milliseconds to tens of milliseconds, depending on the data packet size and SPL statement complexity. |
Source Logstore write performance is unaffected. |
Source Logstore write performance is unaffected. |
|
Scenario coverage |
Broad |
Moderate |
Broad |
Broad |
|
Cost |
No Simple Log Service data processing fees. Client resources are consumed. |
Data processing fees apply. In data filtering scenarios, these fees are typically lower than the savings from reduced traffic and storage costs. |
Source Logstore fees plus data processing fees. To reduce source Logstore costs, set the data retention period to one day and disable indexing. |
Source Logstore fees plus data processing fees. To reduce source Logstore costs, set the data retention period to one day and disable indexing. |
|
Fault tolerance |
Configure whether to retain original fields when processing fails. |
Configure whether to retain original data when processing fails. |
Source data is already stored, so you can reprocess data if a transformation rule fails. Create multiple transformation jobs to process data separately. |
Source data is already stored. Flink, DataWorks, and SDK consumer groups with SPL consumption rules automatically retry on errors. |
The following table compares the four solutions across typical scenarios. Use the recommendation levels to identify the best fit for each use case.
|
Scenario |
Processing plugin |
Ingest processor |
Data transformation |
Consumer processor |
|
Simple data processing that involves single-row operations without complex computational logic |
Recommended |
Recommended |
Recommended |
Recommended |
|
Complex data processing that involves multi-condition logic, window aggregation, or dimension table enrichment |
Adequate |
Adequate |
Recommended |
Recommended |
|
Limited client resources, such as when Logtail has restricted compute capacity |
Adequate |
Recommended |
Recommended |
Recommended |
|
Limited client-side control, such as no permission to modify Logtail configurations or SDK write logic |
Not recommended |
Recommended |
Recommended |
Recommended |
|
Limited server-side control, such as no permission to modify Logstore or transformation configurations |
Recommended |
Not recommended |
Not recommended |
Not recommended |
|
Latency-sensitive writes that require raw data to be collected as quickly as possible |
Adequate |
Adequate |
Recommended |
Recommended |
|
Data masking when sensitive data is allowed to be written to disk |
Recommended |
Recommended |
Recommended |
Recommended |
|
Data masking when sensitive data must not be written to disk |
Recommended |
Recommended |
Not recommended |
Not recommended |
|
Data enrichment from internal sources, such as adding a field with a static value or a value extracted from an existing field |
Adequate |
Recommended |
Recommended |
Recommended |
|
Data enrichment from external sources, such as querying a MySQL table for additional data based on a log field |
Not recommended |
Not recommended |
Recommended |
Recommended |
|
Data distribution that routes data to different Logstores based on conditions |
Adequate |
Not recommended |
Recommended |
Not recommended |
|
Data filtering to discard raw data and reduce costs |
Adequate |
Recommended |
Adequate |
Adequate |