Overview of data transformation

更新时间:
复制 MD 格式

Simple Log Service provides a fully managed, scalable, and highly available data transformation service for data normalization, enrichment, forwarding, masking, and filtering.

Learn about data transformation

Watch the following video to learn about data transformation.

Simple Log Service provides a wide range of videos about data transformation. For more information and tutorials, see Videos for data transformation.

Video subject

Link

Transformation syntax

Syntax overview

Query string syntax

String syntax

Structured data parsing

Mapping-based enrichment

Mapping-based enrichment functions

Data forwarding

Data forwarding

Convert logs to metrics

Use data transformation to convert logs to metrics

Process special data (such as Base64 text, URL data, and IP address data)

Use data transformation to process special data

Transformation process

The data transformation service processes data in the following three steps.

  1. Reads data from the source Logstore using a consumer group.

  2. Transforms each log based on a transformation rule.

  3. Writes the transformed data to the destination Logstore.

    After the data is transformed, you can view the results in the destination Logstore.

Features

Simple Log Service provides data transformation features that you can use to normalize, enrich, forward, mask, and filter data. The following list describes these features:

  • Data normalization: Extracts fields from logs in various formats and converts the log formats. This process produces structured data for stream processing and data warehouse computing.

  • Data enrichment: Joins fields from logs (such as order logs) and dimension tables (such as user information tables). This adds more dimensions to your logs for data analytics.

  • Data forwarding: Transfers logs from regions outside China to a central region using the cross-region acceleration feature. This helps you manage global logs in a centralized manner.

  • Data masking: Masks sensitive information in data, such as passwords, phone numbers, and addresses.

  • Data filtering: Filters logs to obtain logs from key services for focused analysis.

Scenarios

  • Data normalization (one-to-one): Reads log data from a source Logstore, transforms the data, and then writes the data to a destination Logstore.数据规整

  • Data distribution (one-to-many): Reads log data from a source Logstore, transforms the data, and then writes the data to different destination Logstores.数据分派

  • Data aggregation (many-to-one): Reads log data from different source Logstores, transforms the data, and then writes the data to a destination Logstore.多源汇集

Processing syntax

The SLS domain-specific language (DSL) provides more than 200 built-in functions and 400 regular expression patterns. For more information, see Syntax overview.

Benefits

  • Orchestrates various functions using the SLS DSL, including data filtering, normalization, enrichment, forwarding, and masking.

  • Processes data in real time, making it visible within seconds. It provides scalable computing capabilities, elastic scaling based on data volume, and high throughput capacity.

  • Provides out-of-the-box functions for various log analysis scenarios.

  • Integrates with real-time dashboards, exception logs, and alerting.

  • Offers a fully managed service that requires no operations and maintenance (O&M). This service integrates with Alibaba Cloud big data products and open source ecosystems.

Billing

  • If your Logstore uses the pay-by-ingested-data billing mode, no fees are generated for the data transformation service. However, outbound traffic fees are generated if you pull data over a Simple Log Service public endpoint. These fees are calculated based on the data volume after compression. For more information, see Billable items in pay-by-ingested-data mode.

  • If your Logstore uses the pay-by-feature billing mode, fees are generated because the data transformation service consumes machine and network resources. For more information, see Billable items in pay-by-feature mode.

  • To save costs, you can disable the indexing feature for the source Logstore and set a shorter data retention period. For more information, see Performance guide and Cost optimization guide.