Data transformation FAQ (old version)

更新时间:
复制 MD 格式

After a data transformation task starts, the transformation results are sent to the destination Logstores based on routing rules. If the task fails, for example, no logs are generated in the destination Logstores or a significant delay occurs, this topic explains how to troubleshoot these issues.

Error analysis

When an error occurs, identifying the stage in which it occurred helps you locate the issue more efficiently.

Based on How it works, a data transformation task consists of four main stages.加工环节

Errors can occur in any of these stages. The causes, impacts, and troubleshooting methods vary for each stage.

  • Start the data transformation engine.

    • Errors at this stage primarily occur when the data transformation engine detects issues in your DSL rules during startup, causing internal security checks to fail.

    • If an error occurs at this stage, the data transformation task stops. You must modify the DSL rules and restart the task. If the restart is successful, the task resumes normal operation without data loss or redundancy.

    For information about how to troubleshoot errors at this stage, see Data transformation engine startup errors.

  • Read data from the source Logstore.

    • Errors at this stage can be caused by issues accessing the source Logstore, which can result from misconfigurations, network problems, or changes to the Logstore's settings.

    • The data transformation task continuously retries until it succeeds or is manually stopped. Upon success, the task resumes normal operation without data loss.

    • If an error occurs after some data has already been read, the service saves a checkpoint and keeps retrying. After a successful retry, it continues reading from the checkpoint, preventing data loss or duplication. If the task is stopped during the retry process, no data is lost or duplicated.

    For information about how to troubleshoot errors at this stage, see Source Logstore read errors.

  • Transform log events.

    • Errors at this stage mainly occur when some or all log events are incompatible with the transformation rules.

    • At this stage, errors are triggered by log events that are incompatible with the transformation rules. Errors are classified as WARNING or ERROR level, which is indicated by the logging.levelname field in the processing logs.

      • For ERROR-level errors, the corresponding log event is dropped and will not be included in the transformed output.

      • For WARNING-level errors, such as when an event does not match a regular expression rule, the current DSL step is skipped for that event, and processing continues to the next step.

    For information about how to troubleshoot errors at this stage, see Transformation rule errors.

  • Export to the destination Logstore.

    • Errors at this stage can be caused by issues accessing the destination Logstore, which can result from misconfigurations, network problems, or changes to the Logstore's settings.

    • The data transformation task continuously retries until it succeeds or is manually stopped. Upon success, the task resumes normal operation without data loss.

    • If an error occurs after some data has already been exported, for example, when you have two destinations and one succeeds while the other fails, the task saves a checkpoint and keeps retrying. After a successful retry, no data is lost or duplicated. However, if you stop and restart the task during this process, it will resume from the last checkpoint. In this case, no data is lost, but data redundancy may occur.

    For information about how to troubleshoot errors at this stage, see Troubleshoot write errors for a destination Logstore.

General troubleshooting

  1. Verify that data is being written to the destination Logstore.

    To check if data has been recently written to a destination Logstore, view the data on the consumption preview page of the destination Logstore.消费预览

    Note

    Querying the Logstore directly may not be accurate for the following reasons:

    • Data transformation processes logs based on their reception time. If you process historical logs, their write time might not fall within your query's time range.

    • You can query historical logs only after they are indexed, which typically takes several minutes. Therefore, data from a task that writes historical logs may not be immediately queryable.

  2. Check the status of the data transformation task.

    • Check the progress of the task to see if it has started. For more information, see View the status of a data transformation task. Tasks with a fixed time range stop automatically after reaching the end time.

    • Check the consumer group associated with the task. Ensure it is enabled and its status is updating.消费状态

    • See View error logs to check for exceptions. Then, use the information in Analyze errors to find the cause and resolve the issue.

  3. Check if the source Logstore is generating data.

    Verify that logs exist in the source Logstore within the time range of the data transformation task.

    • If no end time is configured, check if new logs are being generated in the source Logstore. The data transformation task cannot proceed if there are no new logs and no historical logs in the specified time range.

    • If you selected a historical time range, verify that logs exist in the source Logstore within that range.

    Click Modify the transformation rule of a data transformation task for the task, select the relevant time range, and check if raw logs are present.

  4. Check for issues in the transformation rule.

    Check your transformation rule for code that may cause issues. Examples:

    • The rule modifies the log time, causing the log to fall outside the query time range.

    • The transformation rule drops logs under certain conditions.

      For example, the following code discards all logs where the name field does not exist or is empty. The preceding logic is responsible for creating the name field. If the preceding logic has an issue and the name field is not created correctly, no logs will be produced.

      # .... Preceding logic.
      #  .... Construct the name field...
      
      e_keep(e_search('name: "?"'))
    • If your logic pulls data from a third-party source for enrichment, check if the third-party dataset is too large. This can cause the data transformation task to remain in an initializing state for a long time before it starts consuming data. Example:

      e_dict_map(res_rds_mysql(..database="userinfo", table="user"), "username", ["city", "school", "age"])

    Click Modify the transformation rule of a data transformation task for the task, select a time range, and then click Preview Data to view the result.

    If you can reproduce the issue, you can debug it by commenting out specific statements and previewing the result again.

  5. Confirm that the number of shards meets expectations.

    If you find that data transformation is too slow, consider whether the configurations of your source and destination Logstores meet performance requirements. We recommend adjusting the number of shards in the source or destination Logstore.

Viewing error logs

You can view error logs by using the following methods.

  • You can view the data in the Logstore internal-etl-log.

    Logs generated by data transformation tasks are stored in the Logstore internal-etl-log. This Logstore is automatically created by the system after a data transformation task is executed.

    • internal-etl-log is a dedicated Logstore that is free of charge. You cannot modify its configuration or write other data to it.

    • In internal-etl-log, the __topic__ field of each log event displays the status of the data transformation task. You can use this field to determine whether an error has occurred in the corresponding data transformation task.

    • You can view the specific error information in the message and reason fields of each log event, as shown in the following figure.Logstore error information

  • View errors on the dashboard.

    Click the target data transformation task and view the dashboard in the Execution Status area on the Data Transformation Overview page.

    The detailed error message is in the reason column of the exception details, as shown in the figure below.

    image

  • View errors in the console.

    The Log Service console displays error logs from the preview phase directly. The preview phase simulates the transformation rule's operations and shows the expected result without making actual changes to the source or destination Logstore. Therefore, any errors encountered during the preview phase do not affect your source log events.

Preview limitations

Data transformation in the preview phase has some limitations compared to an actual data transformation task.

  • The preview phase cannot detect permission issues with the source Logstore's credentials.

    The preview phase does not create a consumer group to consume data, so it does not check for consumer group permissions.

  • The preview phase cannot detect incorrect destination names in the transformation rule.

    The preview phase does not perform an actual write operation to the destination, so it does not check if the configured destination exists.

  • The preview phase cannot detect configuration errors for the destination.

    • This includes misconfigurations of the destination Project, Logstore, or AccessKey permissions.

    • The preview phase does not perform an actual write operation to the destination, so it does not check if the destination's configuration is correct.

  • The preview does not cover all data.

    • By default, the preview phase pulls only 1,000 log events from the source Logstore for transformation.

    • If the first 1,000 log events do not produce any transformation results, the service continues to pull data for up to five minutes until a result is generated.