Data transformation for text in specific formats

更新时间:
复制 MD 格式

The use cases in this document are based on requirements from actual support tickets. This document shows how to use Log Service Domain Specific Language (DSL) orchestration to meet these requirements.

Convert non-standard JSON objects to JSON objects and expand them

To expand nested dictionary data, you must first convert the dictionary data to JSON data. Then, you can use the e_json function to expand the data.

  • Raw log

    content: {
      'referer': '-',
      'request': 'GET /phpMyAdmin',
      'status': 404,
      'data-1': {
        'aaa': 'Mozilla',
        'bbb': 'asde'
      },
      'data-2': {
        'up_adde': '-',
        'up_host': '-'
      }
    }
  • Data transformation statement

    1. You can convert the single quotes in the content field to double quotes to transform the data into the JSON format.

      e_set("content_json",str_replace(ct_str(v("content")),"'",'"'))

      The processed log is as follows:

      content: {
        'referer': '-',
        'request': 'GET /phpMyAdmin',
        'status': 404,
        'data-1': {
          'aaa': 'Mozilla',
          'bbb': 'asde'
        },
        'data-2': {
          'up_adde': '-',
          'up_host': '-'
        }
      }
      content_json:  {
        "referer": "-",
        "request": "GET /phpMyAdmin",
        "status": 404,
        "data-1": {
          "aaa": "Mozilla",
          "bbb": "asde"
        },
        "data-2": {
          "up_adde": "-",
          "up_host": "-"
        }
      }
    2. You can expand the standardized content_json data. For example, to expand only the first layer, set the depth parameter to 1.

      e_json("content_json",depth=1,fmt='full')

      The expanded log is as follows:

      content_json.data-1.data-1:  {"aaa": "Mozilla", "bbb": "asde"}
      content_json.data-2.data-2:  {"up_adde": "-", "up_host": "-"}
      content_json.referer:  -
      content_json.request:  GET /phpMyAdmin
      content_json.status:  404

      If depth is set to 2, the expanded log is as follows:

      content_json.data-1.aaa:  Mozilla
      content_json.data-1.bbb:  asde
      content_json.data-2.up_adde:  -
      content_json.data-2.up_host:  -
      content_json.referer:  -
      content_json.request:  GET /phpMyAdmin
      content_json.status:  404
    3. In summary, you can use the following Log Service DSL rules:

      e_set("content_json",str_replace(ct_str(v("content")),"'",'"'))
      e_json("content_json",depth=2,fmt='full')
  • Processed data

    The following example shows the transformed data when depth is set to 2:

    content:  {
      'referer': '-',
      'request': 'GET /phpMyAdmin',
      'status': 404,
      'data-1': {
        'aaa': 'Mozilla',
        'bbb': 'asde'
      },
      'data-2': {
        'up_adde': '-',
        'up_host': '-'
      }
    }
    content_json:  {
      "referer": "-",
      "request": "GET /phpMyAdmin",
      "status": 404,
      "data-1": {
        "aaa": "Mozilla",
        "bbb": "asde"
      },
      "data-2": {
        "up_adde": "-",
        "up_host": "-"
      }
    }
    content_json.data-1.aaa:  Mozilla
    content_json.data-1.bbb:  asde
    content_json.data-2.up_adde:  -
    content_json.data-2.up_host:  -
    content_json.referer:  -
    content_json.request:  GET /phpMyAdmin
    content_json.status:  404

Expand text in other formats to JSON

To expand non-standard JSON data, you can combine rules.

  • Raw log

    content : {
      "pod" => {
        "name" => "crm-learning-follow-7bc48f8b6b-m6kgb"
      }, "node" => {
        "name" => "tw5"
      }, "labels" => {
        "pod-template-hash" => "7bc48f8b6b", "app" => "crm-learning-follow"
      }, "container" => {
        "name" => "crm-learning-follow"
      }, "namespace" => "testing1"
    }
  • Data transformation statement

    1. You can convert the log to the JSON format using the str_logtash_config_normalize function.

      e_set("normalize_data",str_logtash_config_normalize(v("content")))
    2. You can use a JSON function to expand the data.

      e_json("normalize_data",depth=1,fmt='full')
    3. In summary, you can use the following Log Service DSL rules:

      e_set("normalize_data",str_logtash_config_normalize(v("content")))
      e_json("normalize_data",depth=1,fmt='full')
  • Processed data

    content : {
      "pod" => {
        "name" => "crm-learning-follow-7bc48f8b6b-m6kgb"
      }, "node" => {
        "name" => "tw5"
      }, "labels" => {
        "pod-template-hash" => "7bc48f8b6b", "app" => "crm-learning-follow"
      }, "container" => {
        "name" => "crm-learning-follow"
      }, "namespace" => "testing1"
    }
    normalize_data:  {
      "pod": {
        "name": "crm-learning-follow-7bc48f8b6b-m6kgb"
      },
      "node": {
        "name": "tw5"
      },
      "labels": {
        "pod-template-hash": "7bc48f8b6b",
        "app": "crm-learning-follow"
      },
      "container": {
        "name": "crm-learning-follow"
      },
      "namespace": "testing1"
    }
    normalize_data.container.container:  {"name": "crm-learning-follow"}
    normalize_data.labels.labels:  {"pod-template-hash": "7bc48f8b6b", "app": "crm-learning-follow"}
    normalize_data.namespace:  testing1
    normalize_data.node.node:  {"name": "tw5"}
    normalize_data.pod.pod:  {"name": "crm-learning-follow-7bc48f8b6b-m6kgb"}

Convert text with special encoding

In your daily work, you may encounter unreadable characters. You can use the str_hex_escape_encode function to escape these characters into a readable hexadecimal format.

  • Raw log

    content : "\xe4\xbd\xa0\xe5\xa5\xbd"
  • Log Service DSL orchestration

    e_set("hex_encode",str_hex_escape_encode(v("content")))
  • Processed data

    content : "\xe4\xbd\xa0\xe5\xa5\xbd"
    hex_encode : "Hello"

Expand XML fields

You may encounter different data types, such as XML. To expand XML data, you can first convert it to JSON using the xml_to_json function.

  • Test log

    str : <?xmlversion="1.0"?>
    <data>
        <countryname="Liechtenstein">
            <rank>1</rank>
            <year>2008</year>
            <gdppc>141100</gdppc>
            <neighborname="Austria"direction="E"/>
            <neighborname="Switzerland"direction="W"/>
        </country>
        <countryname="Singapore">
            <rank>4</rank>
            <year>2011</year>
            <gdppc>59900</gdppc>
            <neighborname="Malaysia"direction="N"/>
        </country>
        <countryname="Panama">
            <rank>68</rank>
            <year>2011</year>
            <gdppc>13600</gdppc>
            <neighborname="Costa Rica"direction="W"/>
            <neighborname="Colombia"direction="E"/>
        </country>
    </data>
  • Log Service DSL orchestration

    e_set("str_json",xml_to_json(v("str")))
  • Log after transformation

    str : <?xmlversion="1.0"?>
    <data>
        <countryname="Liechtenstein">
            <rank>1</rank>
            <year>2008</year>
            <gdppc>141100</gdppc>
            <neighborname="Austria"direction="E"/>
            <neighborname="Switzerland"direction="W"/>
        </country>
        <countryname="Singapore">
            <rank>4</rank>
            <year>2011</year>
            <gdppc>59900</gdppc>
            <neighborname="Malaysia"direction="N"/>
        </country>
        <countryname="Panama">
            <rank>68</rank>
            <year>2011</year>
            <gdppc>13600</gdppc>
            <neighborname="Costa Rica"direction="W"/>
            <neighborname="Colombia"direction="E"/>
        </country>
    </data>
    str_dict :{
      "data": {
        "country": [{
          "@name": "Liechtenstein",
          "rank": "1",
          "year": "2008",
          "gdppc": "141100",
          "neighbor": [{
            "@name": "Austria",
            "@direction": "E"
          }, {
            "@name": "Switzerland",
            "@direction": "W"
          }]
        }, {
          "@name": "Singapore",
          "rank": "4",
          "year": "2011",
          "gdppc": "59900",
          "neighbor": {
            "@name": "Malaysia",
            "@direction": "N"
          }
        }, {
          "@name": "Panama",
          "rank": "68",
          "year": "2011",
          "gdppc": "13600",
          "neighbor": [{
            "@name": "Costa Rica",
            "@direction": "W"
          }, {
            "@name": "Colombia",
            "@direction": "E"
          }]
        }]
      }
    }