数据压缩

lz4是一种无损压缩算法,具有高速解码与压缩能力。日志服务部分API接口支持lz4压缩算法,使用lz4压缩算法可以减少网络传输流量,降低流量费用,提升接口访问速度。

压缩请求数据

日志服务如下API接口支持在HTTP请求体中传输lz4压缩格式的数据。

  • PutLogs(PutLogStoreLogs)

  • PutWebtracking

其使用方法主要分为如下几个步骤:

  1. 在HTTP请求头中添加x-log-compresstype: lz4

  2. 使用lz4压缩算法压缩HTTP请求体。

  3. 将HTTP请求头中的x-log-bodyrawsize设为请求体压缩前大小。

  4. 将HTTP请求头中的Content-Length设为请求体压缩后大小。

接收压缩数据

日志服务的PullLogs接口可返回lz4压缩格式数据。

使用方法:

  • 通过在请求头中设置Accept-Encoding: lz4,服务端将会返回lz4压缩数据。

  • 返回的请求头中x-log-bodyrawsize标识了请求体的压缩前的原始大小,可作为解压参数使用。

使用示例

  • 原始日志

    log-sample.json的内容作为参考示例。在实际使用API接口访问日志服务时,请以实际数据结构为准。

    {
      "__tags__": {},
      "__topic__": "",
      "__source__": "47.100.XX.XX",
      "__logs__": [
        {
          "__time__": "03/22 08:51:01",
          "content": "*************** RSVP Agent started ***************",
          "method": "main",
          "level": "INFO"
        },
        {
          "__time__": "03/22 08:51:01",
          "content": "Specified configuration file: /u/user10/rsvpd1.conf",
          "method": "locate_configFile",
          "level": "INFO"
        },
        {
          "__time__": "03/22 08:51:01",
          "content": "Using log level 511",
          "method": "main",
          "level": "INFO"
        },
        {
          "__time__": "03/22 08:51:01",
          "content": "Get TCP images rc - EDC8112I Operation not supported on socket",
          "method": "settcpimage",
          "level": "INFO"
        },
        {
          "__time__": "03/22 08:51:01",
          "content": "Associate with TCP/IP image name = TCPCS",
          "method": "settcpimage",
          "level": "INFO"
        },
        {
          "__time__": "03/22 08:51:02",
          "content": "registering process with the system",
          "method": "reg_process",
          "level": "INFO"
        },
        {
          "__time__": "03/22 08:51:02",
          "content": "attempt OS/390 registration",
          "method": "reg_process",
          "level": "INFO"
        },
        {
          "__time__": "03/22 08:51:02",
          "content": "return from registration rc=0",
          "method": "reg_process",
          "level": "INFO"
        }
      ]
    }
  • 测试程序

    以Python代码为例,其压缩过程如下:

    from lz4 import block
    with open('log-sample.json', 'rb') as f:
        data = f.read()
        compressed = block.compress(data, store_size=False) # 压缩
        print(f'out/in: {len(compressed)}/{len(data)} Bytes')
        print(f'Compression ratio: {len(compressed)/len(data):.2%}')
  • 压缩效果

    示例文件log-sample.json的压缩测试结果如下所示,压缩比为39.30%。实际压缩效果与文件内容有关,一般来说,内容重复率高的数据有更好的压缩效果,请以实际压缩效果为准。

    out/in: 542/1379 Bytes
    Compression ratio: 39.30%

示例代码

  • Go示例

    • 安装依赖库

      go get github.com/pierrec/lz4
    • 示例代码

      import (
          "fmt"
          "log"
      
          lz4 "github.com/cloudflare/golz4"
      )
      
      func main() {
          data := []byte("hello world, hello golang")
          // 压缩
          compressed := make([]byte, lz4.CompressBound(data))
          compressedSize, err := lz4.Compress(data, compressed)
          if err != nil {
              log.Fatal(err)
          }
          compressed = compressed[:compressedSize]
      
          // 解压
          bodyRawSize := len(data) // 解压日志服务数据,可读取返回http头中的x-log-bodyrawsize。
          decompressed := make([]byte, bodyRawSize)
          err = lz4.Uncompress(compressed, decompressed)
          if err != nil {
              log.Fatal(err)
          }
          decompressed = decompressed[:bodyRawSize]
      }
  • Python示例

    • 安装依赖库

      python3 -m pip install lz4
    • 示例代码

      from lz4 import block
      
      data = b'hello world, hello sls'
      # 压缩
      compressed = block.compress(data, store_size=False)
      
      # 解压
      body_raw_size=len(data) # 解压日志服务数据时,可读取返回http头中的x-log-bodyrawsize。
      decompressed = block.decompress(compressed, uncompressed_size=body_raw_size)
  • Java示例

    • 添加Maven依赖

      <dependency>
        <groupId>net.jpountz.lz4</groupId>
        <artifactId>lz4</artifactId>
        <version>1.3.0</version>
      </dependency>
    • 示例代码

      package sample;
      
      import net.jpountz.lz4.LZ4Compressor;
      import net.jpountz.lz4.LZ4Factory;
      import net.jpountz.lz4.LZ4FastDecompressor;
      
      import java.io.File;
      import java.io.IOException;
      import java.nio.file.Files;
      
      public class Sample {
          public static void main(String[] args) throws IOException {
              byte[] data = "hello world, hello sls".getBytes();
              // 压缩
              LZ4Compressor compressor = LZ4Factory.fastestInstance().fastCompressor();
              int maxLen = compressor.maxCompressedLength(data.length);
              byte[] buffer = new byte[maxLen];
              int compressedSize = compressor.compress(data, 0, data.length, buffer, 0, maxLen);
              // 拷贝buffer数据到compressed中。
              byte[] compressed = new byte[compressedSize];
              System.arraycopy(buffer, 0, compressed, 0, compressedSize);
      
              // 解压
              int bodyRawSize = data.length; // 解压日志服务数据时,可读取返回http头中的x-log-bodyrawsize。
              LZ4FastDecompressor decompressor = LZ4Factory.fastestInstance().fastDecompressor();
              byte[] decompressed = new byte[bodyRawSize];
              decompressor.decompress(compressed, 0, decompressed, 0, bodyRawSize);
          }
      }
  • JavaScript示例

    • 使用npm或者yarn安装依赖库buffer与lz4。

      npm install buffer lz4
    • 示例代码

      import lz4 from 'lz4'
      import { Buffer } from 'buffer'
      
      // 压缩
      const data = 'hello world, hello sls'
      const output = Buffer.alloc(lz4.encodeBound(data.length))
      const compressedSize = lz4.encodeBlock(Buffer.from(data), output)
      const compressed = Uint8Array.prototype.slice.call(output, 0, compressedSize)
      
      // 解压
      const bodyRawSize = data.length; // 解压日志服务数据时,可读取返回http头中的x-log-bodyrawsize。
      const decompressed = Buffer.alloc(bodyRawSize)
      lz4.decodeBlock(Buffer.from(compressed), decompressed)
      const result = decompressed.toString()
  • C++示例

    • 将C++ SDK仓库根目录下的lz4目录以及lib目录拷贝到目标目录。在编译时添加编译参数,添加lz4库搜索路径(如-L./lib),并链接lz4(如-llz4)。更多信息,请参见安装C++ SDK

      g++ -o your_progame your_progame.cpp -std=gnu++11 -llz4 -L./lib/
    • 示例代码

      #include "lz4/lz4.h"
      #include <string>
      #include <iostream>
      using namespace std;
      
      int main()
      {
          string data = "hello sls, hello lz4";
          // 压缩
          string compressed;
          compressed.resize(LZ4_compressBound(data.size()));
          int compressed_size = LZ4_compress(data.c_str(), &compressed[0], data.size());
          compressed.resize(compressed_size);
      
          // 解压
          string decompressed;
          int bodyRawSize = data.size();
          decompressed.resize(bodyRawSize);
          LZ4_decompress_safe(compressed.c_str(), &decompressed[0], compressed.size(), bodyRawSize);
          cout << decompressed << endl;
          return 0;
      }