TSQL10分钟快速入门

我们现在以一个机房性能监控为应用场景,通过一个工具产生的样本数据,来展示如何TSQL来完成时序查询。

样本数据

这个样本数据使用了一个时序性能测试工具benchmark(https://github.com/influxdata/influxdb-comparisons)中的样本数据生成工具。在安装并编译这个工具后,你可以使用下面两步生成数据并加载到TSDB引擎中。

生成数据

cd influxdb-comparisons/cmds
bulk_data_gen/bulk_data_gen --seed=123 --use-case=devops --scale-var=10 --format=opentsdb --timestamp-start="2019-03-01T00:00:00Z" --timestamp-end="2019-03-01T00:10:00Z" > tsdb_devops_sf10_10m_seed123.json

下面显示了部分样本数据:

{"metric":"redis.evicted_keys","timestamp":1551398990000,"tags":{"arch":"x86","datacenter":"us-east-1b","hostname":"host_9","os":"Ubuntu16.10","port":"1470","rack":"7","region":"us-east-1","server":"redis_29176","service":"14","service_environment":"production","service_version":"0","team":"LON"},"value":2951}
{"metric":"redis.keyspace_hits","timestamp":1551398990000,"tags":{"arch":"x86","datacenter":"us-east-1b","hostname":"host_9","os":"Ubuntu16.10","port":"1470","rack":"7","region":"us-east-1","server":"redis_29176","service":"14","service_environment":"production","service_version":"0","team":"LON"},"value":2945}
{"metric":"redis.keyspace_misses","timestamp":1551398990000,"tags":{"arch":"x86","datacenter":"us-east-1b","hostname":"host_9","os":"Ubuntu16.10","port":"1470","rack":"7","region":"us-east-1","server":"redis_29176","service":"14","service_environment":"production","service_version":"0","team":"LON"},"value":2944}
{"metric":"redis.instantaneous_ops_per_sec","timestamp":1551398990000,"tags":{"arch":"x86","datacenter":"us-east-1b","hostname":"host_9","os":"Ubuntu16.10","port":"1470","rack":"7","region":"us-east-1","server":"redis_29176","service":"14","service_environment":"production","service_version":"0","team":"LON"},"value":65}
{"metric":"redis.instantaneous_input_kbps","timestamp":1551398990000,"tags":{"arch":"x86","datacenter":"us-east-1b","hostname":"host_9","os":"Ubuntu16.10","port":"1470","rack":"7","region":"us-east-1","server":"redis_29176","service":"14","service_environment":"production","service_version":"0","team":"LON"},"value":58}

加载数据

cat tsdb_devops_sf10_10m_seed123.json | bulk_load_opentsdb/bulk_load_opentsdb --urls=http://your_tsdb_host:port_num -workers=5

查询

  • 时间范围查询:查看一个metric在一个时间段内的所有的列,包括值,时间戳,以及对应的tag key的值。

select *
from tsdb.`cpu.usage_system` 
where `timestamp` between '2019-03-01 00:00:00' and '2019-03-01 00:00:10'
  • 时间范围查询:查询一个metric在一个时间段内的指定的列,包括值,时间戳,以及具体某几个tag key的值。

select `value`, `timestamp`, hostname,  datacenter
from tsdb.`cpu.usage_system` 
where `timestamp` between '2019-03-01 00:00:00' and '2019-03-01 00:00:10'
  • 时间范围+tagkey条件查询:查询一个metric在时间段内的值,时间戳, 并且hostname满足IN-LIST的条件。

select `value`, `timestamp`, hostname,  datacenter
from tsdb.`cpu.usage_system` 
where `timestamp` between '2019-03-01 00:00:00' and '2019-03-01 00:00:10' and
      hostname in ('host_0', 'host_2', 'host_4')
  • 查询结果按时间戳排序:查看一个metric在一个时间段内的值,时间戳,以及对应的tag key的值。

select *
from tsdb.`cpu.usage_system` 
where `timestamp` between '2019-03-01 00:00:00' and '2019-03-01 00:00:10'
order by `timestamp`
  • 含数学计算表达式的值过滤条件查询:查看一个metric在一个时间段内的值,时间戳,以及对应的tag key的值, 值满足其平方根>1.5。

select *
from tsdb.`cpu.usage_system` 
where `timestamp` between '2019-03-01 00:00:00' and '2019-03-01 00:00:10' and
    sqrt(`value`) > 1.5
  • 分组聚合查询:按照hostname, datacenter来分组,计算每个分组最大值,最小值,平均值。

select 
    hostname, 
    datacenter, 
    max(`value`) as maxV, 
    min(`value`) as minV, 
    avg(`value`) as avgV
from tsdb.`cpu.usage_system` 
where `timestamp` between '2019-03-01 00:00:00' and '2019-03-01 00:00:10'
group by hostname, datacenter
  • 分组分时间段聚合查询,按照hostname, datacenter来分组,并且进一步按照2分钟的间隔分组,计算最大值,最小值,平均值。

select 
   hostname, 
   datacenter, 
   tumble(`timestamp`, interval '2' minute) as ts, 
   max(`value`) as maxV, 
   min(`value`) as minV, 
   avg(`value`) as avgV
from tsdb.`cpu.usage_system` 
where `timestamp` between '2019-03-01 00:00:00' and '2019-03-01 00:10:00'
group by hostname, datacenter, ts
  • 分组分时间段聚合查询, 计算聚合后的表达式的值max(value) - min(value) + 0.5* avg(value`)

select 
    hostname, 
    datacenter, 
    tumble(`timestamp`, interval '2' minute) as ts, 
    max(`value`) - min(`value`) + 0.5* avg(`value`) as compV
from tsdb.`cpu.usage_system` 
where `timestamp` between '2019-03-01 00:00:00' and '2019-03-01 00:10:00'
group by hostname, datacenter, ts
  • 计算每台机器上,相邻两个时间戳上的记录值之间的差值。

下面的例子使用window function lag(), lag()函数使用一个基于hostname做分组,时间戳排序的窗口frame, 返回在同一个窗口内当前记录的前一条记录的值,通过计算两者之差,获取每台主机上相邻时间戳的记录值上的差值。

select hostname, `timestamp`, `value`, 
       `value` - lag(`value`) over(partition by hostname order by `timestamp`) as diff
from tsdb.`cpu.usage_system` 
  where `timestamp` between '2019-03-01' and '2019-03-01 00:10:00'
  • 计算每台机器上,相邻两个时间戳上的记录值之间的差值, 如果差值超过异常值,则重设成0。

下面的例子,把上面的查询放在一个子查询中,在子查询之外用了一个case表达式,来表达如果查询中的差值超过50.0(认为是异常值), 则重设成0.0。

select hostname, `timestamp`, `value`, 
        case when diff > 50.0 then 0.0
                else diff
        end
from (
    select hostname, `timestamp`, `value`, 
       `value` - lag(`value`) over(partition by hostname order by `timestamp`) as diff
    from tsdb.`cpu.usage_system` 
      where `timestamp` between '2019-03-01' and '2019-03-01 00:10:00'
);
  • 计算每台机器上一分钟内的记录的最大值,以及相邻分钟之间最大值的差值。

下面的查询区别于之前差值计算,在于要获取每分钟的最大值的差值。我们可以在子查询中计算每分钟的最大值,在子查询外用窗口函数lag()计算获得相邻分钟之间最大值的差值。

select hostname, ts, maxValue, 
             maxValue - lag(maxValue) over(partition by hostname order by ts) as diff
from (
  select hostname, 
              tumble(`timestamp`, interval '1' minute) ts, max(`value`) maxValue
  from tsdb.`cpu.usage_system` 
  where `timestamp` between '2019-03-01' and '2019-03-01 00:10:00'
  group by hostname, ts)