本文介绍时序SQL中支持的常用函数和使用示例。
引擎与版本
常用时序函数列表
函数 | 说明 |
获取系统当前运行时间。 | |
计算时序数据表中Field列的和。 | |
计算时序数据表中Field列的平均值。 | |
统计时序数据表中Field列值的个数。 | |
查询时序数据表中Field列的最小值。 | |
查询时序数据表中Field列的最大值。 | |
获取时序数据表中Field列的第一个值。 | |
获取时序数据表中Field列的最后一个值。 | |
计算时序数据表中与Field列关联的第P个百分位字段值。 | |
计算时序数据表中Field列值的斜率。 | |
计算时序数据表中Field列的值与前一行对应值的差。 | |
获取时序数据表中Field列的最新的N条数据。 | |
对时序数据表中的Field列进行时序预测。 | |
对时序数据表中Field列进行时序异常检测。 | |
对时序数据表中Field列进行特征分箱。 |
数据准备
假设示例表sensor的结构如下:
+-------------+-----------+------------+
| columnName | typeName | columnKind |
+-------------+-----------+------------+
| device_id | VARCHAR | TAG |
| region | VARCHAR | TAG |
| time | TIMESTAMP | TIMESTAMP |
| temperature | DOUBLE | FIELD |
| humidity | DOUBLE | FIELD |
+-------------+-----------+------------+
表中数据如下:
+-----------+----------+---------------------------+-------------+-----------+
| device_id | region | time | temperature | humidity |
+-----------+----------+---------------------------+-------------+-----------+
| F07A1260 | north-cn | 2021-04-22T15:33:00+08:00 | 12.100000 | 45.000000 |
| F07A1260 | north-cn | 2021-04-22T15:33:10+08:00 | 13.200000 | 47.000000 |
| F07A1260 | north-cn | 2021-04-22T15:33:20+08:00 | 10.600000 | 46.000000 |
| F07A1261 | south-cn | 2021-04-22T15:51:20+08:00 | 12.100000 | 51.000000 |
| F07A1261 | south-cn | 2021-04-22T15:51:21+08:00 | 13.200000 | 52.000000 |
| F07A1261 | south-cn | 2021-04-22T15:51:27+08:00 | 10.600000 | 53.000000 |
| F07A1261 | south-cn | 2021-04-22T15:51:47+08:00 | 20.600000 | 55.000000 |
+-----------+----------+---------------------------+-------------+-----------+
共包含以下两条时间线:
时间线1:Tag为
device_id=F07A1260, region=north-cn
。Field为temperature、humidity。时间线2:Tag为
device_id=F07A1261, region=south-cn
。Field为temperature、humidity。
关于时间线的概念,请参见基本概念。
时间处理函数
CURRENT_TIMESTAMP函数
说明
获取系统当前运行时间。
语法
CURRENT_TIMESTAMP
CURRENT_TIMESTAMP无参数,使用时直接通过关键字
CURRENT_TIMESTAMP
调用即可。CURRENT_TIMESTAMP函数返回的时间戳精确至毫秒,例如
2023-04-23T21:13:15.819+08:00
。CURRENT_TIMESTAMP函数返回的是UTC+8格式的Lindorm实例服务端的系统时间,使用该函数前请确认系统时间是否满足业务需求。
请尽量避免在实际业务写入数据时直接使用该函数的返回结果。建议在开发测试中需要大规模生成测试数据的场景下使用CURRENT_TIMESTAMP函数。
示例
在sensor表中插入一条数据,TAG为device_id=F07A1262, region=north-cn
,时间戳为数据写入的当前时间。
INSERT INTO sensor(device_id,region,time,temperature,humidity) VALUES ('F07A1262','north-cn',CURRENT_TIMESTAMP,19.9,42);
聚合函数
SUM函数
说明
计算时序数据表中Field列的和。
语法
SUM(field_name)
SUM的返回类型为DOUBLE或长整型BIGINT。
参数
参数 | 描述 |
field_name | Field列名。 说明 Field列的类型不能是VARCHAR和BOOLEAN类型。 |
示例
示例1:设备ID为F07A1260和F07A1261,对每条时间线进行降采样操作,时间线按照20s时间窗口计算temperature列的和。
SELECT device_id, region, time, sum(temperature) AS temperature FROM sensor WHERE time <= '2021-04-22T15:51:47+08:00' and time >= '2021-04-22T15:33:00+08:00' sample by 20s;
返回结果如下:
+-----------+----------+---------------------------+-------------+ | device_id | region | time | temperature | +-----------+----------+---------------------------+-------------+ | F07A1260 | north-cn | 2021-04-22T15:33:00+08:00 | 25.300000 | | F07A1261 | south-cn | 2021-04-22T15:51:20+08:00 | 35.900000 | | F07A1260 | north-cn | 2021-04-22T15:33:20+08:00 | 10.600000 | | F07A1261 | south-cn | 2021-04-22T15:51:40+08:00 | 20.600000 | +-----------+----------+---------------------------+-------------+
示例2:设备ID为F07A1260和F07A1261,计算每个设备的温度之和。
SELECT device_id, region, time, sum(temperature) AS temperature FROM sensor WHERE time <= '2021-04-22T15:51:47+08:00' and time >= '2021-04-22T15:33:00+08:00' sample by 0;
返回结果如下:
+-----------+----------+---------------------------+-------------+ | device_id | region | time | temperature | +-----------+----------+---------------------------+-------------+ | F07A1260 | north-cn | 1970-01-01T08:00:00+08:00 | 35.900000 | | F07A1261 | south-cn | 1970-01-01T08:00:00+08:00 | 56.500000 | +-----------+----------+---------------------------+-------------+
示例3:计算所有设备的温度之和。
SELECT sum(temperature) AS temperature FROM sensor WHERE time <= '2021-04-22T15:51:47+08:00' and time >= '2021-04-22T15:33:00+08:00';
返回结果如下:
+-------------+ | temperature | +-------------+ | 92.400000 | +-------------+
AVG函数
说明
计算时序数据表中Field列的平均值。
语法
AVG(field_name)
AVG的返回类型为DOUBLE类型。
参数
参数 | 描述 |
field_name | Field列名。 说明 Field列的类型不能是VARCHAR和BOOLEAN类型。 |
示例
示例1:设备ID为F07A1260和F07A1261,对每条时间线进行降采样操作,时间线按照20s时间窗口计算temperature列的平均值。
SELECT device_id, region, time, avg(temperature) AS temperature FROM sensor WHERE time <= '2021-04-22T15:51:47+08:00' and time >= '2021-04-22T15:33:00+08:00' sample by 20s;
返回结果如下:
+-----------+----------+---------------------------+-------------+ | device_id | region | time | temperature | +-----------+----------+---------------------------+-------------+ | F07A1260 | north-cn | 2021-04-22T15:33:00+08:00 | 12.650000 | | F07A1261 | south-cn | 2021-04-22T15:51:20+08:00 | 11.966667 | | F07A1260 | north-cn | 2021-04-22T15:33:20+08:00 | 10.600000 | | F07A1261 | south-cn | 2021-04-22T15:51:40+08:00 | 20.600000 | +-----------+----------+---------------------------+-------------+
示例2:设备ID为F07A1260和F07A1261,计算每个设备的温度的平均值。
SELECT device_id, region, time, avg(temperature) AS temperature FROM sensor WHERE time <= '2021-04-22T15:51:47+08:00' and time >= '2021-04-22T15:33:00+08:00' sample by 0;
返回结果如下:
+-----------+----------+---------------------------+-------------+ | device_id | region | time | temperature | +-----------+----------+---------------------------+-------------+ | F07A1260 | north-cn | 1970-01-01T08:00:00+08:00 | 11.966667 | | F07A1261 | south-cn | 1970-01-01T08:00:00+08:00 | 14.125000 | +-----------+----------+---------------------------+-------------+
示例3:计算所有设备的温度的平均值。
SELECT avg(temperature) AS temperature FROM sensor WHERE time <= '2021-04-22T15:51:47+08:00' and time >= '2021-04-22T15:33:00+08:00';
返回结果如下:
+-------------+ | temperature | +-------------+ | 13.200000 | +-------------+
COUNT函数
说明
统计时序数据表中Field列的值个数。
语法
COUNT(field_name)
COUNT的返回类型为BIGINT类型。
参数
参数 | 描述 |
field_name | Field列名。 |
示例
示例1:设备ID为F07A1260和F07A1261,对每条时间线进行降采样操作,时间线按照20s时间窗口统计各个窗口中temperature列的值个数。
SELECT device_id, region, time, count(temperature) AS temperature FROM sensor WHERE time <= '2021-04-22T15:51:47+08:00' and time >= '2021-04-22T15:33:00+08:00' sample by 20s;
返回结果如下:
+-----------+----------+---------------------------+-------------+ | device_id | region | time | temperature | +-----------+----------+---------------------------+-------------+ | F07A1260 | north-cn | 2021-04-22T15:33:00+08:00 | 2 | | F07A1261 | south-cn | 2021-04-22T15:51:20+08:00 | 3 | | F07A1260 | north-cn | 2021-04-22T15:33:20+08:00 | 1 | | F07A1261 | south-cn | 2021-04-22T15:51:40+08:00 | 1 | +-----------+----------+---------------------------+-------------+
示例2:设备ID为F07A1260和F07A1261,统计每个设备的温度的值个数。
SELECT device_id, region, time, count(temperature) AS temperature FROM sensor WHERE time <= '2021-04-22T15:51:47+08:00' and time >= '2021-04-22T15:33:00+08:00' sample by 0;
返回结果如下:
+-----------+----------+---------------------------+-------------+ | device_id | region | time | temperature | +-----------+----------+---------------------------+-------------+ | F07A1260 | north-cn | 1970-01-01T08:00:00+08:00 | 3 | | F07A1261 | south-cn | 1970-01-01T08:00:00+08:00 | 4 | +-----------+----------+---------------------------+-------------+
示例3:统计所有设备的温度的值个数。
SELECT count(temperature) AS temperature FROM sensor WHERE time <= '2021-04-22T15:51:47+08:00' and time >= '2021-04-22T15:33:00+08:00';
返回结果如下:
+-------------+ | temperature | +-------------+ | 7 | +-------------+
选择函数
MIN函数
说明
查询时序数据表中Field列的最小值。
语法
MIN(field_name)
MIN的返回类型为DOUBLE或长整型BIGINT。
参数
参数 | 描述 |
field_name | Field列名。 说明 Field列的类型不能是VARCHAR和BOOLEAN类型。 |
示例
示例1:设备ID为F07A1260和F07A1261,对每条时间线进行降采样操作,时间线按照20s时间窗口查询各个窗口中temperature列的最小值。
SELECT device_id, region, time, min(temperature) AS temperature FROM sensor WHERE time <= '2021-04-22T15:51:47+08:00' and time >= '2021-04-22T15:33:00+08:00' sample by 20s;
返回结果如下:
+-----------+----------+---------------------------+-------------+ | device_id | region | time | temperature | +-----------+----------+---------------------------+-------------+ | F07A1260 | north-cn | 2021-04-22T15:33:00+08:00 | 12.100000 | | F07A1261 | south-cn | 2021-04-22T15:51:20+08:00 | 10.600000 | | F07A1260 | north-cn | 2021-04-22T15:33:20+08:00 | 10.600000 | | F07A1261 | south-cn | 2021-04-22T15:51:40+08:00 | 20.600000 | +-----------+----------+---------------------------+-------------+
示例2:设备ID为F07A1260和F07A1261,查询每个设备的温度的最小值。
SELECT device_id, region, time, min(temperature) AS temperature FROM sensor WHERE time <= '2021-04-22T15:51:47+08:00' and time >= '2021-04-22T15:33:00+08:00' sample by 0;
返回结果如下:
+-----------+----------+---------------------------+-------------+ | device_id | region | time | temperature | +-----------+----------+---------------------------+-------------+ | F07A1260 | north-cn | 2021-04-22T15:33:20+08:00 | 10.600000 | | F07A1261 | south-cn | 2021-04-22T15:51:27+08:00 | 10.600000 | +-----------+----------+---------------------------+-------------+
示例3:查询所有设备的温度的最小值。
SELECT min(temperature) AS temperature FROM sensor WHERE time <= '2021-04-22T15:51:47+08:00' and time >= '2021-04-22T15:33:00+08:00';
返回结果如下:
+-------------+ | temperature | +-------------+ | 10.600000 | +-------------+
MAX函数
说明
查询时序数据表中Field列的最大值。
语法
MAX(field_name)
MAX的返回类型为DOUBLE或长整型BIGINT。
参数
参数 | 描述 |
field_name | Field列名。 说明 Field列的类型不能是VARCHAR和BOOLEAN类型。 |
示例
示例1:设备ID为F07A1260和F07A1261,对每条时间线进行降采样操作,时间线按照20s时间窗口查询各个窗口中temperature列的最大值。
SELECT device_id, region, time, max(temperature) AS temperature FROM sensor WHERE time <= '2021-04-22T15:51:47+08:00' and time >= '2021-04-22T15:33:00+08:00' sample by 20s;
返回结果如下:
+-----------+----------+---------------------------+-------------+ | device_id | region | time | temperature | +-----------+----------+---------------------------+-------------+ | F07A1260 | north-cn | 2021-04-22T15:33:00+08:00 | 13.200000 | | F07A1261 | south-cn | 2021-04-22T15:51:20+08:00 | 13.200000 | | F07A1260 | north-cn | 2021-04-22T15:33:20+08:00 | 10.600000 | | F07A1261 | south-cn | 2021-04-22T15:51:40+08:00 | 20.600000 | +-----------+----------+---------------------------+-------------+
示例2:设备ID为F07A1260和F07A1261,查询每个设备的温度的最大值。
SELECT device_id, region, time, max(temperature) AS temperature FROM sensor WHERE time <= '2021-04-22T15:51:47+08:00' and time >= '2021-04-22T15:33:00+08:00' sample by 0;
返回结果如下:
+-----------+----------+---------------------------+-------------+ | device_id | region | time | temperature | +-----------+----------+---------------------------+-------------+ | F07A1260 | north-cn | 2021-04-22T15:33:10+08:00 | 13.200000 | | F07A1261 | south-cn | 2021-04-22T15:51:47+08:00 | 20.600000 | +-----------+----------+---------------------------+-------------+
示例三:查询所有设备的温度的最大值。
SELECT max(temperature) AS temperature FROM sensor WHERE time <= '2021-04-22T15:51:47+08:00' and time >= '2021-04-22T15:33:00+08:00';
返回结果如下:
+-------------+ | temperature | +-------------+ | 20.600000 | +-------------+
FIRST函数
说明
获取时序数据表中Field列的第一个值。
语法
FIRST(field_name)
FIRST的返回类型与Field列的类型相同。
参数
参数 | 描述 |
field_name | Field列名。 |
示例
示例1:设备ID为F07A1260和F07A1261,对每条时间线进行降采样操作,时间线按照20s时间窗口获取各个窗口中第一个温度值。
SELECT device_id, region, time, first(temperature) AS temperature FROM sensor WHERE time <= '2021-04-22T15:51:47+08:00' and time >= '2021-04-22T15:33:00+08:00' sample by 20s;
返回结果如下:
+-----------+----------+---------------------------+-------------+ | device_id | region | time | temperature | +-----------+----------+---------------------------+-------------+ | F07A1260 | north-cn | 2021-04-22T15:33:00+08:00 | 12.100000 | | F07A1261 | south-cn | 2021-04-22T15:51:20+08:00 | 12.100000 | | F07A1260 | north-cn | 2021-04-22T15:33:20+08:00 | 10.600000 | | F07A1261 | south-cn | 2021-04-22T15:51:40+08:00 | 20.600000 | +-----------+----------+---------------------------+-------------+
示例2:设备ID为F07A1260和F07A1261,获取每个设备第一个温度值。
SELECT device_id, region, time, first(temperature) AS temperature FROM sensor WHERE time <= '2021-04-22T15:51:47+08:00' and time >= '2021-04-22T15:33:00+08:00' sample by 0;
返回结果如下:
+-----------+----------+---------------------------+-------------+ | device_id | region | time | temperature | +-----------+----------+---------------------------+-------------+ | F07A1260 | north-cn | 2021-04-22T15:33:00+08:00 | 12.100000 | | F07A1261 | south-cn | 2021-04-22T15:51:20+08:00 | 12.100000 | +-----------+----------+---------------------------+-------------+
LAST函数
说明
获取时序数据表中Field列的最后一个值。
语法
LAST(field_name)
LAST的返回类型与Field列的类型相同。
参数
参数 | 描述 |
field_name | Field列名。 |
示例
示例1:设备ID为F07A1260和F07A1261,对每条时间线进行降采样操作,时间线按照20s时间窗口获取各个窗口中最后一个温度值。
SELECT device_id, region, time, last(temperature) AS temperature FROM sensor WHERE time <= '2021-04-22T15:51:47+08:00' and time >= '2021-04-22T15:33:00+08:00' sample by 20s;
返回结果如下:
+-----------+----------+---------------------------+-------------+ | device_id | region | time | temperature | +-----------+----------+---------------------------+-------------+ | F07A1260 | north-cn | 2021-04-22T15:33:00+08:00 | 13.200000 | | F07A1261 | south-cn | 2021-04-22T15:51:20+08:00 | 10.600000 | | F07A1260 | north-cn | 2021-04-22T15:33:20+08:00 | 10.600000 | | F07A1261 | south-cn | 2021-04-22T15:51:40+08:00 | 20.600000 | +-----------+----------+---------------------------+-------------+
示例2:设备ID为F07A1260和F07A1261,获取每个设备的最后一个温度值。
SELECT device_id, region, time, last(temperature) AS temperature FROM sensor WHERE time <= '2021-04-22T15:51:47+08:00' and time >= '2021-04-22T15:33:00+08:00' sample by 0;
返回结果如下:
+-----------+----------+---------------------------+-------------+ | device_id | region | time | temperature | +-----------+----------+---------------------------+-------------+ | F07A1260 | north-cn | 2021-04-22T15:33:20+08:00 | 10.600000 | | F07A1261 | south-cn | 2021-04-22T15:51:47+08:00 | 20.600000 | +-----------+----------+---------------------------+-------------+
PERCENTILE函数
说明
计算时序数据表中与Field列关联的第P个百分位字段值。
语法
PERCENTILE(field_name,P)
PERCENTILE的返回类型为DOUBLE类型。
参数
参数 | 描述 |
field_name | Field列名。 说明 Field列的类型不能是VARCHAR和BOOLEAN类型。 |
P | 取值为整数或浮点数,范围为[0,100],默认值为50。 |
示例
示例1:设备ID为F07A1260和F07A1261,对每条时间线进行降采样操作,时间线按照20s时间窗口获取各个窗口中temperature列的90分位值。
SELECT device_id, region, time, percentile(temperature, 90) AS temperature FROM sensor WHERE time <= '2021-04-22T15:51:47+08:00' and time >= '2021-04-22T15:33:00+08:00' sample by 20s;
返回结果如下:
+-----------+----------+---------------------------+-------------+ | device_id | region | time | temperature | +-----------+----------+---------------------------+-------------+ | F07A1260 | north-cn | 2021-04-22T15:33:00+08:00 | 13.200000 | | F07A1261 | south-cn | 2021-04-22T15:51:20+08:00 | 13.200000 | | F07A1260 | north-cn | 2021-04-22T15:33:20+08:00 | 10.600000 | | F07A1261 | south-cn | 2021-04-22T15:51:40+08:00 | 20.600000 | +-----------+----------+---------------------------+-------------+
示例2:设备ID为F07A1260和F07A1261,获取每个设备的温度的90分位值。
SELECT device_id, region, time, percentile(temperature, 90) AS temperature FROM sensor WHERE time <= '2021-04-22T15:51:47+08:00' and time >= '2021-04-22T15:33:00+08:00' sample by 0;
返回结果如下:
+-----------+----------+---------------------------+-------------+ | device_id | region | time | temperature | +-----------+----------+---------------------------+-------------+ | F07A1260 | north-cn | 1970-01-01T08:00:00+08:00 | 13.200000 | | F07A1261 | south-cn | 1970-01-01T08:00:00+08:00 | 20.600000 | +-----------+----------+---------------------------+-------------+
转换函数
RATE函数
说明
计算时序数据表中Field列值的斜率。
计算公式:每个数据点的时间戳和值分别是:(t
1
,v
1
),(t
2
,v
2
)...(t
N
,v
N
)
。使用RATE函数后,返回N-1个数据点,每个数据点的时间戳和值分别是:
公式中interval
是指定的变化率,单位为秒。
语法
RATE(field_name, 'interval units')
RATE的返回类型为DOUBLE类型。
SQL语句中只支持RATE算子单独使用。
参数
参数 | 描述 |
field_name | Field列名。 说明 Field列的类型不能是VARCHAR和BOOLEAN类型。 |
interval units | 变化率。默认值为1s。单位包括s(秒)、m(分钟)、h(小时)、d(天)。 |
示例
示例1:设备ID为F07A1260和F07A1261,计算每个设备温度的秒级变化率。
SELECT device_id, region, time, rate(temperature) AS temperature FROM sensor WHERE time <= '2021-04-22T15:51:47+08:00' and time >= '2021-04-22T15:33:00+08:00' sample by 0;
返回结果如下:
+-----------+----------+---------------------------+-------------+ | device_id | region | time | temperature | +-----------+----------+---------------------------+-------------+ | F07A1260 | north-cn | 2021-04-22T15:33:10+08:00 | 0.110000 | | F07A1260 | north-cn | 2021-04-22T15:33:20+08:00 | -0.260000 | | F07A1261 | south-cn | 2021-04-22T15:51:21+08:00 | 1.100000 | | F07A1261 | south-cn | 2021-04-22T15:51:27+08:00 | -0.433333 | | F07A1261 | south-cn | 2021-04-22T15:51:47+08:00 | 0.500000 | +-----------+----------+---------------------------+-------------+
示例2:设备ID为F07A1260和F07A1261,计算每个设备温度的分钟变化率。
SELECT device_id, region, time, rate(temperature, '1m') AS temperature FROM sensor WHERE time <= '2021-04-22T15:51:47+08:00' and time >= '2021-04-22T15:33:00+08:00' sample by 0;
返回结果如下:
+-----------+----------+---------------------------+-------------+ | device_id | region | time | temperature | +-----------+----------+---------------------------+-------------+ | F07A1260 | north-cn | 2021-04-22T15:33:10+08:00 | 6.600000 | | F07A1260 | north-cn | 2021-04-22T15:33:20+08:00 | -15.600000 | | F07A1261 | south-cn | 2021-04-22T15:51:21+08:00 | 66.000000 | | F07A1261 | south-cn | 2021-04-22T15:51:27+08:00 | -26.000000 | | F07A1261 | south-cn | 2021-04-22T15:51:47+08:00 | 30.000000 | +-----------+----------+---------------------------+-------------+
示例3:设备ID为F07A1260和F07A1261,先按照20s时间窗口计算temperature列的平均值,再对降采样后的值计算分钟变化率。
SELECT device_id, region, time, rate(avg(temperature), '1m') AS temperature FROM sensor WHERE time <= '2021-04-22T15:51:47+08:00' and time >= '2021-04-22T15:33:00+08:00' sample by 20s;
返回结果如下:
+-----------+----------+---------------------------+-------------+ | device_id | region | time | temperature | +-----------+----------+---------------------------+-------------+ | F07A1260 | north-cn | 2021-04-22T15:33:20+08:00 | -6.150000 | | F07A1261 | south-cn | 2021-04-22T15:51:40+08:00 | 25.900000 | +-----------+----------+---------------------------+-------------+
示例4:SQL语句中只支持RATE算子单独使用,不支持以下混用方式。
错误示例1:
SELECT region, time, rate(temperature), rate(humidity) FROM sensor WHERE time <= '2021-04-22T15:51:47+08:00' and time >= '2021-04-22T15:33:00+08:00' sample by 0;
返回结果如下:
ERROR 9001 (0A000): UNSUPPORTED: latest/rate/delta must be the only aggregator in a single SQL statement
错误示例2:
SELECT region, time, rate(temperature), latest(humidity) FROM sensor WHERE time <= '2021-04-22T15:51:47+08:00' and time >= '2021-04-22T15:33:00+08:00' sample by 0;
返回结果如下:
ERROR 9001 (0A000): UNSUPPORTED: latest/rate/delta must be the only aggregator in a single SQL statement
DELTA函数
说明
计算时序数据表中Field列的值与前一行对应值的差。
计算公式:每个数据点的时间戳和值分别是:(t
1
,v
1
),(t
2
,v
2
)...(t
N
,v
N
)
。使用DELTA函数后,返回N-1个数据点,每个数据点的时间戳和值分别是:(t
2
,v
2
-v
1
),(t
3
,v
3
-v
2
)...(t
N
,v
N
-,v
N-1
)
。
语法
DELTA(field_name)
DELTA的返回类型为DOUBLE类型。
SQL语句中只支持DELTA算子单独使用。
参数
参数 | 描述 |
field_name | Field列名。 说明 Field列的类型不能是VARCHAR和BOOLEAN类型。 |
示例
示例1:设备ID为F07A1260和F07A1261,计算每个设备温度与前一行差值。
SELECT device_id, region, time, delta(temperature) AS temperature FROM sensor WHERE time <= '2021-04-22T15:51:47+08:00' and time >= '2021-04-22T15:33:00+08:00' sample by 0;
返回结果如下:
+-----------+----------+---------------------------+-------------+ | device_id | region | time | temperature | +-----------+----------+---------------------------+-------------+ | F07A1260 | north-cn | 2021-04-22T15:33:10+08:00 | 1.100000 | | F07A1260 | north-cn | 2021-04-22T15:33:20+08:00 | -2.600000 | | F07A1261 | south-cn | 2021-04-22T15:51:21+08:00 | 1.100000 | | F07A1261 | south-cn | 2021-04-22T15:51:27+08:00 | -2.600000 | | F07A1261 | south-cn | 2021-04-22T15:51:47+08:00 | 10.000000 | +-----------+----------+---------------------------+-------------+
示例2:设备ID为F07A1260和F07A1261,先按照20s时间窗口计算temperature列的平均值,再对降采样后的值计算差值。
SELECT device_id, region, time, delta(avg(temperature)) AS temperature FROM sensor WHERE time <= '2021-04-22T15:51:47+08:00' and time >= '2021-04-22T15:33:00+08:00' sample by 20s;
返回结果如下:
+-----------+----------+---------------------------+-------------+ | device_id | region | time | temperature | +-----------+----------+---------------------------+-------------+ | F07A1260 | north-cn | 2021-04-22T15:33:20+08:00 | -2.050000 | | F07A1261 | south-cn | 2021-04-22T15:51:40+08:00 | 8.633333 | +-----------+----------+---------------------------+-------------+
示例3:SQL语句中只支持DELTA算子单独使用,不支持以下混用方式。
错误示例:
错误示例:SELECT region, time, delta(temperature), rate(humidity) FROM sensor WHERE time <= '2021-04-22T15:51:47+08:00' and time >= '2021-04-22T15:33:00+08:00' sample by 0;
返回结果如下:
ERROR 9001 (0A000): UNSUPPORTED: latest/rate/delta must be the only aggregator in a single SQL statement
LATEST函数
说明
获取时序数据表中Field列的最新的N条数据。
语法
LATEST(field_name, N)
LATEST的返回类型与Field列的类型相同。
SQL语句中只支持LATEST算子单独使用。
参数
参数 | 描述 |
field_name | Field列名。 |
N | 整型,用于指定查询最新的N条数据。默认值为1,表示获取最新的一条数据。 |
示例
示例1:查询设备F07A1260和F07A1261最新的温度值。
SELECT device_id, region, time, latest(temperature) AS temperature FROM sensor WHERE time <= '2021-04-22T15:51:47+08:00' and time >= '2021-04-22T15:33:00+08:00' sample by 0;
返回结果如下:
+-----------+----------+---------------------------+-------------+ | device_id | region | time | temperature | +-----------+----------+---------------------------+-------------+ | F07A1261 | south-cn | 2021-04-22T15:51:47+08:00 | 20.600000 | | F07A1260 | north-cn | 2021-04-22T15:33:20+08:00 | 10.600000 | +-----------+----------+---------------------------+-------------+
示例2:查询设备F07A1260和F07A1261最新的两条温度值。
SELECT device_id, region, time, latest(temperature, 2) AS temperature FROM sensor WHERE time <= '2021-04-22T15:51:47+08:00' and time >= '2021-04-22T15:33:00+08:00' sample by 0;
返回结果如下:
+-----------+----------+---------------------------+-------------+ | device_id | region | time | temperature | +-----------+----------+---------------------------+-------------+ | F07A1261 | south-cn | 2021-04-22T15:51:27+08:00 | 10.600000 | | F07A1261 | south-cn | 2021-04-22T15:51:47+08:00 | 20.600000 | | F07A1260 | north-cn | 2021-04-22T15:33:10+08:00 | 13.200000 | | F07A1260 | north-cn | 2021-04-22T15:33:20+08:00 | 10.600000 | +-----------+----------+---------------------------+-------------+
示例3:SQL语句中只支持LATEST算子单独使用,不支持以下混用方式。
错误示例:
SELECT region, time, latest(temperature), rate(humidity) FROM sensor WHERE time <= '2021-04-22T15:51:47+08:00' and time >= '2021-04-22T15:33:00+08:00' sample by 0;
返回结果如下:
ERROR 9001 (0A000): UNSUPPORTED: latest/rate/delta must be the only aggregator in a single SQL statement
FORECAST函数(时序预测)
前提条件
时序预测功能依赖于Lindorm AI引擎,在使用FORECAST函数前,请开通Lindorm AI引擎。开通方式,请参见开通指南。
说明
对时序数据表中的Field列进行时序预测。
语法
FORECAST(field_name, model_name, options)
FORECAST的返回类型为DOUBLE类型。
参数
参数 | 描述 |
field_name | Field列名。 说明 Field列的类型不能是VARCHAR和BOOLEAN类型。 |
model_name | 模型名称。 说明 model_name的类型为VARCHAR类型。 |
options | 调整时序预测的结果。可选参数。格式为:
|
示例
示例1:对时序数据表sensor中指定时间段的温度进行时序预测。
SELECT device_id, region, `time`, forecast(temperature, forecast_model) AS forecast_result FROM sensor WHERE `time` >= '2022-01-01T00:00:00+08:00' and `time` < '2022-01-01T00:01:00+08:00' sample by 0;
返回结果如下:
+-----------+----------+---------------------------+------------------+ | device_id | region | time | forecast_result | +-----------+----------+---------------------------+------------------+ | F07A1260 | north-cn | 2022-01-01T00:00:00+08:00 | 12.40307807 | | F07A1260 | north-cn | 2022-01-01T00:00:20+08:00 | 11.36715841 | | F07A1260 | north-cn | 2022-01-01T00:00:40+08:00 | 10.12969923 | | F07A1261 | south-cn | 2022-01-01T00:00:00+08:00 | 26.51369649 | | F07A1261 | south-cn | 2022-01-01T00:00:20+08:00 | 25.54403301 | | F07A1261 | south-cn | 2022-01-01T00:00:40+08:00 | 24.46405267 | +-----------+----------+---------------------------+------------------+
示例2:对时序数据表sensor中指定时间段的温度进行时序预测,同时指定预测步长为2,输出分位数为0.9。
SELECT device_id, region, `time`, forecast(temperature, forecast_model, 'step=2,quantile_output=0.9') AS forecast_result FROM sensor WHERE `time` >= '2022-01-01T00:00:00+08:00' and `time` < '2022-01-01T00:01:00+08:00' sample by 0;
返回结果如下:
+-----------+----------+---------------------------+------------------+ | device_id | region | time | forecast_result | +-----------+----------+---------------------------+------------------+ | F07A1260 | north-cn | 2022-01-01T00:00:00+08:00 | 13.12353792 | | F07A1260 | north-cn | 2022-01-01T00:00:20+08:00 | 12.14833554 | | F07A1261 | south-cn | 2022-01-01T00:00:00+08:00 | 26.73869304 | | F07A1261 | south-cn | 2022-01-01T00:00:20+08:00 | 24.92990853 | +-----------+----------+---------------------------+------------------+
ANOMALY_DETECT函数(时序异常检测)
说明
对时序数据表中Field列进行时序异常检测。
语法
ANOMALY_DETECT(field_name, [algo_name | model_name], options)
ANOMALY_DETECT的返回类型为BOOLEAN类型。
参数
参数 | 描述 |
field_name | Field列名。 说明 Field列的类型不能是VARCHAR和BOOLEAN类型。 |
algo_name | 异常检测算法名称。支持阿里达摩院自研在线异常检测算法。
说明 algo_identifer参数适用于未开通数据库内机器学习功能,但有使用时序异常检测需求的场景。 |
model_name | 模型名称。 说明
|
options | 调整异常检测算法的检测效果。可选参数。格式为: |
示例
示例1:对时序数据表sensor中指定时间范围的温度使用esd算法进行时序异常检测。
SELECT device_id, region, time, anomaly_detect(temperature, 'esd') AS detect_result FROM sensor WHERE time >= '2022-01-01T00:00:00+08:00' and time < '2022-01-01T00:01:00+08:00' sample by 0;
返回结果如下:
+-----------+----------+---------------------------+---------------+ | device_id | region | time | detect_result | +-----------+----------+---------------------------+---------------+ | F07A1260 | north-cn | 2022-01-01T00:00:00+08:00 | true | | F07A1260 | north-cn | 2022-01-01T00:00:20+08:00 | false | | F07A1260 | north-cn | 2022-01-01T00:00:40+08:00 | true | | F07A1261 | south-cn | 2022-01-01T00:00:00+08:00 | false | | F07A1261 | south-cn | 2022-01-01T00:00:20+08:00 | false | | F07A1261 | south-cn | 2022-01-01T00:00:40+08:00 | false | +-----------+----------+---------------------------+---------------+
示例2:对时序数据表sensor中F07A1260设备指定时间范围的温度使用esd算法进行时序异常检测。
SELECT device_id, region, time, anomaly_detect(temperature, 'esd') AS detect_result FROM sensor WHERE device_id in ('F07A1260') and time >= '2022-01-01T00:00:00+08:00' and time < '2022-01-01T00:01:00+08:00' sample by 0;
返回结果如下:
+-----------+----------+---------------------------+---------------+ | device_id | region | time | detect_result | +-----------+----------+---------------------------+---------------+ | F07A1260 | north-cn | 2022-01-01T00:00:00+08:00 | true | | F07A1260 | north-cn | 2022-01-01T00:00:20+08:00 | false | | F07A1260 | north-cn | 2022-01-01T00:00:40+08:00 | true | +-----------+----------+---------------------------+---------------+
示例3:对时序数据表sensor中F07A1260设备指定时间范围的温度使用esd算法进行时序异常检测,并自定义异常检测算法的参数(异常检测算法参考的时间窗口长度为30,异常检测时值小于90%分位数的点被过滤)。
SELECT device_id, region, time, anomaly_detect(temperature, 'esd', 'lenHistoryWindow=30,maxAnomalyRatio=0.1') AS detect_result FROM sensor where device_id in ('F07A1260') and time >= '2022-01-01T00:00:00+08:00' and time < '2022-01-01T00:01:00+08:00' sample by 0;
返回结果如下:
+-----------+----------+---------------------------+---------------+ | device_id | region | time | detect_result | +-----------+----------+---------------------------+---------------+ | F07A1260 | north-cn | 2022-01-01T00:00:00+08:00 | false | | F07A1260 | north-cn | 2022-01-01T00:00:20+08:00 | false | | F07A1260 | north-cn | 2022-01-01T00:00:40+08:00 | true | +-----------+----------+---------------------------+---------------+
示例4:使用在数据库内机器学习功能中创建的时序异常检测模型,对sensor表中指定时间范围的温度进行时序异常检测。
SELECT device_id, region, `time`, raw(temperature) as temperature, anomaly_detect(temperature, ad_model) AS detect_result FROM sensor WHERE time >= '2022-01-01T00:00:00+08:00' and time < '2022-01-01T00:01:00+08:00' sample by 0;
返回结果如下:
+-----------+----------+---------------------------+-------------+---------------+ | device_id | region | time | temperature | detect_result | +-----------+----------+---------------------------+-------------+---------------+ | F07A1260 | north-cn | 2022-01-01T00:00:00+08:00 | 59.100000 | true | | F07A1260 | north-cn | 2022-01-01T00:00:20+08:00 | 13.200000 | false | | F07A1260 | north-cn | 2022-01-01T00:00:40+08:00 | 64.600000 | true | | F07A1261 | south-cn | 2022-01-01T00:00:00+08:00 | 12.100000 | false | | F07A1261 | south-cn | 2022-01-01T00:00:20+08:00 | 13.200000 | false | | F07A1261 | south-cn | 2022-01-01T00:00:40+08:00 | 10.600000 | false | +-----------+----------+---------------------------+-------------+---------------+
BINS函数(特征分箱)
说明
对时序数据表中Field列进行特征分箱,特征分箱(也称为离散分箱或者离散分段)是一种数据预处理技术。
语法
BINS(field_name, options)
BINS的返回类型为VARCHAR类型。
参数
参数 | 是否必选 | 描述 |
field_name | 是 | Field列名。 说明 Field列的类型不能是VARCHAR和BOOLEAN类型。 |
options | 否 | 指定特征分箱策略和输出方式,具体说明请参见特征分箱。格式为: |
示例
示例1:对时序数据表sensor中指定时间范围的温度进行特征分箱。
SELECT device_id, region, time, bins(temperature) AS temperature_bin FROM sensor WHERE time >= '2021-04-22 00:00:00' AND time < '2022-04-23 00:01:00' sample by 0;
返回结果如下:
+-----------+----------+--------------------------------+--------------------+ | device_id | region | time | temperature_bin | +-----------+----------+--------------------------------+--------------------+ | F07A1260 | north-cn | 2021-04-22T15:33:00+08:00 | [10.60,12.10) | | F07A1260 | north-cn | 2021-04-22T15:33:10+08:00 | [12.10,13.20] | | F07A1260 | north-cn | 2021-04-22T15:33:20+08:00 | [10.60,10.60) | | F07A1261 | south-cn | 2021-04-22T15:51:20+08:00 | [10.60,12.10) | | F07A1261 | south-cn | 2021-04-22T15:51:21+08:00 | [12.10,13.20) | | F07A1261 | south-cn | 2021-04-22T15:51:27+08:00 | [10.60,10.60) | | F07A1261 | south-cn | 2021-04-22T15:51:47+08:00 | [13.20,20.60] | +-----------+-----------+-------------------------------+--------------------+
示例2:对时序数据表sensor中F07A1260设备指定时间范围的温度进行特征分箱。
SELECT device_id, region, time, bins(temperature) AS temperature_bin FROM sensor WHERE device_id in ('F07A1260') AND time >= '2021-04-22 00:00:00' AND time < '2022-04-23 00:01:00' sample by 0;
返回结果如下:
+-----------+----------+--------------------------------+--------------------+ | device_id | region | time | temperature_bin | +-----------+----------+--------------------------------+--------------------+ | F07A1260 | north-cn | 2021-04-22T15:33:00+08:00 | [10.60,12.10) | | F07A1260 | north-cn | 2021-04-22T15:33:10+08:00 | [12.10,13.20] | | F07A1260 | north-cn | 2021-04-22T15:33:20+08:00 | [10.60,10.60) | +-----------+----------+--------------------------------+--------------------+
示例3:对时序数据表sensor中F07A1260设备指定时间范围的温度进行特征分箱,并自定义分箱算法的参数(生成分箱数为2的等宽分箱,结果以分箱区间的顺序编码形式输出)。
SELECT device_id, region, time, bins(temperature, 'n_bins=2, output_type=ordinal, strategy=uniform') AS temperature_bin FROM sensor WHERE device_id in ('F07A1260') AND time >= '2021-04-22 00:00:00' AND time < '2022-04-23 00:01:00' sample by 0;
返回结果如下:
+-----------+----------+--------------------------------+--------------------+ | device_id | region | time | temperature_bin | +-----------+----------+--------------------------------+--------------------+ | F07A1260 | north-cn | 2021-04-22T15:33:00+08:00 | 1 | | F07A1260 | north-cn | 2021-04-22T15:33:10+08:00 | 1 | | F07A1260 | north-cn | 2021-04-22T15:33:20+08:00 | 0 | +-----------+----------+--------------------------------+--------------------+