本文为您介绍Over Aggregate修改的可兼容性和不可兼容性详情。
可兼容的修改
- 新增、删除、修改非Distinct的统计指标(Aggregate Function)。
- 对于新增统计指标,属于部分兼容,从当前作业启动时开始累计。
- 对于删除统计指标,属于完全兼容修改。删除的统计指标对应的状态数据会被丢弃。
- 对于既有新增又有删除统计指标,则属于部分兼容修改。新增的统计指标从当前作业启动时开始累计,删除的统计指标对应的状态数据会被丢弃。
- 对于修改统计指标,被视为删除和新增两个操作,属于部分兼容。新增的统计指标从当前作业启动时开始累计,删除的统计指标对应的状态数据会被丢弃。
说明- 对于未进行任何变更的统计指标,复用状态数据后计算的结果和基于历史数据运行的结果是一致的。
- Over Aggregate除了输出聚合指标,还会输出原始输入数据。因此输入的Schema发生变化时,状态不兼容。
-- 原始SQL。 select a, b, c, sum(b) over (partition by a order by ts), max(c) over (partition by a order by ts) from MyTable; -- 新增统计指标:count(c),属于部分兼容修改。 -- sum(b)、max(c) 的计算结果不受影响,count(c)的值在作业启动时从0开始累计。 select a, b, c, sum(b) over (partition by a order by ts), max(c) over (partition by a order by ts), count(c) over (partition by a order by ts) from MyTable; -- 删除统计指标:sum(b),属于完全兼容修改。 -- max(c) 的计算结果不受影响。 select a, b, c, max(c) over (partition by a order by ts) from MyTable; -- 修改统计指标:max(c) -> min(c),属于部分兼容修改。 -- sum(b)的计算结果不受影响。max(c)被认为删除,其对应的状态数据会被丢弃。 -- min(c)被认为是新增指标,其值在作业启动时开始计算,min(c) 对应的状态数据会被丢弃。 select a, b, c, sum(b) over (partition by a order by ts), min(c) over (partition by a order by ts) from MyTable;
- 调整非Distinct的统计指标位置,该修改属于完全兼容修改。
-- 原始SQL。 select a, b, c, sum(b) over (partition by a order by ts), max(c) over (partition by a order by ts) from MyTable -- 调整统计指标位置:sum(b)、max(c),属于完全兼容修改。 -- sum(b)、max(c) 的计算结果不受影响。 select a, b, c, max(c) over (partition by a order by ts), sum(b) over (partition by a order by ts) from MyTable;
不兼容的修改
- Over Aggregate输入的Schema发生变化,该修改属于不兼容修改。
-- 原始SQL。 select a, b, c, sum(b) over (partition by a order by ts), max(c) over (partition by a order by ts) from MyTable; -- 新增输入字段d,当前修改属于不兼容修改。 select a, b, c, d, max(c) over (partition by a order by ts), sum(b) over (partition by a order by ts) from MyTable; -- 修改输入字段c,当前修改属于不兼容修改。 select a, b, c, max(c) over (partition by a order by ts), sum(b) over (partition by a order by ts) from ( select a, b, substring(c, 1, 5) as c from MyTable );
- 修改Over窗口相关属性(Partition By、Order By、Bound Definition),该修改属于不兼容修改。
-- 原始SQL。 select a, b, c, max(c) over (partition by a order by ts asc rows between unbounded preceding and current row) from MyTable; -- 修改partition key:a -> b,当前修改属于不兼容修改。 select a, b, c, max(c) over (partition by b order by ts asc rows between unbounded preceding and current row) from MyTable; -- 修改order by:ts asc -> ts desc,当前修改属于不兼容修改。 select a, b, c, max(c) over (partition by a order by ts desc rows between unbounded preceding and current row) from MyTable; -- 修改bound definition:unbounded preceding -> 2 preceding,当前修改属于不兼容修改。 select a, b, c, max(c) over (partition by a order by ts asc rows between 2 preceding and current row) from MyTable;
- 新增、删除、修改Distinct统计指标(Distinct Aggregate Function),该修改属于不兼容修改。
-- 原始SQL。 select a, b, c, max(c) over (partition by a order by ts) from MyTable; -- 新增Distinct统计指标count(distinct b),当前修改属于不兼容修改。 select a, b, c, max(c) over (partition by b order by ts), count(distinct b) over (partition by b order by ts) from MyTable;