基于PolarDB的时序数据库:股票金融交易数据分析

本文模拟股票交易数据,基于PolarDBGanos TSDB,在股票交易场景执行时序数据分析,确保交易系统精准追踪价格波动,即时计算技术指标(如移动均线、RSI),并支撑高频交易算法进行历史模式匹配与实时决策。

背景

在股票金融交易场景中,每秒产生海量高频时序数据(如逐笔报价、分时成交量、委托订单流等)。这些数据不仅要求毫秒级实时写入与持久化存储,还需支持高并发查询、快速时间范围检索和复杂时序分析。

时序数据库针对时间戳进行存储结构优化及高效数据压缩算法,能够实时捕获并存储每秒数十万笔市场行情数据,从而确保交易系统能够精准追踪价格波动并即时计算技术指标(如移动均线、RSI),同时也支撑高频交易算法进行历史模式匹配与实时决策。此外,时序数据库的时间窗口聚合功能可快速生成分钟/小时K线图(Candlestick Chart),以满足不同交易策略对时间颗粒度的需求。

最佳实践

创建超表

  1. 创建数据表,也称为源表。

    CREATE TABLE market_trades (
        market_id BIGINT NOT NULL,            -- Market ID
        trade_id BIGINT,                      -- Trade ID
        inst_name VARCHAR,                    -- Instrument name
        price DECIMAL NOT NULL,               -- Trade price
        amount DECIMAL NOT NULL,              -- Trade amount
        trade_ts TIMESTAMP NOT NULL,          -- Trade timestamp
        insert_ts BIGINT NOT NULL             -- insert timestamp
    );
  2. 将数据表转为超表。

    SELECT create_hypertable(
        'market_trades',                     -- 数据表名
        'trade_ts',                           -- 时间分片字段
        chunk_time_interval => INTERVAL '1 day', -- 时间分片间隔
        partitioning_column => 'market_id',   -- 空间分片字段
        number_partitions => 20            -- 空间分片数量
    );

(可选)时区校正

PolarDB集群默认时区为UTC。您可按需调整,本案例将时区设置为中国标准时间,并设置起始时间为交易数据的最小时间。

CREATE OR REPLACE FUNCTION get_start_time() RETURNS timestamptz LANGUAGE SQL AS
$$
  -- 测试数据的最小时间为'2024-01-01 01:01:38.710554+08',按实际数据的开始时间进行修改
  SELECT timezone('Asia/Shanghai', '2024-01-01 01:01:38.710554+08'::timestamp);
$$ VOLATILE;

创建固定K线图

在源表基础上创建K线连续聚合,并设置刷新策略。

K线图

  1. 创建秒K线图。

    CREATE MATERIALIZED VIEW kline_sec
    WITH (timescaledb.continuous) AS
    SELECT
        time_bucket('1 second', trade_ts) AS candle_ts,
        market_id,
        MIN(price) AS low,
        MAX(price) AS high,
        first(price, trade_ts) AS open,
        last(price, trade_ts) AS close,
        last(insert_ts, trade_ts) AS insert_ts,
        SUM(amount) AS vol
    FROM market_trades
    GROUP BY candle_ts, market_id
    WITH NO DATA;
  2. 启用实时查询。

    ALTER MATERIALIZED VIEW kline_sec SET (timescaledb.materialized_only = false);
  3. 创建连续聚合自动刷新策略。

    1. 创建刷新策略。

      CREATE OR REPLACE PROCEDURE kline_sec_refresh(job_id int, config jsonb) LANGUAGE PLPGSQL AS
      $$
      DECLARE
        start_time timestamp without time zone = date_trunc('sec', get_start_time());
      BEGIN
        CALL refresh_continuous_aggregate('kline_sec', start_time, NULL);
      END
      $$;
    2. 执行刷新策略。

      说明

      当每秒写入量较大(例如写入量超过3000条/秒)时,当前采用整秒刷新机制,实时查询可能会出现轻微抖动问题。为了解决该抖动问题,您可以将刷新频率设置得更加频繁,例如调整为0.05秒。

      SELECT add_job('kline_sec_refresh','1 sec');

分钟K线

分钟K线是在秒K线上创建的嵌套聚合。

  1. 创建分钟K线图。

    CREATE MATERIALIZED VIEW kline_min
    WITH (timescaledb.continuous) AS
    SELECT
        time_bucket('1 min', candle_ts) AS candle_ts,
        market_id,
        MIN(low) AS low,
        MAX(high) AS high,
        first(open, candle_ts) AS open,
        last(close, candle_ts) AS close,
        last(insert_ts, candle_ts) AS insert_ts,
        SUM(vol) AS vol
    FROM kline_sec
    GROUP BY time_bucket('1 min', candle_ts), market_id
    WITH NO DATA;
  2. 启用实时查询。

    ALTER MATERIALIZED VIEW kline_min SET (timescaledb.materialized_only = false);
  3. 创建连续聚合自动刷新策略。

    1. 创建刷新策略。

      CREATE OR REPLACE PROCEDURE kline_min_refresh(job_id int, config jsonb) LANGUAGE PLPGSQL AS
      $$
      DECLARE
        start_time timestamp without time zone = date_trunc('min', get_start_time());
      BEGIN
        CALL refresh_continuous_aggregate('kline_min', start_time, NULL);
      END
      $$;
    2. 执行刷新策略。

      SELECT add_job('kline_min_refresh','1 min');

小时K线

小时K线是在分钟K线上创建的嵌套聚合。

  1. 创建小时K线图。

    CREATE MATERIALIZED VIEW kline_hour
    WITH (timescaledb.continuous) AS
    SELECT
        time_bucket('1 hour', candle_ts) AS candle_ts,
        market_id,
        MIN(low) AS low,
        MAX(high) AS high,
        first(open, candle_ts) AS open,
        last(close, candle_ts) AS close,
        last(insert_ts, candle_ts) AS insert_ts,
        SUM(vol) AS vol
    FROM kline_min
    GROUP BY time_bucket('1 hour', candle_ts), market_id
    WITH NO DATA;
  2. 启用实时查询。

    ALTER MATERIALIZED VIEW kline_hour SET (timescaledb.materialized_only = false);
  3. 创建连续聚合自动刷新策略。

    1. 创建刷新策略。

      CREATE OR REPLACE PROCEDURE kline_hour_refresh(job_id int, config jsonb) LANGUAGE PLPGSQL AS
      $$
      DECLARE
        start_time timestamp without time zone = date_trunc('hour', get_start_time());
      BEGIN
        CALL refresh_continuous_aggregate('kline_hour', start_time, NULL);
      END
      $$;
    2. 执行刷新策略。

      SELECT add_job('kline_hour_refresh','1 hour', initial_start => date_trunc('hour', now()));

K线图

K线是在小时K线上创建的嵌套聚合。

  1. 创建天K线图。

    CREATE MATERIALIZED VIEW kline_day
    WITH (timescaledb.continuous) AS
    SELECT
        time_bucket('1 day', candle_ts) AS candle_ts,
        market_id,
        MIN(low) AS low,
        MAX(high) AS high,
        first(open, candle_ts) AS open,
        last(close, candle_ts) AS close,
        last(insert_ts, candle_ts) AS insert_ts,
        SUM(vol) AS vol
    FROM kline_hour
    GROUP BY time_bucket('1 day', candle_ts), market_id
    WITH NO DATA;
  2. 启用实时查询。

    ALTER MATERIALIZED VIEW kline_day SET (timescaledb.materialized_only = false);
  3. 创建连续聚合自动刷新策略。

    1. 创建刷新策略。

      CREATE OR REPLACE PROCEDURE kline_day_refresh(job_id int, config jsonb) LANGUAGE PLPGSQL AS
      $$
      DECLARE
        start_time timestamp without time zone = date_trunc('day', get_start_time());
      BEGIN
        CALL refresh_continuous_aggregate('kline_day', start_time, NULL);
      END
      $$;
    2. 执行刷新策略。

      SELECT add_job('kline_day_refresh','1 day',initial_start => date_trunc('hour', now()));

(可选)创建动态K线图

如果固定K线图不满足查询需求时,您可以在固定K线图基础上创建动态K线图。例如,在分钟K线图基础上创建30分钟K线图。

SELECT
    market_id,
    (EXTRACT(EPOCH FROM time_bucket('30 min', candle_ts)) * 1000)::BIGINT AS candle_ts,
    MIN(low) AS low,
    MAX(high) AS high,
    first(open, candle_ts) AS open,
    last(close, candle_ts) AS close,
    SUM(vol) AS vol
FROM kline_min
WHERE market_id = 100001
GROUP BY time_bucket('30 min', candle_ts), market_id
ORDER BY candle_ts DESC
limit 10;

模拟股票交易数据

CREATE OR REPLACE FUNCTION test_trade_data(cnt int)
    RETURNS void
AS $$
DECLARE
   sql text;
   i int := 0;
   tm timestamp without time zone := '2024-01-01 01:01:38.710554+08'::TIMESTAMPTZ;
BEGIN
    FOR i IN 1..cnt
    LOOP
        -- 数据产生频率为0.1 sec
        tm = tm + '0.1 sec';
        sql = format('INSERT INTO market_trades(market_id, price, amount, trade_ts, insert_ts) VALUES(100001, random(), random(), ''%s'', (EXTRACT(EPOCH FROM ''%s''::timestamp) * 1000)::bigint);', tm ,tm);
        execute sql;
    END LOOP;
END;
$$
LANGUAGE 'plpgsql';

SELECT test_trade_data(1000000);

查看K线图

K线图

SELECT market_id,candle_ts,low,high,open,close,vol FROM kline_sec WHERE market_id = 100001 ORDER BY candle_ts limit 10;

返回结果如下:

 market_id |      candle_ts      |         low         |       high        |        open        |        close        |         vol
-----------+---------------------+---------------------+-------------------+--------------------+---------------------+---------------------
    100001 | 2024-01-01 01:01:38 |   0.279485633997758 | 0.681338873589468 |  0.279485633997758 |   0.681338873589468 |   0.910744891936063
    100001 | 2024-01-01 01:01:39 |  0.0525641673891641 | 0.962955545742446 |  0.385015528170719 |  0.0525641673891641 |   4.620692226670329
    100001 | 2024-01-01 01:01:40 |  0.0633381498328447 | 0.954725391899331 |  0.801216359386196 |   0.242894295894374 |  3.6880741489060102
    100001 | 2024-01-01 01:01:41 |   0.077063096976957 | 0.619752482643143 |  0.619752482643143 |   0.362329000463575 |  5.4477741841190114
    100001 | 2024-01-01 01:01:42 |   0.130160131849806 | 0.904649178467757 |  0.751462633424246 |     0.8657081421063 | 4.60274994611772491
    100001 | 2024-01-01 01:01:43 |  0.0245408224841981 | 0.915085200903377 |  0.659205003206203 |  0.0245408224841981 |   6.882016111031889
    100001 | 2024-01-01 01:01:44 | 0.00436203999301199 | 0.999261582297358 |  0.999261582297358 | 0.00436203999301199 |   6.343432050925331
    100001 | 2024-01-01 01:01:45 |  0.0235861333017553 | 0.998218944784099 |  0.837373493299758 |   0.690308731587692 | 5.02263333895452072
    100001 | 2024-01-01 01:01:46 |   0.167417759341774 |   0.9150153383248 |  0.213189330042447 |   0.796555201647649 |  4.9323075845995145
    100001 | 2024-01-01 01:01:47 |  0.0337299584831712 | 0.765553193903251 | 0.0337299584831712 |   0.160180953095164 |   5.136451153685312
(10 rows)

分钟K线图

SELECT market_id,candle_ts,low,high,open,close,vol FROM kline_min WHERE market_id = 100001 ORDER BY candle_ts limit 10;

返回结果如下:

 market_id |      candle_ts      |         low          |       high        |        open        |        close        |          vol
-----------+---------------------+----------------------+-------------------+--------------------+---------------------+------------------------
    100001 | 2024-01-01 01:01:00 |  0.00436203999301199 | 0.999261582297358 |  0.279485633997758 |   0.660205011565097 |  107.67711307252750133
    100001 | 2024-01-01 01:02:00 | 0.000382727352629786 | 0.997355089315917 |  0.638467919873655 |   0.741995626890457 | 300.032527480741814196
    100001 | 2024-01-01 01:03:00 |  0.00102311960848311 | 0.998987023712075 |  0.494493945315959 | 0.00933623188228694 | 296.670624428911073230
    100001 | 2024-01-01 01:04:00 | 0.000317312327567265 | 0.999400263658501 |  0.914812683364666 |   0.565009458532018 |  308.44719638361391718
    100001 | 2024-01-01 01:05:00 |  0.00165624319932078 | 0.990618859346739 |  0.456251672433702 |   0.825974036343325 |  297.87230623497759724
    100001 | 2024-01-01 01:06:00 |  0.00446447743636469 | 0.994494939263078 | 0.0486883210703937 |   0.690970192417581 | 304.612071727522449497
    100001 | 2024-01-01 01:07:00 | 0.000609624951934506 | 0.999061576948947 |  0.663441135677637 |   0.843439992945548 |  317.33816039424379517
    100001 | 2024-01-01 01:08:00 | 0.000305593115932368 | 0.999032452035479 |  0.595979795892312 |   0.545231376105335 |   293.4771868344440740
    100001 | 2024-01-01 01:09:00 |  0.00180631157653011 | 0.999969521686225 |  0.687172827619953 |    0.52655462636659 |  308.64126233531476704
    100001 | 2024-01-01 01:10:00 |   0.0042720602925499 | 0.999550291053481 | 0.0572916237256678 |   0.827660040200954 |  302.52094683726853244
(10 rows)

小时K线图

SELECT market_id,candle_ts,low,high,open,close,vol FROM kline_hour WHERE market_id = 100001 ORDER BY candle_ts limit 10;

返回结果如下:

 market_id |      candle_ts      |          low           |       high        |       open        |       close       |            vol
-----------+---------------------+------------------------+-------------------+-------------------+-------------------+----------------------------
    100001 | 2024-01-01 01:00:00 |  0.0000311086421334039 | 0.999988514373982 | 0.279485633997758 | 0.122312797390521 |  17496.2413504152012247561
    100001 | 2024-01-01 02:00:00 |  0.0000770308891162585 | 0.999986986486086 | 0.545347700734887 | 0.125192844770879 |  18083.5114260775312946946
    100001 | 2024-01-01 03:00:00 |  0.0000845044836879083 | 0.999952117680969 | 0.426587036254354 | 0.532456849129801 | 18030.73594659831131232683
    100001 | 2024-01-01 04:00:00 |  0.0000918109849195048 | 0.999995562667976 |   0.6330394930823 | 0.373607753625436 |  18015.5229018762197568761
    100001 | 2024-01-01 05:00:00 |  0.0000109588696517449 | 0.999994772802758 | 0.258798587687728 | 0.507375948925699 |  18051.3594940787675161618
    100001 | 2024-01-01 06:00:00 |  0.0000487699150326648 | 0.999988924300197 |  0.28910545006503 | 0.634414585708271 |  18025.7784576000281124843
    100001 | 2024-01-01 07:00:00 | 0.00000107306622965098 | 0.999981224025916 | 0.433356636233988 | 0.559994887160599 |  18010.8893090606375400777
    100001 | 2024-01-01 08:00:00 |  0.0000171556727366351 | 0.999982847376515 | 0.208111940739776 | 0.706701461479895 |  18042.9095973077942234180
    100001 | 2024-01-01 09:00:00 |  0.0000391086173721078 | 0.999991541819419 | 0.270102209152959 | 0.877127614373137 |  18006.6961534152591878249
    100001 | 2024-01-01 10:00:00 |  0.0000039186593809859 | 0.999974658507568 | 0.999237150569492 | 0.266570661557068 |  17965.2763406833559493498
(10 rows)

K线图

SELECT market_id,candle_ts,low,high,open,close,vol FROM kline_day WHERE market_id = 100001 ORDER BY candle_ts limit 10;

返回结果如下:

 market_id |      candle_ts      |          low           |       high        |       open        |       close       |             vol
-----------+---------------------+------------------------+-------------------+-------------------+-------------------+-----------------------------
    100001 | 2024-01-01 00:00:00 | 0.00000107306622965098 | 0.999995695245573 | 0.279485633997758 | 0.400038274292626 | 413100.59225003849417556234
    100001 | 2024-01-02 00:00:00 |  0.0000168010291758947 | 0.999989906749448 | 0.383386779917068 | 0.842509523496489 |  86494.66799924685771865093
(2 rows)

30分钟K线图

SELECT
    market_id,
    time_bucket('30 min', candle_ts) AS candle_ts,
    MIN(low) AS low,
    MAX(high) AS high,
    first(open, candle_ts) AS open,
    last(close, candle_ts) AS close,
    SUM(vol) AS vol
FROM kline_min
WHERE market_id = 100001
GROUP BY time_bucket('30 min', candle_ts), market_id
ORDER BY candle_ts DESC
limit 10;

返回结果如下:

 market_id |      candle_ts      |          low          |       high        |       open        |       close       |            vol
-----------+---------------------+-----------------------+-------------------+-------------------+-------------------+---------------------------
    100001 | 2024-01-02 04:30:00 | 0.0000868399533757724 | 0.999973308725529 | 0.245099633513782 | 0.842509523496489 |   5504.421751602083391644
    100001 | 2024-01-02 04:00:00 | 0.0000322917197408401 | 0.999930785807123 | 0.205706123491304 | 0.461819536540872 |   9105.530763894508472418
    100001 | 2024-01-02 03:30:00 | 0.0000195752894498469 |  0.99995978492796 | 0.864512243732669 | 0.877628779498252 |  8976.6974200916718262393
    100001 | 2024-01-02 03:00:00 | 0.0000965453148040751 | 0.999730903399971 | 0.277744581160881 |  0.40121776736629 |  8950.0175974347619817090
    100001 | 2024-01-02 02:30:00 | 0.0000168010291758947 | 0.999989906749448 | 0.628462923153656 |  0.59543446274796 |  8994.8387945492956960329
    100001 | 2024-01-02 02:00:00 | 0.0000243959481878164 | 0.999908087762357 | 0.397794388402794 | 0.253756033949969 |  8999.4657434897451608921
    100001 | 2024-01-02 01:30:00 |  0.000107806352637851 | 0.999920775786947 |  0.72757380777356 | 0.487782751467137 |   8978.433358958287225616
    100001 | 2024-01-02 01:00:00 |  0.000149064256355302 | 0.999904610171029 | 0.474390355164054 | 0.349888134466767 |  9026.5921875235542230817
    100001 | 2024-01-02 00:30:00 |  0.000112103957143006 | 0.999988398043051 |  0.45504042836459 | 0.491025347273023 |  8951.0063568393831949862
    100001 | 2024-01-02 00:00:00 | 0.0000272969867580741 | 0.999900252112297 | 0.383386779917068 | 0.353689845851306 | 9007.66402486356654603173
(10 rows)

性能对比

查询类型

原始数据量

无连续聚合查询耗时

使用连续聚合查询耗时

性能提升倍数

过去7天的每小时最大值

100,000,000

12

0.2

60

过去1年的每日平均值

1,000,000,000

超时(>5分钟)

0.5

>600