首页 Hologres User Guide Data asset management Analyze table statistics

Analyze table statistics

更新时间: 2026-04-11 03:30:13

Starting from V1.3, Hologres provides the hologres.hg_table_info system view to collect daily statistics about the tables in your instance. You can query this view to monitor and analyze table information and take optimization measures based on the statistics.

Limitations

  • This feature is available only for Hologres V1.3 and later. If your instance is an earlier version, you must upgrade it. For assistance, see Common upgrade preparation errors or join the Hologres DingTalk group. For more information, see How to get more online support?.

  • Data in the hologres.hg_table_info view has T+1 freshness. Data for a given day is typically updated by 05:00 the next day. On the day a Hologres instance is upgraded to V1.3, no table statistics are generated. A query on this day returns the meta warehouse store currently not available error. You can start querying table statistics the day after the upgrade.

Usage notes

  • Table statistics are retained for 30 days by default.

  • For non-partitioned internal tables (type='TABLE'), you can query detailed statistics, such as storage space, the number of files, cumulative access counts, and the number of rows.

  • For other objects, such as views, materialized views, foreign tables, and partitioned parent tables, only basic information is available. This includes the number of partitions, the name of the external table that a foreign table maps to, and definitions of views and materialized views.

  • The hologres.hg_table_info table belongs to the Hologres meta warehouse system. Because a failure to query hologres.hg_table_info does not affect business queries running in the instance, the stability of the hologres.hg_table_info table is not covered by the product's service level agreement (SLA).

hg_table_info view

The hg_table_info view contains the following fields.

Note
  • Table statistics are stored in the hologres.hg_table_info system view. After an instance is upgraded to V1.3, table information is collected daily by default.

  • Some fields may be null for tables created before the instance was upgraded to V1.3 because their creation information was not collected. For tables created after the upgrade, all fields are populated.

Field

Type

Description

Remarks

db_name

text

The name of the database where the table resides.

None

schema_name

text

The name of the schema where the table resides.

None

table_name

text

The name of the table.

None

table_id

text

The unique identifier for the table. For a foreign table, the ID format is db.schema.table.

None

type

text

The type of the object. Valid values:

  • TABLE: a regular table or a partitioned child table.

  • PARTITION TABLE: a physical partitioned parent table.

  • LOGICAL PARTITION TABLE: a logical partitioned table.

    Note

    This type is supported in Hologres V3.1.25/V3.2.8 and later. In earlier versions, the reported type is TABLE.

  • DYNAMIC TABLE: a dynamic table.

    Note

    This type is supported in Hologres V4.0.14 and later. In earlier versions, the reported type is TABLE.

  • FOREIGN TABLE: a foreign table.

  • VIEW: a view.

  • MATERIALIZED VIEW: a materialized view.

  • If type is VIEW, the create_time and last_ddl_time fields are null.

  • If type is VIEW, FOREIGN TABLE, or PARTITION TABLE, the last_modify_time, last_access_time, hot_file_count, cold_file_count, total_read_count, and total_write_count fields are null.

  • When the type is DYNAMIC TABLE, the view_def field displays the task definition query, while special properties such as freshness and auto_refresh_enable are displayed in the dynamic_table_properties section of the table_meta field.

partition_spec

text

The partitioning condition. This field is valid only for partitioned child tables.

None

is_partition

boolean

Indicates whether the table is a partitioned child table.

None

owner_name

text

The username of the table owner. You can join this column with the usename column of the hg_query_log view.

None

create_time

timestamp with time zone

The timestamp when the table was created.

None

last_ddl_time

timestamp with time zone

The timestamp when the table metadata was last updated.

None

last_modify_time

timestamp with time zone

The timestamp when the table data was last modified.

None

last_access_time

timestamp with time zone

The timestamp when the table was last accessed.

None

view_def

text

The definition of the view.

This field is valid only for views.

comment

text

The description of the table or view.

None

hot_storage_size

bigint

The size of hot storage used by the table, in bytes.

The storage size reported in hg_table_info may differ from the result of the pg_relation_size function. This difference is expected because hg_table_info is updated daily and pg_relation_size does not include the storage size of binary logs.

cold_storage_size

bigint

The size of cold storage used by the table, in bytes.

The storage size reported in hg_table_info may differ from the result of the pg_relation_size function. This difference is expected because hg_table_info is updated daily and pg_relation_size does not include the storage size of binary logs.

hot_file_count

bigint

The number of files in hot storage for the table.

None

cold_file_count

bigint

The number of files in cold storage for the table.

None

table_meta

jsonb

The raw metadata of the table, in JSONB format.

None

row_count

bigint

The number of rows in the table or partition.

For a partitioned parent table, this is the total number of rows across all its child tables.

collect_time

timestamp with time zone

The timestamp when the statistics were collected.

None

partition_count

bigint

The number of partitioned child tables.

This field is valid only for partitioned parent tables.

parent_schema_name

text

The schema name of the parent table for a partitioned child table.

This field is valid only for partitioned child tables.

parent_table_name

text

The table name of the parent table for a partitioned child table.

This field is valid only for partitioned child tables.

total_read_count

bigint

The cumulative number of read operations on the table. This is an approximate value, as SELECT, INSERT, UPDATE, and DELETE operations can all increment the count.

This value is an approximation and is not recommended for accurate counting.

total_write_count

bigint

The cumulative number of write operations on the table. This is an approximate value, as INSERT, UPDATE, and DELETE operations can all increment the count.

This value is an approximation and is not recommended for accurate counting.

read_sql_count_1d

bigint

The total number of read operations on the table on the previous day (T-1), from 00:00 to 24:00 (UTC+8).

  • Supported in Hologres V3.0 and later.

  • For a partitioned table, if an SQL statement targets a specific partitioned child table, data is collected only from the child table, not from the parent table.

write_sql_count_1d

bigint

The total number of write operations on the table on the previous day (T-1), from 00:00 to 24:00 (UTC+8).

  • Supported in Hologres V3.0 and later.

  • For a partitioned table, if an SQL query targets a specific child table, data is collected only for the child table, not the parent table.

Grant permissions

You must have specific permissions to query table statistics. This section describes the permission rules and how to grant them.

  • View table statistics for all databases in a Hologres instance

    • Grant the user superuser privileges.

      A superuser can view table statistics for all databases in a Hologres instance. The following command grants superuser privileges.

      -- Replace "Alibaba Cloud account ID" with the actual username. For a RAM user, prefix the ID with "p4_".
      ALTER USER "Alibaba Cloud account ID" SUPERUSER;
    • Add users to the pg_stat_scan_table user group.

      In addition to a superuser, Hologres allows users in the pg_stat_scan_tables user group (for versions earlier than V1.3.44) or the pg_read_all_stats user group (for V1.3.44 and later versions) to view statistics logs for all database tables. If a regular user needs to view all logs, they can contact a superuser to be added to the relevant user group. The authorization command is as follows.

      -- For versions earlier than V1.3.44
      GRANT pg_stat_scan_tables TO "Alibaba Cloud account ID";-- For the expert permission model
      CALL spm_grant('pg_stat_scan_tables', 'Alibaba Cloud account ID');  -- For the simple permission model (SPM)
      CALL slpm_grant('pg_stat_scan_tables', 'Alibaba Cloud account ID'); -- For the schema-level permission model (SLPM)
      
      -- For V1.3.44 and later
      GRANT pg_read_all_stats TO "Alibaba Cloud account ID";-- For the expert permission model
      CALL spm_grant('pg_read_all_stats', 'Alibaba Cloud account ID');  -- For the simple permission model (SPM)
      CALL slpm_grant('pg_read_all_stats', 'Alibaba Cloud account ID'); -- For the schema-level permission model (SLPM)
  • View table statistics for the current database

    If you enable the simple permission model (SPM) or the schema-level permission model (SLPM) and add a user to the db_admin user group, the db_admin role can view the table statistics logs for the current database.

    Note

    A regular user can query statistics only for the tables they own in the current database.

    CALL spm_grant('<db_name>_admin', 'Alibaba Cloud account ID');  -- For the simple permission model (SPM)
    CALL slpm_grant('<db_name>.admin', 'Alibaba Cloud account ID'); -- For the schema-level permission model (SLPM)

Sample queries

Use case 1: Monitor internal table trends

-- Monitor trends for all internal tables in an instance, including storage size, number of files, read/write counts, and number of rows.
SELECT
  db_name,
  schema_name,
  table_name,
  collect_time :: date AS collect_date,
  hot_storage_size,
  cold_storage_size,
  hot_file_count,
  cold_file_count,
  read_sql_count_1d,
  write_sql_count_1d,
  row_count
FROM
  hologres.hg_table_info
WHERE
  collect_time > (current_date - interval '1 week')::timestamptz -- For the last week
  AND type ='TABLE'
  ORDER  BY  collect_date desc ;

Use case 2: Check access patterns of large tables

-- Check access patterns for the top 10 tables by storage size.
SELECT
  db_name,
  schema_name,
  table_name,
  hot_storage_size + cold_storage_size AS total_storage_size,
  row_count,
  sum(read_sql_count_1d) AS total_read_count,
  sum(write_sql_count_1d) AS total_write_count
FROM
  hologres.hg_table_info
WHERE
  collect_time > (current_date - interval '1 week')::timestamptz -- For the last week
  AND type = 'TABLE'
  AND (
    cold_storage_size IS NOT NULL
    OR hot_storage_size IS NOT NULL
  )
GROUP BY db_name,schema_name,table_name,total_storage_size,row_count
ORDER BY total_storage_size DESC
LIMIT 10;

Use case 3: Analyze trends for top 10 tables

-- Analyze access, storage, and data volume trends over the last week for the top 10 tables by storage (based on yesterday's statistics).
WITH top10_table AS (SELECT
  db_name,
  schema_name,
  table_name,
  hot_storage_size + cold_storage_size AS total_storage_size
FROM
  hologres.hg_table_info
WHERE
  collect_time >= (current_date - interval '1 day')::timestamptz -- Yesterday
  AND collect_time < current_date
  AND type = 'TABLE'
  AND ( cold_storage_size IS NOT NULL OR hot_storage_size IS NOT NULL )
ORDER BY total_storage_size DESC
LIMIT 10
)
SELECT
  base.db_name,
  base.schema_name,
  base.table_name,
  base.hot_storage_size + cold_storage_size AS total_storage_size,
  base.row_count,
  base.read_sql_count_1d,
  base.write_sql_count_1d,
  base.collect_time :: date AS collect_date
FROM
  hologres.hg_table_info AS base
LEFT JOIN
  top10_table
ON
  base.db_name = top10_table.db_name
  AND base.schema_name = top10_table.schema_name
  AND base.table_name = top10_table.table_name
WHERE
  collect_time > (current_date - interval '1 week')::timestamptz -- For the last week
  AND type = 'TABLE'
  AND ( cold_storage_size IS NOT NULL OR hot_storage_size IS NOT NULL )
ORDER BY total_storage_size DESC , collect_date DESC;

Use case 4: Analyze trends for bottom 10 tables

-- Analyze access, storage, and data volume trends over the last week for the 10 tables that use the least storage (based on yesterday's statistics).
WITH top10_table AS (SELECT
  db_name,
  schema_name,
  table_name,
  hot_storage_size + cold_storage_size AS total_storage_size
FROM
  hologres.hg_table_info
WHERE
  collect_time >= (current_date - interval '1 day')::timestamptz -- Yesterday
  AND collect_time < current_date
  AND type = 'TABLE'
ORDER BY total_storage_size ASC LIMIT 10
)
SELECT
  base.db_name,
  base.schema_name,
  base.table_name,
  base.hot_storage_size + cold_storage_size AS total_storage_size,
  base.row_count,
  base.read_sql_count_1d,
  base.write_sql_count_1d,
  base.collect_time :: date AS collect_date
FROM
  hologres.hg_table_info AS base
RIGHT JOIN
  top10_table
ON
  base.db_name = top10_table.db_name
  AND base.schema_name = top10_table.schema_name
  AND base.table_name = top10_table.table_name
WHERE
  collect_time > (current_date - interval '1 week')::timestamptz -- For the last week
  AND type = 'TABLE'
ORDER BY total_storage_size ASC  , collect_date DESC ;

Use case 5: Identify large tables with small files

-- View the number of files and storage size for each table, and sort by average file size.
-- The table group shows the shard count only for the current database. For other databases, this value is null.
SELECT
  db_name,
  schema_name,
  table_name,
  cold_storage_size + hot_storage_size AS total_storage_size,
  cold_file_count + hot_file_count AS total_file_count,
  (cold_storage_size + hot_storage_size) / (cold_file_count + hot_file_count) AS avg_file_size,
  tmp_table_info.table_meta ->> 'table_group' AS table_group,
  tg_info.shard_count
FROM
  hologres.hg_table_info tmp_table_info
  LEFT JOIN (
    SELECT
      tablegroup_name,
      property_value AS shard_count
    FROM
      hologres.hg_table_group_properties
    WHERE
      property_key = 'shard_count'
  ) tg_info ON tmp_table_info.table_meta ->> 'table_group' = tg_info.tablegroup_name
WHERE
  collect_time > (current_date - interval '1 day')::timestamptz
  AND type = 'TABLE'
  AND (
    cold_storage_size IS NOT NULL
    OR hot_storage_size IS NOT NULL
  )
  AND (
    cold_file_count IS NOT NULL
    OR hot_file_count IS NOT NULL
  )
  AND cold_file_count + hot_file_count <> 0
ORDER BY avg_file_size;

Use case 6: Check row count changes

-- View the last modification time and the total change in the number of rows from the last modification.
-- If the instance contains many tables, filter the CTE tmp_table_info to prevent long query times from fetching excessive data.
WITH tmp_table_info AS (
  SELECT
    db_name,
    schema_name,
    table_name,
    row_count,
    collect_time,
    last_modify_time
  FROM
    hologres.hg_table_info
  WHERE
    last_modify_time IS NOT NULL
    AND type = 'TABLE'
    -- Add filters for tmp_table_info here.
    -- For example: collect_time > (current_date - interval '14 day'):: timestamptz
    -- For example: table_name like ''
    -- For example: type = 'PARTITION'
)
SELECT
  end_data.db_name AS db_name,
  end_data.schema_name AS schema_name,
  end_data.table_name AS table_name,
  (end_data.row_count - start_data.row_count) AS modify_row_count,
  end_data.row_count AS current_rows,
  end_data.last_modify_time AS last_modify_time
FROM
  (
    SELECT
      db_name,
      schema_name,
      table_name,
      row_count,
      last_modify_time
    FROM
      tmp_table_info
    WHERE
      collect_time > (current_date - interval '1 day')::timestamptz -- Query the last modification time of the table recorded yesterday.
  ) end_data
  LEFT JOIN (
    SELECT
      db_name,
      schema_name,
      table_name,
      row_count,
      collect_time
    FROM
      tmp_table_info
  ) start_data ON (
    end_data.db_name = start_data.db_name
    AND end_data.schema_name = start_data.schema_name
    AND end_data.table_name = start_data.table_name
    AND end_data.last_modify_time::date = (start_data.collect_time + interval '1 day')::date
  );
上一篇: Data lineage 下一篇: Performance testing
阿里云首页 实时数仓 Hologres 相关技术圈