数据库是较大型的应用,对于繁忙的数据库,需要消耗大量的内存、CPU、IO、网络资源。SQL 优化是数据库优化的手段之一,而为了达到 SQL 优化的最佳效果,您首先需要了解最消耗资源的 SQL(Top SQL),例如 IO 消耗最高的 SQL。
数据库资源分为多个维度,包括CPU、内存、IO 等,为能够从各个维度层面查找最消耗数据库资源的 SQL,您可以使用 pg_stat_statements 插件统计数据库的资源开销和分析 Top SQL。
本文将通过示例介绍如何创建 pg_stat_statements 插件、如何分析 Top SQL 以及如何重置统计信息。
执行如下命令,在需要查询 TOP SQL 的数据库中,创建 pg_stat_statements 插件。
CREATE EXTENSION pg_stat_statements;
pg_stat_statements 输出内容介绍
通过查询 pg_stat_statements 视图,您可以得到数据库资源开销的统计信息。SQL 语句中的一些过滤条件在 pg_stat_statements 中会被替换成变量,可以减少重复显示的问题。
pg_stat_statements 视图包含了一些重要信息,例如:
SQL 的调用次数、总耗时、最快执行时间、最慢执行时间、平均执行时间、执行时间的方差(反映抖动),总共扫描、返回或处理了多少行。
shared buffer 的使用情况:命中、未命中、产生脏块、驱逐脏块。
local buffer 的使用情况:命中、未命中、产生脏块、驱逐脏块。
temp buffer 的使用情况:读了多少脏块、驱逐脏块。
数据块的读写时间。
下表列出了 pg_stat_statements 输出内容中各参数的含义。
参数名称 | 类型 | 参考 | 说明 |
userid | oid | pg_authid.oid | OID of user who executed the statement. |
dbid | oid | pg_database.oid | OID of database in which the statement was executed. |
queryid | bigint | 无 | Internal hash code, computed from the statement’s parse tree. |
query | text | 无 | Text of a representative statement. |
calls | bigint | 无 | Number of times executed. |
total_time | double precision | 无 | Total time spent in the statement, in milliseconds. |
min_time | double precision | 无 | Minimum time spent in the statement, in milliseconds. |
max_time | double precision | 无 | Maximum time spent in the statement, in milliseconds. |
mean_time | double precision | 无 | Mean time spent in the statement, in milliseconds. |
stddev_time | double precision | 无 | Population standard deviation of time spent in the statement, in milliseconds. |
rows | bigint | 无 | Total number of rows retrieved or affected by the statement. |
shared_blks_hit | bigint | 无 | Total number of shared block cache hits by the statement. |
shared_blks_read | bigint | 无 | Total number of shared blocks read by the statement. |
shared_blks_dirtied | bigint | 无 | Total number of shared blocks dirtied by the statement. |
shared_blks_written | bigint | 无 | Total number of shared blocks written by the statement. |
local_blks_hit | bigint | 无 | Total number of local block cache hits by the statement. |
local_blks_read | bigint | 无 | Total number of local blocks read by the statement. |
local_blks_dirtied | bigint | 无 | Total number of local blocks dirtied by the statement. |
local_blks_written | bigint | 无 | Total number of local blocks written by the statement. |
temp_blks_read | bigint | 无 | Total number of temp blocks read by the statement. |
temp_blks_written | bigint | 无 | Total number of temp blocks written by the statement. |
blk_read_time | double precision | 无 | Total time the statement spent reading blocks, in milliseconds (if track_io_timing is enabled, otherwise zero). |
blk_write_time | double precision | 无 | Total time the statement spent writing blocks, in milliseconds (if track_io_timing is enabled, otherwise zero). |
分析 TOP SQL
最耗 IO SQL
执行如下命令,查询单次调用最耗 IO SQL TOP 5。
SELECT userid::regrole, dbid, query FROM pg_stat_statements ORDER BY (blk_read_time+blk_write_time)/calls DESC LIMIT 5;
执行如下命令,查询总最耗 IO SQL TOP 5。
SELECT userid::regrole, dbid, query FROM pg_stat_statements ORDER BY (blk_read_time+blk_write_time) DESC LIMIT 5;
最耗时 SQL
执行如下命令,查询单次调用最耗时 SQL TOP 5。
SELECT userid::regrole, dbid, query FROM pg_stat_statements ORDER BY mean_time DESC LIMIT 5;
执行如下命令,查询总最耗时 SQL TOP 5。
SELECT userid::regrole, dbid, query FROM pg_stat_statements ORDER BY total_time DESC LIMIT 5;
响应时间抖动最严重 SQL
执行如下命令,查询响应时间抖动最严重 SQL。
SELECT userid::regrole, dbid, query FROM pg_stat_statements ORDER BY stddev_time DESC LIMIT 5;
最耗共享内存 SQL
执行如下命令,查询最耗共享内存 SQL。
SELECT userid::regrole, dbid, query FROM pg_stat_statements ORDER BY (shared_blks_hit+shared_blks_dirtied) DESC LIMIT 5;
最耗临时空间 SQL
执行如下命令,查询最耗临时空间 SQL。
SELECT userid::regrole, dbid, query FROM pg_stat_statements ORDER BY temp_blks_written DESC LIMIT 5;
重置统计信息
pg_stat_statements是累积的统计,如果要查看某个时间段的统计,需要查询快照的信息,详情请参见《PostgreSQL AWR报告(for 阿里云ApsaraDB PgSQL)》。
您也可以通过执行如下命令,来定期清理历史统计信息。
SELECT pg_stat_statements_reset();