Tracking Query Execution Statistics in PostgreSQL

There is a fantastic extension in PostgreSQL for capturing query performance statistics, including total time, mean time, and the number of calls. It is an invaluable tool to keep track of which statements are consuming the most CPU time and to identify slow queries.

Most cloud providers will already have loaded the extension for you, however, if you are self-hosting edit postgresql.conf, add pg_stat_statements to shared_preload_libraries, and restart the service to load it.

Once loaded, it can be enabled in the database you want to monitor by running create extension pg_stat_statements. If already enabled, the statistics can, optionally, be reset with select pg_stat_statements_reset().

Queries are combined into a single entry whenever they have identical structures. As of PostgreSQL 13, the attributes tracked for each query include the ones listed below. Note that older versions of PostgreSQL did not have separated attributes for plan and execution time, i.e., instead of mean_plan_time and mean_exec_time, only mean_time was available.

      Column        |       Type
--------------------+-------------------
userid              | oid
dbid                | oid
queryid             | bigint
query               | text
plans               | bigint
total_plan_time     | double precision
min_plan_time       | double precision
max_plan_time       | double precision
mean_plan_time      | double precision
stddev_plan_time    | double precision
calls               | bigint
total_exec_time     | double precision
min_exec_time       | double precision
max_exec_time       | double precision
mean_exec_time      | double precision
stddev_exec_time    | double precision
rows                | bigint
(...)

One common strategy to reduce the load would be to look at slow queries that run frequently and start by optimizing those by reducing the mean time they take to execute, the number of calls, or both.

The following query will provide an ordered list of queries that are good candidates for optimization. The total time takes into account the time each execution takes and the number of calls which is usually a good place.

Slow queries that rarely run are unlikely to have a big impact on the average load and queries that run frequently but are very fast will probably be hard to optimize in a significant way.