Aiven for PostgreSQL® shared buffers#

Learn what Aiven for PostgreSQL® shared buffers are and why use them. Find out how using shared buffers impacts performance and how to set them up properly. Finally, discover how to inspect the database cache performance and and the query cache performance and check out how to put data into cache manually.

About shared buffers#

There are two primary memory allocations in Aiven for PostgreSQL that drastically impact the performance of queries: shared_buffers (the amount of RAM used for shared memory buffers) and work_mem (the maximum amount of memory to be used by a query operation before writing to temporary disk files).

Purpose of shared buffers#

The shared_buffers parameter controls the amount of memory allocated to the database server for disk page caching. The primary purpose of shared_buffers is to share memory over multiple sessions that may want to access the same blocks concurrently. Managing the access using the memory helps avoid unnecessary locking.

Note

Even with shared_buffers, Aiven for PostgreSQL relies on the filesystem cache for optimization so reading from the disk is still needed.

Allocation and setup#

shared_buffers is allocated only once at startup; thus, it’s not allocated per-session or per-user. It is shared among all sessions and useable for each worker and, therefore, each query.

To obtain a good performance of a database server with 1 GB or more of RAM, it is usually necessary to set this value to ~25% of the system memory. For systems with less than 1 GB of RAM, a smaller portion of RAM is preferable to allow for the operating system.

Allocating a lot of memory to shared_buffers is not always optimal for your configuration since the remaining free memory is allocated to workers (queries) and the filesystem cache. There are workloads where large shared_buffers are effective, but the allocation of more than 40% is unlikely to work better than a smaller-amount allocation.

The optimal setting for any service depends on the available RAM, the working data set, and the workload applied.

To examine the current shared_buffers value, run the following query:

SHOW shared_buffers;

Tuning guidelines#

Aiven for PostgreSQL actively tracks data access patterns, updates the shared_buffers with frequently accessed data, and ejects Least Recently Used (LRU) when necessary. With many applications, only a fraction of the entire data set is accessed regularly. This data set fraction can be referred to as a working set or a frequent read set and often follows the 80/20 rule: 80% of the reads is in 20% of the data.

Tip

For optimal performance, a shared_buffers cache hit rate of 97-99% is ideal.

You can find an overview of shared_buffers cache hit rates using pg_statio_user_tables:

 SELECT * FROM pg_statio_user_tables;

 relid | schemaname | relname | heap_blks_read | heap_blks_hit | idx_blks_read | idx_blks_hit | toast_blks_read | toast_blks_hit | tidx_blks_read | tidx_blks_hit
-------+------------+---------+----------------+---------------+---------------+--------------+-----------------+----------------+----------------+---------------
 16415 | public     | records |        1042770 |      88157826 |        184280 |     40282404 |               0 |              0 |              0 |             0
(1 row)

Calculate the database cache hit rate with:

SELECT
  sum(heap_blks_read) as heap_read,
  sum(heap_blks_hit)  as heap_hit,
  sum(heap_blks_hit) / (sum(heap_blks_hit) + sum(heap_blks_read)) as hit_ratio
FROM
  pg_statio_user_tables;

   heap_read | heap_hit |         ratio
  -----------+----------+------------------------
     6942770 | 88157826 | 0.9883098315
  (1 row)

If the cache hit rate is significantly lower than 95%, this may be an indicator of several issues:

  • Insufficient data activity to generate accurate statistics (new database)

  • Current shared_buffers_percentage value too low

  • Size of the working set larger than the maximum available shared_buffers_percentage (60%)

To achieve an optimal performance, the working set needs to fit in shared buffers.

While the shared_buffers_percentage has a maximum value of 60%, exceeding a value of 40% suggests more RAM is required.

Tip

In many cases, the Aiven default value of 20% requires no further modification.

Inspecting the database cache performance#

For a deeper examination into the contents of the shared_buffers, enable the pg_buffercache extension:

CREATE EXTENSION pg_buffercache;

Calculate how many blocks from tables (r), indexes (i), sequences (S), and other objects are currently cached using the following query:

SELECT c.relname, c.relkind
  , pg_size_pretty(count(*) * 8193) as buffered
  , round(100.0 * count(*) / ( SELECT setting FROM pg_settings WHERE name='shared_buffers')::integer,1) AS buffers_percent
  , round(100.0 * count(*) * 8192 / pg_relation_size(c.oid),1) AS percent_of_relation
FROM pg_class c
INNER JOIN pg_buffercache b ON b.relfilenode = c.relfilenode
INNER JOIN pg_database d ON (b.reldatabase = d.oid AND d.datname = current_database())
WHERE c.oid >= 16384
AND pg_relation_size(c.oid) > 0
GROUP BY c.oid, c.relname
ORDER BY 3 DESC
LIMIT 10;

 relname | relkind | buffered | buffers_percent | percent_of_relation
---------+---------+----------+-----------------+---------------------
 records | r       | 781 MB   |            99.7 |                27.2

Relations with object IDs (oid) below 16384 are reserved system objects.

Inspecting the query cache performance#

Queries can also be inspected for the cache hit performance using EXPLAIN:

EXPLAIN (ANALYZE, BUFFERS, VERBOSE)
SELECT * from records;
                                                          QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------
Seq Scan on public.records  (cost=0.00..480095.20 rows=11207220 width=77) (actual time=0.158..16863.051 rows=11600000 loops=1)
  Output: id, "timestamp", data
  Buffers: shared hit=92345 read=275678 dirtied=10938
Query Identifier: 2582883386000135492
Planning:
  Buffers: shared hit=30 dirtied=2
Planning Time: 1.081 ms
Execution Time: 17467.342 ms
(8 rows)

Using hit / (hit + read) shows ~25% of this full table scan was in the shared_buffers

Putting data into the cache manually#

You may want to prewarm the shared_buffers in anticipation of a specific workload, such as a large analytical query set used for reporting. This can be accomplished using the pg_prewarm extension.

CREATE EXTENSION pg_prewarm;

Example

Call the pg_prewarm function and pass the name of a desired table.

SELECT * FROM pg_prewarm('public.records');
pg_prewarm
------------
     368023

 SELECT pg_size_pretty(pg_relation_size('public.records'));
 pg_size_pretty
 ----------------
 2875 MB

368023 pages have been read into the cache (or ~2875 MB).

If the shared buffers size is less than pre-loaded data, only the tailing end of the data is cached as the earlier data encounters a forced ejection.