05- Patroni Cluster on Ubuntu24.04 - Continuous Backups with Barman / Streaming-Only Mode (SSL)

Super User Hits: 202

In this part, we will install and configure Barman to take continuous backups from our PostgreSQL 18 Patroni Cluster using streaming replication.

This setup allows:

Continuous WAL streaming (near real-time)

Point-In-Time Recovery (PITR)

Full filesystem-level base backups

ZFS-based optimized backup storage

Proper failover handling via HAProxy VIP

No archive_command complexity (simpler and more stable for Patroni)

Important Notes

PostgreSQL version: 18
Cluster uses SSL/TLS encryption
Barman uses:
- Streaming replication (pull) for WALs (WAL is pulled from replication slot to barman server)
- pg_basebackup for full backups
ZFS storage used for backup retention & performance
We do NOT use archive_command

1) Install Barman and PostgreSQL 18 Client (on Barman Server):

sudo su
apt update
apt install -y barman
barman --version

Install PostgreSQL 18 Client

wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | \
  gpg --dearmor -o /usr/share/keyrings/postgresql.gpg

echo "deb [signed-by=/usr/share/keyrings/postgresql.gpg] http://apt.postgresql.org/pub/repos/apt noble-pgdg main" \
  > /etc/apt/sources.list.d/pgdg.list

apt update
apt install -y postgresql-client-18

2) Prepare ZFS Backup Storage:

I added another disk to barman server. The following command scans iscsi devices. So you will not have to reboot the server to see the new disk.

for host in /sys/class/scsi_host/*; do
  echo "- - -" > $host/scan
done

lsblk

Install ZFS utilities

apt install -y zfsutils-linux

Create ZFS Pool

zpool create -f barman-pool /dev/sdb

Create dataset

zfs create barman-pool/backups
zfs set mountpoint=/var/lib/barman barman-pool/backups
chown -R barman:barman /var/lib/barman
chmod 700 /var/lib/barman

Optimize dataset for PostgreSQL workloads

zfs set compression=lz4 barman-pool/backups
zfs set dedup=on barman-pool/backups
zfs set atime=off barman-pool/backups
zfs set recordsize=128K barman-pool/backups
zfs set primarycache=metadata barman-pool/backups

Check:

df -h | grep barman
zfs get compression,dedup barman-pool/backups

3) Patroni Configuration Updates (on ALL PostgreSQL nodes):

nano /etc/patroni/config.yml

Add WAL settings

  parameters:
    max_connections: 100
    shared_buffers: 256MB
    wal_level: replica
    archive_mode: off
    archive_command: 'true'
    max_wal_senders: 5
    wal_keep_size: 4GB

Add pg_hba rules required for Barman (TLS)

  pg_hba:
    - hostssl replication replicator 127.0.0.1/32 md5     ## Patroni/Postgres replication connection to itself
    - hostssl replication barman 192.168.204.11/32 md5    #WAL streaming connection (haproxy01 IP)
    - hostssl all barman 192.168.204.11/32 md5                  #base backup connection (haproxy01 IP)
    - hostssl replication barman 192.168.204.12/32 md5    #WAL streaming connection (haproxy02 IP)
    - hostssl all barman 192.168.204.12/32 md5                  #base backup connection (haproxy02 IP)

    ##Replication between nodes
    - hostssl replication replicator 192.168.204.16/32 md5
    - hostssl replication replicator 192.168.204.17/32 md5
    - hostssl replication replicator 192.168.204.18/32 md5

    - hostssl all all 127.0.0.1/32 md5 ##Localhost
    - hostssl all all 0.0.0.0/0 md5 ##All other clients such as PGAdmin

Restart Patroni on each node

systemctl restart patroni
journalctl -u patroni -f

Verify Cluster

patronictl -c /etc/patroni/config.yml list

4) Create Barman PostgreSQL Role (on Leader: postgres01)

psql -U postgres -h 127.0.0.1 -c "CREATE ROLE barman WITH REPLICATION LOGIN PASSWORD 'myrepPASS123';"
psql -U postgres -h 127.0.0.1 -c "ALTER ROLE barman WITH SUPERUSER;"

Verify

psql -U postgres -h 127.0.0.1 -c "\du"

5) Connection Test from Barman Server:

Here note that, we do not connect postgres nodes's IP addresses directl, we use haproxy VIP instead. This will ensure that even if the leader node is changed, WAL streaming will not fail and continue to the leader.

psql "host=192.168.204.10 port=5432 user=barman dbname=postgres sslmode=require"

#then run any postgres command such as
\du

6) Create Barman Configuration File (On Barman server):

nano /etc/barman.d/postgres.conf

Barman replication slot should be created on the current Patroni leader as the best practice.
Replication slots do NOT work on replicas, and creating a slot on a replica will cause Patroni to fail WAL tracking. This subject is being discussed in the official barman documentation as the following statement:

"When using WAL streaming, it is recommended to always stream from the primary node. This is to ensure that all WALs are received by Barman, even in the event of a failover."

On the orher hand, base backup can be taken from the leader or a standby node. For small or mid-sized environments, it is preferred to take base backup from the leader node for the sake of simplicity of configuration and troubleshooting.

Unlike standalone PostgreSQL, a Patroni cluster dynamically manages replication nodes.
Replica nodes may be rebuilt, resynced, or reassigned at any time.
Therefore:

WAL streaming must always use the current leader
Replication slot must exist on the leader
Replica nodes must never be used for streaming_conninfo

[postgres]
description = "PostgreSQL 18 Patroni Cluster Backups"

# Base backup ALWAYS via leader (VIP)
conninfo = host=192.168.204.10 port=5432 user=barman password=myrepPASS123 dbname=postgres

backup_method = postgres

# Continuous WAL streaming via replication slot
streaming_conninfo = host=192.168.204.10 port=5432 user=barman password=myrepPASS123 dbname=postgres
#enables pg_receivewal-based streaming
streaming_archiver = on
archiver = off 
slot_name = barman_slot
#this creates barman slot automatically on postgres leader node
create_slot = auto
immediate_checkpoint = true

path_prefix = /usr/lib/postgresql/18/bin/

retention_policy = RECOVERY WINDOW OF 30 DAYS
wal_retention_policy = main

Permission:

chmod 644 /etc/barman.d/postgres.conf

7) Check Barman Status

su - barman
barman receive-wal --create-slot postgres
barman check postgres

#check barman logs
tail -n 50 -f /var/log/barman/barman.log

First Full Backup:

Time to take the first full backup now.

barman backup postgres

barman list-backup postgres

barman check postgres

Barman took the first backup by using pg_basebackup. For consistency, WALs must be recieved that might be created during the backup process.

WAL size is 0B. That means WAL has not been received yet.

Let's run these commands to check if WAL is received.

ls -lh /var/lib/barman/postgres/wals/
OR
barman show-backup postgres <backup-id>
#In my case
barman show-backup postgres 20251202T113138

#to check backup details
barman show-backup postgres 20251202T113138

#to delete backup
barman delete postgres 20251202T113138

#for PITR restore
barman recover postgres 20251202T113138 /tmp/recover

Scheduled Jobs (Barman Cron Tasks):

crontab -e -u barman

# 1) Daily full backup at 02:00
0 2 * * * /usr/bin/barman backup postgres >> /var/log/barman/backup.log 2>&1

# 2) Barman maintenance tasks every 5 minutes
#    - archives new WAL files
#    - finalizes backups waiting for WALs
#    - applies retention policies
#    - cleans up expired/failed backups
*/5 * * * * /usr/bin/barman cron >> /var/log/barman/cron.log 2>&1

# (Optional) 3) WAL archive consistency check every hour
#0 * * * * /usr/bin/barman check-wal-archive postgres >> /var/log/barman/check-wal.log 2>&1

Restore Test:

PITR (Point In Time Recovery) lets you restore your PostgreSQL database to a specific moment in the past(second, transaction, or WAL location (LSN)) .

First I want to take a new base backup for this test. This process will hang.

barman backup postgres --wait

Then On Postgres leader node, I will switch WAL file. So barman can recieve the latest WAL segment and the backup command above can be completed.

psql "host=127.0.0.1 sslmode=require user=postgres dbname=postgres" \
  -c "SELECT pg_switch_wal();"

On postgres leader node, run these command to create a test table and then drop. Just make sure after creating the table, wait a few minutes before dropping.

psql "host=127.0.0.1 sslmode=require user=postgres dbname=postgres"

#Just for reference time
SELECT now();

#Create test table
CREATE TABLE pitr_final(x int);
INSERT INTO pitr_final SELECT generate_series(1, 50000);

#close wallsegment after INSERT
SELECT pg_switch_wal();

#time before DROP
SELECT now();

#Drop the table
DROP TABLE pitr_final;

#Close WAL that includes DROP
SELECT pg_switch_wal();

#time after DROP
SELECT now();

On barman

barman list-backup postgres
barman show-backup postgres <backup-id>

We should select the base backup which is closest to the dropping table.

Table dropped at 11:22

20251204T100809 - Thu Dec 4 10:08:11 2025 Best option, closest to the table drop
20251204T095858 - Thu Dec 4 09:58:59 2025 Possible but not as close as 20251204T100809 to the dropping
20251204T081454 - Thu Dec 4 08:14:55 2025 Possible but not as close as 20251204T100809 to the dropping
20251202T113138 - Tue Dec 2 11:31:40 2025 Too old

On Barman, we will restore to 11:22:24

barman recover postgres 20251204T100809 \
  /tmp/pitr_final \
  --target-time "2025-12-04 11:22:24+00"

Copy the restored files to postgres (using a non root user)

cd /tmp
tar czf pitr_final.tar.gz pitr_final

scp pitr_final.tar.gz This email address is being protected from spambots. You need JavaScript enabled to view it.:/tmp/

On postgres01

sudo mv /tmp/pitr_final.tar.gz /var/lib/postgresql/
cd /var/lib/postgresql
sudo tar xzf pitr_final.tar.gz
sudo chown -R postgres:postgres /var/lib/postgresql/pitr_final

#Check
ls -lh /var/lib/postgresql/pitr_final

nano /var/lib/postgresql/pitr_final/pg_hba.conf

#Add these to the top
host    all             all             127.0.0.1/32            trust
host    all             all             ::1/128                 trust
local   all             postgres                                trust

Create another instance on port5433

sudo mkdir -p /var/run/pg-restore
sudo chown postgres:postgres /var/run/pg-restore

sudo -u postgres /usr/lib/postgresql/18/bin/postgres \
   -D /var/lib/postgresql/pitr_final \
   -p 5433 \
   -c unix_socket_directories=/var/run/pg-restore

Use another terminall on postgres01 and check

#If this query returns true
SELECT pg_is_in_recovery();

#then we need to run this to resume
SELECT pg_wal_replay_resume();

#check if table and dataexists 
\d pitr_final
SELECT * FROM pitr_final LIMIT 10;

#now we can run the actually query which returns 5000 as output
SELECT COUNT(*) FROM pitr_final;

We can see the data. PITR is succesful. We can restore this on Patroni Cluster now.

Patroni Cluster Restore:

On postgres01 we already have the restored files (/var/lib/postgresql/pitr_final)

data_dir: /var/lib/postgresql/data

Let's stop patroni on all nodes

systemctl stop patroni

Delete old data folder

#on postgres01
mv /var/lib/postgresql/data /var/lib/postgresql/data.old
mkdir /var/lib/postgresql/data
cp -a /var/lib/postgresql/pitr_final/* /var/lib/postgresql/data
cd /var/lib/postgresql/data
rm -f standby.signal
rm -f recovery.signal
chown -R postgres:postgres /var/lib/postgresql/data
chmod 700 /var/lib/postgresql/data
#delete cluster record on ETCD
patronictl -c /etc/patroni/config.yml remove postgresql-cluster
systemctl start patroni

#only on postgres02 and postgres03
rm -rf /var/lib/postgresql/data/*
mkdir -p /var/lib/postgresql/data
chown -R postgres:postgres /var/lib/postgresql/data
chmod 700 /var/lib/postgresql/data
systemctl start patroni

patronictl -c /etc/patroni/config.yml list

Cluster is up and running again.

I check the dropped table and verify table is recovered.

It is possible to backup multiple databases on the same barman server. The best practice is to have seperate config file for each server. The section at the top in the config file defines the server. Then we use this name in our barman commands like this:

barman check postgres
barman backup postgres
barman list-backup postgres