PostgreSQL: A Comprehensive Guide to PostgreSQL Streaming Replication

Introduction

PostgreSQL, a powerful and open-source relational database management system, is renowned for its robustness, flexibility, and extensive feature set. According to DB-Engines, PostgreSQL is currently the fourth most popular database in the world and has risen quickly in recent years. It is also the second most popular open-source RDBMS, behind MySQL. 

One of PostgreSQL's most valuable features is streaming replication, which provides high availability, fault tolerance, and load balancing by maintaining multiple copies of a database on different servers. In this post, we'll explore what PostgreSQL streaming replication is, how it works, and the steps to set it up with a hot standby and heartbeat monitoring. 

What is PostgreSQL Streaming Replication?

Streaming replication in PostgreSQL allows you to create a replica (or standby) server that continuously receives updates from a primary (or master) server. This replication method is asynchronous by default, but PostgreSQL also supports synchronous replication, where the primary server waits for at least one standby to confirm receipt of the data before committing the transaction.

Key Benefits of Streaming Replication:

  • High Availability: Minimize downtime by promoting a standby server to primary in the event of a failure.
  • Scalability: Distribute read-only queries across multiple standby servers.
  • Data Protection: Standby servers provide an additional layer of data redundancy.

How Streaming Replication Works

Streaming replication relies on a few key components:

  1. Write-Ahead Logging (WAL): Ensures all changes are logged before they are committed.
  2. Primary and Standby Servers: The primary server is the source of data, while standby servers are replicas.
  3. WAL Sender and WAL Receiver: The primary server streams WAL records to the standby server.
  4. Replication Slots (Optional): Ensures the primary server retains WAL files until the standby has received them.

Setting Up PostgreSQL Streaming Replication with Hot Standby

1. Configure the Primary Server

First, configure the primary server to support streaming replication.

Edit postgresql.conf:

wal_level = replica
archive_mode = on
archive_command = 'cp %p /path/to/archive/%f'
max_wal_senders = 3
wal_keep_segments = 32

Enable replication slots if desired:

max_replication_slots = 3

Edit pg_hba.conf:

host replication replicator <standby_ip_address>/32 md5
Create a Replication Role:
CREATE ROLE replicator WITH REPLICATION LOGIN PASSWORD 'your_password';
Restart PostgreSQL on the primary server:
sudo systemctl restart postgresql

2. Configure the Standby Server

Clone the Primary Server:

pg_basebackup -h <primary_ip_address> -D /path/to/standby/data -U replicator -Fp -Xs -P

Configure the Standby Server:

Create a recovery.conf file:

standby_mode = 'on'
primary_conninfo = 'host=<primary_ip_address> port=5432 user=replicator password=your_password'
trigger_file = '/tmp/standby.trigger'
Modify postgresql.conf:
wal_level = replica
archive_mode = on
archive_command = 'cp %p /path/to/archive/%f'
max_wal_senders = 3
hot_standby = on
Start the Standby Server:
sudo systemctl start postgresql
Check the pg_stat_replication view on the primary server:
SELECT * FROM pg_stat_replication;

Adding Heartbeat Monitoring with repmgr

Heartbeat monitoring ensures that the standby server is alive and responsive. We'll use repmgr, a popular PostgreSQL replication management tool, to implement this.

1. Install repmgr

Install repmgr on both the primary and standby servers:

sudo apt-get install repmgr

2. Configure repmgr on the Primary Server

Edit the repmgr.conf file on the primary server:

node_id=1
node_name='primary'
conninfo='host=<primary_ip_address> user=repmgr dbname=repmgr'
data_directory='/path/to/primary/data'
Initialize repmgr:
repmgr -f /etc/repmgr.conf primary register

3. Configure repmgr on the Standby Server

Edit the repmgr.conf file on the standby server:

node_id=2
node_name='standby'
conninfo='host=<standby_ip_address> user=repmgr dbname=repmgr'
data_directory='/path/to/standby/data'
Clone the primary server using repmgr:
repmgr -f /etc/repmgr.conf standby clone <primary_ip_address>
Register the standby server:
repmgr -f /etc/repmgr.conf standby register

4. Start and Monitor Replication

Start the replication process:

repmgrd -f /etc/repmgr.conf --daemonize
Check the replication status:
repmgr cluster show

Conclusion

PostgreSQL streaming replication with hot standby and heartbeat monitoring provides a robust solution for high availability, fault tolerance, and load balancing. By following this guide, you can set up a resilient PostgreSQL environment that ensures data protection and minimizes downtime in the event of a failure.

If you have any questions or need further assistance, feel free to leave a comment below!

Happy coding !!!

Felix N. Abelardo Santana

Comments

Popular posts from this blog

Rails: Create an API Wrapper of an External Service

Rails: Upgrading Ruby in Rails Applications a Comprehensive Guide