High Availability PostgreSQL Patterns

PostgreSQL is best known for its SQL compliance and excellent extensibility. Learn how to achieve using it efficiently with a required focus on high availability.

Digital Delivery

5 min

A successor to the Ingres database, PostgreSQL (often called "Postgres"), is a free and open-source system for relational database management.

PostgreSQL is best known for its SQL compliance and excellent extensibility, but using it efficiently requires a focus on high availability.

Here's what that means and how to achieve it, no matter how you manage your data.

What is High Availability?

High availability, or HA, measures how resilient a system is when underlying infrastructure fails.

For PostgreSQL clusters to be highly available, database servers must cooperate to allow a secondary server to take over if a primary server fails. Some key characteristics define "high availability" in a system.

For instance, HA requires the following:

There should be no single point of failure within the system
The health of backend servers should be continuously monitored
In the event of failure, reliable failover should be standard
Ideally, resources will be distributed geographically

Compare high availability to load balancing, which is another mechanism for improving system resiliency.

With load balancing, several computers work together to serve the same data.

Both methods have pros and cons, but if you're looking to create a high availability PostgreSQL environment, these are the steps you need to follow.

Setting Up Replication

In order to achieve high availability, you cannot rely on a single database server.

This means all data needs to be replicated to a "slave," or a replica that stands by in the event of a failure.

You can choose from streaming replication or logical replication and logical decoding.

Using streaming replication, a standby replica (slave) will connect to the primary data (master) and stream WAL records from the master to the slave server.

This is considered fast and safe, as the slave becomes an exact replica with minimal lag in between.

With logical replication, you can selectively replicate a subset of a master's table.

Whereas streaming replication takes place at the block level, meaning everything in the master database is replicated, logical replication is ideal for those scenarios when only certain data needs to be replicated.

Logical replication also allows for optional direct writes to the slave database. Logical replication can also allow a single slave to replicate from multiple masters.

Open-Source Solutions

Countless open-source solutions exist to help make your PostgreSQL environment highly available. Here's a look at the most popular.

Patroni

Patroni is a cluster management framework that can store and talk to a key-value store, allowing it to determine the state of a PostgreSQL cluster at any point in time.

A Patroni cluster consists of multiple PostgreSQL instances, which can run in containers, on servers, or within virtual machines.

For failover to work properly, a slave must be assigned as a leader.

If something fails, Patroni takes the leader slave and promotes it. Countless configuration options exist.

For instance, you can use asynchronous replication and specify a maximum lag for failover, allowing you to stop Patroni from promoting a slave that's lagging by more than your set value.

Some of the other features and benefits of Patroni include continuous monitoring, manual or scheduled switchover, built-in automation, infrastructure for transparent failover, and integrations with Linux Watchdog to avoid split-brain scenarios.

PgPool-II

As an open-source high availability solution, PgPool has vastly improved in recent years.

This connection pooler has a long history but has gained a greater market share thanks to its implementation of Watchdog in version 3.2, making it a much more sought-after HA solution.

In version 3.5, updates enhanced the performance further by ensuring constant quorum presence, avoiding the issues associated with "split-brain" scenarios.

By establishing connections with your PostgreSQL servers and reusing them with each new connection, PgPool-II offers reduced overhead that can improve the overall throughput of your systems.

It can also distribute read queries over multiple servers, allowing you to take advantage of the benefits of load balancing.

PostgreSQL Automatic Failover (PAF)

One of the most popular high availability management solutions is PostgreSQL Automatic Failover (PAF) by ClusterLabs.

No data is lost when failover occurs thanks to the use of Postgres synchronous replication.

The tool utilizes Pacemaker and Corosync, both of which are industry standard applications that help detect failures in a PostgreSQL database and respond promptly.

Integrating Pacemaker makes PAF capable of managing a variety of resources using resource agents.

Those agents are responsible for managing specific resources, determining behavior, and informing Pacemaker of updates.

Implementing resource agents requires compliance with the Open Cluster Framework (OCF) specification.

OCF defines stop, start, promote, demote, and interactions with Pacemaker and guides the behavior of resource agents.

After building your database cluster, you can use PAF to expose the current status of an instance to Pacemaker with information on each node.

Achieving High Availability in PostgreSQL

High availability isn't just an appealing concept, it's a crucial part of efficiently using PostgreSQL.

Failing to achieve high availability can cost your business big time, but finding the path to high availability is often easier said than done.

Fortunately, with the right experts on your side, an optimized and dependable PostgreSQL database isn't too far out of reach.

Adservio works with PostgreSQL users every day to help them achieve high availability with a seamless and straightforward approach.

Our PostgreSQL experts take a daunting task and make it look easy, offering fast and proven methodologies to help you achieve high availability in less time.

Interested in learning more? Contact Adservio today.

Published on

October 7, 2021