It is not a secret that I’m all in favor of using pgbackrest. I think that it is today the best backup tool for PostgreSQL. It does not mean others are bad, it means they are less good. If you feel the need to have a way to restore your data, I guess you’d rather use the best tool to achieve that goal.
What’s the problem with
pg_basebackup is a tool that should be renamed
because that’s exactly what it does. That’s why I’m in favor of using
pg_basebackup to create (or reinstate) a standby.
A proper backup tool should care about how to restore your data. Having a backup is just a convenient way to perform a restore, but that’s just a step in the journey, not a destination.
pg_basebackup does not care about restoring, it is missing important
- restoring (obviously)
- archiving WALs
- maintaining a backup catalog
- managing backups and WALs retention policy
- testing the backup files as not been corrupted (before Postgres 13 and this is not automatic)
I’m sure I could find others, but those should be enough not to consider using
pg_basebackup as a backup tool.
Barman can help
Barman was created to try to add the features missing in
the missing catalog, the missing WAL archiving problem, and the missing restore
Barman also has a mode
rsync where it does not rely on
I still deeply think that when there’s a design flaw, like
has, re-creating something new from a blank page is the way to go to fix the
The only way to find out your
backup is corrupted is to use the
verify command which will call
pg_verifybackup(only available from Postgres 13) which will fail should you
rsync method without launching the
For a long time, this was not even possible!
Also, I think at one point, we need to discuss
Barman documentation. Back in
2017, I tried to use
Barman to back up a Postgres cluster. I wanted to do it by
myself without any help, just by reading the documentation. I found out that the
documentation was not good, with contradictions in different paragraphs and, in
the end, I was not able to back up (or restore) a standalone Postgres cluster.
Barman is good, but not good enough.
So, why is
pgBackRest my recommended tool? As said,
pgBackRest will try to
save you time. Back in 2017, when I was experimenting with backup tools, I was able
to follow the documentation and create a backup of my test standalone Postgres
cluster and restore it very easily, following the tutorial.
Since then, I created a practical training for my students at Université Lyon I: without
previous knowledge of Postgres or
pgBackRest, they are able in 4 hours to
backup, restore, PITR, and even dig in the WALs to find the timestamp or xID of
a mistake to PITR just before the mistake. It’s that good that when I suggested
pgBackRest to create a replication, I couldn’t find out if they had
stupidly used the tool without understanding anything or if they had completely
understood how setting a replication works (I added questions for the
replication training, don’t worry)!
The feature I love with
pgBackRest is the “delta” feature which will
copy over only the files which changed since the backup. This feature is very
useful for huge clusters. You can first issue a messy rsync command (or restore
a storage snapshot) and then restore with the delta option.
Another great feature of
pgBackRest is the ability to restore just one
database on a dummy cluster so that you can then choose which data you want to
copy over to another cluster.
Of course, those features are only the cherry on the top.
created with secure restore in mind, so that the first feature of the first
pgBackRest was the creation of a manifest file and check-summing of all
The only case when I don’t recommend
pgBackRest is when you’re running Postgres
under Windows (which I don’t recommend either), because there is no Windows
pgBackRest for Windows. I still consider having a great
backup/restore tool for Windows is a must for the community and we’re trying to
work on it.
Other tools are barely used in the community. You might want to favor a tool that is used by the majority of people running Postgres so that you reduce the risk to find a bug, but this is your choice.
Bonus question: is
pg_dump a backup tool
It depends. If you can allow losing all transactions between the beginning of your
dump and the time you need to import your dump, then
pg_dump can be considered
as a backup tool. I’d rather use the term “export” than “backup” so that people
don’t have a false sense of security.
On top of that, you need to consider the time to re-import all those data which can be very long as importing is slower than simply copying files to the file system. This operation needs to be performed frequently to make sure the downtime is affordable.
Most of the time, you won’t be fine with losing that much data or having such
a downtime to re-import your dump, but if you fully understand the consequences
and risks and are ok with them, please use
pg_dump as a backup tool.
pgBackRest and relax
This is my TL; TD for this post: just use
pgBackRest and you’ll be fine.
Remember also that you should never trust a tool and always try to complete
a restore to make sure your backup is usable. I guess you don’t want to discover
what “Backup of Shrödinger means”!