This talk cover various advanced topics in the area of backups:
- incremental backups;
- archive management;
- backup validation;
- retention policies;
etc.
Based on these features, we'll compare various backup/recovery solutions for PostgreSQL.
This information will help you to choose the most appropriate tool for your system.
2. Agenda
- Why backup?
- What is a good backup tool?
- Overview of advanced backup features
- Overview of PostgreSQL backup tools
Spoiler: this talk doesn’t contain any benchmarks.
2
3. Why do you need a backup?
- To restore the database after an accident
- hardware failure
- software bug
- human error
- To set up a new replica
- To create a test environment
- To inspect data from the past
3
4. What are the options?
- replica is not a backup
- dump a.k.a. “logical backup”
- storage snapshots
- pg_basebackup
- set of custom scripts
- PostgreSQL specific backup tools
4
5. What makes a good backup tool?
- Convenience
- out-of-box automatization of various routines
- documentation & support
- convenient and stable api
- Performance
- parallel execution
- compression
- incremental & differential backups
- WAL prefetch
5
6. What backup tools exist?
- Barman
- pgBackRest
- pg_probackup
- WAL-G
- BART
- part of the “EDB Advanced Server”
- requires pg_basebackup
6
7. Who is who? Barman
- https://www.pgbarman.org/
- 2ndQuadrant
- GPL v 3.0
- Python
- first release: 2011
- Two methods: basebackup & rsync
Notable features:
Synchronous streaming for “zero data loss”.
7
8. Who is who? pgBackRest
- https://pgbackrest.org/
- Crunchy Data
- MIT License
- C
- first release: 2014
Notable features:
Performance optimizations for large backups.
8
9. Who is who? pg_probackup
- https://github.com/postgrespro/pg_probackup
- Postgres Professional
- PostgreSQL License
- C
- first release: 2017 (based on pg_arman)
Notable features:
Page-level incremental backups and built-in validation.
9
10. Who is who? WAL-G
- https://github.com/wal-g/wal-g
- introduced by Citus Data,
now maintained by Yandex Cloud team
- Apache License, Version 2.0
- Go
- first release: 2017 ( “based on” WAL-E)
Notable features:
Out-of-box support for various cloud storages.
10
13. Documentation
Barman User guide & command reference.
Great overview of backup architectures
pgBackRest User guide & command reference
pg_probackup User guide & command reference
WAL-G README
13
14. Installation
Barman Linux packages, Build from source
pgBackRest Linux packages, Build from source
pg_probackup Linux packages, Build from source,
Windows installer
WAL-G Linux binary, Build from source
14
22. Streaming backups
- Recovery Point Objective (RPO):
"maximum targeted period in which data might be lost
from an IT service due to a major incident"
- “RPO = 0” (Zero data loss)
can be achieved by synchronous WAL streaming
- replication slot
prevents the removal of WAL that is not yet received
(PostgreSQL feature)
22
24. 4. Incremental backups
Full backup includes all data files.
Differential backup contains changes since last full backup.
Incremental backup contains changes since last backup.
24
25. Incremental backup methods
- DELTA - read everything, backup what changed
- independent method
- read load on data server
- PAGE - scan WAL to determine changed blocks
- requires WAL archive
- minimal load on data server
- PTRACK - remember changed blocks in a map
- requires core patch
- minimal load during backup
25