2. What’s Streaming Replication?
• Postgres’ built-in replication
– Now under development in the community
– Will be available in v8.5
• Why replication?
– Fail Over
– Load Balancing
Client
query query
Master Slave
changes
3. Streaming Replication vs Hot Standby
• Streaming Replication (SR)
– Capability to send changes to the slave, to keep it current
• Hot Standby (HS)
– Capability to run queries on the slave
Client Hot Standby
query query
Master Slave
changes
Streaming Replication
4. History
• Historical policy
– Avoid putting replication into core Postgres
– "one size fits all" replication solution
• Replication war!
rubyrep
PL/Proxy
warm-standby DBmirror
PG on DRBD
Slony-I
Cybercluster
syncreplicator
PGCluster-II
Londiste
Postgres-R Monmoth
Bucardo
pgpool
Sequoia
RepDB twin
pgpool-II
PyReplica
GridSQL PostgresForest
PG on Shared Disk
PGCluster
5. Road to core
• Growing frustration
– Too complex to install and use for simple cases
– vs. other dbms
• Proposal of built-in replication
– by NTT OSSC @ PGCon 2008 Ottawa
• Core team statement
– It is time to include a simple, reliable basic
replication feature in the core system
– NOT replace the existing projects
6. Master - Slaves
• One master and one or more slaves
– Only master accepts write queries
– One-way replication: master slaves
Client
write query Slaves
Master
changes
7. Log-shipping
• WAL is shipped as data changes
– Slave can be kept current by replaying WAL
– WAL records in partially-filled WAL file can be shipped
⇔Cannot in warm-standby
Client
write query H/W architecture and
major postgres release
Master Slave level must be the same
WAL
Recovery
WAL WAL Database
8. Log-shipping
• No migration required
– Provides the same interface as the stand-alone
Postgres does
⇔Tables must have a primary key for Slony-I
⇔Some queries (e.g., sequences) would have
different values on different servers in pgpool
Client Client
No need to change the
existing application Master Slave
code for replication
Stand-alone
9. Per database cluster granularity
• Replicates all database objects
– No need to specify which object is replicated
– NOT replicate: server’s log file, statistics
information, ...etc
⇔Per-table in Slony-I
Per database cluster Per table
Master Slave Master Slave
10. Shared nothing architecture
• WAL is shipped via network
– No single point of failure
– No special H/W like shared disk required
Shared nothing Shared disk
Master Slave Master Slave
11. Management
• Fail Over
– Anytime, slave can be brought up
– Automatic failover is not supported
⇒ Need to use clusterware like heartbeat for that
• Split
– Anytime, master can become stand-alone
Fail Over Split
Client Client
Master Slave Master Slave
12. Management
• Online-Resync
– Anytime, new slave can be added to replication
without downtime
Client
Master
Slave
14. Synchronization mode
• async
– Asynchronous replication
– Possible risk of data loss
– No master overhead
• write, fsync
– Semi-synchronous
– No data loss
• apply
– Synchronous
– READ COMMITTED on slave is guaranteed
– Large master overhead
15. Synchronization mode
• The mode can be specified per transaction
– Mission critical trasanction like banking would
requires write, fsync or apply mode
– async would be sufficient for transaction like
web crawling
Needs durability Needs speed
SET mode TO fsync SET mode TO async
16. Built-in
• Easy to install and use
– Just install postgres
– Replication connection can be treat the same as normal
backend connection: pg_hba.conf, keepalive, SSL, ...
• Highly active community
– Bug fix soon
– Active development and collaboration
– Many users
17. Road to v8.5
• Needs your help
– Comment about interface, design and architecture
– Test and review the code
– Correct my English on the doc
• Project site
– http://wiki.postgresql.org/wiki/Streaming_Replication
• Where to comment and report bug ...
– pgsql-hackers@postgresql.org
– pgsql-jp@ml.postgresql.jp