SlideShare a Scribd company logo
1 of 42
Download to read offline
;
PostgreSQL 9.0 HA
Julien Pivotto
April, 1 2012 @ Loadays
;
Overview
PostgreSQL 9.0
Clustering
Backups
Monitoring
Automation
The end
The mission
Before the migration
Table of content
1 Overview
The mission
Before the migration
2 PostgreSQL 9.0
Intro
Streaming replication
Master configuration
Slave configuration
PostgreSQL specific tricks
Setting up a slave
3 Clustering
Set up of corosync
OCF resource
4 Backups
Cron jobs
BackupPC
5 Monitoring
Nagios
Munin
6 Automation
Puppet module
The node file
#TODO
7 The end
Julien Pivotto PostgreSQL 9.0 HA
;
Overview
PostgreSQL 9.0
Clustering
Backups
Monitoring
Automation
The end
The mission
Before the migration
Who am I
• Julien Pivotto
• Consultant at Inuits since May 2011
• FOSS defender since 2005
Julien Pivotto PostgreSQL 9.0 HA
;
Overview
PostgreSQL 9.0
Clustering
Backups
Monitoring
Automation
The end
The mission
Before the migration
A.R.S.I.A.
• Association Régionale de Santé et d’Identification Animales
• 30 linux servers in several locations
• A lot of Open Source
• CentOS, Samba, Open-xchange, mailscanner, Cyrus,
• . . . Puppet, jenkins, foreman, OpenVPN, GLPI, rabbitmq,
• . . . BackupPC, CUPS, icinga, trac, zope, plone,
• . . . solr, pentaho, funambol, munin, squid, asterisk,
• . . . and PostgreSQL, . . .
Julien Pivotto PostgreSQL 9.0 HA
;
Overview
PostgreSQL 9.0
Clustering
Backups
Monitoring
Automation
The end
The mission
Before the migration
C.E.R.I.S.E
• A web application
• Plone (python)
• 15k+ visits, 500k+ pages and 2.000.000+ hits each month
• Developped by Affinitic
• Several databases
• PostgreSQL 9.0
• Oracle database
• Several servers/services
• Two reverse proxies in failover HA
• Two application servers in load balancing HA
• Two PostgreSQL servers in failover HA
• An oracledb server
• A development server
• A pentaho server
• Being integrated in jenkins (to be continued. . . )
Julien Pivotto PostgreSQL 9.0 HA
;
Overview
PostgreSQL 9.0
Clustering
Backups
Monitoring
Automation
The end
The mission
Before the migration
PostgreSQL before the migration
• PostgreSQL 8.3.7
• No native support of HA
• High availability with heartbeat 2 and DRBD
• Installed on the application servers
• Nothing automated
• Failover: Passive node is not even read only
• Installed in November 2008
Julien Pivotto PostgreSQL 9.0 HA
;
Overview
PostgreSQL 9.0
Clustering
Backups
Monitoring
Automation
The end
The mission
Before the migration
Monitoring before the installation
• Icinga
• Check of the DRBD
• Simple connection check to PostgreSQL
• Graphing with Cacti
• Size of the databases
• Connexions to the database
• Checkpoints
Julien Pivotto PostgreSQL 9.0 HA
;
Overview
PostgreSQL 9.0
Clustering
Backups
Monitoring
Automation
The end
The mission
Before the migration
Backups before the installation
• Backups were done every hour one the same machine
• External backups once a day on disk and on tape
• Backups are made with pg_dump command
• BackupPC get those files
Julien Pivotto PostgreSQL 9.0 HA
;
Overview
PostgreSQL 9.0
Clustering
Backups
Monitoring
Automation
The end
Intro
Streaming replication
Master configuration
Slave configuration
PostgreSQL specific tricks
Setting up a slave
PostgreSQL 9.0
• PostgreSQL 9.0 was out in september 2010
• It brings to the world native replication in PostgresSQL
• There is not any native failover tool
• So we need to use PostgreSQL + Corosync
• The setup of PostgreSQL is managed by Puppet
Julien Pivotto PostgreSQL 9.0 HA
;
Overview
PostgreSQL 9.0
Clustering
Backups
Monitoring
Automation
The end
Intro
Streaming replication
Master configuration
Slave configuration
PostgreSQL specific tricks
Setting up a slave
Write-Ahead Logging
• It means that every change to datafile must first be written
into a log file
• Less disk writes: only the log file needs to be flushed to disk to
guarantee that a transaction is committed, rather than every
data file changed by the transaction
Julien Pivotto PostgreSQL 9.0 HA
;
Overview
PostgreSQL 9.0
Clustering
Backups
Monitoring
Automation
The end
Intro
Streaming replication
Master configuration
Slave configuration
PostgreSQL specific tricks
Setting up a slave
What is streaming replication
• Streaming replication provides the capability to ship and apply
WAL XLOGS to standby servers
• It’s possible to have multiple standby servers
• Standby servers can be read-only ("Hot standby")
Julien Pivotto PostgreSQL 9.0 HA
;
Overview
PostgreSQL 9.0
Clustering
Backups
Monitoring
Automation
The end
Intro
Streaming replication
Master configuration
Slave configuration
PostgreSQL specific tricks
Setting up a slave
DisadvantagesSpecifications of streaming replication
• Streaming replication supports only asynchronous log-shipping
• But when the database is used, the delay is close to
synchronous log-shipping
• Adding a standby server requires manual action
• But in our case we will only have one standby server
• PostgreSQL does not provide HA feature
• But Corosync does
• It is a single-threaded replication
• It is a single-threaded replication
Julien Pivotto PostgreSQL 9.0 HA
;
Overview
PostgreSQL 9.0
Clustering
Backups
Monitoring
Automation
The end
Intro
Streaming replication
Master configuration
Slave configuration
PostgreSQL specific tricks
Setting up a slave
Master configuration
The master only needs one configuration file.
Configuration non-related to SR
#Postgresql configuration
#http://www.postgresql.org/docs/9.0/interactive/index.html
listen_addresses = ’*’
max_connections = 200
shared_buffers = 4096MB
work_mem = 4096MB
effective_cache_size = 10024MB
commit_delay = 100000
effective_cache_size = 2560
log_destination = ‘stderr’
log_directory = ‘pg_log’
logging_collector = on
log_filename = ‘postgresql-%Y-%m-%d_%H%M%S.log’
log_truncate_on_rotation = on
log_rotation_age = 1d
log_rotation_size = 0
log_min_messages = notice
log_min_duration_statement = 1000
log_line_prefix = ‘%t %u ’
log_statement = ‘none’
datestyle = ‘iso, dmy’
Julien Pivotto PostgreSQL 9.0 HA
;
Overview
PostgreSQL 9.0
Clustering
Backups
Monitoring
Automation
The end
Intro
Streaming replication
Master configuration
Slave configuration
PostgreSQL specific tricks
Setting up a slave
Master configuration
Configuration related to SR
wal_level = hot_standby
max_wal_senders = 2
wal_keep_segments = 128
Julien Pivotto PostgreSQL 9.0 HA
;
Overview
PostgreSQL 9.0
Clustering
Backups
Monitoring
Automation
The end
Intro
Streaming replication
Master configuration
Slave configuration
PostgreSQL specific tricks
Setting up a slave
Master configuration
• wal_level = hot_standby
Allows stanby server to be readable
• max_wal_senders = 2
We allow up to 2 standby nodes
• wal_keep_segments = 128
The minimum wal segments to keep
Julien Pivotto PostgreSQL 9.0 HA
;
Overview
PostgreSQL 9.0
Clustering
Backups
Monitoring
Automation
The end
Intro
Streaming replication
Master configuration
Slave configuration
PostgreSQL specific tricks
Setting up a slave
Slave configuration
• The slave requires at least two configuration files
• A postgreSQL.conf file
• A recovery.conf file, used to apply the WAL XLOGS shipped by
the master
• A trigger file to stop replication can be specified
PostgreSQL.conf - Configuration related to SR
wal_level = hot_standby
hot_standby = on
Note that the file also have the same first part of the config file
than the master configuration.
Julien Pivotto PostgreSQL 9.0 HA
;
Overview
PostgreSQL 9.0
Clustering
Backups
Monitoring
Automation
The end
Intro
Streaming replication
Master configuration
Slave configuration
PostgreSQL specific tricks
Setting up a slave
Slave configuration
recovery.conf
standby_mode = ‘on’
primary_conninfo = ‘host=192.168.177.2 user=replicuser’
• standby_mode means that this is a standby server
• primary_conninfo is the connection to the master
Julien Pivotto PostgreSQL 9.0 HA
;
Overview
PostgreSQL 9.0
Clustering
Backups
Monitoring
Automation
The end
Intro
Streaming replication
Master configuration
Slave configuration
PostgreSQL specific tricks
Setting up a slave
Replication user
• A super user called replication has to be created
• The SQL command to create it is
CREATE USER replication SUPERUSER LOGIN CONNECTION
LIMIT 1 ENCRYPTED PASSWORD ‘foobar’;
Julien Pivotto PostgreSQL 9.0 HA
;
Overview
PostgreSQL 9.0
Clustering
Backups
Monitoring
Automation
The end
Intro
Streaming replication
Master configuration
Slave configuration
PostgreSQL specific tricks
Setting up a slave
pg_hba.conf
• pg_hba.conf is the file that contains some kind of ACLs for
the PostgreSQL connections
• In that file we will add both nodes as ‘trusted’ and the
replication user as trusted too
pg_hba.conf
hostnossl all all 10.0.10.8/32 trust
hostnossl all all 10.0.10.9/32 trust
hostnossl replication replicuser 192.168.177.2/24 trust
Julien Pivotto PostgreSQL 9.0 HA
;
Overview
PostgreSQL 9.0
Clustering
Backups
Monitoring
Automation
The end
Intro
Streaming replication
Master configuration
Slave configuration
PostgreSQL specific tricks
Setting up a slave
Setting up a slave
• You have to type a bunch of commands on the master when
you add a new standby server
Adding a standby server
psql -c "SELECT pg_start_backup(’label’, true)"
rsync -a ${PGDATA}/ standby:/srv/pgsql/standby/ --exclude postmaster.pid --exclude ‘*-master’
--exclude ‘*-slave’
psql -c "SELECT pg_stop_backup()"
Julien Pivotto PostgreSQL 9.0 HA
;
Overview
PostgreSQL 9.0
Clustering
Backups
Monitoring
Automation
The end
Set up of corosync
OCF resource
Corosync configuration
• The goal of corosync is to make the switch between
master/slave when needed
• It will ensure that a master is online and connected to the
router
• The two servers are connected to each other on eth1
• Corosync is installed by Puppet
• We take it from the clusterlabs repositories
• We use a personalized master/slave ocf resource to manage
the PostgreSQL M/S
Julien Pivotto PostgreSQL 9.0 HA
;
Overview
PostgreSQL 9.0
Clustering
Backups
Monitoring
Automation
The end
Set up of corosync
OCF resource
crm.conf
The main configuration file of corosync is
/etc/corosync/crm.conf. It contains all the
resources/nodes/etc. . .
Defining the nodes
node babar.interne.arsia.be 
attributes standby="off"
node dumbo.interne.arsia.be 
attributes standby="off"
In this code, the two nodes are defined, and we tell corosync that
they should be started at launch.
Julien Pivotto PostgreSQL 9.0 HA
;
Overview
PostgreSQL 9.0
Clustering
Backups
Monitoring
Automation
The end
Set up of corosync
OCF resource
crm.conf
Defining the primitives
primitive pgsql ocf:inuits:pgsql-ms
primitive virt_ip ocf:heartbeat:IPaddr2 
params nic="eth0" iflabel="0" ip="10.0.10.10" cidr_netmask="24" broadcast="10.0.10.255" 
meta target-role="Started" is-managed="true"
primitive ping ocf:pacemaker:ping 
params host_list="10.0.10.1" 
op monitor interval="10s" timeout="10s" 
op start interval="0" timeout="45s" 
op stop interval="0" timeout="50s"
• We define 3 primitives:
• pgsql, the PostgreSQL primitive
• virt_ip, the floating IP address
• ping, the primitive that will check that the servers are
connected to the router
Julien Pivotto PostgreSQL 9.0 HA
;
Overview
PostgreSQL 9.0
Clustering
Backups
Monitoring
Automation
The end
Set up of corosync
OCF resource
crm.conf
Configuring the primitives
ms pgsql-ms pgsql 
params pgsqlconfig="/var/lib/pgsql/data/postgresql.conf" 
lsb_script="/etc/init.d/postgresql-9.0" 
pgsqlrecovery="/var/lib/pgsql/data/recovery.conf" 
meta clone-max="2" clone-node-max="1" master-max="1" master-node-max="1" notify="false"
clone clone-ping ping 
meta globally-unique="false"
• We configure the PostgreSQL M/S: the init script, the
configuration files. . .
• We also configure the ping resource as a clone (it will be
launched on both servers)
Julien Pivotto PostgreSQL 9.0 HA
;
Overview
PostgreSQL 9.0
Clustering
Backups
Monitoring
Automation
The end
Set up of corosync
OCF resource
crm.conf
Defining the nodes
group PSQL virt_ip
location connected PSQL 
rule $id="connected-rule" -inf: not_defined pingd or pingd lte 0
colocation ip_psql inf: PSQL pgsql-ms:Master
property $id="cib-bootstrap-options" 
cluster-infrastructure="openais" 
expected-quorum-votes="2" 
stonith-enabled="false" 
no-quorum-policy="ignore" 
default-resource-stickiness="INFINITY"
rsc_defaults $id="rsc_defaults-options" 
migration-threshold="INFINITY" 
failure-timeout="10" 
resource-stickiness="INFINITY"
• These lines will ensure that the master is always on the same
node as the floating IP address
• And also that the master is connected to the router
Julien Pivotto PostgreSQL 9.0 HA
;
Overview
PostgreSQL 9.0
Clustering
Backups
Monitoring
Automation
The end
Set up of corosync
OCF resource
OCF resource
• There is a custom OCF resource to manage the master/slave
PostgreSQL
• It is based on an example of resource written by Andrew
Beekhof from Clusterlabs
• The file has to be in
/usr/lib/ocf/resource.d/inuits/pgsql-ms
Julien Pivotto PostgreSQL 9.0 HA
;
Overview
PostgreSQL 9.0
Clustering
Backups
Monitoring
Automation
The end
Set up of corosync
OCF resource
OCF resource
• The script does the following:
• It moves the postgresql.conf-master to
postgresql.conf when a node is promoted/master
• It moves the postgresql.conf-slave to postgresql.conf
when a node is depromoted/slave
• It ensure that recovery.conf-slave is on recovery.conf
on slave and absent on master
• It starts/restarts PostgreSQL when needed.
• I will post that file on Github soon
Julien Pivotto PostgreSQL 9.0 HA
;
Overview
PostgreSQL 9.0
Clustering
Backups
Monitoring
Automation
The end
Cron jobs
BackupPC
Backups of the databases
• Sometimes, you need backups (especially when you don’t have
backups. . . )
• We do a backup per hour on each node (one at minute 0 and
one at minute 30)
• We do a backup per day on each node
• We do a backup per day on before BackupPC backup on each
node.
• We keep 24 hourly backups and 7 daily backups on disk
• With BackupPC we keep months of backups
Julien Pivotto PostgreSQL 9.0 HA
;
Overview
PostgreSQL 9.0
Clustering
Backups
Monitoring
Automation
The end
Cron jobs
BackupPC
Hourly backup script
/usr/local/bin/backup_hourly.sh
#!/bin/bash
DATE=$(date +%H)
BACKUP_PATH=/var/lib/backups/hourly
for db in foobar_db foobar2_db
do
/usr/bin/pg_dump $db | gzip > $BACKUP_PATH/${db}_$DATE.pgsql.gz
ln -fs $BACKUP_PATH/${db}_$DATE.pgsql.gz $BACKUP_PATH/${db}_current.pgsql.gz
done
The daily script is almost the same.
Julien Pivotto PostgreSQL 9.0 HA
;
Overview
PostgreSQL 9.0
Clustering
Backups
Monitoring
Automation
The end
Cron jobs
BackupPC
BackupPC script
/usr/local/bin/backup_backuppc.sh
#!/bin/bash
DATE=$(date +%u)
BACKUP_PATH=/var/lib/backups/backuppc
for db in cerise trackitquality trackit zodb_cerise
do
/usr/bin/pg_dump -U postgres $db | gzip > $BACKUP_PATH/${db}_$DATE.pgsql.gz
ln -fs $BACKUP_PATH/${db}_$DATE.pgsql.gz $BACKUP_PATH/${db}_current.pgsql.gz
done
In the backupPC config, I added the following:
BackupPC config
$Conf{DumpPreUserCmd} = ‘$sshPath -t -q -x -l backuppc $host /usr/local/bin/backup_backuppc.sh’;
Julien Pivotto PostgreSQL 9.0 HA
;
Overview
PostgreSQL 9.0
Clustering
Backups
Monitoring
Automation
The end
Nagios
Munin
check_postgres script
• The check_postgres.pl is a nagios-compatible perl script
• Available on http://www.bucardo.org/check_postgres/
and on Github
• What we check with it:
• The current connections
• The status of the replication (the delay)
Julien Pivotto PostgreSQL 9.0 HA
;
Overview
PostgreSQL 9.0
Clustering
Backups
Monitoring
Automation
The end
Nagios
Munin
Check hot_standby latency
• The check_postgres.pl script has a check for hot_standby
delay
• But we do not know who is the master and the slave, and it is
required to launch the script
• So, here is a bash script I wrote to know the M/S order
Master/slave replication check
#!/bin/bash
/usr/lib64/nagios/plugins/check_postgres.pl --db="$1" 
--action hot_standby_delay -w 300 -c 600 --host=$(
crm_resource --resource pgsql-ms --locate|
awk ‘/Master/ {master=$6} / $/ {slave=$6} END {print master","slave}’
)
Julien Pivotto PostgreSQL 9.0 HA
;
Overview
PostgreSQL 9.0
Clustering
Backups
Monitoring
Automation
The end
Nagios
Munin
Munin postgres scripts
• Munin is shipped with perl plugins for postgresql
• We use four of them:
• postgres_size,
• postgres_checkpoints,
• postgres_connections_db,
• postgres_cache
Julien Pivotto PostgreSQL 9.0 HA
;
Overview
PostgreSQL 9.0
Clustering
Backups
Monitoring
Automation
The end
Nagios
Munin
Munin postgres scripts
Julien Pivotto PostgreSQL 9.0 HA
;
Overview
PostgreSQL 9.0
Clustering
Backups
Monitoring
Automation
The end
Puppet module
The node file
#TODO
Puppet module
• The puppet postgres module is forked from Kris Buytaert’s
github page
• It is modified to remove all references to services, because we
want corosync to manage them
• It creates the users, the super users, the databases
• It is a parameterized class, with a "cluster" parameter. So we
can also install simple PostgreSQL
• The cache sizes are parameterized too, so we can also use that
in Vagrant boxes
• Here are some examples from the module I will upload on
Github ASAP
Julien Pivotto PostgreSQL 9.0 HA
;
Overview
PostgreSQL 9.0
Clustering
Backups
Monitoring
Automation
The end
Puppet module
The node file
#TODO
Class postgres
The postgres class installs the packages and makes the initdb stuff.
init.pp
class postgres (
$cluster = ‘no’,
$running_ip = ‘127.0.0.1’
){ ...
• The cluster parameter indicates if we want or not clustering
• running_ip is used for the SQL commands. In case of a
cluster, you have to put cluste’s IP address here.
Julien Pivotto PostgreSQL 9.0 HA
;
Overview
PostgreSQL 9.0
Clustering
Backups
Monitoring
Automation
The end
Puppet module
The node file
#TODO
Sqlexec definition
sqlexec.pp
define postgres::sqlexec($username, $database, $sql, $sqlcheck) {
exec{ "psql -h $postgres::running_ip –username=${username} $database
-c ¨${sql}¨>> /var/log/puppet-postgresql.sql.log 2>&1 && /bin/sleep 5":
environment => "PGPASSWORD=${postgres_password}",
path => $::path,
timeout => 600,
unless => "psql -h $postgres::running_ip -U $username $database -c $sqlcheck",
require => Service[’postgresql-9.0’],
}
}
Julien Pivotto PostgreSQL 9.0 HA
;
Overview
PostgreSQL 9.0
Clustering
Backups
Monitoring
Automation
The end
Puppet module
The node file
#TODO
Example in the node file
Here is the result in the node file:
dumbo.pp
node babar {
class {
’postgres’:
cluster => ’yes’,
running_ip => ’10.0.10.10’,
}
include postgres::munin
include postgres::backup
include cluster::node
postgres::config{
$::fqdn: listen => ’*’,
}
Julien Pivotto PostgreSQL 9.0 HA
;
Overview
PostgreSQL 9.0
Clustering
Backups
Monitoring
Automation
The end
Puppet module
The node file
#TODO
Example in the node file
dumbo.pp
postgres::hba {
$::fqdn:
allowedrules => [
"host all all $::ipaddress/32 trust",
’hostnossl all all 10.0.10.8/32 trust’,
’hostnossl all all 10.0.10.9/32 trust’,
’hostnossl all all 10.0.10.10/32 trust’,
’hostnossl replication replicuser 192.168.177.2/24 trust’,
],
}
Julien Pivotto PostgreSQL 9.0 HA
;
Overview
PostgreSQL 9.0
Clustering
Backups
Monitoring
Automation
The end
Puppet module
The node file
#TODO
Example in the node file
dumbo.pp
postgres::createsuperuser{
’replicuser’:
passwd => ’foobar’,
}
postgres::createuser{
’cerise’:
passwd => ’foobar’;
}
postgres::createdb{
’zodb_cerise’:
owner => ’cerise’,
require => Postgres::Createuser[’cerise’],
}
}
Julien Pivotto PostgreSQL 9.0 HA
;
Overview
PostgreSQL 9.0
Clustering
Backups
Monitoring
Automation
The end
Puppet module
The node file
#TODO
#TODO
• The first synchronisation is not puppetized
• More advanced checks on the database #monitoringsucks
(e.g. slow queries)
• A disaster recovery
• Improve the ocf script
• Check the content of the backups
• . . .
Julien Pivotto PostgreSQL 9.0 HA
;
Overview
PostgreSQL 9.0
Clustering
Backups
Monitoring
Automation
The end
Any questions?
Julien Pivotto PostgreSQL 9.0 HA

More Related Content

What's hot

Package Management and Chef - ChefConf 2015
Package Management and Chef - ChefConf 2015Package Management and Chef - ChefConf 2015
Package Management and Chef - ChefConf 2015
Chef
 
London devops logging
London devops loggingLondon devops logging
London devops logging
Tomas Doran
 

What's hot (20)

Hacking on WildFly 9
Hacking on WildFly 9Hacking on WildFly 9
Hacking on WildFly 9
 
Continuous Infrastructure: Modern Puppet for the Jenkins Project - PuppetConf...
Continuous Infrastructure: Modern Puppet for the Jenkins Project - PuppetConf...Continuous Infrastructure: Modern Puppet for the Jenkins Project - PuppetConf...
Continuous Infrastructure: Modern Puppet for the Jenkins Project - PuppetConf...
 
Puppet evolutions
Puppet evolutionsPuppet evolutions
Puppet evolutions
 
Serverspec and Sensu - Testing and Monitoring collide
Serverspec and Sensu - Testing and Monitoring collideServerspec and Sensu - Testing and Monitoring collide
Serverspec and Sensu - Testing and Monitoring collide
 
Portland PUG April 2014: Beaker 101: Acceptance Test Everything
Portland PUG April 2014: Beaker 101: Acceptance Test EverythingPortland PUG April 2014: Beaker 101: Acceptance Test Everything
Portland PUG April 2014: Beaker 101: Acceptance Test Everything
 
Transforming the Ceph Integration Tests with OpenStack
Transforming the Ceph Integration Tests with OpenStack Transforming the Ceph Integration Tests with OpenStack
Transforming the Ceph Integration Tests with OpenStack
 
Puppet for dummies - ZendCon 2011 Edition
Puppet for dummies - ZendCon 2011 EditionPuppet for dummies - ZendCon 2011 Edition
Puppet for dummies - ZendCon 2011 Edition
 
SaltConf14 - Oz Akan, Rackspace - Deploying OpenStack Marconi with SaltStack
SaltConf14 - Oz Akan, Rackspace - Deploying OpenStack Marconi with SaltStackSaltConf14 - Oz Akan, Rackspace - Deploying OpenStack Marconi with SaltStack
SaltConf14 - Oz Akan, Rackspace - Deploying OpenStack Marconi with SaltStack
 
Verifying your Ansible Roles using Docker, Test Kitchen and Serverspec
Verifying your Ansible Roles using Docker, Test Kitchen and ServerspecVerifying your Ansible Roles using Docker, Test Kitchen and Serverspec
Verifying your Ansible Roles using Docker, Test Kitchen and Serverspec
 
SCALE12X: Chef for OpenStack
SCALE12X: Chef for OpenStackSCALE12X: Chef for OpenStack
SCALE12X: Chef for OpenStack
 
SaltConf 2015: Salt stack at web scale: Better, Stronger, Faster
SaltConf 2015: Salt stack at web scale: Better, Stronger, FasterSaltConf 2015: Salt stack at web scale: Better, Stronger, Faster
SaltConf 2015: Salt stack at web scale: Better, Stronger, Faster
 
Puppet Availability and Performance at 100K Nodes - PuppetConf 2014
Puppet Availability and Performance at 100K Nodes - PuppetConf 2014Puppet Availability and Performance at 100K Nodes - PuppetConf 2014
Puppet Availability and Performance at 100K Nodes - PuppetConf 2014
 
Modern Infrastructure from Scratch with Puppet
Modern Infrastructure from Scratch with PuppetModern Infrastructure from Scratch with Puppet
Modern Infrastructure from Scratch with Puppet
 
SaltConf14 - Saurabh Surana, HP Cloud - Automating operations and support wit...
SaltConf14 - Saurabh Surana, HP Cloud - Automating operations and support wit...SaltConf14 - Saurabh Surana, HP Cloud - Automating operations and support wit...
SaltConf14 - Saurabh Surana, HP Cloud - Automating operations and support wit...
 
Package Management and Chef - ChefConf 2015
Package Management and Chef - ChefConf 2015Package Management and Chef - ChefConf 2015
Package Management and Chef - ChefConf 2015
 
OSDC2014: Testing Server Infrastructure with #serverspec
OSDC2014: Testing Server Infrastructure with #serverspecOSDC2014: Testing Server Infrastructure with #serverspec
OSDC2014: Testing Server Infrastructure with #serverspec
 
Steamlining your puppet development workflow
Steamlining your puppet development workflowSteamlining your puppet development workflow
Steamlining your puppet development workflow
 
Sensu and Sensibility - Puppetconf 2014
Sensu and Sensibility - Puppetconf 2014Sensu and Sensibility - Puppetconf 2014
Sensu and Sensibility - Puppetconf 2014
 
London devops logging
London devops loggingLondon devops logging
London devops logging
 
Arnold Bechtoldt, Inovex GmbH Linux systems engineer - Configuration Manageme...
Arnold Bechtoldt, Inovex GmbH Linux systems engineer - Configuration Manageme...Arnold Bechtoldt, Inovex GmbH Linux systems engineer - Configuration Manageme...
Arnold Bechtoldt, Inovex GmbH Linux systems engineer - Configuration Manageme...
 

Similar to Postgresql 9.0 HA at LOADAYS 2012

Built-in-Physical-and-Logical-Replication-in-Postgresql-Firat-Gulec.pptx
Built-in-Physical-and-Logical-Replication-in-Postgresql-Firat-Gulec.pptxBuilt-in-Physical-and-Logical-Replication-in-Postgresql-Firat-Gulec.pptx
Built-in-Physical-and-Logical-Replication-in-Postgresql-Firat-Gulec.pptx
nadirpervez2
 
Cloud Foundry on OpenStack - An Experience Report | anynines
Cloud Foundry on OpenStack - An Experience Report | anynines Cloud Foundry on OpenStack - An Experience Report | anynines
Cloud Foundry on OpenStack - An Experience Report | anynines
anynines GmbH
 
Automating Software Development Life Cycle - A DevOps Approach
Automating Software Development Life Cycle - A DevOps ApproachAutomating Software Development Life Cycle - A DevOps Approach
Automating Software Development Life Cycle - A DevOps Approach
Akshaya Mahapatra
 
Introduction to PostgreSQL for System Administrators
Introduction to PostgreSQL for System AdministratorsIntroduction to PostgreSQL for System Administrators
Introduction to PostgreSQL for System Administrators
Jignesh Shah
 
A3Sec Advanced Deployment System
A3Sec Advanced Deployment SystemA3Sec Advanced Deployment System
A3Sec Advanced Deployment System
a3sec
 
Ceph Day Shanghai - CeTune - Benchmarking and tuning your Ceph cluster
Ceph Day Shanghai - CeTune - Benchmarking and tuning your Ceph cluster Ceph Day Shanghai - CeTune - Benchmarking and tuning your Ceph cluster
Ceph Day Shanghai - CeTune - Benchmarking and tuning your Ceph cluster
Ceph Community
 

Similar to Postgresql 9.0 HA at LOADAYS 2012 (20)

Postgresql 9.0 HA at RMLL 2012
Postgresql 9.0 HA at RMLL 2012Postgresql 9.0 HA at RMLL 2012
Postgresql 9.0 HA at RMLL 2012
 
Built-in-Physical-and-Logical-Replication-in-Postgresql-Firat-Gulec.pptx
Built-in-Physical-and-Logical-Replication-in-Postgresql-Firat-Gulec.pptxBuilt-in-Physical-and-Logical-Replication-in-Postgresql-Firat-Gulec.pptx
Built-in-Physical-and-Logical-Replication-in-Postgresql-Firat-Gulec.pptx
 
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...
 
SUSE Container as a Service Platform
SUSE Container as a Service PlatformSUSE Container as a Service Platform
SUSE Container as a Service Platform
 
Built in physical and logical replication in postgresql-Firat Gulec
Built in physical and logical replication in postgresql-Firat GulecBuilt in physical and logical replication in postgresql-Firat Gulec
Built in physical and logical replication in postgresql-Firat Gulec
 
Switch as a Server - PuppetConf 2014 - Leslie Carr
Switch as a Server - PuppetConf 2014 - Leslie CarrSwitch as a Server - PuppetConf 2014 - Leslie Carr
Switch as a Server - PuppetConf 2014 - Leslie Carr
 
2009-01-20 RHEL 5.3 for System z
2009-01-20 RHEL 5.3 for System z2009-01-20 RHEL 5.3 for System z
2009-01-20 RHEL 5.3 for System z
 
Building tungsten-clusters-with-postgre sql-hot-standby-and-streaming-replica...
Building tungsten-clusters-with-postgre sql-hot-standby-and-streaming-replica...Building tungsten-clusters-with-postgre sql-hot-standby-and-streaming-replica...
Building tungsten-clusters-with-postgre sql-hot-standby-and-streaming-replica...
 
PaaSTA: Running applications at Yelp
PaaSTA: Running applications at YelpPaaSTA: Running applications at Yelp
PaaSTA: Running applications at Yelp
 
Sanger OpenStack presentation March 2017
Sanger OpenStack presentation March 2017Sanger OpenStack presentation March 2017
Sanger OpenStack presentation March 2017
 
The Accidental DBA
The Accidental DBAThe Accidental DBA
The Accidental DBA
 
Cloud Foundry on OpenStack - An Experience Report | anynines
Cloud Foundry on OpenStack - An Experience Report | anynines Cloud Foundry on OpenStack - An Experience Report | anynines
Cloud Foundry on OpenStack - An Experience Report | anynines
 
Automating Software Development Life Cycle - A DevOps Approach
Automating Software Development Life Cycle - A DevOps ApproachAutomating Software Development Life Cycle - A DevOps Approach
Automating Software Development Life Cycle - A DevOps Approach
 
Making Spinnaker Go @ Stitch Fix
Making Spinnaker Go @ Stitch FixMaking Spinnaker Go @ Stitch Fix
Making Spinnaker Go @ Stitch Fix
 
Introduction to PostgreSQL for System Administrators
Introduction to PostgreSQL for System AdministratorsIntroduction to PostgreSQL for System Administrators
Introduction to PostgreSQL for System Administrators
 
A3Sec Advanced Deployment System
A3Sec Advanced Deployment SystemA3Sec Advanced Deployment System
A3Sec Advanced Deployment System
 
Ceph Day Shanghai - CeTune - Benchmarking and tuning your Ceph cluster
Ceph Day Shanghai - CeTune - Benchmarking and tuning your Ceph cluster Ceph Day Shanghai - CeTune - Benchmarking and tuning your Ceph cluster
Ceph Day Shanghai - CeTune - Benchmarking and tuning your Ceph cluster
 
Neutron CI Run on Docker
Neutron CI Run on DockerNeutron CI Run on Docker
Neutron CI Run on Docker
 
Streaming Processing with a Distributed Commit Log
Streaming Processing with a Distributed Commit LogStreaming Processing with a Distributed Commit Log
Streaming Processing with a Distributed Commit Log
 
Sanger, upcoming Openstack for Bio-informaticians
Sanger, upcoming Openstack for Bio-informaticiansSanger, upcoming Openstack for Bio-informaticians
Sanger, upcoming Openstack for Bio-informaticians
 

More from Julien Pivotto

More from Julien Pivotto (20)

The O11y Toolkit
The O11y ToolkitThe O11y Toolkit
The O11y Toolkit
 
What's New in Prometheus and Its Ecosystem
What's New in Prometheus and Its EcosystemWhat's New in Prometheus and Its Ecosystem
What's New in Prometheus and Its Ecosystem
 
Prometheus: What is is, what is new, what is coming
Prometheus: What is is, what is new, what is comingPrometheus: What is is, what is new, what is coming
Prometheus: What is is, what is new, what is coming
 
What's new in Prometheus?
What's new in Prometheus?What's new in Prometheus?
What's new in Prometheus?
 
Introduction to Grafana Loki
Introduction to Grafana LokiIntroduction to Grafana Loki
Introduction to Grafana Loki
 
Why you should revisit mgmt
Why you should revisit mgmtWhy you should revisit mgmt
Why you should revisit mgmt
 
Observing the HashiCorp Ecosystem From Prometheus
Observing the HashiCorp Ecosystem From PrometheusObserving the HashiCorp Ecosystem From Prometheus
Observing the HashiCorp Ecosystem From Prometheus
 
Monitoring in a fast-changing world with Prometheus
Monitoring in a fast-changing world with PrometheusMonitoring in a fast-changing world with Prometheus
Monitoring in a fast-changing world with Prometheus
 
5 tips for Prometheus Service Discovery
5 tips for Prometheus Service Discovery5 tips for Prometheus Service Discovery
5 tips for Prometheus Service Discovery
 
Prometheus and TLS - an Introduction
Prometheus and TLS - an IntroductionPrometheus and TLS - an Introduction
Prometheus and TLS - an Introduction
 
Powerful graphs in Grafana
Powerful graphs in GrafanaPowerful graphs in Grafana
Powerful graphs in Grafana
 
YAML Magic
YAML MagicYAML Magic
YAML Magic
 
HAProxy as Egress Controller
HAProxy as Egress ControllerHAProxy as Egress Controller
HAProxy as Egress Controller
 
Improved alerting with Prometheus and Alertmanager
Improved alerting with Prometheus and AlertmanagerImproved alerting with Prometheus and Alertmanager
Improved alerting with Prometheus and Alertmanager
 
SIngle Sign On with Keycloak
SIngle Sign On with KeycloakSIngle Sign On with Keycloak
SIngle Sign On with Keycloak
 
Monitoring as an entry point for collaboration
Monitoring as an entry point for collaborationMonitoring as an entry point for collaboration
Monitoring as an entry point for collaboration
 
Incident Resolution as Code
Incident Resolution as CodeIncident Resolution as Code
Incident Resolution as Code
 
Monitor your CentOS stack with Prometheus
Monitor your CentOS stack with PrometheusMonitor your CentOS stack with Prometheus
Monitor your CentOS stack with Prometheus
 
Monitor your CentOS stack with Prometheus
Monitor your CentOS stack with PrometheusMonitor your CentOS stack with Prometheus
Monitor your CentOS stack with Prometheus
 
An introduction to Ansible
An introduction to AnsibleAn introduction to Ansible
An introduction to Ansible
 

Recently uploaded

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 

Recently uploaded (20)

"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 

Postgresql 9.0 HA at LOADAYS 2012

  • 1. ; PostgreSQL 9.0 HA Julien Pivotto April, 1 2012 @ Loadays
  • 2. ; Overview PostgreSQL 9.0 Clustering Backups Monitoring Automation The end The mission Before the migration Table of content 1 Overview The mission Before the migration 2 PostgreSQL 9.0 Intro Streaming replication Master configuration Slave configuration PostgreSQL specific tricks Setting up a slave 3 Clustering Set up of corosync OCF resource 4 Backups Cron jobs BackupPC 5 Monitoring Nagios Munin 6 Automation Puppet module The node file #TODO 7 The end Julien Pivotto PostgreSQL 9.0 HA
  • 3. ; Overview PostgreSQL 9.0 Clustering Backups Monitoring Automation The end The mission Before the migration Who am I • Julien Pivotto • Consultant at Inuits since May 2011 • FOSS defender since 2005 Julien Pivotto PostgreSQL 9.0 HA
  • 4. ; Overview PostgreSQL 9.0 Clustering Backups Monitoring Automation The end The mission Before the migration A.R.S.I.A. • Association Régionale de Santé et d’Identification Animales • 30 linux servers in several locations • A lot of Open Source • CentOS, Samba, Open-xchange, mailscanner, Cyrus, • . . . Puppet, jenkins, foreman, OpenVPN, GLPI, rabbitmq, • . . . BackupPC, CUPS, icinga, trac, zope, plone, • . . . solr, pentaho, funambol, munin, squid, asterisk, • . . . and PostgreSQL, . . . Julien Pivotto PostgreSQL 9.0 HA
  • 5. ; Overview PostgreSQL 9.0 Clustering Backups Monitoring Automation The end The mission Before the migration C.E.R.I.S.E • A web application • Plone (python) • 15k+ visits, 500k+ pages and 2.000.000+ hits each month • Developped by Affinitic • Several databases • PostgreSQL 9.0 • Oracle database • Several servers/services • Two reverse proxies in failover HA • Two application servers in load balancing HA • Two PostgreSQL servers in failover HA • An oracledb server • A development server • A pentaho server • Being integrated in jenkins (to be continued. . . ) Julien Pivotto PostgreSQL 9.0 HA
  • 6. ; Overview PostgreSQL 9.0 Clustering Backups Monitoring Automation The end The mission Before the migration PostgreSQL before the migration • PostgreSQL 8.3.7 • No native support of HA • High availability with heartbeat 2 and DRBD • Installed on the application servers • Nothing automated • Failover: Passive node is not even read only • Installed in November 2008 Julien Pivotto PostgreSQL 9.0 HA
  • 7. ; Overview PostgreSQL 9.0 Clustering Backups Monitoring Automation The end The mission Before the migration Monitoring before the installation • Icinga • Check of the DRBD • Simple connection check to PostgreSQL • Graphing with Cacti • Size of the databases • Connexions to the database • Checkpoints Julien Pivotto PostgreSQL 9.0 HA
  • 8. ; Overview PostgreSQL 9.0 Clustering Backups Monitoring Automation The end The mission Before the migration Backups before the installation • Backups were done every hour one the same machine • External backups once a day on disk and on tape • Backups are made with pg_dump command • BackupPC get those files Julien Pivotto PostgreSQL 9.0 HA
  • 9. ; Overview PostgreSQL 9.0 Clustering Backups Monitoring Automation The end Intro Streaming replication Master configuration Slave configuration PostgreSQL specific tricks Setting up a slave PostgreSQL 9.0 • PostgreSQL 9.0 was out in september 2010 • It brings to the world native replication in PostgresSQL • There is not any native failover tool • So we need to use PostgreSQL + Corosync • The setup of PostgreSQL is managed by Puppet Julien Pivotto PostgreSQL 9.0 HA
  • 10. ; Overview PostgreSQL 9.0 Clustering Backups Monitoring Automation The end Intro Streaming replication Master configuration Slave configuration PostgreSQL specific tricks Setting up a slave Write-Ahead Logging • It means that every change to datafile must first be written into a log file • Less disk writes: only the log file needs to be flushed to disk to guarantee that a transaction is committed, rather than every data file changed by the transaction Julien Pivotto PostgreSQL 9.0 HA
  • 11. ; Overview PostgreSQL 9.0 Clustering Backups Monitoring Automation The end Intro Streaming replication Master configuration Slave configuration PostgreSQL specific tricks Setting up a slave What is streaming replication • Streaming replication provides the capability to ship and apply WAL XLOGS to standby servers • It’s possible to have multiple standby servers • Standby servers can be read-only ("Hot standby") Julien Pivotto PostgreSQL 9.0 HA
  • 12. ; Overview PostgreSQL 9.0 Clustering Backups Monitoring Automation The end Intro Streaming replication Master configuration Slave configuration PostgreSQL specific tricks Setting up a slave DisadvantagesSpecifications of streaming replication • Streaming replication supports only asynchronous log-shipping • But when the database is used, the delay is close to synchronous log-shipping • Adding a standby server requires manual action • But in our case we will only have one standby server • PostgreSQL does not provide HA feature • But Corosync does • It is a single-threaded replication • It is a single-threaded replication Julien Pivotto PostgreSQL 9.0 HA
  • 13. ; Overview PostgreSQL 9.0 Clustering Backups Monitoring Automation The end Intro Streaming replication Master configuration Slave configuration PostgreSQL specific tricks Setting up a slave Master configuration The master only needs one configuration file. Configuration non-related to SR #Postgresql configuration #http://www.postgresql.org/docs/9.0/interactive/index.html listen_addresses = ’*’ max_connections = 200 shared_buffers = 4096MB work_mem = 4096MB effective_cache_size = 10024MB commit_delay = 100000 effective_cache_size = 2560 log_destination = ‘stderr’ log_directory = ‘pg_log’ logging_collector = on log_filename = ‘postgresql-%Y-%m-%d_%H%M%S.log’ log_truncate_on_rotation = on log_rotation_age = 1d log_rotation_size = 0 log_min_messages = notice log_min_duration_statement = 1000 log_line_prefix = ‘%t %u ’ log_statement = ‘none’ datestyle = ‘iso, dmy’ Julien Pivotto PostgreSQL 9.0 HA
  • 14. ; Overview PostgreSQL 9.0 Clustering Backups Monitoring Automation The end Intro Streaming replication Master configuration Slave configuration PostgreSQL specific tricks Setting up a slave Master configuration Configuration related to SR wal_level = hot_standby max_wal_senders = 2 wal_keep_segments = 128 Julien Pivotto PostgreSQL 9.0 HA
  • 15. ; Overview PostgreSQL 9.0 Clustering Backups Monitoring Automation The end Intro Streaming replication Master configuration Slave configuration PostgreSQL specific tricks Setting up a slave Master configuration • wal_level = hot_standby Allows stanby server to be readable • max_wal_senders = 2 We allow up to 2 standby nodes • wal_keep_segments = 128 The minimum wal segments to keep Julien Pivotto PostgreSQL 9.0 HA
  • 16. ; Overview PostgreSQL 9.0 Clustering Backups Monitoring Automation The end Intro Streaming replication Master configuration Slave configuration PostgreSQL specific tricks Setting up a slave Slave configuration • The slave requires at least two configuration files • A postgreSQL.conf file • A recovery.conf file, used to apply the WAL XLOGS shipped by the master • A trigger file to stop replication can be specified PostgreSQL.conf - Configuration related to SR wal_level = hot_standby hot_standby = on Note that the file also have the same first part of the config file than the master configuration. Julien Pivotto PostgreSQL 9.0 HA
  • 17. ; Overview PostgreSQL 9.0 Clustering Backups Monitoring Automation The end Intro Streaming replication Master configuration Slave configuration PostgreSQL specific tricks Setting up a slave Slave configuration recovery.conf standby_mode = ‘on’ primary_conninfo = ‘host=192.168.177.2 user=replicuser’ • standby_mode means that this is a standby server • primary_conninfo is the connection to the master Julien Pivotto PostgreSQL 9.0 HA
  • 18. ; Overview PostgreSQL 9.0 Clustering Backups Monitoring Automation The end Intro Streaming replication Master configuration Slave configuration PostgreSQL specific tricks Setting up a slave Replication user • A super user called replication has to be created • The SQL command to create it is CREATE USER replication SUPERUSER LOGIN CONNECTION LIMIT 1 ENCRYPTED PASSWORD ‘foobar’; Julien Pivotto PostgreSQL 9.0 HA
  • 19. ; Overview PostgreSQL 9.0 Clustering Backups Monitoring Automation The end Intro Streaming replication Master configuration Slave configuration PostgreSQL specific tricks Setting up a slave pg_hba.conf • pg_hba.conf is the file that contains some kind of ACLs for the PostgreSQL connections • In that file we will add both nodes as ‘trusted’ and the replication user as trusted too pg_hba.conf hostnossl all all 10.0.10.8/32 trust hostnossl all all 10.0.10.9/32 trust hostnossl replication replicuser 192.168.177.2/24 trust Julien Pivotto PostgreSQL 9.0 HA
  • 20. ; Overview PostgreSQL 9.0 Clustering Backups Monitoring Automation The end Intro Streaming replication Master configuration Slave configuration PostgreSQL specific tricks Setting up a slave Setting up a slave • You have to type a bunch of commands on the master when you add a new standby server Adding a standby server psql -c "SELECT pg_start_backup(’label’, true)" rsync -a ${PGDATA}/ standby:/srv/pgsql/standby/ --exclude postmaster.pid --exclude ‘*-master’ --exclude ‘*-slave’ psql -c "SELECT pg_stop_backup()" Julien Pivotto PostgreSQL 9.0 HA
  • 21. ; Overview PostgreSQL 9.0 Clustering Backups Monitoring Automation The end Set up of corosync OCF resource Corosync configuration • The goal of corosync is to make the switch between master/slave when needed • It will ensure that a master is online and connected to the router • The two servers are connected to each other on eth1 • Corosync is installed by Puppet • We take it from the clusterlabs repositories • We use a personalized master/slave ocf resource to manage the PostgreSQL M/S Julien Pivotto PostgreSQL 9.0 HA
  • 22. ; Overview PostgreSQL 9.0 Clustering Backups Monitoring Automation The end Set up of corosync OCF resource crm.conf The main configuration file of corosync is /etc/corosync/crm.conf. It contains all the resources/nodes/etc. . . Defining the nodes node babar.interne.arsia.be attributes standby="off" node dumbo.interne.arsia.be attributes standby="off" In this code, the two nodes are defined, and we tell corosync that they should be started at launch. Julien Pivotto PostgreSQL 9.0 HA
  • 23. ; Overview PostgreSQL 9.0 Clustering Backups Monitoring Automation The end Set up of corosync OCF resource crm.conf Defining the primitives primitive pgsql ocf:inuits:pgsql-ms primitive virt_ip ocf:heartbeat:IPaddr2 params nic="eth0" iflabel="0" ip="10.0.10.10" cidr_netmask="24" broadcast="10.0.10.255" meta target-role="Started" is-managed="true" primitive ping ocf:pacemaker:ping params host_list="10.0.10.1" op monitor interval="10s" timeout="10s" op start interval="0" timeout="45s" op stop interval="0" timeout="50s" • We define 3 primitives: • pgsql, the PostgreSQL primitive • virt_ip, the floating IP address • ping, the primitive that will check that the servers are connected to the router Julien Pivotto PostgreSQL 9.0 HA
  • 24. ; Overview PostgreSQL 9.0 Clustering Backups Monitoring Automation The end Set up of corosync OCF resource crm.conf Configuring the primitives ms pgsql-ms pgsql params pgsqlconfig="/var/lib/pgsql/data/postgresql.conf" lsb_script="/etc/init.d/postgresql-9.0" pgsqlrecovery="/var/lib/pgsql/data/recovery.conf" meta clone-max="2" clone-node-max="1" master-max="1" master-node-max="1" notify="false" clone clone-ping ping meta globally-unique="false" • We configure the PostgreSQL M/S: the init script, the configuration files. . . • We also configure the ping resource as a clone (it will be launched on both servers) Julien Pivotto PostgreSQL 9.0 HA
  • 25. ; Overview PostgreSQL 9.0 Clustering Backups Monitoring Automation The end Set up of corosync OCF resource crm.conf Defining the nodes group PSQL virt_ip location connected PSQL rule $id="connected-rule" -inf: not_defined pingd or pingd lte 0 colocation ip_psql inf: PSQL pgsql-ms:Master property $id="cib-bootstrap-options" cluster-infrastructure="openais" expected-quorum-votes="2" stonith-enabled="false" no-quorum-policy="ignore" default-resource-stickiness="INFINITY" rsc_defaults $id="rsc_defaults-options" migration-threshold="INFINITY" failure-timeout="10" resource-stickiness="INFINITY" • These lines will ensure that the master is always on the same node as the floating IP address • And also that the master is connected to the router Julien Pivotto PostgreSQL 9.0 HA
  • 26. ; Overview PostgreSQL 9.0 Clustering Backups Monitoring Automation The end Set up of corosync OCF resource OCF resource • There is a custom OCF resource to manage the master/slave PostgreSQL • It is based on an example of resource written by Andrew Beekhof from Clusterlabs • The file has to be in /usr/lib/ocf/resource.d/inuits/pgsql-ms Julien Pivotto PostgreSQL 9.0 HA
  • 27. ; Overview PostgreSQL 9.0 Clustering Backups Monitoring Automation The end Set up of corosync OCF resource OCF resource • The script does the following: • It moves the postgresql.conf-master to postgresql.conf when a node is promoted/master • It moves the postgresql.conf-slave to postgresql.conf when a node is depromoted/slave • It ensure that recovery.conf-slave is on recovery.conf on slave and absent on master • It starts/restarts PostgreSQL when needed. • I will post that file on Github soon Julien Pivotto PostgreSQL 9.0 HA
  • 28. ; Overview PostgreSQL 9.0 Clustering Backups Monitoring Automation The end Cron jobs BackupPC Backups of the databases • Sometimes, you need backups (especially when you don’t have backups. . . ) • We do a backup per hour on each node (one at minute 0 and one at minute 30) • We do a backup per day on each node • We do a backup per day on before BackupPC backup on each node. • We keep 24 hourly backups and 7 daily backups on disk • With BackupPC we keep months of backups Julien Pivotto PostgreSQL 9.0 HA
  • 29. ; Overview PostgreSQL 9.0 Clustering Backups Monitoring Automation The end Cron jobs BackupPC Hourly backup script /usr/local/bin/backup_hourly.sh #!/bin/bash DATE=$(date +%H) BACKUP_PATH=/var/lib/backups/hourly for db in foobar_db foobar2_db do /usr/bin/pg_dump $db | gzip > $BACKUP_PATH/${db}_$DATE.pgsql.gz ln -fs $BACKUP_PATH/${db}_$DATE.pgsql.gz $BACKUP_PATH/${db}_current.pgsql.gz done The daily script is almost the same. Julien Pivotto PostgreSQL 9.0 HA
  • 30. ; Overview PostgreSQL 9.0 Clustering Backups Monitoring Automation The end Cron jobs BackupPC BackupPC script /usr/local/bin/backup_backuppc.sh #!/bin/bash DATE=$(date +%u) BACKUP_PATH=/var/lib/backups/backuppc for db in cerise trackitquality trackit zodb_cerise do /usr/bin/pg_dump -U postgres $db | gzip > $BACKUP_PATH/${db}_$DATE.pgsql.gz ln -fs $BACKUP_PATH/${db}_$DATE.pgsql.gz $BACKUP_PATH/${db}_current.pgsql.gz done In the backupPC config, I added the following: BackupPC config $Conf{DumpPreUserCmd} = ‘$sshPath -t -q -x -l backuppc $host /usr/local/bin/backup_backuppc.sh’; Julien Pivotto PostgreSQL 9.0 HA
  • 31. ; Overview PostgreSQL 9.0 Clustering Backups Monitoring Automation The end Nagios Munin check_postgres script • The check_postgres.pl is a nagios-compatible perl script • Available on http://www.bucardo.org/check_postgres/ and on Github • What we check with it: • The current connections • The status of the replication (the delay) Julien Pivotto PostgreSQL 9.0 HA
  • 32. ; Overview PostgreSQL 9.0 Clustering Backups Monitoring Automation The end Nagios Munin Check hot_standby latency • The check_postgres.pl script has a check for hot_standby delay • But we do not know who is the master and the slave, and it is required to launch the script • So, here is a bash script I wrote to know the M/S order Master/slave replication check #!/bin/bash /usr/lib64/nagios/plugins/check_postgres.pl --db="$1" --action hot_standby_delay -w 300 -c 600 --host=$( crm_resource --resource pgsql-ms --locate| awk ‘/Master/ {master=$6} / $/ {slave=$6} END {print master","slave}’ ) Julien Pivotto PostgreSQL 9.0 HA
  • 33. ; Overview PostgreSQL 9.0 Clustering Backups Monitoring Automation The end Nagios Munin Munin postgres scripts • Munin is shipped with perl plugins for postgresql • We use four of them: • postgres_size, • postgres_checkpoints, • postgres_connections_db, • postgres_cache Julien Pivotto PostgreSQL 9.0 HA
  • 35. ; Overview PostgreSQL 9.0 Clustering Backups Monitoring Automation The end Puppet module The node file #TODO Puppet module • The puppet postgres module is forked from Kris Buytaert’s github page • It is modified to remove all references to services, because we want corosync to manage them • It creates the users, the super users, the databases • It is a parameterized class, with a "cluster" parameter. So we can also install simple PostgreSQL • The cache sizes are parameterized too, so we can also use that in Vagrant boxes • Here are some examples from the module I will upload on Github ASAP Julien Pivotto PostgreSQL 9.0 HA
  • 36. ; Overview PostgreSQL 9.0 Clustering Backups Monitoring Automation The end Puppet module The node file #TODO Class postgres The postgres class installs the packages and makes the initdb stuff. init.pp class postgres ( $cluster = ‘no’, $running_ip = ‘127.0.0.1’ ){ ... • The cluster parameter indicates if we want or not clustering • running_ip is used for the SQL commands. In case of a cluster, you have to put cluste’s IP address here. Julien Pivotto PostgreSQL 9.0 HA
  • 37. ; Overview PostgreSQL 9.0 Clustering Backups Monitoring Automation The end Puppet module The node file #TODO Sqlexec definition sqlexec.pp define postgres::sqlexec($username, $database, $sql, $sqlcheck) { exec{ "psql -h $postgres::running_ip –username=${username} $database -c ¨${sql}¨>> /var/log/puppet-postgresql.sql.log 2>&1 && /bin/sleep 5": environment => "PGPASSWORD=${postgres_password}", path => $::path, timeout => 600, unless => "psql -h $postgres::running_ip -U $username $database -c $sqlcheck", require => Service[’postgresql-9.0’], } } Julien Pivotto PostgreSQL 9.0 HA
  • 38. ; Overview PostgreSQL 9.0 Clustering Backups Monitoring Automation The end Puppet module The node file #TODO Example in the node file Here is the result in the node file: dumbo.pp node babar { class { ’postgres’: cluster => ’yes’, running_ip => ’10.0.10.10’, } include postgres::munin include postgres::backup include cluster::node postgres::config{ $::fqdn: listen => ’*’, } Julien Pivotto PostgreSQL 9.0 HA
  • 39. ; Overview PostgreSQL 9.0 Clustering Backups Monitoring Automation The end Puppet module The node file #TODO Example in the node file dumbo.pp postgres::hba { $::fqdn: allowedrules => [ "host all all $::ipaddress/32 trust", ’hostnossl all all 10.0.10.8/32 trust’, ’hostnossl all all 10.0.10.9/32 trust’, ’hostnossl all all 10.0.10.10/32 trust’, ’hostnossl replication replicuser 192.168.177.2/24 trust’, ], } Julien Pivotto PostgreSQL 9.0 HA
  • 40. ; Overview PostgreSQL 9.0 Clustering Backups Monitoring Automation The end Puppet module The node file #TODO Example in the node file dumbo.pp postgres::createsuperuser{ ’replicuser’: passwd => ’foobar’, } postgres::createuser{ ’cerise’: passwd => ’foobar’; } postgres::createdb{ ’zodb_cerise’: owner => ’cerise’, require => Postgres::Createuser[’cerise’], } } Julien Pivotto PostgreSQL 9.0 HA
  • 41. ; Overview PostgreSQL 9.0 Clustering Backups Monitoring Automation The end Puppet module The node file #TODO #TODO • The first synchronisation is not puppetized • More advanced checks on the database #monitoringsucks (e.g. slow queries) • A disaster recovery • Improve the ocf script • Check the content of the backups • . . . Julien Pivotto PostgreSQL 9.0 HA