SlideShare une entreprise Scribd logo
1  sur  12
Harder, better, faster, stronger.
HA with RelStorage and Postgres
abstract @ PLOG 2014simone.deponti@abstract.it /
HA?
From the not-very-realiable Wikipedia page
about it:
A = Ut / Tt
Where Ut is the uptime and Tt the total time
Three rules of HA
1. Eliminate single points of failure
2. Have a reliable failover
3. Detect failures as they occur
It’s a short way to HA
Two elephants in a cluster
1. Use PostgreSQL’s native Streaming
Replication
2. Means explicit master/slave roles at any
given time
3. Manual failover (slave promotion) or use
repmgr
repmgr
1. Developed by phoenicians 2ndQuadrant
2. Works in addition to streaming replication
3. Acts as watchdog and can take automatic
actions (run bash scripts)
4. You run it on the slave node
(https://github.com/2ndQuadrant/repmgr)
How it works
1. Continuously and compulsively checks
twitter that the master is alive
2. If the master is unreachable for more than
N seconds, runs a bash script
It also offers convenient command line tools
to check status of cluster, promote nodes,
syncronize.
repmgr’s gotchas
1. Create a database to store replication info
(do not follow the bad example of the
documentation)
2. The suggested wal_keep_segments
setting is too high, will use up to 78GB,
can be reduced with -w option
3. Use custom promote and follow scripts
4. Launch daemon with with --monitoring-
history
Fail scenarios
1. One node has a catastrophic failure
(easy)
2. There is a total network outage (when the
network goes up again, you have a split
brain)
3. There is network partitioning (similar to
above, can be worst)
repmgr and streaming replication do no
perform too well in cases 2 and 3
Our work is never over
1. Always notify when a failure is detected
2. Investigate ASAP, even if automatic
action was taken
3. Have the slave try to exclude the master
upon promotion
Example of #3:
The slave upon promotion contacts all the
clients and tells them to avoid talking to master
RelStorage and PostgreSQL
It has a smal issue with connections IDLE IN
TRANSACTION.
Check with:
SELECT datname, usename,
query_start, state_change, state,
query
FROM pg_stat_activity;
It’s not fatal, but might result in locks during
backups.
Simone Deponti
simone.deponti@abstract.it

Contenu connexe

Similaire à HA with RelStorage and Postgres

Multi threading
Multi threadingMulti threading
Multi threadinggndu
 
Storm Real Time Computation
Storm Real Time ComputationStorm Real Time Computation
Storm Real Time ComputationSonal Raj
 
Towards 100% uptime with node
Towards 100% uptime with nodeTowards 100% uptime with node
Towards 100% uptime with nodesandinmyjoints
 
CS844 U1 Individual Project
CS844 U1 Individual ProjectCS844 U1 Individual Project
CS844 U1 Individual ProjectThienSi Le
 
Profiler Guided Java Performance Tuning
Profiler Guided Java Performance TuningProfiler Guided Java Performance Tuning
Profiler Guided Java Performance Tuningosa_ora
 
Java And Multithreading
Java And MultithreadingJava And Multithreading
Java And MultithreadingShraddha
 
Multithreading 101
Multithreading 101Multithreading 101
Multithreading 101Tim Penhey
 
Building resilient applications
Building resilient applicationsBuilding resilient applications
Building resilient applicationsNuno Caneco
 
Describe synchronization techniques used by programmers who develop .pdf
Describe synchronization techniques used by programmers who develop .pdfDescribe synchronization techniques used by programmers who develop .pdf
Describe synchronization techniques used by programmers who develop .pdfexcellentmobiles
 
Debugging Complex Systems - Erlang Factory SF 2015
Debugging Complex Systems - Erlang Factory SF 2015Debugging Complex Systems - Erlang Factory SF 2015
Debugging Complex Systems - Erlang Factory SF 2015lpgauth
 
Not Less, Not More: Exactly Once, Large-Scale Stream Processing in Action
Not Less, Not More: Exactly Once, Large-Scale Stream Processing in ActionNot Less, Not More: Exactly Once, Large-Scale Stream Processing in Action
Not Less, Not More: Exactly Once, Large-Scale Stream Processing in ActionParis Carbone
 
Java Threads: Lightweight Processes
Java Threads: Lightweight ProcessesJava Threads: Lightweight Processes
Java Threads: Lightweight ProcessesIsuru Perera
 
Velocity 2015: Building Self-Healing Systems
Velocity 2015: Building Self-Healing SystemsVelocity 2015: Building Self-Healing Systems
Velocity 2015: Building Self-Healing SystemsSOASTA
 
Velocity 2015 building self healing systems (slide share version)
Velocity 2015 building self healing systems (slide share version)Velocity 2015 building self healing systems (slide share version)
Velocity 2015 building self healing systems (slide share version)SOASTA
 
Solr Troubleshooting - TreeMap approach
Solr Troubleshooting - TreeMap approachSolr Troubleshooting - TreeMap approach
Solr Troubleshooting - TreeMap approachAlexandre Rafalovitch
 
Solr Troubleshooting - Treemap Approach: Presented by Alexandre Rafolovitch, ...
Solr Troubleshooting - Treemap Approach: Presented by Alexandre Rafolovitch, ...Solr Troubleshooting - Treemap Approach: Presented by Alexandre Rafolovitch, ...
Solr Troubleshooting - Treemap Approach: Presented by Alexandre Rafolovitch, ...Lucidworks
 

Similaire à HA with RelStorage and Postgres (20)

Multi threading
Multi threadingMulti threading
Multi threading
 
Apache Storm
Apache StormApache Storm
Apache Storm
 
Storm Real Time Computation
Storm Real Time ComputationStorm Real Time Computation
Storm Real Time Computation
 
Towards 100% uptime with node
Towards 100% uptime with nodeTowards 100% uptime with node
Towards 100% uptime with node
 
CS844 U1 Individual Project
CS844 U1 Individual ProjectCS844 U1 Individual Project
CS844 U1 Individual Project
 
Profiler Guided Java Performance Tuning
Profiler Guided Java Performance TuningProfiler Guided Java Performance Tuning
Profiler Guided Java Performance Tuning
 
Java And Multithreading
Java And MultithreadingJava And Multithreading
Java And Multithreading
 
Md09 multithreading
Md09 multithreadingMd09 multithreading
Md09 multithreading
 
Multithreading 101
Multithreading 101Multithreading 101
Multithreading 101
 
Building resilient applications
Building resilient applicationsBuilding resilient applications
Building resilient applications
 
Describe synchronization techniques used by programmers who develop .pdf
Describe synchronization techniques used by programmers who develop .pdfDescribe synchronization techniques used by programmers who develop .pdf
Describe synchronization techniques used by programmers who develop .pdf
 
Debugging Complex Systems - Erlang Factory SF 2015
Debugging Complex Systems - Erlang Factory SF 2015Debugging Complex Systems - Erlang Factory SF 2015
Debugging Complex Systems - Erlang Factory SF 2015
 
Not Less, Not More: Exactly Once, Large-Scale Stream Processing in Action
Not Less, Not More: Exactly Once, Large-Scale Stream Processing in ActionNot Less, Not More: Exactly Once, Large-Scale Stream Processing in Action
Not Less, Not More: Exactly Once, Large-Scale Stream Processing in Action
 
Java Threads: Lightweight Processes
Java Threads: Lightweight ProcessesJava Threads: Lightweight Processes
Java Threads: Lightweight Processes
 
Velocity 2015: Building Self-Healing Systems
Velocity 2015: Building Self-Healing SystemsVelocity 2015: Building Self-Healing Systems
Velocity 2015: Building Self-Healing Systems
 
Velocity 2015 building self healing systems (slide share version)
Velocity 2015 building self healing systems (slide share version)Velocity 2015 building self healing systems (slide share version)
Velocity 2015 building self healing systems (slide share version)
 
Solr Troubleshooting - TreeMap approach
Solr Troubleshooting - TreeMap approachSolr Troubleshooting - TreeMap approach
Solr Troubleshooting - TreeMap approach
 
Solr Troubleshooting - Treemap Approach: Presented by Alexandre Rafolovitch, ...
Solr Troubleshooting - Treemap Approach: Presented by Alexandre Rafolovitch, ...Solr Troubleshooting - Treemap Approach: Presented by Alexandre Rafolovitch, ...
Solr Troubleshooting - Treemap Approach: Presented by Alexandre Rafolovitch, ...
 
Java threading
Java threadingJava threading
Java threading
 
Java
JavaJava
Java
 

Dernier

Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company OdishaBalasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odishasmiwainfosol
 
Sending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdfSending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdf31events.com
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtimeandrehoraa
 
Precise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalPrecise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalLionel Briand
 
Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesŁukasz Chruściel
 
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Cizo Technology Services
 
SensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving CarsSensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving CarsChristian Birchler
 
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Matt Ray
 
PREDICTING RIVER WATER QUALITY ppt presentation
PREDICTING  RIVER  WATER QUALITY  ppt presentationPREDICTING  RIVER  WATER QUALITY  ppt presentation
PREDICTING RIVER WATER QUALITY ppt presentationvaddepallysandeep122
 
What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...Technogeeks
 
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Angel Borroy López
 
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...OnePlan Solutions
 
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...confluent
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWave PLM
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024StefanoLambiase
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesPhilip Schwarz
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmSujith Sukumaran
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsAhmed Mohamed
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaHanief Utama
 

Dernier (20)

Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company OdishaBalasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
 
Sending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdfSending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdf
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtime
 
Precise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalPrecise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive Goal
 
Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New Features
 
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
 
SensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving CarsSensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving Cars
 
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
 
2.pdf Ejercicios de programación competitiva
2.pdf Ejercicios de programación competitiva2.pdf Ejercicios de programación competitiva
2.pdf Ejercicios de programación competitiva
 
PREDICTING RIVER WATER QUALITY ppt presentation
PREDICTING  RIVER  WATER QUALITY  ppt presentationPREDICTING  RIVER  WATER QUALITY  ppt presentation
PREDICTING RIVER WATER QUALITY ppt presentation
 
What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...
 
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
 
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
 
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need It
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a series
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalm
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML Diagrams
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief Utama
 

HA with RelStorage and Postgres

  • 1. Harder, better, faster, stronger. HA with RelStorage and Postgres abstract @ PLOG 2014simone.deponti@abstract.it /
  • 2. HA? From the not-very-realiable Wikipedia page about it: A = Ut / Tt Where Ut is the uptime and Tt the total time
  • 3. Three rules of HA 1. Eliminate single points of failure 2. Have a reliable failover 3. Detect failures as they occur
  • 4. It’s a short way to HA
  • 5. Two elephants in a cluster 1. Use PostgreSQL’s native Streaming Replication 2. Means explicit master/slave roles at any given time 3. Manual failover (slave promotion) or use repmgr
  • 6. repmgr 1. Developed by phoenicians 2ndQuadrant 2. Works in addition to streaming replication 3. Acts as watchdog and can take automatic actions (run bash scripts) 4. You run it on the slave node (https://github.com/2ndQuadrant/repmgr)
  • 7. How it works 1. Continuously and compulsively checks twitter that the master is alive 2. If the master is unreachable for more than N seconds, runs a bash script It also offers convenient command line tools to check status of cluster, promote nodes, syncronize.
  • 8. repmgr’s gotchas 1. Create a database to store replication info (do not follow the bad example of the documentation) 2. The suggested wal_keep_segments setting is too high, will use up to 78GB, can be reduced with -w option 3. Use custom promote and follow scripts 4. Launch daemon with with --monitoring- history
  • 9. Fail scenarios 1. One node has a catastrophic failure (easy) 2. There is a total network outage (when the network goes up again, you have a split brain) 3. There is network partitioning (similar to above, can be worst) repmgr and streaming replication do no perform too well in cases 2 and 3
  • 10. Our work is never over 1. Always notify when a failure is detected 2. Investigate ASAP, even if automatic action was taken 3. Have the slave try to exclude the master upon promotion Example of #3: The slave upon promotion contacts all the clients and tells them to avoid talking to master
  • 11. RelStorage and PostgreSQL It has a smal issue with connections IDLE IN TRANSACTION. Check with: SELECT datname, usename, query_start, state_change, state, query FROM pg_stat_activity; It’s not fatal, but might result in locks during backups.