SlideShare a Scribd company logo
1 of 47
Download to read offline
Challenges of Distributing
Postgres: A Citus Story
Ozgun Erdogan
PGConf US Jersey City | April 2018
Developers Love Postgres
PostgreSQL
MySQL
MongoDB
SQL Server +
Oracle
RDBMS: PostgreSQL, MySQL, Microsoft SQL Server, Oracle
Ozgun Erdogan | QCon San Francisco 2017
I love Postgres, too
3 Ozgun Erdogan | QCon San Francisco 2017
Ozgun Erdogan
CTO of Citus Data
Distributed Systems
Distributed Databases
Formerly of Amazon
Love drinking margaritas
4
Our mission at Citus Data
5 Ozgun Erdogan | QCon San Francisco 2017
Make it so SaaS businesses
never have to worry about
scaling their database again
What is the Citus database?
1.Scales out PostgreSQL
2.Extension to PostgreSQL
3.Available in 3 Ways
Ozgun Erdogan | QCon San Francisco 2017
• Using sharding & replication
• Query engine parallelizes SQL queries across many nodes
• Using PostgreSQL extension APIs
Citus, Packaged Three Ways
Ozgun Erdogan | QCon San Francisco 2017
Open
Source
Enterprise
Software
Fully-Managed
Database as a Service
github.com/citusdata/citus
Simplified Citus Architecture
3 Challenges Distributing Postgres
1. PostgreSQL and High Availability
2. PostgreSQL is huge. How to keep up with it
3. Distributed transactions
Ozgun Erdogan | QCon San Francisco 2017
PostgreSQL &
High Availability (HA)
Designing for a Cloud-native world
1
Why is High Availability hard?
PostgreSQL replication uses one primary &
multiple secondary nodes. Two challenges:
1. Most Postgres clients aren’t smart. When the
primary fails, they retry the same IP.
2. Postgres replicates entire state. This makes it
resource intensive to reconstruct new nodes from a
primary.
Ozgun Erdogan | QCon San Francisco 2017
Database Failures Should Be Transparent
Ozgun Erdogan | QCon San Francisco 2017
Database Failures Shouldn’t Be a Big Deal
1. PostgreSQL streaming replication to replicate from
primary to secondary. Back up to S3.
2. Volume level replication to replicate to secondary’s
volume. Back up to S3.
3. Incremental backups to S3. Reconstruct secondary
nodes from S3.
Ozgun Erdogan | QCon San Francisco 2017
3 Methods for HA & Backups in Postgres
Postgres - Streaming Replication (1)
Write-ahead logs
(streaming repl.)
Table foo
Primary –
PostgreSQL
streaming repl.
Table bar
WAL logs
Table foo
Table bar
WAL logs
Secondary –
PostgreSQL
streaming repl.
Monitoring Agents -
streaming repl.
setup & auto failover
S3 / Blob Storage
(Encrypted)
Backup
Process
Ozgun Erdogan | QCon San Francisco 2017
Postgres – AWS RDS & Azure (2)
Postgres
Primary
Monitoring Agents
(Auto node failover)
Persistent Volume
Postgres
Standby
S3 / Blob Storage
(Encrypted)
Table foo
Table bar
WAL logs
Table foo
Table bar
WAL logs
Backup process
Backup
Process
Persistent Volume
Ozgun Erdogan | QCon San Francisco 2017
Postgres – Reconstruct from WAL (3)
Postgres
Primary
Monitoring Agents
(Auto node failover)
Persistent Volume
Postgres
Secondary
Backup
Process
S3 / Blob Storage
(Encrypted)
Table foo
Table bar
WAL logs
Persistent Volume
Table foo
Table bar
WAL logs
Backup process
Ozgun Erdogan | QCon San Francisco 2017
WHO DOES THIS? PRIMARY BENEFITS
Streaming Replication
(local / ephemeral disk)
On-prem
Manual EC2
Simple to set up
Direct I/O: High I/O & large storage
Disk Mirroring
RDS
Azure Preview
Works for MySQL and PostgreSQL
Data durability in cloud environments
Reconstruct from WAL
Heroku
Citus Cloud
Enables Fork and PITR
Node reconstruction in background
(Data durability in cloud environments)
How do these approaches compare?
17 Ozgun Erdogan | QCon San Francisco 2017
Summary
• In PostgreSQL, a database node’s state gets
replicated in its entirety. The replication can be set up
in three ways.
• Reconstructing a secondary node from S3 makes
bringing up or shooting down nodes easy.
• When you shard your database, the state you need to
replicate per node becomes smaller.
Ozgun Erdogan | QCon San Francisco 2017
PostgreSQL is huge
How do you scale all features?
2
3 ways to build a distributed database
1. Build a distributed database from scratch
2. Middleware sharding (mimic the parser)
3. Fork your favorite database (like PostgreSQL)
Ozgun Erdogan | QCon San Francisco 2017
Example Transaction Block
Ozgun Erdogan | QCon San Francisco 2017
Postgres Features, Tools & Frameworks
• PostgreSQL manual (US Letter)
• Clients for diff programming
languages
• ORMs, libraries, GUIs
• Tools (dump, restore, analyze)
• New features
Ozgun Erdogan | QCon San Francisco 2017
At First, Forked PostgreSQL with Style
Ozgun Erdogan | QCon San Francisco 2017
Two Stage Query Optimization
1. Plan to minimize network I/O
2. Nodes talk to each other using SQL over libpq
3. Learned to cooperate with planner / executor bit by bit
(Volcano style executor)
Ozgun Erdogan | QCon San Francisco 2017
Citus Architecture (Simplified)
25
SELECT avg(revenue)
FROM sales
Coordinator
SELECT sum(revenue), count(revenue)
FROM table_1001
SELECT sum … FROM table_1003
Worker node 1
Table metadata
Table_1001
Table_1003
SELECT sum … FROM table_1002
SELECT sum … FROM table_1004
Worker node 2
Table_1002
Table_1004
Worker node N
.
.
.
.
.
.
Each node PostgreSQL with Citus installed
1 shard = 1 PostgreSQL table
Ozgun Erdogan | QCon San Francisco 2017
Unfork Citus using Extension APIs
CREATE EXTENSION citus;
• System catalogs – Distributed metadata
• Planner hook – Insert, Update, Delete, Select
• Executor hook – Insert, Update, Delete, Select
• Utility hook – Alter Table, Create Index, Vacuum, etc.
• Transaction & resources handling – file descriptors, etc.
• Background worker process – Maintenance processes
(distributed deadlock detection, task tracker, etc.)
• Logical decoding – Online data migrations
Ozgun Erdogan | QCon San Francisco 2017
PostgreSQL has transactions.
How to handle distributed transactions
3
BEGIN
INSERT
UPDATE
SELECT
COMMIT
ROLLBACK
Consistency in Distributed Databases
1. 2PC: All participating nodes need to be up
2. Paxos: Achieves consensus with quorum
3. Raft: More understandable alternative to
Paxos
Ozgun Erdogan | QCon San Francisco 2017
Concurrency in Distributed Databases
Ozgun Erdogan | QCon San Francisco 2017
LocksLocks
What is a Lock?
• Protects against concurrent modifications.
• Locks are released at the end of a transaction.
Deadlocks
Transactions Block on 1st Conflicting LockWhat is a lock?
Protects against concurrent modifications
Locks released at end of transaction
BEGIN;
UPDATE data SET y = 2 WHERE x = 1;
<obtained lock on rows with x = 1>
COMMIT;
<all locks released>
BEGIN;
UPDATE data SET y = 5 WHERE x = 1;
<waiting for lock on rows with x = 1>
<obtained lock on rows with x = 1>
COMMIT;
Transactions and Concurrency
• Transactions that don’t modify the same row
can run concurrently.
Transactions block on 1st lock that conflicts
BEGIN;
UPDATE data SET y = y - 1 WHERE x = 1;
COMMIT;
<all locks released>
BEGIN;
UPDATE data SET y = y + 1 WHERE x = 2;
UPDATE data SET y = y + 1 WHERE x = 1;
<waiting for lock on rows with x = 1>
<obtained lock on rows with x = 1>
COMMIT;
Transactions and Concurrency
• Transactions that don’t modify the same row
can run concurrently.
Transactions block on 1st lock that conflicts
BEGIN;
UPDATE data SET y = y - 1 WHERE x = 1;
COMMIT;
<all locks released>
BEGIN;
UPDATE data SET y = y + 1 WHERE x = 2;
UPDATE data SET y = y + 1 WHERE x = 1;
<waiting for lock on rows with x = 1>
<obtained lock on rows with x = 1>
COMMIT;
(Distributed) deadlock!
BEGIN;
UPDATE data SET y = y - 1 WHERE x = 1;
UPDATE data SET y = y + 1 WHERE x = 2;
BEGIN;
UPDATE data SET y = y - 1 WHERE x = 2;
UPDATE data SET y = y + 1 WHERE x = 1;
But what if they start blocking each other?
Transactions and Concurrency
• Transactions that don’t modify the same row
can run concurrently.
Transactions block on 1st lock that conflicts
BEGIN;
UPDATE data SET y = y - 1 WHERE x = 1;
COMMIT;
<all locks released>
BEGIN;
UPDATE data SET y = y + 1 WHERE x = 2;
UPDATE data SET y = y + 1 WHERE x = 1;
<waiting for lock on rows with x = 1>
<obtained lock on rows with x = 1>
COMMIT;
(Distributed) deadlock!
BEGIN;
UPDATE data SET y = y - 1 WHERE x = 1;
UPDATE data SET y = y + 1 WHERE x = 2;
BEGIN;
UPDATE data SET y = y - 1 WHERE x = 2;
UPDATE data SET y = y + 1 WHERE x = 1;
But what if they start blocking each other?Deadlock detection in PostgreSQL
Deadlock detection builds a graph of processes that
are waiting for each other.
Transactions and Concurrency
• Transactions that don’t modify the same row
can run concurrently.
Transactions block on 1st lock that conflicts
BEGIN;
UPDATE data SET y = y - 1 WHERE x = 1;
COMMIT;
<all locks released>
BEGIN;
UPDATE data SET y = y + 1 WHERE x = 2;
UPDATE data SET y = y + 1 WHERE x = 1;
<waiting for lock on rows with x = 1>
<obtained lock on rows with x = 1>
COMMIT;
(Distributed) deadlock!
BEGIN;
UPDATE data SET y = y - 1 WHERE x = 1;
UPDATE data SET y = y + 1 WHERE x = 2;
BEGIN;
UPDATE data SET y = y - 1 WHERE x = 2;
UPDATE data SET y = y + 1 WHERE x = 1;
But what if they start blocking each other?Deadlock detection in PostgreSQL
Deadlock detection builds a graph of processes that
are waiting for each other.
Deadlock detection in PostgreSQL
Transactions are cancelled until the cycle is gone
Transactions and Concurrency
• Transactions that don’t modify the same row
can run concurrently.
Transactions block on 1st lock that conflicts
BEGIN;
UPDATE data SET y = y - 1 WHERE x = 1;
COMMIT;
<all locks released>
BEGIN;
UPDATE data SET y = y + 1 WHERE x = 2;
UPDATE data SET y = y + 1 WHERE x = 1;
<waiting for lock on rows with x = 1>
<obtained lock on rows with x = 1>
COMMIT;
(Distributed) deadlock!
BEGIN;
UPDATE data SET y = y - 1 WHERE x = 1;
UPDATE data SET y = y + 1 WHERE x = 2;
BEGIN;
UPDATE data SET y = y - 1 WHERE x = 2;
UPDATE data SET y = y + 1 WHERE x = 1;
But what if they start blocking each other?Deadlock detection in PostgreSQL
Deadlock detection builds a graph of processes that
are waiting for each other.
Deadlock detection in PostgreSQL
Transactions are cancelled until the cycle is gone
Firstname Lastname | Citus Data | Meeting Name | Month Year
Deadlocks in Citus
Citus delegates transactions to nodes
Firstname Lastname | Citus Data | Meeting Name | Month Year
Deadlocks in Citus
Citus delegates transactions to nodes
Transactions and Concurrency
• Transactions that don’t modify the same row
can run concurrently.
Transactions block on 1st lock that conflicts
BEGIN;
UPDATE data SET y = y - 1 WHERE x = 1;
COMMIT;
<all locks released>
BEGIN;
UPDATE data SET y = y + 1 WHERE x = 2;
UPDATE data SET y = y + 1 WHERE x = 1;
<waiting for lock on rows with x = 1>
<obtained lock on rows with x = 1>
COMMIT;
(Distributed) deadlock!
BEGIN;
UPDATE data SET y = y - 1 WHERE x = 1;
UPDATE data SET y = y + 1 WHERE x = 2;
BEGIN;
UPDATE data SET y = y - 1 WHERE x = 2;
UPDATE data SET y = y + 1 WHERE x = 1;
But what if they start blocking each other?Deadlock detection in PostgreSQL
Deadlock detection builds a graph of processes that
are waiting for each other.
Deadlock detection in PostgreSQL
Transactions are cancelled until the cycle is gone
Firstname Lastname | Citus Data | Meeting Name | Month Year
Deadlocks in Citus
Citus delegates transactions to nodes
Firstname Lastname | Citus Data | Meeting Name | Month Year
Deadlocks in Citus
Citus delegates transactions to nodes
Firstname Lastname | Citus Data | Meeting Name | Month Year
Deadlocks in Citus
PostgreSQL’s deadlock detector still works
Transactions and Concurrency
• Transactions that don’t modify the same row
can run concurrently.
Transactions block on 1st lock that conflicts
BEGIN;
UPDATE data SET y = y - 1 WHERE x = 1;
COMMIT;
<all locks released>
BEGIN;
UPDATE data SET y = y + 1 WHERE x = 2;
UPDATE data SET y = y + 1 WHERE x = 1;
<waiting for lock on rows with x = 1>
<obtained lock on rows with x = 1>
COMMIT;
(Distributed) deadlock!
BEGIN;
UPDATE data SET y = y - 1 WHERE x = 1;
UPDATE data SET y = y + 1 WHERE x = 2;
BEGIN;
UPDATE data SET y = y - 1 WHERE x = 2;
UPDATE data SET y = y + 1 WHERE x = 1;
But what if they start blocking each other?Deadlock detection in PostgreSQL
Deadlock detection builds a graph of processes that
are waiting for each other.
Deadlock detection in PostgreSQL
Transactions are cancelled until the cycle is gone
Firstname Lastname | Citus Data | Meeting Name | Month Year
Deadlocks in Citus
Citus delegates transactions to nodes
Firstname Lastname | Citus Data | Meeting Name | Month Year
Deadlocks in Citus
Citus delegates transactions to nodes
Firstname Lastname | Citus Data | Meeting Name | Month Year
Deadlocks in Citus
PostgreSQL’s deadlock detector still works
Firstname Lastname | Citus Data | Meeting Name | Month Year
Deadlocks in Citus
When deadlocks span across node, PostgreSQL cannot help us
Firstname Lastname | Citus Data | Meeting Name | Month Year
Deadlocks in Citus
When deadlocks span across node, PostgreSQL cannot help us
Transactions and Concurrency
• Transactions that don’t modify the same row
can run concurrently.
Transactions block on 1st lock that conflicts
BEGIN;
UPDATE data SET y = y - 1 WHERE x = 1;
COMMIT;
<all locks released>
BEGIN;
UPDATE data SET y = y + 1 WHERE x = 2;
UPDATE data SET y = y + 1 WHERE x = 1;
<waiting for lock on rows with x = 1>
<obtained lock on rows with x = 1>
COMMIT;
(Distributed) deadlock!
BEGIN;
UPDATE data SET y = y - 1 WHERE x = 1;
UPDATE data SET y = y + 1 WHERE x = 2;
BEGIN;
UPDATE data SET y = y - 1 WHERE x = 2;
UPDATE data SET y = y + 1 WHERE x = 1;
But what if they start blocking each other?Deadlock detection in PostgreSQL
Deadlock detection builds a graph of processes that
are waiting for each other.
Deadlock detection in PostgreSQL
Transactions are cancelled until the cycle is gone
Firstname Lastname | Citus Data | Meeting Name | Month Year
Deadlocks in Citus
Citus delegates transactions to nodes
Firstname Lastname | Citus Data | Meeting Name | Month Year
Deadlocks in Citus
Citus delegates transactions to nodes
Firstname Lastname | Citus Data | Meeting Name | Month Year
Deadlocks in Citus
PostgreSQL’s deadlock detector still works
Firstname Lastname | Citus Data | Meeting Name | Month Year
Deadlocks in Citus
When deadlocks span across node, PostgreSQL cannot help us
Firstname Lastname | Citus Data | Meeting Name | Month Year
Deadlocks in Citus
When deadlocks span across node, PostgreSQL cannot help us
Firstname Lastname | Citus Data | Meeting Name | Month Year
Deadlock detection in Citus 7
Citus 7 adds distributed deadlock detection
Transactions and Concurrency
• Transactions that don’t modify the same row
can run concurrently.
Transactions block on 1st lock that conflicts
BEGIN;
UPDATE data SET y = y - 1 WHERE x = 1;
COMMIT;
<all locks released>
BEGIN;
UPDATE data SET y = y + 1 WHERE x = 2;
UPDATE data SET y = y + 1 WHERE x = 1;
<waiting for lock on rows with x = 1>
<obtained lock on rows with x = 1>
COMMIT;
(Distributed) deadlock!
BEGIN;
UPDATE data SET y = y - 1 WHERE x = 1;
UPDATE data SET y = y + 1 WHERE x = 2;
BEGIN;
UPDATE data SET y = y - 1 WHERE x = 2;
UPDATE data SET y = y + 1 WHERE x = 1;
But what if they start blocking each other?Deadlock detection in PostgreSQL
Deadlock detection builds a graph of processes that
are waiting for each other.
Deadlock detection in PostgreSQL
Transactions are cancelled until the cycle is gone
Firstname Lastname | Citus Data | Meeting Name | Month Year
Deadlocks in Citus
Citus delegates transactions to nodes
Firstname Lastname | Citus Data | Meeting Name | Month Year
Deadlocks in Citus
Citus delegates transactions to nodes
Firstname Lastname | Citus Data | Meeting Name | Month Year
Deadlocks in Citus
PostgreSQL’s deadlock detector still works
Firstname Lastname | Citus Data | Meeting Name | Month Year
Deadlocks in Citus
When deadlocks span across node, PostgreSQL cannot help us
Firstname Lastname | Citus Data | Meeting Name | Month Year
Deadlocks in Citus
When deadlocks span across node, PostgreSQL cannot help us
Firstname Lastname | Citus Data | Meeting Name | Month Year
Deadlock detection in Citus 7
Citus 7 adds distributed deadlock detection
Firstname Lastname | Citus Data | Meeting Name | Month Year
Deadlock detection in Citus 7
Citus 7 adds distributed deadlock detection.
Distributed transactions are a complex
topic
• Most articles on distributed transactions focus on data
consistency.
• Data consistency is only one side of the coin. If you’re
using a relational database, your application benefits
from another key feature: deadlock detection.
• https://www.citusdata.com/blog/2017/08/31/databases
-and-distributed-deadlocks-a-faq
Ozgun Erdogan | QCon San Francisco 2017
So now what? We talked about 3
challenges distributing Postgres
1. PostgreSQL, Replication, High Availability
2. Tradeoffs in building a distributed database—
and how we chose PostgreSQL s extension
APIs
3. Distributed deadlock detection & distributed
transactions
Ozgun Erdogan | QCon San Francisco 2017
45
Solving a very complex problem
46
SQL is hard, not impossible, to scale
© 2017 Citus Data. All right reserved.
ozgun@citusdata.com
Questions?
@citusdata
Ozgun Erdogan
www.citusdata.com

More Related Content

More from Citus Data

Deep Postgres Extensions in Rust | PGCon 2019 | Jeff Davis
Deep Postgres Extensions in Rust | PGCon 2019 | Jeff DavisDeep Postgres Extensions in Rust | PGCon 2019 | Jeff Davis
Deep Postgres Extensions in Rust | PGCon 2019 | Jeff DavisCitus Data
 
Why Postgres Why This Database Why Now | SF Bay Area Postgres Meetup | Claire...
Why Postgres Why This Database Why Now | SF Bay Area Postgres Meetup | Claire...Why Postgres Why This Database Why Now | SF Bay Area Postgres Meetup | Claire...
Why Postgres Why This Database Why Now | SF Bay Area Postgres Meetup | Claire...Citus Data
 
A story on Postgres index types | PostgresLondon 2019 | Louise Grandjonc
A story on Postgres index types | PostgresLondon 2019 | Louise GrandjoncA story on Postgres index types | PostgresLondon 2019 | Louise Grandjonc
A story on Postgres index types | PostgresLondon 2019 | Louise GrandjoncCitus Data
 
Why developers need marketing now more than ever | GlueCon 2019 | Claire Gior...
Why developers need marketing now more than ever | GlueCon 2019 | Claire Gior...Why developers need marketing now more than ever | GlueCon 2019 | Claire Gior...
Why developers need marketing now more than ever | GlueCon 2019 | Claire Gior...Citus Data
 
The Art of PostgreSQL | PostgreSQL Ukraine | Dimitri Fontaine
The Art of PostgreSQL | PostgreSQL Ukraine | Dimitri FontaineThe Art of PostgreSQL | PostgreSQL Ukraine | Dimitri Fontaine
The Art of PostgreSQL | PostgreSQL Ukraine | Dimitri FontaineCitus Data
 
Optimizing your app by understanding your Postgres | RailsConf 2019 | Samay S...
Optimizing your app by understanding your Postgres | RailsConf 2019 | Samay S...Optimizing your app by understanding your Postgres | RailsConf 2019 | Samay S...
Optimizing your app by understanding your Postgres | RailsConf 2019 | Samay S...Citus Data
 
When it all goes wrong (with Postgres) | RailsConf 2019 | Will Leinweber
When it all goes wrong (with Postgres) | RailsConf 2019 | Will LeinweberWhen it all goes wrong (with Postgres) | RailsConf 2019 | Will Leinweber
When it all goes wrong (with Postgres) | RailsConf 2019 | Will LeinweberCitus Data
 
The Art of PostgreSQL | PostgreSQL Ukraine Meetup | Dimitri Fontaine
The Art of PostgreSQL | PostgreSQL Ukraine Meetup | Dimitri FontaineThe Art of PostgreSQL | PostgreSQL Ukraine Meetup | Dimitri Fontaine
The Art of PostgreSQL | PostgreSQL Ukraine Meetup | Dimitri FontaineCitus Data
 
Using Postgres and Citus for Lightning Fast Analytics, also ft. Rollups | Liv...
Using Postgres and Citus for Lightning Fast Analytics, also ft. Rollups | Liv...Using Postgres and Citus for Lightning Fast Analytics, also ft. Rollups | Liv...
Using Postgres and Citus for Lightning Fast Analytics, also ft. Rollups | Liv...Citus Data
 
How to write SQL queries | pgDay Paris 2019 | Dimitri Fontaine
How to write SQL queries | pgDay Paris 2019 | Dimitri FontaineHow to write SQL queries | pgDay Paris 2019 | Dimitri Fontaine
How to write SQL queries | pgDay Paris 2019 | Dimitri FontaineCitus Data
 
When it all Goes Wrong |Nordic PGDay 2019 | Will Leinweber
When it all Goes Wrong |Nordic PGDay 2019 | Will LeinweberWhen it all Goes Wrong |Nordic PGDay 2019 | Will Leinweber
When it all Goes Wrong |Nordic PGDay 2019 | Will LeinweberCitus Data
 
Why PostgreSQL Why This Database Why Now | Nordic PGDay 2019 | Claire Giordano
Why PostgreSQL Why This Database Why Now | Nordic PGDay 2019 | Claire GiordanoWhy PostgreSQL Why This Database Why Now | Nordic PGDay 2019 | Claire Giordano
Why PostgreSQL Why This Database Why Now | Nordic PGDay 2019 | Claire GiordanoCitus Data
 
Scaling Multi-Tenant Applications Using the Django ORM & Postgres | PyCaribbe...
Scaling Multi-Tenant Applications Using the Django ORM & Postgres | PyCaribbe...Scaling Multi-Tenant Applications Using the Django ORM & Postgres | PyCaribbe...
Scaling Multi-Tenant Applications Using the Django ORM & Postgres | PyCaribbe...Citus Data
 
Data Modeling, Normalization, and Denormalisation | FOSDEM '19 | Dimitri Font...
Data Modeling, Normalization, and Denormalisation | FOSDEM '19 | Dimitri Font...Data Modeling, Normalization, and Denormalisation | FOSDEM '19 | Dimitri Font...
Data Modeling, Normalization, and Denormalisation | FOSDEM '19 | Dimitri Font...Citus Data
 
Five data models for sharding and which is right | PGConf.ASIA 2018 | Craig K...
Five data models for sharding and which is right | PGConf.ASIA 2018 | Craig K...Five data models for sharding and which is right | PGConf.ASIA 2018 | Craig K...
Five data models for sharding and which is right | PGConf.ASIA 2018 | Craig K...Citus Data
 
Monitoring Postgres at Scale | PGConf.ASIA 2018 | Lukas Fittl
Monitoring Postgres at Scale | PGConf.ASIA 2018 | Lukas FittlMonitoring Postgres at Scale | PGConf.ASIA 2018 | Lukas Fittl
Monitoring Postgres at Scale | PGConf.ASIA 2018 | Lukas FittlCitus Data
 
Real time analytics at any scale | PostgreSQL User Group NL | Marco Slot
Real time analytics at any scale | PostgreSQL User Group NL | Marco SlotReal time analytics at any scale | PostgreSQL User Group NL | Marco Slot
Real time analytics at any scale | PostgreSQL User Group NL | Marco SlotCitus Data
 
Scaling Multi-tenant Applications Using the Django ORM & Postgres | PyCon Can...
Scaling Multi-tenant Applications Using the Django ORM & Postgres | PyCon Can...Scaling Multi-tenant Applications Using the Django ORM & Postgres | PyCon Can...
Scaling Multi-tenant Applications Using the Django ORM & Postgres | PyCon Can...Citus Data
 
Python and PostgreSQL: Let's Work Together! | PyConFr 2018 | Dimitri Fontaine
Python and PostgreSQL: Let's Work Together! | PyConFr 2018 | Dimitri FontainePython and PostgreSQL: Let's Work Together! | PyConFr 2018 | Dimitri Fontaine
Python and PostgreSQL: Let's Work Together! | PyConFr 2018 | Dimitri FontaineCitus Data
 
What is HyperLogLog and Why You Will Love It | PostgreSQL Conference Europe 2...
What is HyperLogLog and Why You Will Love It | PostgreSQL Conference Europe 2...What is HyperLogLog and Why You Will Love It | PostgreSQL Conference Europe 2...
What is HyperLogLog and Why You Will Love It | PostgreSQL Conference Europe 2...Citus Data
 

More from Citus Data (20)

Deep Postgres Extensions in Rust | PGCon 2019 | Jeff Davis
Deep Postgres Extensions in Rust | PGCon 2019 | Jeff DavisDeep Postgres Extensions in Rust | PGCon 2019 | Jeff Davis
Deep Postgres Extensions in Rust | PGCon 2019 | Jeff Davis
 
Why Postgres Why This Database Why Now | SF Bay Area Postgres Meetup | Claire...
Why Postgres Why This Database Why Now | SF Bay Area Postgres Meetup | Claire...Why Postgres Why This Database Why Now | SF Bay Area Postgres Meetup | Claire...
Why Postgres Why This Database Why Now | SF Bay Area Postgres Meetup | Claire...
 
A story on Postgres index types | PostgresLondon 2019 | Louise Grandjonc
A story on Postgres index types | PostgresLondon 2019 | Louise GrandjoncA story on Postgres index types | PostgresLondon 2019 | Louise Grandjonc
A story on Postgres index types | PostgresLondon 2019 | Louise Grandjonc
 
Why developers need marketing now more than ever | GlueCon 2019 | Claire Gior...
Why developers need marketing now more than ever | GlueCon 2019 | Claire Gior...Why developers need marketing now more than ever | GlueCon 2019 | Claire Gior...
Why developers need marketing now more than ever | GlueCon 2019 | Claire Gior...
 
The Art of PostgreSQL | PostgreSQL Ukraine | Dimitri Fontaine
The Art of PostgreSQL | PostgreSQL Ukraine | Dimitri FontaineThe Art of PostgreSQL | PostgreSQL Ukraine | Dimitri Fontaine
The Art of PostgreSQL | PostgreSQL Ukraine | Dimitri Fontaine
 
Optimizing your app by understanding your Postgres | RailsConf 2019 | Samay S...
Optimizing your app by understanding your Postgres | RailsConf 2019 | Samay S...Optimizing your app by understanding your Postgres | RailsConf 2019 | Samay S...
Optimizing your app by understanding your Postgres | RailsConf 2019 | Samay S...
 
When it all goes wrong (with Postgres) | RailsConf 2019 | Will Leinweber
When it all goes wrong (with Postgres) | RailsConf 2019 | Will LeinweberWhen it all goes wrong (with Postgres) | RailsConf 2019 | Will Leinweber
When it all goes wrong (with Postgres) | RailsConf 2019 | Will Leinweber
 
The Art of PostgreSQL | PostgreSQL Ukraine Meetup | Dimitri Fontaine
The Art of PostgreSQL | PostgreSQL Ukraine Meetup | Dimitri FontaineThe Art of PostgreSQL | PostgreSQL Ukraine Meetup | Dimitri Fontaine
The Art of PostgreSQL | PostgreSQL Ukraine Meetup | Dimitri Fontaine
 
Using Postgres and Citus for Lightning Fast Analytics, also ft. Rollups | Liv...
Using Postgres and Citus for Lightning Fast Analytics, also ft. Rollups | Liv...Using Postgres and Citus for Lightning Fast Analytics, also ft. Rollups | Liv...
Using Postgres and Citus for Lightning Fast Analytics, also ft. Rollups | Liv...
 
How to write SQL queries | pgDay Paris 2019 | Dimitri Fontaine
How to write SQL queries | pgDay Paris 2019 | Dimitri FontaineHow to write SQL queries | pgDay Paris 2019 | Dimitri Fontaine
How to write SQL queries | pgDay Paris 2019 | Dimitri Fontaine
 
When it all Goes Wrong |Nordic PGDay 2019 | Will Leinweber
When it all Goes Wrong |Nordic PGDay 2019 | Will LeinweberWhen it all Goes Wrong |Nordic PGDay 2019 | Will Leinweber
When it all Goes Wrong |Nordic PGDay 2019 | Will Leinweber
 
Why PostgreSQL Why This Database Why Now | Nordic PGDay 2019 | Claire Giordano
Why PostgreSQL Why This Database Why Now | Nordic PGDay 2019 | Claire GiordanoWhy PostgreSQL Why This Database Why Now | Nordic PGDay 2019 | Claire Giordano
Why PostgreSQL Why This Database Why Now | Nordic PGDay 2019 | Claire Giordano
 
Scaling Multi-Tenant Applications Using the Django ORM & Postgres | PyCaribbe...
Scaling Multi-Tenant Applications Using the Django ORM & Postgres | PyCaribbe...Scaling Multi-Tenant Applications Using the Django ORM & Postgres | PyCaribbe...
Scaling Multi-Tenant Applications Using the Django ORM & Postgres | PyCaribbe...
 
Data Modeling, Normalization, and Denormalisation | FOSDEM '19 | Dimitri Font...
Data Modeling, Normalization, and Denormalisation | FOSDEM '19 | Dimitri Font...Data Modeling, Normalization, and Denormalisation | FOSDEM '19 | Dimitri Font...
Data Modeling, Normalization, and Denormalisation | FOSDEM '19 | Dimitri Font...
 
Five data models for sharding and which is right | PGConf.ASIA 2018 | Craig K...
Five data models for sharding and which is right | PGConf.ASIA 2018 | Craig K...Five data models for sharding and which is right | PGConf.ASIA 2018 | Craig K...
Five data models for sharding and which is right | PGConf.ASIA 2018 | Craig K...
 
Monitoring Postgres at Scale | PGConf.ASIA 2018 | Lukas Fittl
Monitoring Postgres at Scale | PGConf.ASIA 2018 | Lukas FittlMonitoring Postgres at Scale | PGConf.ASIA 2018 | Lukas Fittl
Monitoring Postgres at Scale | PGConf.ASIA 2018 | Lukas Fittl
 
Real time analytics at any scale | PostgreSQL User Group NL | Marco Slot
Real time analytics at any scale | PostgreSQL User Group NL | Marco SlotReal time analytics at any scale | PostgreSQL User Group NL | Marco Slot
Real time analytics at any scale | PostgreSQL User Group NL | Marco Slot
 
Scaling Multi-tenant Applications Using the Django ORM & Postgres | PyCon Can...
Scaling Multi-tenant Applications Using the Django ORM & Postgres | PyCon Can...Scaling Multi-tenant Applications Using the Django ORM & Postgres | PyCon Can...
Scaling Multi-tenant Applications Using the Django ORM & Postgres | PyCon Can...
 
Python and PostgreSQL: Let's Work Together! | PyConFr 2018 | Dimitri Fontaine
Python and PostgreSQL: Let's Work Together! | PyConFr 2018 | Dimitri FontainePython and PostgreSQL: Let's Work Together! | PyConFr 2018 | Dimitri Fontaine
Python and PostgreSQL: Let's Work Together! | PyConFr 2018 | Dimitri Fontaine
 
What is HyperLogLog and Why You Will Love It | PostgreSQL Conference Europe 2...
What is HyperLogLog and Why You Will Love It | PostgreSQL Conference Europe 2...What is HyperLogLog and Why You Will Love It | PostgreSQL Conference Europe 2...
What is HyperLogLog and Why You Will Love It | PostgreSQL Conference Europe 2...
 

Recently uploaded

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusZilliz
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...apidays
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 

Recently uploaded (20)

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 

The Challenges of Distributing Postgres: A Citus Story | PostgresConf US 2018 | Ozgun Erdogan

  • 1. Challenges of Distributing Postgres: A Citus Story Ozgun Erdogan PGConf US Jersey City | April 2018
  • 2. Developers Love Postgres PostgreSQL MySQL MongoDB SQL Server + Oracle RDBMS: PostgreSQL, MySQL, Microsoft SQL Server, Oracle Ozgun Erdogan | QCon San Francisco 2017
  • 3. I love Postgres, too 3 Ozgun Erdogan | QCon San Francisco 2017 Ozgun Erdogan CTO of Citus Data Distributed Systems Distributed Databases Formerly of Amazon Love drinking margaritas
  • 4. 4
  • 5. Our mission at Citus Data 5 Ozgun Erdogan | QCon San Francisco 2017 Make it so SaaS businesses never have to worry about scaling their database again
  • 6. What is the Citus database? 1.Scales out PostgreSQL 2.Extension to PostgreSQL 3.Available in 3 Ways Ozgun Erdogan | QCon San Francisco 2017 • Using sharding & replication • Query engine parallelizes SQL queries across many nodes • Using PostgreSQL extension APIs
  • 7. Citus, Packaged Three Ways Ozgun Erdogan | QCon San Francisco 2017 Open Source Enterprise Software Fully-Managed Database as a Service github.com/citusdata/citus
  • 9. 3 Challenges Distributing Postgres 1. PostgreSQL and High Availability 2. PostgreSQL is huge. How to keep up with it 3. Distributed transactions Ozgun Erdogan | QCon San Francisco 2017
  • 10. PostgreSQL & High Availability (HA) Designing for a Cloud-native world 1
  • 11. Why is High Availability hard? PostgreSQL replication uses one primary & multiple secondary nodes. Two challenges: 1. Most Postgres clients aren’t smart. When the primary fails, they retry the same IP. 2. Postgres replicates entire state. This makes it resource intensive to reconstruct new nodes from a primary. Ozgun Erdogan | QCon San Francisco 2017
  • 12. Database Failures Should Be Transparent Ozgun Erdogan | QCon San Francisco 2017
  • 13. Database Failures Shouldn’t Be a Big Deal 1. PostgreSQL streaming replication to replicate from primary to secondary. Back up to S3. 2. Volume level replication to replicate to secondary’s volume. Back up to S3. 3. Incremental backups to S3. Reconstruct secondary nodes from S3. Ozgun Erdogan | QCon San Francisco 2017 3 Methods for HA & Backups in Postgres
  • 14. Postgres - Streaming Replication (1) Write-ahead logs (streaming repl.) Table foo Primary – PostgreSQL streaming repl. Table bar WAL logs Table foo Table bar WAL logs Secondary – PostgreSQL streaming repl. Monitoring Agents - streaming repl. setup & auto failover S3 / Blob Storage (Encrypted) Backup Process Ozgun Erdogan | QCon San Francisco 2017
  • 15. Postgres – AWS RDS & Azure (2) Postgres Primary Monitoring Agents (Auto node failover) Persistent Volume Postgres Standby S3 / Blob Storage (Encrypted) Table foo Table bar WAL logs Table foo Table bar WAL logs Backup process Backup Process Persistent Volume Ozgun Erdogan | QCon San Francisco 2017
  • 16. Postgres – Reconstruct from WAL (3) Postgres Primary Monitoring Agents (Auto node failover) Persistent Volume Postgres Secondary Backup Process S3 / Blob Storage (Encrypted) Table foo Table bar WAL logs Persistent Volume Table foo Table bar WAL logs Backup process Ozgun Erdogan | QCon San Francisco 2017
  • 17. WHO DOES THIS? PRIMARY BENEFITS Streaming Replication (local / ephemeral disk) On-prem Manual EC2 Simple to set up Direct I/O: High I/O & large storage Disk Mirroring RDS Azure Preview Works for MySQL and PostgreSQL Data durability in cloud environments Reconstruct from WAL Heroku Citus Cloud Enables Fork and PITR Node reconstruction in background (Data durability in cloud environments) How do these approaches compare? 17 Ozgun Erdogan | QCon San Francisco 2017
  • 18. Summary • In PostgreSQL, a database node’s state gets replicated in its entirety. The replication can be set up in three ways. • Reconstructing a secondary node from S3 makes bringing up or shooting down nodes easy. • When you shard your database, the state you need to replicate per node becomes smaller. Ozgun Erdogan | QCon San Francisco 2017
  • 19. PostgreSQL is huge How do you scale all features? 2
  • 20. 3 ways to build a distributed database 1. Build a distributed database from scratch 2. Middleware sharding (mimic the parser) 3. Fork your favorite database (like PostgreSQL) Ozgun Erdogan | QCon San Francisco 2017
  • 21. Example Transaction Block Ozgun Erdogan | QCon San Francisco 2017
  • 22. Postgres Features, Tools & Frameworks • PostgreSQL manual (US Letter) • Clients for diff programming languages • ORMs, libraries, GUIs • Tools (dump, restore, analyze) • New features Ozgun Erdogan | QCon San Francisco 2017
  • 23. At First, Forked PostgreSQL with Style Ozgun Erdogan | QCon San Francisco 2017
  • 24. Two Stage Query Optimization 1. Plan to minimize network I/O 2. Nodes talk to each other using SQL over libpq 3. Learned to cooperate with planner / executor bit by bit (Volcano style executor) Ozgun Erdogan | QCon San Francisco 2017
  • 25. Citus Architecture (Simplified) 25 SELECT avg(revenue) FROM sales Coordinator SELECT sum(revenue), count(revenue) FROM table_1001 SELECT sum … FROM table_1003 Worker node 1 Table metadata Table_1001 Table_1003 SELECT sum … FROM table_1002 SELECT sum … FROM table_1004 Worker node 2 Table_1002 Table_1004 Worker node N . . . . . . Each node PostgreSQL with Citus installed 1 shard = 1 PostgreSQL table Ozgun Erdogan | QCon San Francisco 2017
  • 26. Unfork Citus using Extension APIs CREATE EXTENSION citus; • System catalogs – Distributed metadata • Planner hook – Insert, Update, Delete, Select • Executor hook – Insert, Update, Delete, Select • Utility hook – Alter Table, Create Index, Vacuum, etc. • Transaction & resources handling – file descriptors, etc. • Background worker process – Maintenance processes (distributed deadlock detection, task tracker, etc.) • Logical decoding – Online data migrations Ozgun Erdogan | QCon San Francisco 2017
  • 27. PostgreSQL has transactions. How to handle distributed transactions 3
  • 29. Consistency in Distributed Databases 1. 2PC: All participating nodes need to be up 2. Paxos: Achieves consensus with quorum 3. Raft: More understandable alternative to Paxos Ozgun Erdogan | QCon San Francisco 2017
  • 30. Concurrency in Distributed Databases Ozgun Erdogan | QCon San Francisco 2017
  • 32. What is a Lock? • Protects against concurrent modifications. • Locks are released at the end of a transaction. Deadlocks
  • 33. Transactions Block on 1st Conflicting LockWhat is a lock? Protects against concurrent modifications Locks released at end of transaction BEGIN; UPDATE data SET y = 2 WHERE x = 1; <obtained lock on rows with x = 1> COMMIT; <all locks released> BEGIN; UPDATE data SET y = 5 WHERE x = 1; <waiting for lock on rows with x = 1> <obtained lock on rows with x = 1> COMMIT;
  • 34. Transactions and Concurrency • Transactions that don’t modify the same row can run concurrently. Transactions block on 1st lock that conflicts BEGIN; UPDATE data SET y = y - 1 WHERE x = 1; COMMIT; <all locks released> BEGIN; UPDATE data SET y = y + 1 WHERE x = 2; UPDATE data SET y = y + 1 WHERE x = 1; <waiting for lock on rows with x = 1> <obtained lock on rows with x = 1> COMMIT;
  • 35. Transactions and Concurrency • Transactions that don’t modify the same row can run concurrently. Transactions block on 1st lock that conflicts BEGIN; UPDATE data SET y = y - 1 WHERE x = 1; COMMIT; <all locks released> BEGIN; UPDATE data SET y = y + 1 WHERE x = 2; UPDATE data SET y = y + 1 WHERE x = 1; <waiting for lock on rows with x = 1> <obtained lock on rows with x = 1> COMMIT; (Distributed) deadlock! BEGIN; UPDATE data SET y = y - 1 WHERE x = 1; UPDATE data SET y = y + 1 WHERE x = 2; BEGIN; UPDATE data SET y = y - 1 WHERE x = 2; UPDATE data SET y = y + 1 WHERE x = 1; But what if they start blocking each other?
  • 36. Transactions and Concurrency • Transactions that don’t modify the same row can run concurrently. Transactions block on 1st lock that conflicts BEGIN; UPDATE data SET y = y - 1 WHERE x = 1; COMMIT; <all locks released> BEGIN; UPDATE data SET y = y + 1 WHERE x = 2; UPDATE data SET y = y + 1 WHERE x = 1; <waiting for lock on rows with x = 1> <obtained lock on rows with x = 1> COMMIT; (Distributed) deadlock! BEGIN; UPDATE data SET y = y - 1 WHERE x = 1; UPDATE data SET y = y + 1 WHERE x = 2; BEGIN; UPDATE data SET y = y - 1 WHERE x = 2; UPDATE data SET y = y + 1 WHERE x = 1; But what if they start blocking each other?Deadlock detection in PostgreSQL Deadlock detection builds a graph of processes that are waiting for each other.
  • 37. Transactions and Concurrency • Transactions that don’t modify the same row can run concurrently. Transactions block on 1st lock that conflicts BEGIN; UPDATE data SET y = y - 1 WHERE x = 1; COMMIT; <all locks released> BEGIN; UPDATE data SET y = y + 1 WHERE x = 2; UPDATE data SET y = y + 1 WHERE x = 1; <waiting for lock on rows with x = 1> <obtained lock on rows with x = 1> COMMIT; (Distributed) deadlock! BEGIN; UPDATE data SET y = y - 1 WHERE x = 1; UPDATE data SET y = y + 1 WHERE x = 2; BEGIN; UPDATE data SET y = y - 1 WHERE x = 2; UPDATE data SET y = y + 1 WHERE x = 1; But what if they start blocking each other?Deadlock detection in PostgreSQL Deadlock detection builds a graph of processes that are waiting for each other. Deadlock detection in PostgreSQL Transactions are cancelled until the cycle is gone
  • 38. Transactions and Concurrency • Transactions that don’t modify the same row can run concurrently. Transactions block on 1st lock that conflicts BEGIN; UPDATE data SET y = y - 1 WHERE x = 1; COMMIT; <all locks released> BEGIN; UPDATE data SET y = y + 1 WHERE x = 2; UPDATE data SET y = y + 1 WHERE x = 1; <waiting for lock on rows with x = 1> <obtained lock on rows with x = 1> COMMIT; (Distributed) deadlock! BEGIN; UPDATE data SET y = y - 1 WHERE x = 1; UPDATE data SET y = y + 1 WHERE x = 2; BEGIN; UPDATE data SET y = y - 1 WHERE x = 2; UPDATE data SET y = y + 1 WHERE x = 1; But what if they start blocking each other?Deadlock detection in PostgreSQL Deadlock detection builds a graph of processes that are waiting for each other. Deadlock detection in PostgreSQL Transactions are cancelled until the cycle is gone Firstname Lastname | Citus Data | Meeting Name | Month Year Deadlocks in Citus Citus delegates transactions to nodes Firstname Lastname | Citus Data | Meeting Name | Month Year Deadlocks in Citus Citus delegates transactions to nodes
  • 39. Transactions and Concurrency • Transactions that don’t modify the same row can run concurrently. Transactions block on 1st lock that conflicts BEGIN; UPDATE data SET y = y - 1 WHERE x = 1; COMMIT; <all locks released> BEGIN; UPDATE data SET y = y + 1 WHERE x = 2; UPDATE data SET y = y + 1 WHERE x = 1; <waiting for lock on rows with x = 1> <obtained lock on rows with x = 1> COMMIT; (Distributed) deadlock! BEGIN; UPDATE data SET y = y - 1 WHERE x = 1; UPDATE data SET y = y + 1 WHERE x = 2; BEGIN; UPDATE data SET y = y - 1 WHERE x = 2; UPDATE data SET y = y + 1 WHERE x = 1; But what if they start blocking each other?Deadlock detection in PostgreSQL Deadlock detection builds a graph of processes that are waiting for each other. Deadlock detection in PostgreSQL Transactions are cancelled until the cycle is gone Firstname Lastname | Citus Data | Meeting Name | Month Year Deadlocks in Citus Citus delegates transactions to nodes Firstname Lastname | Citus Data | Meeting Name | Month Year Deadlocks in Citus Citus delegates transactions to nodes Firstname Lastname | Citus Data | Meeting Name | Month Year Deadlocks in Citus PostgreSQL’s deadlock detector still works
  • 40. Transactions and Concurrency • Transactions that don’t modify the same row can run concurrently. Transactions block on 1st lock that conflicts BEGIN; UPDATE data SET y = y - 1 WHERE x = 1; COMMIT; <all locks released> BEGIN; UPDATE data SET y = y + 1 WHERE x = 2; UPDATE data SET y = y + 1 WHERE x = 1; <waiting for lock on rows with x = 1> <obtained lock on rows with x = 1> COMMIT; (Distributed) deadlock! BEGIN; UPDATE data SET y = y - 1 WHERE x = 1; UPDATE data SET y = y + 1 WHERE x = 2; BEGIN; UPDATE data SET y = y - 1 WHERE x = 2; UPDATE data SET y = y + 1 WHERE x = 1; But what if they start blocking each other?Deadlock detection in PostgreSQL Deadlock detection builds a graph of processes that are waiting for each other. Deadlock detection in PostgreSQL Transactions are cancelled until the cycle is gone Firstname Lastname | Citus Data | Meeting Name | Month Year Deadlocks in Citus Citus delegates transactions to nodes Firstname Lastname | Citus Data | Meeting Name | Month Year Deadlocks in Citus Citus delegates transactions to nodes Firstname Lastname | Citus Data | Meeting Name | Month Year Deadlocks in Citus PostgreSQL’s deadlock detector still works Firstname Lastname | Citus Data | Meeting Name | Month Year Deadlocks in Citus When deadlocks span across node, PostgreSQL cannot help us Firstname Lastname | Citus Data | Meeting Name | Month Year Deadlocks in Citus When deadlocks span across node, PostgreSQL cannot help us
  • 41. Transactions and Concurrency • Transactions that don’t modify the same row can run concurrently. Transactions block on 1st lock that conflicts BEGIN; UPDATE data SET y = y - 1 WHERE x = 1; COMMIT; <all locks released> BEGIN; UPDATE data SET y = y + 1 WHERE x = 2; UPDATE data SET y = y + 1 WHERE x = 1; <waiting for lock on rows with x = 1> <obtained lock on rows with x = 1> COMMIT; (Distributed) deadlock! BEGIN; UPDATE data SET y = y - 1 WHERE x = 1; UPDATE data SET y = y + 1 WHERE x = 2; BEGIN; UPDATE data SET y = y - 1 WHERE x = 2; UPDATE data SET y = y + 1 WHERE x = 1; But what if they start blocking each other?Deadlock detection in PostgreSQL Deadlock detection builds a graph of processes that are waiting for each other. Deadlock detection in PostgreSQL Transactions are cancelled until the cycle is gone Firstname Lastname | Citus Data | Meeting Name | Month Year Deadlocks in Citus Citus delegates transactions to nodes Firstname Lastname | Citus Data | Meeting Name | Month Year Deadlocks in Citus Citus delegates transactions to nodes Firstname Lastname | Citus Data | Meeting Name | Month Year Deadlocks in Citus PostgreSQL’s deadlock detector still works Firstname Lastname | Citus Data | Meeting Name | Month Year Deadlocks in Citus When deadlocks span across node, PostgreSQL cannot help us Firstname Lastname | Citus Data | Meeting Name | Month Year Deadlocks in Citus When deadlocks span across node, PostgreSQL cannot help us Firstname Lastname | Citus Data | Meeting Name | Month Year Deadlock detection in Citus 7 Citus 7 adds distributed deadlock detection
  • 42. Transactions and Concurrency • Transactions that don’t modify the same row can run concurrently. Transactions block on 1st lock that conflicts BEGIN; UPDATE data SET y = y - 1 WHERE x = 1; COMMIT; <all locks released> BEGIN; UPDATE data SET y = y + 1 WHERE x = 2; UPDATE data SET y = y + 1 WHERE x = 1; <waiting for lock on rows with x = 1> <obtained lock on rows with x = 1> COMMIT; (Distributed) deadlock! BEGIN; UPDATE data SET y = y - 1 WHERE x = 1; UPDATE data SET y = y + 1 WHERE x = 2; BEGIN; UPDATE data SET y = y - 1 WHERE x = 2; UPDATE data SET y = y + 1 WHERE x = 1; But what if they start blocking each other?Deadlock detection in PostgreSQL Deadlock detection builds a graph of processes that are waiting for each other. Deadlock detection in PostgreSQL Transactions are cancelled until the cycle is gone Firstname Lastname | Citus Data | Meeting Name | Month Year Deadlocks in Citus Citus delegates transactions to nodes Firstname Lastname | Citus Data | Meeting Name | Month Year Deadlocks in Citus Citus delegates transactions to nodes Firstname Lastname | Citus Data | Meeting Name | Month Year Deadlocks in Citus PostgreSQL’s deadlock detector still works Firstname Lastname | Citus Data | Meeting Name | Month Year Deadlocks in Citus When deadlocks span across node, PostgreSQL cannot help us Firstname Lastname | Citus Data | Meeting Name | Month Year Deadlocks in Citus When deadlocks span across node, PostgreSQL cannot help us Firstname Lastname | Citus Data | Meeting Name | Month Year Deadlock detection in Citus 7 Citus 7 adds distributed deadlock detection Firstname Lastname | Citus Data | Meeting Name | Month Year Deadlock detection in Citus 7 Citus 7 adds distributed deadlock detection.
  • 43. Distributed transactions are a complex topic • Most articles on distributed transactions focus on data consistency. • Data consistency is only one side of the coin. If you’re using a relational database, your application benefits from another key feature: deadlock detection. • https://www.citusdata.com/blog/2017/08/31/databases -and-distributed-deadlocks-a-faq Ozgun Erdogan | QCon San Francisco 2017
  • 44. So now what? We talked about 3 challenges distributing Postgres 1. PostgreSQL, Replication, High Availability 2. Tradeoffs in building a distributed database— and how we chose PostgreSQL s extension APIs 3. Distributed deadlock detection & distributed transactions Ozgun Erdogan | QCon San Francisco 2017
  • 45. 45 Solving a very complex problem
  • 46. 46 SQL is hard, not impossible, to scale
  • 47. © 2017 Citus Data. All right reserved. ozgun@citusdata.com Questions? @citusdata Ozgun Erdogan www.citusdata.com