SlideShare une entreprise Scribd logo
1  sur  43
SCALING OUR SAAS BACKEND
WITH POSTGRESQL
OLIVER SEEMANN, BIDMANAGEMENT GMBH
BWB MEETUP, 2013-10-28
THIS TALK IS ABOUT …
THIS TALK IS ABOUT …

Gigabytes

Terabytes
PRODUCTIVITY TOOLS FOR
ONLINE MARKETERS

Automatic Bid Management for
Auctioned Ads

“Organic” Search
SIGNIFICANT AMOUNTS OF DATA

10.000 Campaigns
5 Mio Keywords
4 Mio Ads
per AdWords account
SIGNIFICANT AMOUNTS OF DATA

Full History for all objects
over full lifetime
SLOW AND FAST DATA

“Slow” / OLAP data for
batch-processing jobs
“Fast” / OLTP data for
human interaction
INITIALLY SEPARATE

Slow
Data

Fast
Data
A LOT OF OVERLAP

Slow
Data

Fast
Data
THEN ONLY ONE

Slow
Data

Fast
Data
CURRENTLY

7 machines running PostgreSQL
3 Terabytes Data
Thousands of Databases
Largest Table: 120GB
HOW IT BEGAN

Experiment
DESIGN BY THE BOOK
Scenario
PK,FK1
PK,FK1
PK

Customer
PK

customer_id

Account

Campaign

Adgroup

PK

user_id

FK1

customer_id

account_id

PK

campaign_id

PK

adgroup_id

FK1

User

PK

customer_id

FK1

account_id

FK1

campaign_id

UserAccountAccess
PK,FK1
PK,FK2

account_id
user_id

History
PK
PK,FK1
PK,FK1,FK2

day
keyword_id
adgroup_id

keyword_id
adgroup_id
factor

Keywords
PK,FK1
PK

adgroup_id
keyword_id
MORE CUSTOMERS – MORE DATA
PARTITIONING
All Accounts
Account 1 – Rec 1
Account 2 – Rec 1
Account 1 – Rec 2
Account 3 – Rec 1

Account 2 – Rec 2
Account 2 – Rec 3
Account 1 – Rec 3

Account 3 – Rec 2
PARTITIONING
Account 1

Account 2

Account 3

Account 1 – Rec 1

Account 2 – Rec 1

Account 3 – Rec 1

Account 1 – Rec 2

Account 2 – Rec 2

Account 3 – Rec 2

Account 1 – Rec 3

Account 2 – Rec 3

Account 3 – Rec 3
PARTITION WITH INHERITANCE

SELECT

Child

Parent

INSERT

Child

CHECK CONSTRAINTS

Child
ISOLATE ACCOUNTS

One DB

Many DBs
PARTITIONING VIA DATABASES

Excellent horizontal scaling
Easy cloning
pg_dump/pg_restore
Some Overhead
No direct references
WHY NOT SCHEMAS?

More lightweight
Full References
No easy cloning
No schemas inside schemas
SETUP

main

machine-1

machine-0
machine-2
DB HARDWARE

Data > RAM
⇒ High I/O
EC2?
MIGRATION TO EC2

Must migrate all/most machines
No PostgreSQL in RDS
DB Instances run 24/7 ⇒ costly
EBS Performance limited
EBS I/O LIMITED
MB/s
900
800
700
600
500
400
300
200
100
0

Seq. Write
Seq. Read

AWS Instance AWS EBS (Raid-0)
Storage SSD (Raid0)

Real 15k SAS2
(Raid-10)
DEDICATED MACHINES

Moderate CPU / RAM
Fast Disks
Battery-backed caching controller
ALTERNATIVE HW

Use bigger (and slower) SATA drives
Evaluate EC2+EBS in production
SSDs
HARDWARE FAILS

Replication

Master

Slave

Availability
Query Load Balancing
REPLICATION
Account DBs

Main DB
master-1

master

slave-1

master-2

slave-2

slave
BACKUPS

pg_dump
compressed

Backup Server
REPLICATION
Account DBs

Main DB
master-1

master

slave-1

master-2

slave-2

slave
REPLICATION
Account DBs

Main DB
master-1

master

slave-1

master-2

slave-2

slave
REPLICATION
Account DBs

Main DB
master-1

master

master-3

master-2

master-4

slave
DISASTER RESTORE

concurrent
pg_restore

Backup Server
PERFORMANCE PROBLEMS
Too many concurrent full table scans
From 300MB/s to 30MB/s
MORE CONCURRENT
QUERIES

LONGER QUERY RUNTIME
DIFFERENT APPS

Web App
Server

Compute
Cluster

Many fast
queries

Few very
slow queries
DIFFERENT APPS
Semaphore

Web App
Server

Many fast queries

Compute
Cluster

Few very slow queries

Simple counting semaphore using Advisory Locks
Implemented in the application
BULK INSERTS

INSERT
20k – 80k
per sec

50M
BULK INSERT BEST PRACTICE

COPY instead of INSERT
Drop indexes + recreate
Truncate
COPY into a new table, swap + drop
SIGNUP PROBLEMS

Adspert
Service

Signup
CREATE
DATABASE

Up to 5-10 min
PRE-CREATE DATABASES

Create DBs ahead of time
New signups rename DBs
Periodically create new
Fall back to direct create
CONCLUDING ..

Partitioning into Databases
Physical Hardware
Check out advisory locks
THANKS FOR LISTENING

QUESTIONS?

Contenu connexe

Dernier

Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...AliaaTarek5
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditSkynet Technologies
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 

Dernier (20)

Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance Audit
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 

En vedette

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by HubspotMarius Sescu
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTExpeed Software
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsPixeldarts
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthThinkNow
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfmarketingartwork
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 

En vedette (20)

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPT
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 

Scaling our SaaS backend with PostgreSQL

Notes de l'éditeur

  1. Hi, I’m Oliver, I’m a software developer, currently heading the development team at Bidmanagement GmbH in Berlin.
  2. I’m going to talk aboutpostgresqlNot so much about the dbms itself, but more about how we’re using it as main datastore in our system.
  3. About how in our company we're running a large Postgresql installationHow we‘ve grown our setup
  4. ----- Meeting Notes (27/10/13 11:10) -----very popularbillions of dollarsvery important online marketing channel
  5. Google provides a very extensive API
  6. ----- Meeting Notes (23/10/13 22:22) -----The different kinds of data we store can be largely separated into two groups.
  7. .. And we decided to go with postgresql, because:Our Go-To tool for storing data for many yearsProblems from time to time, but..We never looked back
  8. But it began much smaller …
  9. Straightforward approachNobody thought of scaling
  10. Pilots successful, we started to acquire customersSoon >10mio rows in some tablesQuery performance lagged (many FTS) Did not want to scale horizontally, because we aspired much bigger growth(Also: expensive)----- Meeting Notes (24/10/13 20:45) -----vertically
  11. PostgreSQL supports partitioning via inheritance[insert scheme]Use CHECK constraints to tell Query Planner where to lookCannot insert into parent table, must insert into child tableLot of effort goes to application logicTried it on one table, weren’t it conviced
  12. One main db with non-account specific dataCurrently ~ 1-2 GBSeveral machines dedicated to account-databases50-1000 DBs per machinePostgreSQL 9.0 and 9.3 on each machineAllows us to migrate one db after another
  13. Partitioning scheme allows easy horizontal scaling More machines. But which?Dataset does not fit in RAM High I/O requirementsAWS EC2?Must migrate all/most machines due to latencyDB Instances run 24/7  costlyEBS Performance limited (GBit Ethernet)[ec2 / ebs performance numbers vs. physical]----- Meeting Notes (24/10/13 20:45) -----Add: not many core
  14. Not that much elasticity requiredAs B2B our growth is more predictableBatch processing of expensive backend jobs1 year EC2 instance ≅ Buying one physical serverUsing mid-sized machinesGood price/value ratio
  15. SATA: 600GB vs 3 TBEC2: performance, latency unclear. Evaluate to make informed decisionSSDs: expensive. Reliable? Raid?
  16. But when things go awry and data gets deleted …
  17. Big cheap HDDs
  18. But when things go awry and data gets deleted …
  19. But when things go awry and data gets deleted …
  20. MainDB still replicatedTo enable quick failoverHere we can’t afford extended downtime
  21. Capacity doubled, cost reduced 40%The more servers, the faster the restoreGbit Ethernet on backup server is limiting factor
  22. From sequential reads to random readsFeedback loop:
  23. Webapp-queries with humans waiting are quite fastProblematic queries done by the analysis jobsFrequent full table scansQueries with huge resultsNeed way to synchronize queries, control concurrencyCould use a connection poolerOr an external synchronization mechanisme.g. Zookeeper
  24. Webapp-queries with humans waiting are quite fastProblematic queries done by the analysis jobsFrequent full table scansQueries with huge resultsNeed way to synchronize queries, control concurrencyCould use a connection poolerOr an external synchronization mechanisme.g. Zookeeper
  25. We rewrite the history every day (for various reasons)Conversions arrive up to 30 days laterCampaigns are added to optimizationFor most accounts <1M recordsFor some 10-100MWe achieve up to 80k inserts/secNetwork is bottleneck [check this]
  26. We use COPY for all bulk inserts, even small bulksDrop/recreate with simple plpgsql functionsFor complete table rewritesTRUNCATE is not transaction safe
  27. We added a self-service signup2-minute process to add AdWords account to the systemOAuth User Info  Optimization BootstrapBiggest problem:CREATE DATABASE can take several minutesDepends on current amount of write activity
  28. We know always keep 10-20 spare databases in stockWe control target host for new databases this wayTake care not to have race conditions when applying schema changes