SlideShare une entreprise Scribd logo
1  sur  37
Télécharger pour lire hors ligne
Why Zalando
trusts in PostgreSQL
A developer’s view on using the most advanced
open-source database
Henning Jacobs - Technical Lead Platform/Software
Zalando GmbH
Valentine Gogichashvili - Technical Lead Platform/Database
Zalando GmbH
GOTO Berlin 2013, 2013-10-18
Who we are
Henning Jacobs <henning.jacobs@zalando.de>
with Zalando since 2010
NO DBA, NO PostgreSQL Expert
Valentine Gogichashvili <valentine.gogichashvili@zalando.de>
with Zalando since 2010
DBA, PostgreSQL Expert
3/37
About Zalando
14 countries
> 1 billion € revenue 2012
> 150,000 products
3 warehouses
Europe's largest online fashion retailer
·
·
·
·
·
4/37
Our ZEOS Platform
5/37
Our Main Stack
Java → PostgreSQL
Tomcat
CXF-WS
API Schema
Data Schemas
JDBC
SProcWrapper
Main/Production
Postgres
Java 7, Tomcat 7, PostgreSQL 9.0–9.3
6/37
Our Main Stacks
Different Use Cases — same Database
Tomcat
CXF-WS
API Schema
Data Schemas
JDBC
SProcWrapper
Tomcat
CXF-WS
Data Schemas
JDBC
JPA
CherryPy
HTTP/REST
Data Schemas
psycopg2
SQLAlchemy
Main/Production CRUD/JPA Scripting/Python
Postgres Postgres Postgres
Java 7, Tomcat 7, Python 2.7, PostgreSQL 9.0–9.3
7/37
Some Numbers
> 90 deployment units (WARs)
> 800 production Tomcat instances
> 50 different databases
> 90 database master instances
> 5 TB of PostgreSQL data
> 200 developers, 8 DBAs
·
·
·
·
·
·
8/37
Stored Procedures
register_customer(..)
find_orders(..)
...
z_api_v13_42API Schemas
stored procedures,
custom types, ...
register_customer(..)
find_orders(..)
...
Data Schema(s)
tables, views, ...
SET search_path ....
SELECT register_customer(...)
customer
order
order_address
...
z_data
9/37
Stored Procedures
1:1 Mapping using SProcWrapper
@SProcService
public interface CustomerSProcService {
@SProcCall
int registerCustomer(@SProcParam String email, @SProcParam Gender gender);
}
JAVA
CREATE FUNCTION register_customer(p_email text, p_gender z_data.gender)
RETURNS int AS
$$
INSERT INTO z_data.customer (c_email, c_gender)
VALUES (p_email, p_gender)
RETURNING c_id
$$
LANGUAGE 'sql' SECURITY DEFINER;
SQL
10/37
Stored Procedures
Complex Types
@SProcCall
List<Order> findOrders(@SProcParam String email);
JAVA
CREATE FUNCTION find_orders(p_email text,
OUT order_id int,
OUT order_date timestamp,
OUT shipping_address order_address) RETURNS SETOF RECORD AS
$$
SELECT o_id, o_date, ROW(oa_street, oa_city, oa_country)::order_address
FROM z_data.order
JOIN z_data.order_address ON oa_order_id = o_id
JOIN z_data.customer ON c_id = o_customer_id
WHERE c_email = p_email
$$
LANGUAGE 'sql' SECURITY DEFINER;
SQL
11/37
Stored Procedures
Experience
Performance benefits
Easy to change live behavior
Validation close to data
Simple transaction scope
Makes moving to new software version easy
Cross language API layer
CRUD ⇒ JPA
·
·
·
·
·
·
·
12/37
Stored Procedures
Rolling out database changes
API versioning
Modularization
DB-Diffs
·
Automatic roll-out during deployment
Java backend selects "right" API version
-
-
·
shared SQL gets own Maven artifacts
feature/library bundles of Java+SQL
-
-
·
SQL scripts for database changes
Review process
Tooling support
-
-
-
13/37
Stored Procedures & Constraints
Protect Your Most Valuable Asset: Your Data
14/37
Constraints
Ensuring Data Quality
Simple SQL expressions:
Combining check constraints and stored procedures:
CREATE TABLE z_data.host (
h_hostname varchar(63) NOT NULL UNIQUE CHECK (h_hostname ~ '^[a-z][a-z0-9-]*$'),
h_ip inet UNIQUE CHECK (masklen(h_ip) = 32),
h_memory_bytes bigint CHECK (h_memory_bytes > 8*1024*1024)
);
COMMENT ON COLUMN z_data.host.h_memory_bytes IS 'Main memory (RAM) of host in Bytes';
SQL
CREATE TABLE z_data.customer (
c_email text NOT NULL UNIQUE CHECK (is_valid_email(c_email)),
..
);
SQL
15/37
Constraints
What about MySQL?
The MySQL manual says:
The CHECK clause is parsed but ignored
by all storage engines.
Open Bug ticket since 2004: http://bugs.mysql.com/bug.php?id=3464
http://dev.mysql.com/doc/refman/5.7/en/create-table.html
16/37
Constraints
Custom Type Validation using Triggers
value:
"john.doe@example.org"
key:
recipientEmail
type:
EMAIL_ADDRESS
ct_id : serial
ct_name : text
ct_type : base_type
ct_regex : text
...
config_type
cv_id : serial
cv_key : text
cv_value : json
cv_type_id : int
config_value
17/37
Custom Type Validation using Triggers
CREATE FUNCTION validate_value_trigger() RETURNS trigger AS
$$
BEGIN
PERFORM validate_value(NEW.cv_value, NEW.cv_type_id);
END;
$$
LANGUAGE 'plpgsql';
SQL
CREATE FUNCTION validate_value(p_value json, p_type_id int) RETURNS void AS
$$
import json
import re
# ... Python code, see next slide
$$
LANGUAGE 'plpythonu';
SQL
18/37
Custom Type Validation with PL/Python
class TypeValidator(object):
@staticmethod
def load_by_id(id):
tdef = plpy.execute('SELECT * FROM config_type WHERE ct_id = %d' % int(id))[0]
return _get_validator(tdef)
def check(self, condition, msg):
if not condition:
raise ValueError('Value does not match the type "%s". Details: %s' %
(self.type_name, msg))
class BooleanValidator(TypeValidator):
def validate(self, value):
self.check(type(value) is bool, 'Boolean expected')
validator = TypeValidator.load_by_id(p_type_id)
validator.validate(json.loads(p_value))
PYTHON
19/37
Scaling?
20/37
Sharding
Scaling Horizontally
App
Shard 1
Cust. A-F
Shard 2
Cust. G-K Cust. Y-Z
Shard N
SProcWrapper
21/37
Sharding
SProcWrapper Support
Sharding customers by ID:
Sharding articles by SKU (uses MD5 hash):
Collecting information from all shards concurrently:
@SProcCall
int registerCustomer(@SProcParam @ShardKey int customerId,
@SProcParam String email, @SProcParam Gender gender);
JAVA
@SProcCall
Article getArticle(@SProcParam @ShardKey Sku sku);
JAVA
@SProcCall(runOnAllShards = true, parallel = true)
List<Order> findOrders(@SProcParam String email);
JAVA
22/37
Sharding
Auto Partitioning Collections
@SProcCall(parallel = true)
void updateStockItems(@SProcParam @ShardKey List<StockItem> items);
JAVA
23/37
Sharding
Bitmap Data Source Providers
24/37
Sharding
Experience
Scalability without sacrificing any SQL features
Start with a reasonable number of shards (8)
Some tooling required
Avoid caching if you can and scale horizontally
·
·
·
·
25/37
Fail Safety
Replication
WAL
WAL Slave
with 1 h delay
Hot Standby
for readonly queries
Service IP
App
All databases use streaming replication
Every database has a (hot) standby and a delayed slave
·
·
26/37
Fail Safety
Failover
Service IP for switching/failover
Monitoring with Java and custom web frontend
Failover is manual task
·
·
·
27/37
Fail Safety
General Remarks
Good hardware
No downtimes allowed
Two data centers
Dedicated 24x7 team
Maintenance
·
G8 servers from HP
ECC RAM, RAID
-
-
·
Major PostgreSQL version upgrades?-
·
·
·
Concurrent index rebuild, table compactor-
28/37
Monitoring
You need it...
Nagios/Icinga
PGObserver
pg_view
·
·
·
29/37
Monitoring
PGObserver
30/37
Monitoring
PGObserver
Locate hot spots
Helps tuning DB performance
Creating the right indices
often a silver bullet
·
Frequent stored
procedure calls
Long running stored
procedures
I/O patterns
-
-
-
·
·
31/37
Monitoring
pg_view
Top-like command
line utility
DBA's daily tool
Analyze live
problems
Monitor data
migrations
·
·
·
·
32/37
NoSQL
Relational is dead?
If your project is not particularly vital and/or your team is
not particularly good, use relational databases!
Martin Fowler, GOTO Aarhus 2012
People vs. NoSQL, GOTO Aarhus 2012
33/37
NoSQL
Relational is dead?
The key goals of F1's design are:
3. Consistency: The system must provide ACID
transactions, [..]
4. Usability: The system must provide full SQL query
support [..]
Features like indexes and ad hoc query are not just nice
to have, but absolute requirements for our business.
F1: A Distributed SQL Database That Scales
34/37
Document
Column-Family
Key-Value Graph
Aggregate-Oriented
Schemaless
NoSQL – Comparison
Aggregate oriented?
Schemaless?
Scaling?
Auth*?
Use cases? ⇒ "Polyglot Persistence"
·
SProcWrapper
Changes?
-
-
·
⇒ Implicit Schema
HStore, JSON
-
-
·
·
·
Introduction to NoSQL, GOTO Aarhus 2012
35/37
YeSQL
PostgreSQL at Zalando
Fast, stable and well documented code base
Performs as well as (and outperforms some) NoSQL
databases while retaining deeper exibility
Scaling horizontally can be easy
Know your tools — whatever they are!
·
·
·
·
PostgreSQL and NoSQL
36/37
Thank You!
Please Visit also
tech.zalando.com
Please rate this session!
SProcWrapper – Java library for stored procedure access
github.com/zalando/java-sproc-wrapper
PGObserver – monitoring web tool for PostgreSQL
github.com/zalando/PGObserver
pg_view – top-like command line activity monitor
github.com/zalando/pg_view
·
·
·
37/37

Contenu connexe

Tendances

pgDay Asia 2016 - Swapping Pacemaker-Corosync for repmgr (1)
pgDay Asia 2016 - Swapping Pacemaker-Corosync for repmgr (1)pgDay Asia 2016 - Swapping Pacemaker-Corosync for repmgr (1)
pgDay Asia 2016 - Swapping Pacemaker-Corosync for repmgr (1)
Wei Shan Ang
 
How does PostgreSQL work with disks: a DBA's checklist in detail. PGConf.US 2015
How does PostgreSQL work with disks: a DBA's checklist in detail. PGConf.US 2015How does PostgreSQL work with disks: a DBA's checklist in detail. PGConf.US 2015
How does PostgreSQL work with disks: a DBA's checklist in detail. PGConf.US 2015
PostgreSQL-Consulting
 

Tendances (19)

Streaming huge databases using logical decoding
Streaming huge databases using logical decodingStreaming huge databases using logical decoding
Streaming huge databases using logical decoding
 
Postgresql database administration volume 1
Postgresql database administration volume 1Postgresql database administration volume 1
Postgresql database administration volume 1
 
Troubleshooting PostgreSQL Streaming Replication
Troubleshooting PostgreSQL Streaming ReplicationTroubleshooting PostgreSQL Streaming Replication
Troubleshooting PostgreSQL Streaming Replication
 
Out of the box replication in postgres 9.4
Out of the box replication in postgres 9.4Out of the box replication in postgres 9.4
Out of the box replication in postgres 9.4
 
Patroni - HA PostgreSQL made easy
Patroni - HA PostgreSQL made easyPatroni - HA PostgreSQL made easy
Patroni - HA PostgreSQL made easy
 
Connection Pooling in PostgreSQL using pgbouncer
Connection Pooling in PostgreSQL using pgbouncer Connection Pooling in PostgreSQL using pgbouncer
Connection Pooling in PostgreSQL using pgbouncer
 
PostgreSQL High Availability in a Containerized World
PostgreSQL High Availability in a Containerized WorldPostgreSQL High Availability in a Containerized World
PostgreSQL High Availability in a Containerized World
 
PostgreSQL Terminology
PostgreSQL TerminologyPostgreSQL Terminology
PostgreSQL Terminology
 
pgDay Asia 2016 - Swapping Pacemaker-Corosync for repmgr (1)
pgDay Asia 2016 - Swapping Pacemaker-Corosync for repmgr (1)pgDay Asia 2016 - Swapping Pacemaker-Corosync for repmgr (1)
pgDay Asia 2016 - Swapping Pacemaker-Corosync for repmgr (1)
 
Logical replication with pglogical
Logical replication with pglogicalLogical replication with pglogical
Logical replication with pglogical
 
Streaming replication in PostgreSQL
Streaming replication in PostgreSQLStreaming replication in PostgreSQL
Streaming replication in PostgreSQL
 
Out of the box replication in postgres 9.4(pg confus)
Out of the box replication in postgres 9.4(pg confus)Out of the box replication in postgres 9.4(pg confus)
Out of the box replication in postgres 9.4(pg confus)
 
PostgreSQL Troubleshoot On-line, (RITfest 2015 meetup at Moscow, Russia).
PostgreSQL Troubleshoot On-line, (RITfest 2015 meetup at Moscow, Russia).PostgreSQL Troubleshoot On-line, (RITfest 2015 meetup at Moscow, Russia).
PostgreSQL Troubleshoot On-line, (RITfest 2015 meetup at Moscow, Russia).
 
Go replicator
Go replicatorGo replicator
Go replicator
 
PostgreSQL and RAM usage
PostgreSQL and RAM usagePostgreSQL and RAM usage
PostgreSQL and RAM usage
 
Troubleshooting PostgreSQL with pgCenter
Troubleshooting PostgreSQL with pgCenterTroubleshooting PostgreSQL with pgCenter
Troubleshooting PostgreSQL with pgCenter
 
How does PostgreSQL work with disks: a DBA's checklist in detail. PGConf.US 2015
How does PostgreSQL work with disks: a DBA's checklist in detail. PGConf.US 2015How does PostgreSQL work with disks: a DBA's checklist in detail. PGConf.US 2015
How does PostgreSQL work with disks: a DBA's checklist in detail. PGConf.US 2015
 
GitLab PostgresMortem: Lessons Learned
GitLab PostgresMortem: Lessons LearnedGitLab PostgresMortem: Lessons Learned
GitLab PostgresMortem: Lessons Learned
 
Adding replication protocol support for psycopg2
Adding replication protocol support for psycopg2Adding replication protocol support for psycopg2
Adding replication protocol support for psycopg2
 

Similaire à GOTO 2013: Why Zalando trusts in PostgreSQL

Introduction to PostgreSQL for System Administrators
Introduction to PostgreSQL for System AdministratorsIntroduction to PostgreSQL for System Administrators
Introduction to PostgreSQL for System Administrators
Jignesh Shah
 
[SSA] 03.newsql database (2014.02.05)
[SSA] 03.newsql database (2014.02.05)[SSA] 03.newsql database (2014.02.05)
[SSA] 03.newsql database (2014.02.05)
Steve Min
 
Stored-Procedures-Presentation
Stored-Procedures-PresentationStored-Procedures-Presentation
Stored-Procedures-Presentation
Chuck Walker
 
NoCOUG_201411_Patel_Managing_a_Large_OLTP_Database
NoCOUG_201411_Patel_Managing_a_Large_OLTP_DatabaseNoCOUG_201411_Patel_Managing_a_Large_OLTP_Database
NoCOUG_201411_Patel_Managing_a_Large_OLTP_Database
Paresh Patel
 

Similaire à GOTO 2013: Why Zalando trusts in PostgreSQL (20)

11g R2
11g R211g R2
11g R2
 
Java ee7 with apache spark for the world's largest credit card core systems, ...
Java ee7 with apache spark for the world's largest credit card core systems, ...Java ee7 with apache spark for the world's largest credit card core systems, ...
Java ee7 with apache spark for the world's largest credit card core systems, ...
 
DataEng Mad - 03.03.2020 - Tibero 30-min Presentation.pdf
DataEng Mad - 03.03.2020 - Tibero 30-min Presentation.pdfDataEng Mad - 03.03.2020 - Tibero 30-min Presentation.pdf
DataEng Mad - 03.03.2020 - Tibero 30-min Presentation.pdf
 
6 tips for improving ruby performance
6 tips for improving ruby performance6 tips for improving ruby performance
6 tips for improving ruby performance
 
Empowering the AWS DynamoDB™ application developer with Alternator
Empowering the AWS DynamoDB™ application developer with AlternatorEmpowering the AWS DynamoDB™ application developer with Alternator
Empowering the AWS DynamoDB™ application developer with Alternator
 
Introduction to PostgreSQL for System Administrators
Introduction to PostgreSQL for System AdministratorsIntroduction to PostgreSQL for System Administrators
Introduction to PostgreSQL for System Administrators
 
20181116 Massive Log Processing using I/O optimized PostgreSQL
20181116 Massive Log Processing using I/O optimized PostgreSQL20181116 Massive Log Processing using I/O optimized PostgreSQL
20181116 Massive Log Processing using I/O optimized PostgreSQL
 
22by7 and DellEMC Tech Day July 20 2017 - Power Edge
22by7 and DellEMC Tech Day July 20 2017 - Power Edge22by7 and DellEMC Tech Day July 20 2017 - Power Edge
22by7 and DellEMC Tech Day July 20 2017 - Power Edge
 
Sql server performance tuning
Sql server performance tuningSql server performance tuning
Sql server performance tuning
 
Testing Persistent Storage Performance in Kubernetes with Sherlock
Testing Persistent Storage Performance in Kubernetes with SherlockTesting Persistent Storage Performance in Kubernetes with Sherlock
Testing Persistent Storage Performance in Kubernetes with Sherlock
 
Sherlock holmes for dba’s
Sherlock holmes for dba’sSherlock holmes for dba’s
Sherlock holmes for dba’s
 
Dot Net Application Monitoring
Dot Net Application MonitoringDot Net Application Monitoring
Dot Net Application Monitoring
 
[SSA] 03.newsql database (2014.02.05)
[SSA] 03.newsql database (2014.02.05)[SSA] 03.newsql database (2014.02.05)
[SSA] 03.newsql database (2014.02.05)
 
Store procedures
Store proceduresStore procedures
Store procedures
 
Stored-Procedures-Presentation
Stored-Procedures-PresentationStored-Procedures-Presentation
Stored-Procedures-Presentation
 
Cloud Native Applications on OpenShift
Cloud Native Applications on OpenShiftCloud Native Applications on OpenShift
Cloud Native Applications on OpenShift
 
Caching and tuning fun for high scalability @ FOSDEM 2012
Caching and tuning fun for high scalability @ FOSDEM 2012Caching and tuning fun for high scalability @ FOSDEM 2012
Caching and tuning fun for high scalability @ FOSDEM 2012
 
Debugging Planning Issues Using Calcite's Built-in Loggers
Debugging Planning Issues Using Calcite's Built-in LoggersDebugging Planning Issues Using Calcite's Built-in Loggers
Debugging Planning Issues Using Calcite's Built-in Loggers
 
Caching and tuning fun for high scalability
Caching and tuning fun for high scalabilityCaching and tuning fun for high scalability
Caching and tuning fun for high scalability
 
NoCOUG_201411_Patel_Managing_a_Large_OLTP_Database
NoCOUG_201411_Patel_Managing_a_Large_OLTP_DatabaseNoCOUG_201411_Patel_Managing_a_Large_OLTP_Database
NoCOUG_201411_Patel_Managing_a_Large_OLTP_Database
 

Plus de Henning Jacobs

Why we don’t use the Term DevOps: the Journey to a Product Mindset - Destinat...
Why we don’t use the Term DevOps: the Journey to a Product Mindset - Destinat...Why we don’t use the Term DevOps: the Journey to a Product Mindset - Destinat...
Why we don’t use the Term DevOps: the Journey to a Product Mindset - Destinat...
Henning Jacobs
 

Plus de Henning Jacobs (20)

How Zalando runs Kubernetes clusters at scale on AWS - AWS re:Invent
How Zalando runs Kubernetes clusters at scale on AWS - AWS re:InventHow Zalando runs Kubernetes clusters at scale on AWS - AWS re:Invent
How Zalando runs Kubernetes clusters at scale on AWS - AWS re:Invent
 
Open Source at Zalando - OSB Open Source Day 2019
Open Source at Zalando - OSB Open Source Day 2019Open Source at Zalando - OSB Open Source Day 2019
Open Source at Zalando - OSB Open Source Day 2019
 
Why I love Kubernetes Failure Stories and you should too - GOTO Berlin
Why I love Kubernetes Failure Stories and you should too - GOTO BerlinWhy I love Kubernetes Failure Stories and you should too - GOTO Berlin
Why I love Kubernetes Failure Stories and you should too - GOTO Berlin
 
Why Kubernetes? Cloud Native and Developer Experience at Zalando - Enterprise...
Why Kubernetes? Cloud Native and Developer Experience at Zalando - Enterprise...Why Kubernetes? Cloud Native and Developer Experience at Zalando - Enterprise...
Why Kubernetes? Cloud Native and Developer Experience at Zalando - Enterprise...
 
Why Kubernetes? Cloud Native and Developer Experience at Zalando - OWL Tech &...
Why Kubernetes? Cloud Native and Developer Experience at Zalando - OWL Tech &...Why Kubernetes? Cloud Native and Developer Experience at Zalando - OWL Tech &...
Why Kubernetes? Cloud Native and Developer Experience at Zalando - OWL Tech &...
 
Kubernetes + Python = ❤ - Cloud Native Prague
Kubernetes + Python = ❤ - Cloud Native PragueKubernetes + Python = ❤ - Cloud Native Prague
Kubernetes + Python = ❤ - Cloud Native Prague
 
Kubernetes Failure Stories, or: How to Crash Your Cluster - ContainerDays EU ...
Kubernetes Failure Stories, or: How to Crash Your Cluster - ContainerDays EU ...Kubernetes Failure Stories, or: How to Crash Your Cluster - ContainerDays EU ...
Kubernetes Failure Stories, or: How to Crash Your Cluster - ContainerDays EU ...
 
Why we don’t use the Term DevOps: the Journey to a Product Mindset - DevOpsCo...
Why we don’t use the Term DevOps: the Journey to a Product Mindset - DevOpsCo...Why we don’t use the Term DevOps: the Journey to a Product Mindset - DevOpsCo...
Why we don’t use the Term DevOps: the Journey to a Product Mindset - DevOpsCo...
 
Why we don’t use the Term DevOps: the Journey to a Product Mindset - Destinat...
Why we don’t use the Term DevOps: the Journey to a Product Mindset - Destinat...Why we don’t use the Term DevOps: the Journey to a Product Mindset - Destinat...
Why we don’t use the Term DevOps: the Journey to a Product Mindset - Destinat...
 
Kubernetes Failure Stories - KubeCon Europe Barcelona
Kubernetes Failure Stories - KubeCon Europe BarcelonaKubernetes Failure Stories - KubeCon Europe Barcelona
Kubernetes Failure Stories - KubeCon Europe Barcelona
 
Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc...
Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc...Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc...
Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc...
 
Developer Experience at Zalando - CNCF End User SIG-DX
Developer Experience at Zalando - CNCF End User SIG-DXDeveloper Experience at Zalando - CNCF End User SIG-DX
Developer Experience at Zalando - CNCF End User SIG-DX
 
Ensuring Kubernetes Cost Efficiency across (many) Clusters - DevOps Gathering...
Ensuring Kubernetes Cost Efficiency across (many) Clusters - DevOps Gathering...Ensuring Kubernetes Cost Efficiency across (many) Clusters - DevOps Gathering...
Ensuring Kubernetes Cost Efficiency across (many) Clusters - DevOps Gathering...
 
Let's talk about Failures with Kubernetes - Hamburg Meetup
Let's talk about Failures with Kubernetes - Hamburg MeetupLet's talk about Failures with Kubernetes - Hamburg Meetup
Let's talk about Failures with Kubernetes - Hamburg Meetup
 
Developer Experience at Zalando - Handelsblatt Strategisches IT-Management 2019
Developer Experience at Zalando - Handelsblatt Strategisches IT-Management 2019Developer Experience at Zalando - Handelsblatt Strategisches IT-Management 2019
Developer Experience at Zalando - Handelsblatt Strategisches IT-Management 2019
 
Running Kubernetes in Production: A Million Ways to Crash Your Cluster - DevO...
Running Kubernetes in Production: A Million Ways to Crash Your Cluster - DevO...Running Kubernetes in Production: A Million Ways to Crash Your Cluster - DevO...
Running Kubernetes in Production: A Million Ways to Crash Your Cluster - DevO...
 
Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc...
Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc...Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc...
Optimizing Kubernetes Resource Requests/Limits for Cost-Efficiency and Latenc...
 
Running Kubernetes in Production: A Million Ways to Crash Your Cluster - Cont...
Running Kubernetes in Production: A Million Ways to Crash Your Cluster - Cont...Running Kubernetes in Production: A Million Ways to Crash Your Cluster - Cont...
Running Kubernetes in Production: A Million Ways to Crash Your Cluster - Cont...
 
API First with Connexion - PyConWeb 2018
API First with Connexion - PyConWeb 2018API First with Connexion - PyConWeb 2018
API First with Connexion - PyConWeb 2018
 
Developer Journey at Zalando - Idea to Production with Containers in the Clou...
Developer Journey at Zalando - Idea to Production with Containers in the Clou...Developer Journey at Zalando - Idea to Production with Containers in the Clou...
Developer Journey at Zalando - Idea to Production with Containers in the Clou...
 

Dernier

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Dernier (20)

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 

GOTO 2013: Why Zalando trusts in PostgreSQL

  • 1.
  • 2. Why Zalando trusts in PostgreSQL A developer’s view on using the most advanced open-source database Henning Jacobs - Technical Lead Platform/Software Zalando GmbH Valentine Gogichashvili - Technical Lead Platform/Database Zalando GmbH GOTO Berlin 2013, 2013-10-18
  • 3. Who we are Henning Jacobs <henning.jacobs@zalando.de> with Zalando since 2010 NO DBA, NO PostgreSQL Expert Valentine Gogichashvili <valentine.gogichashvili@zalando.de> with Zalando since 2010 DBA, PostgreSQL Expert 3/37
  • 4. About Zalando 14 countries > 1 billion € revenue 2012 > 150,000 products 3 warehouses Europe's largest online fashion retailer · · · · · 4/37
  • 6. Our Main Stack Java → PostgreSQL Tomcat CXF-WS API Schema Data Schemas JDBC SProcWrapper Main/Production Postgres Java 7, Tomcat 7, PostgreSQL 9.0–9.3 6/37
  • 7. Our Main Stacks Different Use Cases — same Database Tomcat CXF-WS API Schema Data Schemas JDBC SProcWrapper Tomcat CXF-WS Data Schemas JDBC JPA CherryPy HTTP/REST Data Schemas psycopg2 SQLAlchemy Main/Production CRUD/JPA Scripting/Python Postgres Postgres Postgres Java 7, Tomcat 7, Python 2.7, PostgreSQL 9.0–9.3 7/37
  • 8. Some Numbers > 90 deployment units (WARs) > 800 production Tomcat instances > 50 different databases > 90 database master instances > 5 TB of PostgreSQL data > 200 developers, 8 DBAs · · · · · · 8/37
  • 9. Stored Procedures register_customer(..) find_orders(..) ... z_api_v13_42API Schemas stored procedures, custom types, ... register_customer(..) find_orders(..) ... Data Schema(s) tables, views, ... SET search_path .... SELECT register_customer(...) customer order order_address ... z_data 9/37
  • 10. Stored Procedures 1:1 Mapping using SProcWrapper @SProcService public interface CustomerSProcService { @SProcCall int registerCustomer(@SProcParam String email, @SProcParam Gender gender); } JAVA CREATE FUNCTION register_customer(p_email text, p_gender z_data.gender) RETURNS int AS $$ INSERT INTO z_data.customer (c_email, c_gender) VALUES (p_email, p_gender) RETURNING c_id $$ LANGUAGE 'sql' SECURITY DEFINER; SQL 10/37
  • 11. Stored Procedures Complex Types @SProcCall List<Order> findOrders(@SProcParam String email); JAVA CREATE FUNCTION find_orders(p_email text, OUT order_id int, OUT order_date timestamp, OUT shipping_address order_address) RETURNS SETOF RECORD AS $$ SELECT o_id, o_date, ROW(oa_street, oa_city, oa_country)::order_address FROM z_data.order JOIN z_data.order_address ON oa_order_id = o_id JOIN z_data.customer ON c_id = o_customer_id WHERE c_email = p_email $$ LANGUAGE 'sql' SECURITY DEFINER; SQL 11/37
  • 12. Stored Procedures Experience Performance benefits Easy to change live behavior Validation close to data Simple transaction scope Makes moving to new software version easy Cross language API layer CRUD ⇒ JPA · · · · · · · 12/37
  • 13. Stored Procedures Rolling out database changes API versioning Modularization DB-Diffs · Automatic roll-out during deployment Java backend selects "right" API version - - · shared SQL gets own Maven artifacts feature/library bundles of Java+SQL - - · SQL scripts for database changes Review process Tooling support - - - 13/37
  • 14. Stored Procedures & Constraints Protect Your Most Valuable Asset: Your Data 14/37
  • 15. Constraints Ensuring Data Quality Simple SQL expressions: Combining check constraints and stored procedures: CREATE TABLE z_data.host ( h_hostname varchar(63) NOT NULL UNIQUE CHECK (h_hostname ~ '^[a-z][a-z0-9-]*$'), h_ip inet UNIQUE CHECK (masklen(h_ip) = 32), h_memory_bytes bigint CHECK (h_memory_bytes > 8*1024*1024) ); COMMENT ON COLUMN z_data.host.h_memory_bytes IS 'Main memory (RAM) of host in Bytes'; SQL CREATE TABLE z_data.customer ( c_email text NOT NULL UNIQUE CHECK (is_valid_email(c_email)), .. ); SQL 15/37
  • 16. Constraints What about MySQL? The MySQL manual says: The CHECK clause is parsed but ignored by all storage engines. Open Bug ticket since 2004: http://bugs.mysql.com/bug.php?id=3464 http://dev.mysql.com/doc/refman/5.7/en/create-table.html 16/37
  • 17. Constraints Custom Type Validation using Triggers value: "john.doe@example.org" key: recipientEmail type: EMAIL_ADDRESS ct_id : serial ct_name : text ct_type : base_type ct_regex : text ... config_type cv_id : serial cv_key : text cv_value : json cv_type_id : int config_value 17/37
  • 18. Custom Type Validation using Triggers CREATE FUNCTION validate_value_trigger() RETURNS trigger AS $$ BEGIN PERFORM validate_value(NEW.cv_value, NEW.cv_type_id); END; $$ LANGUAGE 'plpgsql'; SQL CREATE FUNCTION validate_value(p_value json, p_type_id int) RETURNS void AS $$ import json import re # ... Python code, see next slide $$ LANGUAGE 'plpythonu'; SQL 18/37
  • 19. Custom Type Validation with PL/Python class TypeValidator(object): @staticmethod def load_by_id(id): tdef = plpy.execute('SELECT * FROM config_type WHERE ct_id = %d' % int(id))[0] return _get_validator(tdef) def check(self, condition, msg): if not condition: raise ValueError('Value does not match the type "%s". Details: %s' % (self.type_name, msg)) class BooleanValidator(TypeValidator): def validate(self, value): self.check(type(value) is bool, 'Boolean expected') validator = TypeValidator.load_by_id(p_type_id) validator.validate(json.loads(p_value)) PYTHON 19/37
  • 21. Sharding Scaling Horizontally App Shard 1 Cust. A-F Shard 2 Cust. G-K Cust. Y-Z Shard N SProcWrapper 21/37
  • 22. Sharding SProcWrapper Support Sharding customers by ID: Sharding articles by SKU (uses MD5 hash): Collecting information from all shards concurrently: @SProcCall int registerCustomer(@SProcParam @ShardKey int customerId, @SProcParam String email, @SProcParam Gender gender); JAVA @SProcCall Article getArticle(@SProcParam @ShardKey Sku sku); JAVA @SProcCall(runOnAllShards = true, parallel = true) List<Order> findOrders(@SProcParam String email); JAVA 22/37
  • 23. Sharding Auto Partitioning Collections @SProcCall(parallel = true) void updateStockItems(@SProcParam @ShardKey List<StockItem> items); JAVA 23/37
  • 24. Sharding Bitmap Data Source Providers 24/37
  • 25. Sharding Experience Scalability without sacrificing any SQL features Start with a reasonable number of shards (8) Some tooling required Avoid caching if you can and scale horizontally · · · · 25/37
  • 26. Fail Safety Replication WAL WAL Slave with 1 h delay Hot Standby for readonly queries Service IP App All databases use streaming replication Every database has a (hot) standby and a delayed slave · · 26/37
  • 27. Fail Safety Failover Service IP for switching/failover Monitoring with Java and custom web frontend Failover is manual task · · · 27/37
  • 28. Fail Safety General Remarks Good hardware No downtimes allowed Two data centers Dedicated 24x7 team Maintenance · G8 servers from HP ECC RAM, RAID - - · Major PostgreSQL version upgrades?- · · · Concurrent index rebuild, table compactor- 28/37
  • 31. Monitoring PGObserver Locate hot spots Helps tuning DB performance Creating the right indices often a silver bullet · Frequent stored procedure calls Long running stored procedures I/O patterns - - - · · 31/37
  • 32. Monitoring pg_view Top-like command line utility DBA's daily tool Analyze live problems Monitor data migrations · · · · 32/37
  • 33. NoSQL Relational is dead? If your project is not particularly vital and/or your team is not particularly good, use relational databases! Martin Fowler, GOTO Aarhus 2012 People vs. NoSQL, GOTO Aarhus 2012 33/37
  • 34. NoSQL Relational is dead? The key goals of F1's design are: 3. Consistency: The system must provide ACID transactions, [..] 4. Usability: The system must provide full SQL query support [..] Features like indexes and ad hoc query are not just nice to have, but absolute requirements for our business. F1: A Distributed SQL Database That Scales 34/37
  • 35. Document Column-Family Key-Value Graph Aggregate-Oriented Schemaless NoSQL – Comparison Aggregate oriented? Schemaless? Scaling? Auth*? Use cases? ⇒ "Polyglot Persistence" · SProcWrapper Changes? - - · ⇒ Implicit Schema HStore, JSON - - · · · Introduction to NoSQL, GOTO Aarhus 2012 35/37
  • 36. YeSQL PostgreSQL at Zalando Fast, stable and well documented code base Performs as well as (and outperforms some) NoSQL databases while retaining deeper exibility Scaling horizontally can be easy Know your tools — whatever they are! · · · · PostgreSQL and NoSQL 36/37
  • 37. Thank You! Please Visit also tech.zalando.com Please rate this session! SProcWrapper – Java library for stored procedure access github.com/zalando/java-sproc-wrapper PGObserver – monitoring web tool for PostgreSQL github.com/zalando/PGObserver pg_view – top-like command line activity monitor github.com/zalando/pg_view · · · 37/37