Contenu connexe Similaire à Geographically Distributed PostgreSQL (20) Geographically Distributed PostgreSQL1. 1©2014 TransLattice, Inc. All Rights Reserved. 11
Geographically Distributed PostgreSQL
PGConf NYC
April 3, 2014
Mason Sharp, Chief Architect
msharp@translattice.com
2. 2©2014 TransLattice, Inc. All Rights Reserved.
Agenda
n Why geographically distribute your data?
n General replication background
n PostgreSQL options
n Custom PostgreSQL configurations
n Upcoming solutions
3. 3©2014 TransLattice, Inc. All Rights Reserved.
Why geographically distribute your data?
n Improved Availability
n Better performance (in some cases…)
– Read vs Write
– Data closer applications and users
n Regulatory or Corporate Compliance
– Data placement concerns
5. 5©2014 TransLattice, Inc. All Rights Reserved.
1 Survey by Zerto, July 2013, 356 IT professionals from 10 industries
Primary causes of data center outage1:
n Hardware failure 34.4%
n Power loss/interruption 31.5%
n Natural disaster 13.3%
79.2%
Most recent unplanned data center outage1:
n Last 6 months 42%
n Last year 34%
76% experienced in last year
Data Center Outages – Causes and
Frequency
6. 6©2014 TransLattice, Inc. All Rights Reserved.
Data Center Outage Costs Increasing
1
2 2013 Cost of Data Center Outages, Ponemon Institute, December 2013
3 Bringing Continuous Availability to Oracle Environments, 2013 Mission-Critical Application Availability Survey,
Unisphere Research
Average cost of an outage is increasing2:
n 2010 $5,617/minute
n 2013 $7,908/minute 41% increase
Length of unplanned outage:
n Average: 86 minutes2
n 25%+ of Oracle users had 8+ hours of unplanned downtime in last year3
7. 7©2014 TransLattice, Inc. All Rights Reserved.
Current State of Data Replication
Top data management issues for IT executives:4
n Providing business continuity at a reasonable cost
n Deploying applications in multiple geographies consistently
n The continued ability to use SQL
4 DBMS Evaluation Criteria, IDG Research Services, October 2013
5 Bringing Continuous Availability to Oracle Environments, 2013 Mission-Critical Application Availability Survey,
Unisphere Research
“Among respondents with at least two data centers and
rapid replication solutions, 46% indicate they are
less than satisfied with their current strategies.” 5
8. 8©2014 TransLattice, Inc. All Rights Reserved.
Replication
n Master-Slave
– One Master, One or more Slaves
n Multi-master
– Multiple Masters
n Multi-source fan-in
– Example: consolidate multiple sites
n Fan-out
10. 10©2014 TransLattice, Inc. All Rights Reserved.
Master-Slave
n All writes go to one master
n Hot Standby reads can be done from any node
n Synchronous / Asynchronous
n Slaves get transactions via either
– Native streaming replication
– Statement based
• Could be synchronous, could be via 2PC
• Could be a replay mechanism via queues or triggers
12. 12©2014 TransLattice, Inc. All Rights Reserved.
Multi-master
n Write can occur at any location
n Synchronous 2PC
– MVCC concerns
– May make sense to first always write at one location,
acquiring lock
n Asynchronous
n Conflict Resolution
n Conflict Avoidance through commit ordering
– Paxos
– Raft
13. 13©2014 TransLattice, Inc. All Rights Reserved.
Multi-source fan-in
CentralLoc1
Loc2
Loc3
n Consolidated centrally for reporting
15. 15©2014 TransLattice, Inc. All Rights Reserved.
Understand Your Requirements
n Availability
– Read-only access of some data ok in downgraded state?
n Immediacy of Data
– Nightly refresh? Immediate? 2 second lag?
n Performance & Latency
– Read vs. Write
n Correctness versus Performance
n Conflicts: Prevent or Resolve
16. 16©2014 TransLattice, Inc. All Rights Reserved.
Understand Your Requirements (continued)
n Data Segregation
n Data Ownership
– Can each location be the “master” to a subset of data?
– Example: regional customers
– Expressed either as a subtable, or expression on a table
• region_code = ‘US’
– Different availability requirements?
n “Staticity” Classification
– Static tables that rarely change
– Frequently updated tables
17. 17©2014 TransLattice, Inc. All Rights Reserved.
Static Tables
n Less concerned about write performance
n Writing to table
– BEGIN;
– Execute DML statement on agreed “master”
– On success, we have acquired all of the row locks
– Safely execute on other nodes without risk of deadlock
– PREPARE TRANSACTION;
– COMMIT;
18. 18©2014 TransLattice, Inc. All Rights Reserved.
Careful with reflexive UPDATES
UPDATE inventory SET qty = qty – 1 WHERE ….;
n What if happens on multiple nodes?
n If conflict resolution policy is last one wins,
inventory is reduced only by 1, not 2
n May expect inventory that is not there
May want to handle some tables specially.
• SELECT FOR UPDATE on a master
– Will block if another transaction modifying
– Locks won’t propagate to other nodes
19. 19©2014 TransLattice, Inc. All Rights Reserved.
Looking in the PostgreSQL Toolbox –
Master-Slave
n Native Streaming Replication
– All databases in instance are replicated
– Synchronous and Asynchronous options
– Hot queryable standby option
n Slony
– Trigger based, asynchronous replication
– Flexibility for a subset of data
– More complex administration
n Londiste
20. 20©2014 TransLattice, Inc. All Rights Reserved.
Looking in the PostgreSQL Toolbox –
pgpool-II
n Middle Layer
n Synchronous Statement-Based Replication
– Can instead be combined with other replication incl.
native streaming replication
n Load Balancer
– Can be All writes must go through master node
21. 21©2014 TransLattice, Inc. All Rights Reserved.
Looking in the PostgreSQL Toolbox –
Postgres-XC
n Can connect to one of multiple nodes
n Good push-down join and operation handling
n Ensures cluster-wide consistency
BUT
n Requires access to Global Transaction Manager from
each node
n Nodes are a modified version of PostgreSQL
n Not currently suited for use case
22. 22©2014 TransLattice, Inc. All Rights Reserved.
Looking in the PostgreSQL Toolbox –
PL/Proxy
n Everything is a stored function
– More cumbersome, but flexible
23. 23©2014 TransLattice, Inc. All Rights Reserved.
Looking in the PostgreSQL Toolbox –
Multi-master
n Bucardo
– Perl-based
– Limited to two masters
– Custom conflict resolution possible
n RubyRep
– Ruby-based
– Limited to two masters
– Custom conflict resolution possible
n Postgres-R
– Modified PostgreSQL 9.0
24. 24©2014 TransLattice, Inc. All Rights Reserved.
Looking in the PostgreSQL Toolbox –
Custom
n Triggers
n Foreign Data Wrappers
n Subtable Partitioning
n Two Phase Commit
25. 25©2014 TransLattice, Inc. All Rights Reserved.
Looking in the PostgreSQL Toolbox –
Considerations
n Connections and MVCC across multiple instances
n Sequence/Serial
– UUID as alternative
n Timestamps
– Use timestamp with timezone
– Network Time Protocol NTP
– Custom functions for time lag
26. 26©2014 TransLattice, Inc. All Rights Reserved.
Custom Example
n Multiple Locations
n Locations largely independent
n Most of the writes will occur locally
– Each site is the “master” for local data
n Want to be able to write data on a remote site
n Want local read performance for remote
originating data
n If remote site is down, local read-only access is
acceptable
n Occasional updates to static data requires all
nodes online
28. 28©2014 TransLattice, Inc. All Rights Reserved.
Custom Example
DC2
Hot
Standby
DC1
Master
DC2
Master
DC1
Hot
Standby
DC1 DC2
customer_dc1
customer_dc2
29. 29©2014 TransLattice, Inc. All Rights Reserved.
View: customer
Custom Example
DC2
Hot
Standby
DC1
Master
DC2
Master
DC1
Hot
Standby
DC1 DC2
customer_dc1
FDW
customer_dc2
30. 30©2014 TransLattice, Inc. All Rights Reserved.
Configuration
n configure --with-ossp-uuid
n CREATE EXTENSION "uuid-ossp"
n CREATE EXTENSION "postgres_fdw”
31. 31©2014 TransLattice, Inc. All Rights Reserved.
Configuration
n From dc1:
CREATE SERVER dc2_master
FOREIGN DATA WRAPPER postgres_fdw
OPTIONS (host 'dc2_host', dbname 'dc2', port '5434');
CREATE SERVER dc2_slave
FOREIGN DATA WRAPPER postgres_fdw
OPTIONS (host 'localhost', dbname 'dc2', port '5434');
32. 32©2014 TransLattice, Inc. All Rights Reserved.
Configuration
n From dc2:
CREATE SERVER dc1_master
FOREIGN DATA WRAPPER postgres_fdw
OPTIONS (host 'dc1_host', dbname 'dc1', port '5433');
CREATE SERVER dc1_slave
FOREIGN DATA WRAPPER postgres_fdw
OPTIONS (host 'localhost', dbname 'dc1', port '5433');
33. 33©2014 TransLattice, Inc. All Rights Reserved.
Configuration
CREATE USER MAPPING
FOR user1 SERVER dc2_master
OPTIONS (user ’user1');
CREATE USER MAPPING
FOR user1 SERVER dc2_slave
OPTIONS (user ’user1');
34. 34©2014 TransLattice, Inc. All Rights Reserved.
Configuration
CREATE TABLE customer_dc1
(cust_id UUID,
cust_name varchar,
cust_loc char(5));
35. 35©2014 TransLattice, Inc. All Rights Reserved.
Configuration
On dc1:
CREATE FOREIGN TABLE customer_dc2_master
(cust_id UUID,
cust_name varchar,
cust_loc char(5))
SERVER dc2_master;
CREATE FOREIGN TABLE customer_dc2_slave
(cust_id UUID,
cust_name varchar,
cust_loc char(5))
SERVER dc2_slave;
36. 36©2014 TransLattice, Inc. All Rights Reserved.
View Handling
n Create a customer view, a union of local data and
local slave
n Include cust_loc condition
CREATE VIEW customer AS
SELECT *
FROM customer_dc1
WHERE cust_loc = ‘DC1’
UNION ALL
SELECT * FROM customer_dc2_slave
WHERE cust_loc = ‘DC2’;
37. 37©2014 TransLattice, Inc. All Rights Reserved.
View Handling
# explain select * from customer;
QUERY PLAN
-----------------------------------------------------------------------
---
Append (cost=0.00..140.82 rows=8 width=72)
-> Seq Scan on customer_dc1
Filter: (cust_loc = 'DC1'::bpchar)
-> Foreign Scan on customer_dc2_slave
38. 38©2014 TransLattice, Inc. All Rights Reserved.
Configuration
n PostgreSQL takes qualifications into account for better plans!
# explain select * from customer where cust_loc = 'DC1';
QUERY PLAN
----------------------------------------------------------------
Append (cost=0.00..20.04 rows=4 width=72)
-> Seq Scan on customer_dc1 (cost=…..)
Filter: (cust_loc = 'DC1'::bpchar)
n Smart enough to know to use just one part of the UNION
– Leaves off foreign table part
– Consider in design of application
39. 39©2014 TransLattice, Inc. All Rights Reserved.
Triggers
CREATE TRIGGER tr_customer
INSTEAD OF
INSERT OR UPDATE OR DELETE ON customer
FOR EACH ROW
EXECUTE PROCEDURE update_customer();
40. 40©2014 TransLattice, Inc. All Rights Reserved.
Trigger Function
CREATE OR REPLACE FUNCTION update_customer()
RETURNS TRIGGER AS $$
BEGIN
-- TODO: Handle updating cust_loc
IF (TG_OP = 'UPDATE') THEN
IF OLD.cust_loc = 'DC1' THEN
UPDATE customer_dc1
SET cust_name = NEW.cust_name
WHERE cust_id = OLD.cust_id;
ELSEIF OLD.cust_loc = 'DC2' THEN
UPDATE customer_dc2_master
SET cust_name = NEW.cust_name
WHERE cust_id = OLD.cust_id;
END IF;
RETURN NEW;
:
$$ LANGUAGE plpgsql;
41. 41©2014 TransLattice, Inc. All Rights Reserved.
Caveats
n Performance will be poor for some queries
– Join push-down
n Two Phase Commit is not used by FDW
– No consistency guarantees!
– FWIW, will commit remotely before locally
n Repeatable Read is used by the FDW
– Keeps results the same for foreign table scanned multiple times
n Differing locale settings may cause problems
42. 42©2014 TransLattice, Inc. All Rights Reserved.
Custom Example – Further Enhancement
n Want to reduce loss of ability to write new data
n Add local table for local inserts when remote side
is down
– Especially helpful for append-only workloads
n Change trigger functions to use the local table
when the remote side is down
n Allow updates and deletes on these as well
n When the remote side is available again, apply
changes to remote side, truncate local table
43. 43©2014 TransLattice, Inc. All Rights Reserved.
Custom Example – Additional try
n Tried using table inheritance and adding a rule on
a subtable to instead query a remote table, but
encountered issues
44. 44©2014 TransLattice, Inc. All Rights Reserved.
Another Custom Example
n All tables in just one database on each node
n No streaming replication
n Changes applied at both locations
– Either via 2PC
– Or asynchronously via triggers
45. 45©2014 TransLattice, Inc. All Rights Reserved.
Upcoming PostgreSQL Multi-master
Replication
n Logical Log Streaming Replication (LLSR) in
PostgreSQL 9.4
n WAL is read to determine logical commits
n Can be decoded to SQL
n Less overhead than other projects
n Will allow for a subset of data to be replicated, not
entire instance unlike existing SR
46. 46©2014 TransLattice, Inc. All Rights Reserved.
Upcoming PostgreSQL Multi-master
Replication
n A goal in a future PostgreSQL release is multi-
master replication with last-one wins conflict
resolution (9.5?)
n Possible 9.4 extension for apply side in future
n Improvements over subsequent releases
– Improved DDL support may be phased in over time
47. 47©2014 TransLattice, Inc. All Rights Reserved.
Bucardo Example
createdb db1
createdb –p 5433 db1
psql –c “CREATE TABLE tab1
(col1 int, col2 int, PRIMARY KEY(col1))”
Db1
psql –c “CREATE TABLE tab1
(col1 int, col2 int, PRIMARY KEY(col1))”
-p 5433 db1
48. 48©2014 TransLattice, Inc. All Rights Reserved.
Bucardo Example
bucardo_ctl install
bucardo_ctl add database db1 name=db1a
bucardo_ctl add database db1 name=db1b
port=5433
bucardo_ctl add all tables db=db1a
psql bucardo:
update bucardo.goat
set standard_conflict = 'latest'
where tablename = 'tab1';
49. 49©2014 TransLattice, Inc. All Rights Reserved.
Bucardo Example
bucardo_ctl add sync sync_tab1 type=swap
source=db1a targetdb=db1b tables=tab1
bucardo_ctl stop
bucardo_ctl start
-> Updates to tab1 now visible on both servers
50. 50©2014 TransLattice, Inc. All Rights Reserved.
Bucardo Notes
n If having trouble, try “bucardo_ctl install” again
n Also try bucardo_ctl stop and bucardo_ctl start
n It seemed to get confused with table names the
same in multiple databases
51. 51©2014 TransLattice, Inc. All Rights Reserved.
Alternative:
TransLattice Elastic Database (TED)
n PostgreSQL-based
n Geo-distributed multi-master RDBMS with sharding
n Policy Configurable
– Degree of redundancy
– Data location
n Uses Fast Generalized Paxos for global commit ordering
n Easily add nodes
– New locations
– Existing locations for scalability
n Nodes recover automatically
n Easy transition
– Can operates in conjunction with
existing database systems
52. 52©2014 TransLattice, Inc. All Rights Reserved.
Each TransLattice Node Delivers Capabilities
That Replace Numerous Disparate Technologies
A single node type simplifies scaling and management
TL
Replication
Storage
Management
Cluster
Management
Compliance Tools
Fully Relational
Database
Management Tools
Data
Integration
Tools
Notes de l'éditeur Even sharding within an instance