Atsushi Mitani from SRA Nishi-Nihon Inc. presented on how to perform write load balancing in PostgreSQL using transactions. He explained that write load distribution is important for systems with high write volumes. PostgreSQL can distribute write load using table partitioning with foreign data wrappers (FDW), which allows partitioning across database instances. Mitani created patches to automate the partitioning setup and load data in parallel to child tables to speed up benchmarking. Benchmark results showed that while increasing child databases improves performance without transactions, increasing parent databases is better with transactions to avoid lock queues. The optimal configuration depends on data size, queries, and hardware.
PostgreSQL Write Load Balancing Using Table Partitioning and Transactions
1. PGConf.ASIA 2019
How did PostgreSQL Write LoadHow did PostgreSQL Write Load
Balancing of Queries UsingBalancing of Queries Using
Transactions?Transactions?
Atsushi MitaniAtsushi Mitani
SRA Nishi-Nihon Inc.SRA Nishi-Nihon Inc.
4. PGConf.ASIA 2019
Agenda
1 Introduction.
2 Why write load distribution is required?
3 How to distribute write load in PostgreSQL?
4 How fast is PostgreSQL's write load balancing
configuration?
5 Conclusion.
6 Summary.
6. PGConf.ASIA 2019
Who am I
●
Real Time Control System Engineer (1991-
– Power distribution control system.
●
Network Engineer (1995-
– Telephone communication network monitoring.
●
Database Engineer (1999-
– Develop PGCluster.
●
Security administrator (2000-
●
Infrastructure engineer (2005-
– Working in a Data center design division.
●
Web Application Engineer (2008-
– Back-end, Front-end, Android, iOS App ...
7. PGConf.ASIA 2019
Purpose of this session
●
Propose a suitable database configuration for
various system types.
●
Especially with a high write load database.
10. PGConf.ASIA 2019
Suitable DB type for each system
read
write
highlow
high
RDBRDB
CacheCache
NoSQLNoSQL
multi-mastermulti-master
11. PGConf.ASIA 2019
Which area should RDB aim for
●
Required features is real-time
processing of high load read / write
data
– The problem is how to perform high-load
read / write processing.
– If PostgreSQL can solve the problem, it
becomes a business
14. PGConf.ASIA 2019
Pros & Cons of Table Partitioning
●
Pros.
– Easy to use.
●
The parent table automatically reads and
writes to the child table.
●
No modification required on the program side.
A piece of cake!
15. PGConf.ASIA 2019
Pros & Cons of Table Partitioning
●
Cons.
– Does not load balance on a server basis.
●
Both parent and child tables are running on the same
DB instance.
On the same plate...
16. PGConf.ASIA 2019
How to distribution read & write load
●
Foreign Data Wrapper (FDW)
Parent DB
Child DBIndonesia
Pulau
Sumatera
Jawa
Pulau
Kalimantan
17. PGConf.ASIA 2019
Pros & Cons of FDW
●
Pros.
– Partitioning with FDW.
●
Partitioning can be used from
PostgreSQL 11.
●
Load balancing is possible since the
parent and child are running on different
DB instances.
Dream spreads!
18. PGConf.ASIA 2019
Pros & Cons of FDW
●
Cons.
– Cannot be used ACID transactions.
●
Data consistency cannot be
guaranteed.
Oops, like wax fruit...
19. PGConf.ASIA 2019
Let's make it
●
Make a patch to enable
ACID transactions.
– Investigate why ACID transactions
cannot be used with FDW.
– Modify the program to use ACID
transactions.
– Confirm that the ACID transaction can
be used in FDW by patch.Yes, let’s make it!
20. PGConf.ASIA 2019
Why ACID transactions cannot be used
with FDW
●
FDW can only use SERIALIZABLE isolation
level or REPEATABLE READ isolation level.
– In order to get snapshot-consistent result for multiple
table scans.
●
READ COMMITTED isolation level cannot be
used.
24. PGConf.ASIA 2019
Problems in benchmark measurement
(1/2)
●
Settings are complicated
– Different settings are required for multiple
DBs of parent and child
Where is the child db?
Which is the
partition key?
Where is the parent db?
What is the
threshold value
for each db?
What is the access
information for each DB?
25. PGConf.ASIA 2019
Example of partitioning table with FDW
Parent DB
Child DB 1
Child DB 2
1 - 500,000,000
500,00,001 – 1,000,000,000
Table Partitioning with
FDW
child_host_1
child_host_2
27. PGConf.ASIA 2019
Settings (2/10)
●
Create Server for FDW
CREATE SERVER db1 FOREIGN DATA WRAPPER postgres_fdw
OPTIONS (host 'child_host_1', port '5432', dbname 'db');
CREATE SERVER db2 FOREIGN DATA WRAPPER postgres_fdw
OPTIONS (host 'child_host_2', port '5432', dbname 'db');
28. PGConf.ASIA 2019
Settings (3/10)
●
Create User Mapping
CREATE USER MAPPING FOR postgres SERVER db1
OPTIONS (user 'postgres');
CREATE USER MAPPING FOR postgres SERVER db2
OPTIONS (user 'postgres');
29. PGConf.ASIA 2019
Settings (4/10)
●
Create Parent Table (pgbench_accounts)
CREATE TABLE public.pgbench_accounts (
aid integer NOT NULL,
bid integer,
abalance integer,
filler character(84)
)
PARTITION BY RANGE (aid) ;
30. PGConf.ASIA 2019
Settings (5/10)
●
Create Foreign Table (pgbench_accounts)
CREATE FOREIGN TABLE public.pgbench_accounts_1
PARTITION OF public.pgbench_accounts
FOR VALUES FROM (1) TO (500000001)
SERVER db1
OPTIONS ( table_name 'pgbench_accounts');
CREATE FOREIGN TABLE public.pgbench_accounts_2
PARTITION OF public.pgbench_accounts
FOR VALUES FROM (500000001) TO (1000000001)
SERVER db2
OPTIONS ( table_name 'pgbench_accounts');
31. PGConf.ASIA 2019
Settings (6/10)
●
Create Parent Table (pgbench_branches)
CREATE TABLE public.pgbench_branches (
bid integer NOT NULL,
bbalance integer,
filler character(88)
)
PARTITION BY RANGE (bid) ;
32. PGConf.ASIA 2019
Settings (7/10)
●
Create Foreign Table (pgbench_branches)
CREATE FOREIGN TABLE public.pgbench_branches_1
PARTITION OF public.pgbench_branches
FOR VALUES FROM (1) TO (5001)
SERVER db1
OPTIONS ( table_name 'pgbench_branches');
CREATE FOREIGN TABLE public.pgbench_branches_2
PARTITION OF public.pgbench_branches
FOR VALUES FROM (5001) TO (10001)
SERVER db2
OPTIONS ( table_name 'pgbench_branches');
33. PGConf.ASIA 2019
Settings (8/10)
●
Create Parent Table (pgbench_tellers)
CREATE TABLE public.pgbench_tellers (
tid integer NOT NULL,
bid integer,
tbalance integer,
filler character(84)
)
PARTITION BY RANGE (tid) ;
34. PGConf.ASIA 2019
Settings (9/10)
●
Create Foreign Table (pgbench_tellers)
CREATE FOREIGN TABLE public.pgbench_tellers_1
PARTITION OF public.pgbench_tellers
FOR VALUES FROM (1) TO (50001)
SERVER db1
OPTIONS ( table_name 'pgbench_tellers');
CREATE FOREIGN TABLE public.pgbench_tellers_2
PARTITION OF public.pgbench_tellers
FOR VALUES FROM (50001) TO (100001)
SERVER db2
OPTIONS ( table_name 'pgbench_tellers');
35. PGConf.ASIA 2019
Settings (10/10)
●
Create Index (primary key)
ALTER TABLE ONLY public.pgbench_accounts
ADD CONSTRAINT pgbench_accounts_pkey PRIMARY KEY (aid);
ALTER TABLE ONLY public.pgbench_branches
ADD CONSTRAINT pgbench_branches_pkey PRIMARY KEY (bid);
ALTER TABLE ONLY public.pgbench_tellers
ADD CONSTRAINT pgbench_tellers_pkey PRIMARY KEY (tid);
36. PGConf.ASIA 2019
Another Problems in benchmark
measurement (2/2)
●
It takes time to load data.
– It takes 36 hours to read one billion data.
– It is necessary to measure multiple times
by changing the scale of data and the
number of partitions.
38. PGConf.ASIA 2019
Solutions for complicated
●
Create a patch.
– Patch to automatically generate table settings for
pgbench.
I’m good at following the program!
39. PGConf.ASIA 2019
Solutions for time consumption
●
Create a patch.
– Patch that loads multiple child tables in parallel with
pgbenche.
Parallel processing is the true value of scale-out!
db1 db2 db3 db4
40. PGConf.ASIA 2019
How to use it (1/2)
●
Parent DB initialization parameter with child
DB configuration file specified.
– pgbench -i -s 10000 -W child.conf db
●
“child.conf” format.
{children:[
{'host':'child_host_1','port':'5432','dbname':'db','user':'postgres','password':''},
{'host':'child_host_2','port':'5432','dbname':'db','user':'postgres','password':''}
]}
41. PGConf.ASIA 2019
How to use it (2/2)
●
Child DB initialization parameter with start key.
– In child db 1
●
pgbench -i -w 1 -s 5000 db
– In child db 2
●
pgbench -i -w 5001 -s 10000 db
42. PGConf.ASIA 2019
You can find patch on github
●
https://github.com/at-mitani/pgbench-fdw-patch
Parent DB
Child DB 1
Child DB n
44. PGConf.ASIA 2019
What were measured?
●
Child DB it self.
●
Multi parent / child DB without ACID
transaction.
●
Multi parent / child DB with ACID transaction.
47. PGConf.ASIA 2019
Multi Parent / Child
pgbench LB
Parent
DB 1
Parent
DB 5
Child
DB 1
Child
DB 2
Child
DB 3
Child
DB 10
Machine spec
1 CPU
4G RAM
30G SSD
48. PGConf.ASIA 2019
5 Child DB on multi parent DB
without ACID transaction
2 4 8 16 32 64 128 256 512 768
0
200
400
600
800
1000
1200
1400
normal
p=1
p=2
p=3
p=4
p=5
TPS
connections
50. PGConf.ASIA 2019
Without ACID query
●
Since this is a measurement using a query that
does not have a record lock due to a
transaction,
●
The number of child DB has an effect on
performance directory
53. PGConf.ASIA 2019
With ACID query
●
Since record locks due to transactions occur in
the child DB, performance will not improve
even if the child DB is increased
●
Increasing the parent DB that receives the
query will spread the lock queue and improve
performance
55. PGConf.ASIA 2019
Why write load distribution is required
●
On-premise high performance DB is
expensive, so it becomes a business
– Cache is mode advantageous in a high data read load
system.
– NoSQL is mode advantageous in a high data write
load system (non real time system).
56. PGConf.ASIA 2019
How to distribute write load in
PostgreSQL?
●
Table partitioning using FDW is effective.
– PostgreSQL can do it!
57. PGConf.ASIA 2019
How fast is PostgreSQL's write load
balancing configuration?
●
Child DB itself is faster than non-partitioned DB.
●
More child DB is better for without ACID transaction.
●
More parent DB is better for with ACID transaction.
58. PGConf.ASIA 2019
However...
●
The optimal number of DB depends on
– Data scale
– Query type
– Machine specifications.
●
Benchmark is important.
– Don't guess. Measure and find out way.
by Tim Bray
59. PGConf.ASIA 2019
Thank you for your attention!
Terima kasih atas perhatian Anda!
Please get patch and try it
https://github.com/at-mitani/pgbench-fdw-patch