Contenu connexe Similaire à Deep Dive on Amazon Aurora with PostgreSQL Compatibility (DAT305-R1) - AWS re:Invent 2018 (20) Plus de Amazon Web Services (20) Deep Dive on Amazon Aurora with PostgreSQL Compatibility (DAT305-R1) - AWS re:Invent 20182. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Deep Dive on Amazon Aurora with
PostgreSQL Compatibility
Grant McAlister,
AWS Senior Principal Engineer
D A T 3 0 5
3. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon RDS is . . .
Cloud native engine Open source engines Commercial engines
Amazon RDS platform
• Automatic failover
• Backup & recovery
• X-region replication
• Isolation & security
• Industry compliance
• Automated patching
• Advanced monitoring
• Routine maintenance
• Push-button scaling
Image credit: By Mackphillips - Own work, CC BY-SA 4.0, https://commons.wikimedia.org/w/index.php?curid=55946550
4. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon RDS PostgreSQL Universe
CLIENTS
RDS
PostgreSQL
Aurora
PostgreSQL
EBS
Aurora
Storage
Postgres 9.6/10 — same extensions
Backup/Recovery - PITR
High Availability & Durability
Secure – IAM Auth
Read Replicas
Cross Region Snapshots
Scale Compute – Online Scale Storage
Cross Region Replication
Outbound Logical Replication
Preview 11
5. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
6. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Concurrency—Remove log buffer
Queued Work
Log Buffer
PostgreSQL Aurora PostgreSQL
Storage
A Queued Work
Storage
B C D E
0 0 0 0 0
A B C D E
2 2 1 0 1
A B C D E
4 3 4 2 4
A B C D E
6 5 6 3 5
A B C D E
Durability
Tracking
7. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Aurora PostgreSQL—Writing less
Aurora
update t set y = 6;
Block in
Memory
t-v1
t-v2
t-v3
Aurora
Storage
t-v1
t-v2
t-v3
no
checkpoint
=
no FPW
Block in
Memory
PostgreSQL
t-v1
t-v2
t-v3
checkpoint
datafile
t-v1
t-v2
Full
Block
t-v3
WAL
archive
4K
4K
8K
update t set y = 6;
8. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Aurora
RW
Storage layer
Peer Storage
Nodes
Coalesce
9. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Insert Test
Test Table
• UUID PK—Random
• ID int—Right Lean Sequence
• VARCHAR(100)—Random
• VARCHAR(50)—Small Set of Words
• INT—Random
• INT—Random (smaller set)
• BOOLEAN—Random (50/50)
• BOOLEAN—Somewhat Random (75/25)
• Timestamp—Right Lean
10. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
0
5,000
10,000
15,000
20,000
25,000
30,000
1 31 61 91 121 151 181 211 241 271
InsertsPerSecond
Minutes
Insert Workload—PostgreSQL 9.6
BASE 16GB Max WAL Aurora PostgreSQL
11. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
3,729
4,871
17,158
-
5,000
10,000
15,000
20,000
TPS(2UpdatesperTransaction)
Update Workload—PostgreSQL 9.6
BASE 16GB Max WAL Aurora PostgreSQL
12. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon Aurora Recovers Up to 97% Faster
3 GiB Redo
Recovered in 19 seconds
10 GiB Redo
Recovered in 50 seconds
30 GiB Redo
Recovered in 123 seconds
0
20
40
60
80
100
120
140
160
0 20,000 40,000 60,000 80,000 100,000 120,000 140,000
RecoveryTimeinSeconds(lessisbetter)
Writes / Second (more is better)
RECOVERY TIME FROM CRASH UNDER LOAD
Bubble size represents redo log, which must be recovered
As PostgreSQL
throughput goes up, so
does log size and crash
recovery time
Amazon Aurora has no redo.
Recovered in 3 seconds while
maintaining significantly greater
throughput.
13. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
14. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
RO
Application
Aurora storage and replicas
RW
Application
RO
Application
Async
Invalidation
& Update
Async
Invalidation
& Update
Write log
records
Read
blocks
RW
Automatic
scalable to
64TB
Availability zone 1 Availability zone 3Availability zone 2
Aurora
storage
RORORORO
15. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
16. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
EBS EBS
Availability Zone 1
Availability Zone 2 Availability Zone 3
EBS
Typical synchronous replication – 3 locations
COMMIT
Region
17. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
6
10
21
31
7
12
28
123
0
20
40
60
80
100
120
140
50 90 99.9 99.99
Latency(ms)
Percentile
High Concurrency Sync Write Test
2 Node (4 copy) 3 Node (6 Copy)
Cost of additional synchronous replicas
18. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Aurora
Storage
Availability Zone 1
Availability Zone 2 Availability Zone 3
Amazon Aurora – 3 AZ’s – 6 copies
COMMIT
Region
Aurora
Storage
Aurora
Storage
Aurora
Storage
Aurora
Storage
Aurora
Storage
19. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon Aurora gives >2x lower response times
0.00
100.00
200.00
300.00
400.00
500.00
600.00
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
responsetime,ms
minute
sysbench response time (p95), 30 GiB, 1024 clients
PostgreSQL (Single AZ, No Backup) Amazon Aurora (Three AZs, Continuous Backup)
20. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
21. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Replicas—PostgreSQL
PostgreSQL
RW
EBS Snapshot
PostgreSQL
RO
EBS
update
22. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Replicas—Amazon Aurora
Aurora
RW
Aurora
RO
update
Aurora Storage
update in
memory
23. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
pgbench benchmark
Async replication
and apply
PostgreSQL
Aurora
PostgreSQL
RW RO
accounts
tellers
branches
history
accounts
tellers
branches
history
Async replication
and memory update
RW RO
accounts
tellers
branches
history
accounts
24. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Replicas—backfill on PostgreSQL
pgbench RW 8K tps on primary – RO 200k tps on replica
backfill
pgbench_history
25. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Replicas—Backfill on Amazon Aurora
pgbench RW 8K tps on primary – RO 200k tps on replica
backfill
pgbench_history
26. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
RO
Application
Fast Clones
RW
Application
RW
Reporting
Application
Write log
records
Read
blocks
Availability Zone 1 Availability Zone 3Availability Zone 2
Aurora
storage
Primary storage
Clone storage
Clone
27. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Fast Clone example
0
5000
10000
15000
20000
25000
0 3 6 9 12 15 18 21 24 27 30 33 36 39 42 45 48 51 54 57 60 63 66 69 72 75 78
TransactionsPerSecond(TPS)
Minutes
PGBench RW Scale 10K - Target Rate 20K TPS
Main Database
Clone Database
28. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
29. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Logical replication support
PostgreSQL
instance
30. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Cross region replication
Aurora Storage
RO
Application
RW
Application
RO
Application
Availability zone 1 Availability zone 3Availability zone 2
Region A Region B
Aurora Storage
RO
Application Application
RO
Application
Availability zone 1 Availability zone 3Availability zone 2
RORW
31. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
32. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Caching changes—No double buffering
488 GB RAMPG + OS
processes
shared_buffers
25%
Linux
Pagecache
50+%
select of data –
check for block in
shared_buffers
if not in
shared_buffers
load from
pagecache/disk
EBS
duplicate
buffers
PG + OS
processes
shared_buffers
75%
PostgreSQL Aurora PostgreSQL
Aurora
Storage
select of data –
check for block
in
shared_buffers
or load from
Aurora storage
Survivable
cache
33. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Caching changes—No double buffering
689,068
417,496
334,691
682,931
-
100,000
200,000
300,000
400,000
500,000
600,000
700,000
800,000
TransactionsPerSecond(tps)
pgbench read only - scale 22,000 - r4.16xlarge
Aurora 75% Cache PostgreSQL 25% Cache PostgreSQL 10% Cache PostgreSQL 75% Cache
1.6x 2.0x
18K read iops
no reads
heavy double
buffering
no double
buffering
no survivable
cache
Approx 350GB working set
34. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Cluster Cache Management - Failover
0
50,000
100,000
150,000
200,000
250,000
300,000
350,000
400,000
0 60 120 180 240 300 360 420 480 540 600 660 720 780 840 900 960 1020 1080 1140 1200
TransactionsperSecond(TPS)
Seconds
PGBench 20X RO / 1X RW 160GB Cached - Failover at 600 Seconds
Baseline
340 seconds
32 seconds
35. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
RO
Application
Cluster Cache Management (CCM) Feature
RW
Application
RO
Application
Async
Invalidation
& Update
Availability Zone 1 Availability Zone 3Availability Zone 2
Aurora
storage
RORORORO
apg_ccm_enabled=on
36. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Cluster Cache Management
0
50,000
100,000
150,000
200,000
250,000
300,000
350,000
400,000
0 60 120 180 240 300 360 420 480 540 600 660 720 780 840 900 960 1020 1080 1140 1200
TransactionsperSecond(TPS)
Seconds
PGBench 20X RO / 1X RW 160GB Cached - Failover at 600 Seconds
Baseline CCM Enabled
32 seconds
340 seconds
37. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
38. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Performance Insights
39. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Performance Insights
40. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Performance Insights
41. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Plan change
Before
Aggregate (cost=3804.15..3804.16 rows=1 width=16)
-> Nested Loop (cost=12.67..3802.61 rows=307 width=8)
-> Index Scan using pgbench_branches_pkey on pgbench_branches b (cost=0.29..16.60 rows=2 width=8)
Index Cond: (bid = ANY ('{1,4}'::integer[]))
-> Bitmap Heap Scan on pgbench_history h (cost=12.39..1891.47 rows=154 width=8)
Recheck Cond: (bid = b.bid)
Filter: ((mtime >= (now() - '01:00:00'::interval)) AND (mtime <= (now() - '00:30:00'::interval)))
-> Bitmap Index Scan on i_p_bid (cost=0.00..12.35 rows=522 width=0)
Index Cond: (bid = b.bid)
After
Aggregate (cost=171092.96..171092.97 rows=1 width=16)
-> Hash Join (cost=329.02..171091.42 rows=307 width=8)
Hash Cond: (h.bid = b.bid)
-> Seq Scan on pgbench_history h (cost=0.00..166712.20 rows=1542280 width=8)
Filter: ((mtime >= (now() - '01:00:00'::interval)) AND (mtime <= (now() - '00:30:00'::interval)))
-> Hash (cost=329.00..329.00 rows=2 width=8)
-> Seq Scan on pgbench_branches b (cost=0.00..329.00 rows=2 width=8)
Filter: (bid = ANY ('{1,4}'::integer[]))
• enable_bitmapscan=off
• enable_indexscan=off
42. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Query Plan Management - QPM
Use baseline
43. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
QPM – Use plan baselines
44. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
45. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
4,000
5,000
6,000
7,000
8,000
9,000
10,000
1 61 121 181 241 301 361 421 481 541 601 661 721 781 841 901 961 1021 1081 1141 1201 1261
TPS
Minutes
Updates—No Vacuum running
transaction id wrap
around
46. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Intelligent Vacuum prefetch
PostgreSQL
Aurora PostgreSQL
Submit
Batch I/O
up to
256 blocks
402 seconds
163 seconds
47. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
48. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Aurora Serverless
Starts up on demand, shuts down
when not in use
Scales up/down automatically
No application impact when
scaling
Pay per second, 1 minute
minimum
WARM POOL
OF INSTANCES
APPLICATION
AURORA STORAGE
AURORA
REQUEST ROUTER
DATABASE END-POINT
AURORA STORAGE
49. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Scaling up & down
50. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
51. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Migration to Aurora PostgreSQL
Methods
• PostgreSQL - pg_dump / pg_restore
• AWS Data Migration Service (DMS)
• RDS PostgreSQL - Snapshot import
• RDS PostgreSQL – Read replica
52. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Customer
Premises,
EC2, RDS
Application Users
Aurora
PostgreSQLVPN/Network
AWS DMS—Logical replication
Start a replication instance
Connect to source and target databases
Select tables, schemas, or databases
Let the AWS Database Migration
Service create tables and load data
Uses change data capture to keep
them in sync
Switch applications over to the target
at your convenience
53. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Migration – Read replica
Snapshot
Catchup via PostgreSQL asynchronous replication
54. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Related breakouts
Tuesday, Nov 27
DAT428 - Deep Dive on Amazon Aurora PostgreSQL Performance Tuning
2:30 PM - 3:30 PM | Mirage, Antigua A
Wednesday, Nov 28
DAT324 - Deep Dive on PostgreSQL Databases on Amazon RDS
12:15 PM - 1:15 PM | Venetian, Level 2, Venetian F
Thursday, Nov 29
DAT428 - Deep Dive on Amazon Aurora PostgreSQL Performance Tuning
12:15 PM - 1:15 PM | Aria West, Level 3, Starvine 2
55. Thank you!
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Grant McAlister,
AWS Senior Principal Engineer
56. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.