Contenu connexe Similaire à DAT316_Report from the field on Aurora PostgreSQL Performance (20) Plus de Amazon Web Services (20) DAT316_Report from the field on Aurora PostgreSQL Performance1. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Report from the field on Aurora PostgreSQL
Performance
T a t s u o I s h i i ,
J a p a n P r e s i d e n t
S R A O S S , I n c .
M a r k P o r t e r
G e n e r a l M a n a g e r
A m a z o n R D S , A u r o r a , R D S f o r P o s t g r e S Q L
DAT 316
2. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
• Introduction to Aurora PostgreSQL
• Performance Results from SRA OSS
• Aurora Architecture
• Pgpool-II Announcement
• Performance Insights
• Q&A
3. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
• Introduction to Aurora PostgreSQL
• Performance Results from SRA OSS
• Aurora Architecture
• Pgpool-II Announcement
• Performance Insights
• Q&A
4. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Reimagining the relational database
What if you were inventing the database today?
You would break apart the stack
You would build something that:
Can scale out…
Is self-healing…
Leverages distributed services…
You would use open source software
5. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
A service-oriented architecture applied to the
database
Move the logging and storage layer into a
multitenant, scale-out, database-optimized
storage service
Integrate with other AWS services like
Amazon EC2, Amazon VPC, Amazon
DynamoDB, Amazon SWF, and Amazon
Route 53 for control and monitoring
Make it a managed service—using Amazon
RDS; takes care of management and
administrative functions
Amazon
DynamoDB
Amazon SWF
Amazon Route 53
Logging + Storage
SQL
Transactions
Caching
Amazon S3
1
2
3
Amazon RDS
6. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Why PostgreSQL ?
• Open source database
• In active development for 20 years
• Owned by a foundation, not a single company
• Permissive innovation-friendly open source license
• High performance out of the box
• Object-oriented and ANSI-SQL:2008 compatible
• Most geospatial features of any open source database
• Supports stored procedures in 12 languages (Java, Perl,
Python, Ruby, Tcl, C/C++, its own Oracle-like PL/pgSQL, etc.)
• Most Oracle-compatible open source database
• Highest AWS Schema Conversion Tool automatic conversion
rates are from Oracle to PostgreSQL
7. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
In 2014, we launched Amazon Aurora MySQL
Now we have added PostgreSQL compatibility—creating
Amazon Aurora PostgreSQL
Customers can now choose how to use Amazon’s
cloud-optimized relational database, with the performance and
availability of commercial databases and the simplicity and
cost-effectiveness of open source databases
8. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
• Introduction to Aurora PostgreSQL
• Performance Results from SRA OSS
• Aurora Architecture
• Pgpool-II Announcement
• Performance Insights
• Q&A
9. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon Aurora PostgreSQL
db.r3.8xlarge
vCPU 32 mem 244GB
Multi-AZ
Environment
Amazon RDS for PostgreSQL
db.r3.8xlarge
vCPU 32, mem 244GB
Provisioned IOPS 10,000
Multi-AZ
Using same #CPU and memory size between Aurora and RDS
Using 10,000 Provisioned IOPS storage to set the same price with Aurora per hour
Using PostgreSQL 9.6.2 on both
AZ2
AZ1
AZ2
AZ1
Write
&
Read
Multi-AZ
(Backup)
Client
Amazon EC2 m4.10xlarge
vCPU 40, mem 160GB
pgbench
Read
Only
Write
&
Read
Multi-AZ
(Backup)
10. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Scenario
• Using pbench
• 250, 500, 750, and 1,000 connections
• Loading data, creating index, and executing vacuum for each test
• DB size is 30 GB and large table contains 200 million rows
• Executing one SELECT, three UPDATE, and one INSERT within a transaction
for NUM in 250 500 750 1000
do
#Initialization
pgbench -i -s 2000
#Benchmark
pgbench --progress=1 --protocol=prepared -T 3600 -r -c $NUM -j $NUM -s 2000
done
11. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Loading Data
0:00:00
0:02:53
0:05:46
0:08:38
0:11:31
0:14:24
0:17:17
0:20:10
copy vacuum index 合計
Aurora
RDS
Total
Avg. of 4 tests
GoodElapsedTime
1/2 3/4
1/3
1/8
12. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Throughput
2000
4000
6000
8000
10000
12000
14000
16000
18000
20000
250 500 750 1000
Aurora
RDS
Connections
GoodTPS
x1.7
x2.2 x2.7 x3
13. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Average wait time of transactions
0
500
1000
1500
2000
2500
1 301 601 901 1201 1501 1801 2101 2401 2701 3001 3301
0
500
1000
1500
2000
2500
1 301 601 901 1201 1501 1801 2101 2401 2701 3001 3301
Elapsed seconds with 1,000 connections
Elapsed seconds with 1,000 connections
Aurora
RDS
msmsGoodGood
0
20
40
60
80
1 61 121 181 241 301
0
500
1000
1500
2000
2500
1 61 121 181 241 301
Elapsed seconds + 1,800 sec
Elapsed seconds + 1,800 sec
Aurora is more stable!
14. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Comparison of CPU utilization
• CPU Utilization on CloudWatch (1,000 connections)
• Aurora uses CPU more efficiently than RDS
# IO waits consume CPU time on RDS
Aurora RDS
15. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Comparison of Write IOPS (count/second)
• Write IOPS on CloudWatch
• Write IOPS of Aurora is lower than RDS
This means Aurora is handling writes more efficiently
Aurora RDS
16. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Comparison of Replica Lag
• Adding slave for read on RDS to compare streaming replication on RDS and Aurora’s replication
• Replica Lag on Aurora was low and replication was finished within tens of milliseconds
• Streaming replication on RDS could not catch up within the benchmark
Aurora RDS
17. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Summary
Compared to RDS, Amazon Aurora PostgreSQL is:
• 3 times faster on data loading
• 3 times faster on throughput
• Quick and stable response
• No performance degradation with increasing connections
• Low replica lag on replication
18. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
• Introduction to Aurora PostgreSQL
• Performance Results from SRA OSS
• Aurora Architecture
• Pgpool-II Announcement
• Performance Insights
• Q&A
19. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon Aurora PostgreSQL
Performance Architecture
20. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Do fewer IOs
Minimize network packets
Offload the database engine
DO LESS WORK
Process asynchronously
Reduce latency path
Use lock-free data structures
Batch operations together
BE MORE EFFICIENT
How does Amazon Aurora achieve high performance?
DATABASES ARE ALL ABOUT I/O
NETWORK-ATTACHED STORAGE IS ALL ABOUT PACKETS/SECOND
HIGH-THROUGHPUT PROCESSING NEEDS CPU AND MEMORY OPTIMIZATIONS
21. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Write IO Traffic in an Amazon Aurora database node
AZ 1 AZ 3
Primary
Database
Node
Amazon S3
AZ 2
Read
Replica/
Secondary
Node
AMAZON AURORA
ASYNC
4/6 QUORUM
DISTRIBUTED
WRITES
DATAAMAZON AURORA + WAL LOG COMMIT LOG & FILES
IO FLOW
Only write WAL records; all steps asynchronous
No data block writes (checkpoint, cache replacement)
6X more log writes, but 9X less network traffic
Tolerant of network and storage outlier latency
OBSERVATIONS
2x or better PostgreSQL Community Edition performance on
write-only or mixed read-write workloads
PERFORMANCE
Boxcar log records—fully ordered by LSN
Shuffle to appropriate segments—partially ordered
Boxcar to storage nodes and issue writes
WAL
T Y P E O F W R IT E
Read
Replica/
Secondary
Node
22. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
IO Traffic in Aurora Replicas
Physical: Ship redo (WAL) to Replica
Write workload similar on both instances
Independent, duplicated storage
Heavy write load impairs read performance
PAGE CACHE
UPDATE
Aurora Master
30% Read
70% Write
Aurora Replica
100% New Reads
Shared Multi-AZ Storage
PostgreSQL Master
30% Read
70% Write
PostgreSQL Replica
30% New Reads
70% Write
SINGLE-THREADED
WAL APPLY
Data Volume Data Volume
Physical: Ship redo (WAL) from Master to Replica
Cached pages have redo applied
Replica shares storage: no writes performed
Replica can do more read work
Advance read view as commits seen from master
POSTGRESQL READ SCALING AMAZON AURORA READ SCALING
23. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Write IO Traffic in an Amazon Aurora storage node
LOG RECORDS
Primary
Database
Node
INCOMING QUEUE
STORAGE NODE
AMAZON S3 BACKUP
1
2
3
4
5
6
7
8
UPDATE
QUEUE
ACK
HOT
LOG
DATA
BLOCKS
POINT IN TIME
SNAPSHOT
GC
SCRUB
COALESCE
SORT
GROUP
PEER-TO-
PEER
GOSSIPPeer
Storage
Nodes
All steps are asynchronous
Only steps 1 and 2 are in foreground latency path
Input queue is far smaller than PostgreSQL
Favors latency-sensitive operations
Uses disk space to buffer against spikes in activity
OBSERVATIONS
IO FLOW
① Receive record and add to in-memory queue
② Persist record and acknowledge
③ Organize records and identify gaps in log
④ Gossip with peers to fill in holes
⑤ Coalesce log records into new data block versions
⑥ Periodically stage log and new block versions to
Amazon S3
⑦ Periodically garbage-collect old versions
⑧ Periodically validate CRC codes on blocks
24. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon Aurora PostgreSQL
Durability and Availability
Architecture
25. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon Aurora storage engine overview
Data is replicated 6 times across 3 Availability
Zones
Continuous backup to Amazon S3
(built for 11 9s durability)
Continuous monitoring of nodes and disks for
repair
10 GB segments as unit of repair or hotspot
rebalance
Quorum system for read/write; latency tolerant
Quorum membership changes do not stall writes
Storage volume automatically grows up to 64 TB
AZ 1 AZ 2 AZ 3
Amazon S3
Database
Node
Storage
Node
Storage
Node
Storage
Node
Storage
Node
Storage
Node
Storage
Node
Storage
Monitoring
26. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Scale-out, distributed, log structured storage
Master Replica Replica Replica
Availability Zone 1
Shared Storage Volume—Transaction Aware
Primary
Database
Node
Read
Replica/
Secondary
Node
Read
Replica/
Secondary
Node
Read
Replica/
Secondary
Node
Availability Zone 2 Availability Zone 3
AWS Region
Storage
Monitoring
Database and
Instance
Monitoring
27. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
What can fail?
Segment failures (disks)
Node failures (machines)
AZ failures (network or datacenter)
Optimizations
4 out of 6 write quorum
3 out of 6 read quorum
Peer-to-peer replication for repairs
Amazon Aurora Storage fault tolerance
SQL
Transaction
AZ 1 AZ 2 AZ 3
Caching
SQL
Transaction
AZ 1 AZ 2 AZ 3
Caching
28. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon Aurora Replicas
Availability
Failing database nodes are automatically
detected and replaced
Failing database processes are
automatically detected and recycled
Replicas are automatically promoted to
primary if needed (failover)
Customer specifiable failover order
AZ 1 AZ 3AZ 2
Primary
Node
Primary
Node
Primary
Database
Node
Primary
Node
Primary
Node
Read
Replica
Primary
Node
Primary
Node
Read
Replica
Database
and
Instance
Monitoring
Performance
Customer applications can scale out read traffic
across read replicas
Read balancing across read replicas
29. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Faster, more predictable failover with Amazon Aurora
App
RunningFailure Detection DNS Propagation
Recovery
Database
Failure
Amazon RDS for PostgreSQL is good: failover times of ~60 seconds
Replica-Aware App Running
Failure Detection DNS Propagation
Recovery
Database
Failure
Amazon Aurora is better: failover times < 30 seconds
1 5 - 2 0 s e c 3 - 1 0 s e c
App
Running
30. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
• Introduction to Aurora PostgreSQL
• Performance Results from SRA OSS
• Aurora Architecture
• Pgpool-II Announcement
• Performance Insights
• Q&A
31. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Introducing Pgpool-II
• Providing clustering features between application and PostgreSQL
• Normally, Pgpool-II is used with PostgreSQL’s streaming replication
• Open Source Software (BSD License)
• Major version up per year
• Latest version is 3.7
PostgreSQL
Pgpool-IIClient
Read/Write
Query
Write
Primary
Standby
Standby
Replication
Read
Read
Pgpool-II details
pgpool.net/mediawiki/index.php/Main_Page
32. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
WE ARE ANNOUNCING…
Pgpool-II 3.7 supports Amazon Aurora PostgreSQL and provides:
• Automatic distribution of queries (UPDATE for master, SELECT
for read replica)
• Connection pooling and query cache
• Configuration sample is included
33. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
• Introduction to Aurora PostgreSQL
• Performance Results from SRA OSS
• Aurora Architecture
• Pgpool-II Announcement
• Performance Insights
• Q&A
34. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
What is RDS Performance Insights?
Customers ask for
• Visibility into performance of RDS databases
• Want to optimize cloud database workloads
• Easy tool
• Often only part-time DBA or no DBA
• Single pane of glass
35. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
First Step: RDS Enhanced Monitoring
Released 2016
OS Metrics
Process/thread list
Up to 1 second granularity
36. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Introducing: RDS Performance Insights
Dashboard
• DB load
• Adjustable timeframe
• Filterable by attribute (SQL, User, Host, Wait)
• SQL causing load
Phased RDS delivery
• Aurora, MySQL/MariaDB, PostgreSQL, Oracle, SQL Server
Guided discovery of performance problems
• For both beginners and experts
• Core metric is “Database Load”
37. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
What is “Database Load?”
All engines have a connections list showing
• Active
• Idle
We sample every second
• For each active session, collect
• SQL,
• State :CPU, I/O, Lock, Commit log wait, etc.
• Host
• User
Expose as “Average Active Sessions” (AAS)
38. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
RDS Performance Insights dashboard
39. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
RDS Performance Insights
40. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Sampling Every Second
Query run often
Fast query, run rarely
Slow query
User 1
User 2
User 3
Time
41. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Sampling is like film
42. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AAS load graph
User 1
User 2
User 3
User 4
Active Sessions
=
1
2
3
4
43. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Active Session State
CPU IO Wait
idleidle idle idleQuery 1 Query 2 Query 3
Time
44. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AAS over 1 minute averages
45. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Access to RDS Performance Insights
46. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Access to RDS Performance Insights
High
Load
47. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Summary: Amazon RDS Performance Insights
DB Load: Average Active Sessions
Identifies database bottlenecks
Easy
Powerful
Top SQL
Identifies source of bottleneck
Enables problem discovery
Adjustable timeframe
Hour, day, week, and longer
Questions:
rdspi@amazon.com
48. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
• Introduction to Aurora PostgreSQL
• Performance Results from SRA OSS
• Aurora Architecture
• Pgpool-II Announcement
• Performance Insights
• Questions?
49. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
THANK YOU!
T a t s u o I s h i i ,
J a p a n P r e s i d e n t
S R A O S S , I n c .
M a r k P o r t e r
G e n e r a l M a n a g e r
A m a z o n R D S , A u r o r a , R D S f o r P o s t g r e S Q L
(And pl ease fi l l out your sessi on revi ews)