Developing a database server: software engineer's view
1. Developing a Database
Server: Software Engineer’s
View
Laurynas Biveinis / Percona
laurynas.biveinis@{gmail|percona}.com
Big Data Strategy 2015 Vilnius
2. Which database server?
Percona Server
http://www.percona.com/software/percona-server
A drop-in compatible fork of MySQL
An open-source, relational database management
system
Approaching 2,000,000 downloads
3. A part of MySQL ecosystem
Enabled by GNU General Public License
Forks abound
Healthy and thriving
Lots of politics
11. A case of super_read_only
Facebook patch implemented it first
Facebook contributed it to WebScaleSQL
12. A case of super_read_only
Facebook patch implemented it first
Facebook contributed it to WebScaleSQL
Percona Server merged it from WebScaleSQL, sent
some bugfixes back to WebScaleSQL
13. A case of super_read_only
Facebook patch implemented it first
Facebook contributed it to WebScaleSQL
Percona Server merged it from WebScaleSQL, sent
some bugfixes back to WebScaleSQL
Oracle re-implemented it from scratch for the next major
MySQL release
14. A case of super_read_only
Facebook patch implemented it first
Facebook contributed it to WebScaleSQL
Percona Server merged it from WebScaleSQL, sent
some bugfixes back to WebScaleSQL
Oracle re-implemented it from scratch for the next major
MySQL release
MariaDB did not like it
16. Back to Percona Server
Tracks MySQL closely
Diagnostics and management
Performance and scalability
17. Why diagnostics and
management?
Early Percona Server:
Ad-hoc patch for extra diagnostics by Percona
consultants
Get billed-per-hour work done more efficiently
18. Why (InnoDB) performance
and scalability?
In 2010, InnoDB was performing worse on a 4-core
machine than on 1-core one
And fixes were not forthcoming at the time
Addressed the need then, built the reputation since
19. Why not other features?
Feature benefit / feature cost ratio has to be very, very
high
Case 1: implement low-hanging fruits
Case 2: implement extremely beneficial features
No rewrites, no refactorings, no code base cleanups
21. Lesson 1: stand on the
shoulders of giants
You probably do not need to write a DBMS from
scratch
So find a good project to fork
22. Lesson 2: do not diverge
Do not add a single line of code difference without a very
good reason
Unless your engineering team is as big as the upstream
one
Improvements such as O(n2
) -> O(n log n) algorithms are
often not good enough in cold code paths
Plugins are very good
23. Lesson 3: listen to users
Easier said than done, especially if done right
Listening and then ignoring / downplaying users’ pain
Listening to wrong users
We have the best users! :)
$$$ / €€€ add weight to users’ opinions
Both right and wrong
24. Lesson 4: Continuous QC
Was not something Percona Server had on Day One
MySQL always had an automated feature/regression testsuite
But 3rd parties did not always add tests for their features
Step 1: require developers to actually run the testsuite
Step 2: Jenkins per-push
Step 3: …
25. Lesson 4: wrong ways and slightly
less wrong ways to do performance
28. Same performance graph, different view
0
20000
40000
60000
80000
00:00 00:01 00:02 00:03 00:04 00:05 00:06
Product A Product B
29. Is Product B still better?
How to provision
capacity for B?
What response time
guarantee will it give?
Will your automated
failover work correctly in
the presence of stalls?
0
20000
40000
60000
80000
00:00 00:03 00:06
30. Engineering low variance >
engineering max peak performance
Where does variance come from anyway?
From the query code path requesting resources with variable
availability
C, C++, CPU, memory: caches, heap, mutexes, rwlocks
Memory/disk: data on disk, which could be cached
RDBMS: free space on WAL log etc
Client-server and clusters: network roundtrips
31. Database servers love being
in homeostasis
All the required resources for queries readily available
In the presence of unpredictable load
Do not make query threads work for this
Monitor them in background and make them available
as needed
In the presence of unpredictable workload
32. If you want to develop a
DBMS:
Find an existing one to fork!
And then do not diverge
Listen to your users
Control quality continuously
Ensure stable performance