MySQL Cluster performance best practices

MySQL Cluster : Delivering Breakthrough Performance
26th July 2012

Andrew Morgan Mat Keep
Senior Product Manager – MySQL HA Senior Product Manager – MySQL HA
andrew.morgan@oracle.com mat.keep@oracle.com
clusterdb.com

The presentation is intended to outline our general
product direction. It is intended for information
purposes only, and may not be incorporated into any
contract. It is not a commitment to deliver any
material, code, or functionality, and should not be
relied upon in making purchasing decisions.
The development, release, and timing of any
features or functionality described for Oracle’s
products remains at the sole discretion of Oracle.

Copyright 2012 Oracle Corporation - 26th July 2012 2

Session Agenda
•  Introduction to MySQL Cluster
•  Where does MySQL Cluster fit?
•  Benchmarks
•  WHILE (cluster.measurePerformance() < target) {
cluster.optimize();
}

•  Boosting performance
•  Scaling out
•  Further resources


MySQL Cluster – Users & Applications
Extreme Scalability, Availability and Affordability

•  Web
•  High volume OLTP
•  eCommerce
•  On-Line Gaming
•  Digital Marketing
•  User Profile Management
•  Session Management & Caching
•  Content Management

•  Telecoms
•  Service Delivery Platforms
•  VAS: VoIP, IPTV & VoD
•  Mobile Content Delivery
•  Mobile Payments
•  LTE Access http://www.mysql.com/customers/cluster/


MySQL Cluster Architecture

JPA REST
Application Nodes

Node Group 1 Node Group 2

F1 F2
Node 1

Node 3
Cluster Cluster
Mgmt F3 F4 Mgmt

F3 F4

Node 4
Node 2

F1 F2
Data Nodes


When to Consider MySQL Cluster
l  What are the consequences of downtime or failing to meet
performance requirements?
l  How much effort and $ is spent in developing and managing HA in
your applications?
l  Are you considering sharding your database to scale write
performance? How does that impact your application and
developers?
l  Do your services need to be real-time?
l  Will your services have unpredictable scalability demands,
especially for writes ?
l  Do you want the flexibility to manage your data
with more than just SQL ?


Where would I not Use MySQL Cluster?
•  “Hot” data sets >3TB
•  Replicate cold data to InnoDB
•  Long running transactions
•  Large rows, without using BLOBs
•  Foreign Keys
•  Check out MySQL Cluster 7.3 Early Access: http://labs.mysql.com/
•  Many full table scans
•  Geo-Spatial indexes
•  In these scenarios; InnoDB storage engine would be the
right choice

MySQL Cluster Evaluation Guide
http://mysql.com/why-mysql/white-papers/mysql_cluster_eval_guide.php


General Design Considerations

•  MySQL Cluster is designed for
–  Short transactions
–  Many parallel transactions
•  Utilize Simple access patterns to fetch data
–  Use efficient scans and batching interfaces
•  Analyze what your most typical use cases are
–  optimize for those

Overall design goal
Minimize network roundtrips for your
most important requests!


Servicing the Most Performance-Intensive Workloads


writes

Servicing the Most Performance-Intensive Workloads


Comparing MySQL Cluster Performance
8x Higher Performance per Node
20

18

Reads per Second (Millions)
16

14

12

10

8

6

4

2

0
MySQL Cluster 7.1 MySQL Cluster 7.2

•  1 Billion+ Reads per Minute, 8 node Intel Xeon cluster
•  Multi-Threaded Data Node Extensions
•  NoSQL C++ API, flexaSynch benchmark


1.2 Billion UPDATEs per Minute
25

Millions of UPDATEs per Second

20

15

10

5

0
2 4 6 8 10 12 14 16 18 20 22 24 26 28 30
MySQL Cluster Data Nodes

•  30 x Intel E5-2600 Intel Servers
•  NoSQL C++ API, flexaSynch benchmark
•  ACID Transactions, with Synchronous Replication


WHILE (cluster.measurePerformance() < target)

•  Don’t optimize for the sake of it
•  Wastes effort
•  Introduces unnecessary compromises/complications
•  Forget about database benchmarks
•  You care about the end-to-end performance of your
application on your database
•  If possible drive your application to drive the database
•  Measurements need to be based on
representative traffic
•  Measurements need to be repeatable
•  Easily see impact of each optimization


Where has the time gone?
•  Enable the slow query log
–  set global slow_query_log=1;
–  set global long_query_time=3; //3 seconds
–  set global log_queries_not_using_indexes=1;
–  Slow queries will be written in the slow query log:
mysql> show global variables like 'slow_query_log_file';
+---------------------+------------------------------+
| Variable_name | Value |
+---------------------+------------------------------+
| slow_query_log_file | /data1/mysql/mysqld-slow.log |
+---------------------+------------------------------+

•  Queries will be written in plain text


Query Analyzer in MySQL Enterprise
Monitor (take the easy option!)


Simple database traffic generation
create.sql:
CREATE TABLE sub_name (sub_id INT NOT NULL PRIMARY KEY, name
VARCHAR(30)) engine=ndb;
CREATE TABLE sub_age (sub_id INT NOT NULL PRIMARY KEY, age INT)
engine=ndb;
INSERT INTO sub_name VALUES (1,'Bill'),(2,'Fred'),(3,'Bill'),
(4,'Jane'),(5,'Andrew'),(6,'Anne'),(7,'Juliette'),(8,'Awen'),
(9,'Leo'),(10,'Bill');
INSERT INTO sub_age VALUES (1,40),(2,23),(3,33),(4,19),(5,21),
(6,50),(7,31),(8,65),(9,18),(10,101);

query.sql:
SELECT sub_age.age FROM sub_name, sub_age WHERE
sub_name.name='Bill' AND sub_name.sub_id=sub_age.sub_id;


Simple database traffic generation
shell> mysqlslap --concurrency=5 --
iterations=100 --query=query.sql --
create=create.sql
Benchmark
Average number of seconds to run all queries: 0.132 seconds
Minimum number of seconds to run all queries: 0.037 seconds
Maximum number of seconds to run all queries: 0.268 seconds
Number of clients running queries: 5
Average number of queries per client: 1


What on Earth is it doing?

EXPLAIN is your friend

mysql> EXPLAIN <query>;

mysql> EXPLAN PARTITIONS <query>;

mysql> EXPLAIN EXTENDED <query>;
mysql> SHOW WARNINGS;


Boosting Performance

• ANALYZE TABLE •  Connection pools
•  Access patterns •  Multi-threaded data
•  AQL (fast JOINs) nodes
•  Distribution aware •  NoSQL APIs
•  Batching •  Hardware
•  Schema •  More tips


Before you do anything else, ANALYZE!

•  New for MySQL Cluster 7.2
•  Lets the MySQL optimizer figure out how to best use
indexes etc.
•  Instantly speed up queries by many times

mysql> ANALYZE TABLE <tab-name>;

•  Repeat after changing schema, adding/removing
indexes or making major data changes
•  Only needs running on one mysqld in the cluster


Access patterns

•  Primary key reads/writes -> O(1)
•  Independent of database size and number of nodes
•  Index searched -> O(log n)
•  n = number of tuples
•  BLOBs are stored in second table -> take longer to
access
•  JOINs massively faster in MySQL Cluster 7.3
•  Partition pruning
•  By allowing a query to be satisfied with a single data node,
reduce resource consumption -> greater throughput
•  If result sets not large, will also reduce latency


Adaptive Query Localization
Scaling Distributed Joins 70x
More
Performance

•  Perform Complex Queries
mysqld across Shards
•  JOINs pushed down to data nodes
A Data Nodes
•  Executed in parallel
Q •  Returns single result set to MySQL
L
•  Opens Up New Use-Cases
•  Real-time analytics
•  Recommendations engines
mysqld
•  Analyze click-streams

Data Nodes DON’T COMPROMISE
FUNCTIONALITY TO SCALE-OUT !!


MySQL Cluster 7.2 AQL Test Query
Web-Based Content Management System

MySQL
Server

Data Data
Node1 Node2


Web-Based CMS
Query Execution Time Seconds
70x
More
100
Performance
90

80

70
87.23 seconds
60

50

40

30

20

10 1.26 seconds
0
MySQL Cluster 7.1 MySQL Cluster 7.2

Must Analyze tables for best results
mysql> ANALYZE TABLE <tab-name>;


Did I mention ANALYZE TABLE?


AQL – How to Use it
•  Activated when ndb_join_pushdown is on (default)
•  Rules for a Join to be pushed down:
1.  Joined columns use identical types
2.  No reference to BLOB or TEXT columns
3.  No explicit lock
4.  Child tables in the Join must be accessed using one of the ref, eq_ref, or
const
5.  Tables not explicitly partitioned by [LINEAR] HASH, LIST, or RANGE
6.  Query plan doesn’t select ‘Using join buffer'
7.  If root of Join is an eq_ref or const, child tables must be joined by eq_ref
•  Run ANALYZE TABLE <tab-name> on each table once
•  Use EXPLAIN to see what components are being pushed down:
•  Extra: Child of 'd' in pushed join@1
•  EXPLAIN EXTENDED <query>;SHOW WARNINGS;


Distribution Aware Apps
SELECT SUM(population) FROM towns •  Partition selected using hash on
WHERE country=“UK”;
Partition Key
Partition Key

Primary Key
•  Primary Key by default
town country population •  User can override in table definition
Maidenhead UK 78000 •  MySQL Server (or NDB API) will
Paris France 2193031 attempt to send transaction to the
Boston UK 58124
correct data node
Boston USA 617594
•  If all data for the transaction are in
SELECT SUM(population) FROM towns the same partition, less messaging -
WHERE town=“Boston”; > faster
Partition Key •  Aim to have all rows for high-running
Primary Key queries in same partition
town country population
Maidenhead UK 78000
Paris France 2193031
Boston UK 58124
Boston USA 617594


Distribution Aware – Multiple Tables

Partition Key

Primary Key
sub_id age gender •  Extend partition awareness over
19724 25 male multiple tables
84539 43 female
•  Same rule – aim to have all data for
19724 16 female
instance of high running
74574 21 female
transactions in the same partition
Partition Key

Primary Key ALTER TABLE service_ids
service sub_id svc_id PARTITION BY KEY(sub_id);
twitter 19724 76325732
twitter 84539 67324782
facebook 19724 83753984 EXPLAIN PARTITIONS <query>;
facebook 73642 87324793


Validate if “partition pruning” is working
mysql> SHOW GLOBAL STATUS LIKE 'ndb_pruned_scan_count';
+-----------------------+-------+
+-----------------------+-------+
| Ndb_pruned_scan_count | 12 |
+-----------------------+-------+
mysql> SELECT * FROM services WHERE sub_id=1;
+--------+--------------+--------------+
| sub_id | service_name | service_parm |
+--------+--------------+--------------+
| 1 | IM | 878 |
| 1 | ssh | 666 |
| 1 | Video | 654 |
+--------+--------------+--------------+

mysql> SHOW GLOBAL STATUS LIKE 'ndb_pruned_scan_count';
+-----------------------+-------+
+-----------------------+-------+
| Ndb_pruned_scan_count | 13 |
+-----------------------+-------+


Batching
•  MySQL Cluster allows batching on
•  Inserts, index scans (when not part of a JOIN), PK reads, PK deletes,
and PK (most) updates.
•  Batching means that one network round trip is used to read/modify a
number of records → less ping-pong!
•  If you can batch - do it!
•  Example – Insert 1M records
•  No batching:
•  INSERT INTO t1(data) VALUES (<data>);
•  765 seconds to insert 1M records
•  Batching (batches of 16):
•  INSERT INTO t1(<columns>) VALUES (<data>),
(<data>)...
•  50 seconds to insert 1M records


Batching, more examples
SELECT * FROM t1 WHERE userid=1 AND serviceid
IN (1,2,3,4,5,7,8,9,10);

SET transaction_allow_batching=1; //must be set
on the connection
BEGIN;
INSERT INTO t1 ....;
DELETE FROM t5 ....;
UPDATE t1 SET value='new value' WHERE id=1;
COMMIT;


Optimizing schema - denormalization

userid
voip_data
userid
bb_data

SELECT * FROM bb,voip WHERE 1
<data>
1
<data>

2
<data>
2
<data>

bb.userid=voip.userid AND 3
<data>
3
<data>

bb.userid=1; 4
<data>
4
<data>

voip bb

userid
voip_data
bb_data

1
<data>
<data>

mysql> SELECT * FROM 2
<data>
<data>

voip_bb WHERE userid=1; 3
<data>
<data>

4
<data>
<data>

1.7x improvement
voip_bb


Connection Pools
•  Network hops increase latency (e.g.
Compared with InnoDB read of cached
data)
App thread

App thread

App thread

App thread

App thread

App thread
•  Increase throughput by sending in lots of
parallel operations
•  Multiple client connections (sessions) to each
MySQL Server
mysqld mysqld
•  Multiple MySQL Servers
NDB API NDB API
•  Connection pooling between MySQL
Servers and data nodes
•  Set ndb-cluster-connection-pool > 1
in my.cnf
•  Ensure enough [api] sections in
Data Nodes
config.ini
•  Don’t assign hostnames!


Multi-threaded data nodes
•  Scaling out on commodity
hardware is the standard
Application Nodes way to increase
performance
•  Add more data nodes and

Node 3
API nodes as required
•  MySQL Cluster 7.2
Node 1

increases the ability to also
scale-up each data node
•  Increases maximum
number of utilised threads
from 8 to 59
Node 2

Node 4

Node Group 1 Node Group 2



•  Threads:
•  ldm: 1,2,4,8 or 12 Local
Application Nodes
Query Handler threads
•  tc: typically ldm/4
Transaction
Coordinator threads
•  send: ~2-3 Send
Data Node 1

recv send main threads
•  recv: ~2-4 Receive
threads
•  main: 1 Main thread
•  rep: 1 Replication
tc ldm rep io thread
•  io: 1 I/O thread



•  Applies to ntbdmtd only
•  Configure through either:
•  MaxNoOfExecutionThreads
•  Single value for number of threads
•  System will allocate these threads to blocks in a reasonable
way
•  ThreadConfig
•  Specify explicitly how many threads for each block type
•  Lock threads to CPUs for further performance gains
•  Threadconfig=main={cpubind=0},ldm=
{count=4,cpubind=1,2,5,6},io=
{count=2,cpubind=3,4}


Multi-threaded data nodes – starting
point for ThreadConfig

24 threads 32 threads 40 threads 48 thread
ldm 8 12 16 16
tc 4 6 8 12
recv 3 3 4 5
send 3 3 4 4
rep 1 1 1 1

Note that some threads are left for other data node
blocks as well as the OS


NoSQL APIs

Clients

JDBC / ODBC
PHP / PERL
Native memcached HTTP/REST Python / Ruby

NDB API

Data Nodes

Mix •  SQL: Complex, relational queries
& •  Memcached: Key-Value web services
Match •  Java: Enterprise Apps
•  NDB API: Real-time services


Hardware

•  High bandwidth, low latency network
•  Turn off firewalls if you can
•  Use multiple disks
•  Checkpoints
•  Log files
•  Table spaces
•  SSDs
•  Biggest benefit for Table spaces
•  Refer to MySQL Cluster Evaluation Guide for
more details


MySQL Query Cache
•  Don't enable the Query Cache!
•  It is very expensive to invalidate over multiple MySQL servers
•  A write on one server will force the others to purge their cache.
•  If you have tables that are read only (or change very
seldom):
my.cnf:
query_cache_size=1000000
query_cache_type=2 (ON DEMAND)

mysql> SELECT SQL_CACHE <cols> .. FROM table;

•  SQL_CACHE tells (demands) MySQL to cache the results
from this SELECT
•  This can be good for STATIC data


Non-Durable tables

•  Some types of tables account for a lot of WRITEs, but do not
need to be recovered (e.g, Session tables)
•  Unnecessary to persist such tables - no REDO LOGs or
CHECKPOINTs
•  Create these tables as 'NO LOGGING' tables:

mysql> set ndb_table_no_logging=1;

mysql> create table session_table
(..) engine=ndb;

mysql> set ndb_table_no_logging=0;

•  After system restart table will be there, but empty!


More optimisation tips
•  When using auto-increment columns, increase ndb-
autoincrement-prefetch-sz
•  Set RedoBuffer=32-64M
•  Disk-based tables:
•  Increase UNDO_BUFFER for write-intensive apps
•  Increase DiskIOThreadPool
•  Increase DiskPageBufferMemory for better caching in the data
nodes; Monitor effectiveness using NDBINFO / MEM
• FragmentLogFileSize=256M
• NoOfFragmentLogFiles= 6 x DataMemory (in
MB) / (4x 256MB)
•  Use OPTIMIZE TABLE and perform rolling restarts if
memory fragmentation is an issue


Scaling out with MySQL Cluster Manager
client

mysqld mysqld

ndb_mgmd ndb_mgmd

agent agent

192.168.0.10 192.168.0.11

ndbd ndbd

ndbd ndbd

agent agent

192.168.0.12 192.168.0.13


client

mysqld mysqld

mysqld mysqld

ndb_mgmd ndb_mgmd

agent agent

192.168.0.10 192.168.0.11

ndbd ndbd ndbd ndbd

ndbd ndbd ndbd ndbd

agent agent agent agent

192.168.0.12 192.168.0.13 192.168.0.14 192.168.0.15


mcm> add hosts --hosts=192.168.0.14,192.168.0.15
mysite;
mcm> add package --basedir=/usr/local/mysql_7_0_9
--hosts=192.168.0.14,192.168.0.15 7.0.9;
mcm> add process --
processhosts=mysqld@192.168.0.10,mysqld@192.168.0.
11,ndbd@192.168.0.14,ndbd@192.168.0.15,ndbd@192.16
8.0.14,ndbd@192.168.0.15 -s port:mysqld:
52=3307,port:mysqld:53=3307 mycluster;
mcm> start process --added mycluster;
mysql> ALTER ONLINE TABLE <table-name> REORGANIZE
PARTITION;
mysql> OPTIMIZE TABLE <table-name>;


Download MySQL Cluster Today!

http://www.mysql.com/downloads/cluster/#downloads

Further resources

•  MySQL Cluster Performance white paper:
http://www.mysql.com/why-mysql/white-papers/
mysql_wp_cluster_performance.php
•  MySQL Cluster Forum:
http://forums.mysql.com/list.php?25
•  MySQL Cluster Evaluation guide:
mysql_cluster_eval_guide.php
•  MySQL Cluster in Web-Scale Architectures:
mysql_cluster_eval_guide.php

MySQL Cluster performance best practices

MySQL Cluster performance best practices

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

Similaire à MySQL Cluster performance best practices

Similaire à MySQL Cluster performance best practices (20)

Plus de Mat Keep

Plus de Mat Keep (11)

Dernier

Dernier (20)

MySQL Cluster performance best practices