Tips to drive maria db cluster performance for nextcloud

Copyright 2021 Severalnines AB
2
Hello! I'm Vinay from the Severalnines Team and I'm your host for today's webinar!
Feel free to ask any questions via the Q&A chat.
You can also contact me directly via the chat box or via email:
info@severalnines.com during or after the webinar.
Björn Schiessle
Co-founder and PreSales Lead at Nextcloud
Ashraf Sharif
Senior Support Engineer at Severalnines
Our Presenters:
Welcome note

• Overview of Nextcloud architecture
• Database architecture design
• Database proxy
• MariaDB and InnoDB performance tuning
• Nextcloud performance tuning
• Q&A
Agenda:
3
3

Copyright 2021 Severalnines AB 4
4
Poll #1

PRESENTER:
Björn Schiessle , Co-founder and PreSales Lead at Nextcloud
bjoern.schiessle@nextcloud.com | www.nextcloud.com | @nextclouders
Nextcloud Architecture
June 15th, 2021

Nextcloud Hub
Files Talk Groupware Office
6

LAMP Stack
7

Nextcloud Hub Architecture Overview
8

General Architecture (Files & Groupware)
9

From small to large Enterprise setups
Single Server Setup
(<= 150 users)
Cluster
(5.000 - 100.000 users)
Global Scale
(Millions of users)
10

Large Nextcloud Clusters
11

1
1
2
Poll #2

Ashraf Sharif, Senior Support Engineer, Severalnines
ashraf@severalnines.com | www.severalnines.com
Tips to Drive MariaDB Cluster
Performance for Nextcloud
June 15th, 2021
PRESENTER:

• Nextcloud Database Characteristics
• Architecture Design
• Performance Tuning & Best Practice
Agenda:
14

Nextcloud
Database
Characteristics

PHP Driver
● PHP PDO module:
○ Database Access Abstraction Layer
○ Offers unified interface to access many different
databases
○ Benefits:
■ Security (usable prepared statements)
■ Usability (many helper functions to automate
routine operations)
■ Reusability (unified API to access multitude of
databases, from SQLite to Oracle)
● Supported databases:
○ MySQL (MariaDB)
○ PostgreSQL
○ SQLite (good for testing only)
○ Oracle 11g (Nexcloud Enterprise only)
● Database stores metadata and administration data. Physical files
are stored inside {path}/nextcloud/data/{user}
16

/* Calculate read-write ratio */
SELECT @total_com := SUM(IF(variable_name IN
('Com_select', 'Com_delete', 'Com_insert', 'Com_update',
'Com_replace'), variable_value, 0)) AS `Total`,
@total_reads := SUM(IF(variable_name = 'Com_select',
variable_value, 0)) AS `Total reads`,
@total_writes := SUM(IF(variable_name IN ('Com_delete',
'Com_insert', 'Com_update', 'Com_replace'),
variable_value, 0)) AS `Total writes`,
ROUND((@total_reads / @total_com * 100),2) AS `Reads %`,
ROUND((@total_writes / @total_com * 100),2) AS `Writes %`
FROM information_schema.GLOBAL_STATUSG
****************** 1. row **********************
Total: 67293
Total reads: 61300
Total writes: 5993
Reads %: 91.09
Writes %: 8.91
Database Workload
● R/W ratio estimation:
○ Read = read_operations / total_operations * 100
○ Write = write_operations / total_operations * 100
● Read intensive:
○ Focus on read scalability and availability
○ More replicas
○ Cache resource-intensive queries
○ Indexing, query optimization, parallelism
● Write intensive:
○ Focus on write scalability and availability
○ More shards
○ Conflict resolutions
● Generally, Nextcloud DB workload is read-intensive
○ Around 85-90% reads
17

Task Queries DB Connections
Notifications
(every ~30 secs)
● 18 SELECTs 1
Open a file
● 18 SELECTs
● 1 INSERTs
● 1 UPDATEs
1
Upload a file
● 170 SELECTs
● 20 INSERTs
● 15 UPDATEs
5
Upload 5 files
● 715 SELECTs
● 100 INSERTs
● 85 UPDATEs
18
Delete a file
● 138 SELECTs
● 9 INSERTs
● 29 UPDATEs
● 5 DELETEs
4
Delete 5 files
● 358 SELECTs
● 20 INSERTs
● 82 UPDATEs
● 25 DELETEs
7
Move 1 file
● 117 SELECTs
● 5 INSERTs
● 17 UPDATEs
● 1 DELETE
4
Move 5 files
● 475 SELECTs
● 25 INSERTs
● 86 UPDATEs
● 5 DELETEs
10
Copy a directory
containing 1 file
● 189 SELECTs
● 15 INSERTs
● 17 UPDATEs
6
Copy a directory
containing 5 files
● 309 SELECTs
● 39 INSERTs
● 45 UPDATEs
6
**1 Nextcloud user, connecting via web browser. Nextcloud Server
running with standalone MariaDB, without Redis & load balancer.
Nextcloud Database Access Patterns
18

Architecture Design

WSREP API
WSREP API
WSREP API
Galera
What is MariaDB Cluster?
● MariaDB Server patched with:
○ MySQL-writeset replication API (wsrep)
○ Galera wsrep provider library for group
communication and writeset replication
● Linux only, XtraDB & InnoDB storage engine only
● Part of MariaDB since MariaDB 10.1
● 3 nodes for HA ("1-node cluster" is also possible)
● Features:
○ Active-active multi-master
○ Automatic membership control
○ Automatic node joining
○ Parallel slave applying
○ Native MySQL look & feel
● Benefits:
○ Almost no slave lag
○ No lost transactions
○ Read scalability
○ Small client latency
20

MariaDB version Galera version EOL
10.1 3.x 17th Oct 2020
10.2 3.x 23rd May 2022
10.3 3.x 25th May 2023
10.4 4.x 18th June 2024
10.5 4.x 24th June 2025
Supported Versions
21

How MariaDB Cluster works?
1. During the commit stage on the originator node:
a. Pre-commit the transaction on the node
b. Encapsulate into writeset (key, seqno, binlog
events)
c. Broadcast to all nodes
2. After receiving the writeset, on all receiver nodes:
a. Galera certifies the replicated writeset
b. If certification succeeds, queue the replicated
writeset to be applied
c. If certification fails, discard the writeset
3. Back on the originator node:
a. If 2b, return OK to client
b. If 2c, return ERROR and rollback the transaction
on the originator node
22

Notifications
(every ~30 secs)
● 18 SELECTs 1
Open a file ● 18 SELECTs
● 1 INSERTs
● 1 UPDATEs
1
(1 writeset)
Upload a file ● 170 SELECTs
● 20 INSERTs
● 15 UPDATEs
5
(33 writesets)
Upload 5 files ● 715 SELECTs
● 100 INSERTs
● 85 UPDATEs
18
(161 writesets)
Delete a file ● 138 SELECTs
● 9 INSERTs
● 29 UPDATEs
● 5 DELETEs
4
(18 writesets)
Delete 5 files ● 358 SELECTs
● 20 INSERTs
● 82 UPDATEs
● 25 DELETEs
7 (108 writesets)
Move 1 file ● 117 SELECTs
● 5 INSERTs
● 17 UPDATEs
● 1 DELETE
4 (17 writesets)
Move 5 files ● 475 SELECTs
● 25 INSERTs
● 86 UPDATEs
● 5 DELETEs
10 (60 writesets)
Copy a directory
containing 1 file
● 189 SELECTs
● 15 INSERTs
● 17 UPDATEs
6 (10 writesets)
Copy a directory
containing 5 files
● 309 SELECTs
● 39 INSERTs
● 45 UPDATEs
6 (30 writesets)
**1 Nextcloud user, connecting via web browser. Nextcloud Server
running with standalone MariaDB, without Redis & load balancer.
Nextcloud Writesets
23

Network latency
(RTT to the farthest node)
Flow control
(throttle the replication if
necessary)
Parallel slave threads
(number of threads to use in
applying slave write-sets)
Hardware specs
(CPU clock, cores, disk
subsystem, RAM, swap,
NIC, etc)
MariaDB + InnoDB
configuration
(I/O, threads, buffers,
caches, etc)
MariaDB Cluster Performance Factors
24

Design Concept
● Galera Cluster write and replication performance are depending on:
○ Network latency: Lower round trip time (RTT) to the farthest cluster node from originator node is better.
○ Flow control: Galera is as fast as the slowest node in the cluster. Galera will slow down its replication to allow the
slowest node to keep up in attempt to minimize the slave lag.
● Standard MySQL & InnoDB optimizations can be applied for buffers, caches, threads, queries and indexing.
● To take full advantage of a Galera cluster, use a database load balancer (reverse proxy), e.g:
○ ProxySQL
○ MariaDB MaxScale
○ HAProxy
● The importance of database load balancer:
○ Distribute database load to multiple servers
○ Split writes from reads to avoid Galera deadlocks (requires ProxySQL or MaxScale)
○ Automatic redirect connections to the next available node
○ Connection queueing and overload protection
○ Single endpoint for applications
25

Database Load Balancer
● Centralized:
○ Independent tier
○ Simple and easy to manage
○ Additional hosts for LBs
○ Combine with Virtual IP to eliminate SPOF
● Distributed:
○ Co-locate with each application server
○ Faster from application standpoint (caching, query
rerouting, query firewall)
○ Harder to manage
○ Might affect the health check performance with too
many LB instances
26
Nextcloud
App
Nextcloud
App
APP
LB
LB
APP VIP
LB
LB
LB
Nextcloud
App
Centralized Reverse
Proxies
Distributed Reverse Proxies
Nextcloud
App

ProxySQL
(Active)
ProxySQL
(Passive)
Virtual IP
Address
RO
RO
RO
RW
RO
RW
Primary writer
Backup writer
Backup writer
Read-Write Splitting
● For Nextcloud workload, read-write splitting is mandatory,
due to:
○ PHP PDO does not support multiple DB hosts per
PDO instance.
○ Some of the Nextcloud tables are missing primary
keys (which is critical for certification and conflict
resolution in Galera replication). Thus, write to a
single node in Galera to eliminate the risk of write-
conflicts.
● Use database load balancer that supports read-write
splitting natively:
○ ProxySQL (via query rules)
○ MariaDB MaxScale (via readwritesplit router)
○ HAProxy (via source IP hash, a.k.a session
stickiness)
27

● There are two types of I/O operations on MySQL files:
○ Sequential I/O-oriented files:
■ InnoDB system tablespace (ibdata)
■ MySQL log files:
● Binary/Relay logs
● REDO logs (ib_logfile*)
● General logs
● Slow query logs
● Error log
○ Random I/O-oriented files:
■ InnoDB file-per-table data file (*.ibd)
● Storage layer is critical in InnoDB:
○ Use at least SSD for MySQL data directory (or at least
the random I/O-oriented files).
○ RAID 10 is generally good for all workloads.
○ Use filesystem EXT4 or XFS on Linux.
● Mount your file systems with the -o noatime option to skip updates to the last access time
in inodes on the file system, which avoids some disk seeks.
Disk I/O Subsystem
28

Target Nextcloud active users: 1,000 - 10,000 users
Recommended Architecture I
29
APP
ProxySQL
(Active)
ProxySQL
(Passive)
APP
Virtual IP
Address
Nextcloud App
RO
RO
RO
RW
RO
RW

Target Nextcloud active users: 10,000 - 100,000 users.
Beyond this, use Nextcloud Federation feature.
Recommended Architecture II
30
APP
ProxySQL
(Active)
ProxySQL
(Passive)
APP
Virtual IP
Address
Nextcloud App
RO
RO
RO
RW
RO
RW
Redis
Redis
Redis Asynchronous
slave
Cache

Performance Tuning

Performance Tuning
● Tuning motivation:
○ Improve database read & write performance.
○ As minimal tradeoffs as possible.
● Requirements and specs:
○ Database vendor and server version: MariaDB 10.5.4
○ Database cluster type: Galera cluster
○ Nextcloud version: Stable 19.0.1
○ Target Nextcloud users: 10,000 users
○ Target Nextcloud database size: ~10 GB
○ Database load balancer: ProxySQL 2.0.13
● Some of the suggested MariaDB configuration options are applicable to MySQL 5.6 & MySQL 5.7.
32

MariaDB Cluster Performance Factors
Network latency
(RTT to the farthest
node)
Flow control
(throttle the
replication if
necessary)
Parallel slave threads
(number of threads to
use in applying slave
write-sets)
Hardware specs
(CPU clock, cores,
disk subsystem, RAM,
swap, NIC, etc)
MariaDB + InnoDB
configuration
(I/O, threads, buffers,
caches, etc)

Parameter Recommended Value Justification
max_connections 500 ProxySQL can handle thousands of client connections with
connection pooling and multiplexing. Don't set this too high.
performance_schema OFF Default is enabled. 3-5% performance overhead if enabled.
log_bin OFF (unset) Default is disabled. 10-15% performance overhead if enabled.
skip_name_resolve ON If you need to use hostname or FQDN, make sure /etc/hosts is
mapped with all hosts that is going to connect to the cluster on all
Galera nodes. Do not rely on DNS resolver.
max_allowed_packet 64M The default is 16M. You can go up to 1G to improve packet
transmission and faster mysqldump. Increase in 16,32,64 style.
sync_binlog OFF (unset) Default is disabled (usually this would be enabled together with
log_bin). 20-30% performance overhead if enabled.
*Based on MariaDB 10.5.4 default values.
General MariaDB Tuning
34

innodb_buffer_pool_size 50 - 70% of RAM, specify with K, M, G.
Example: 20G
Try to fit total data + index into memory to avoid hitting disk.
innodb_buffer_pool_instances Starts with 8, maximum is 64. If buffer pool > 8G, starts with 8. Go with 16, 32, 64 incremental if
necessary. Specify a combination of innodb_buffer_pool_instances
and innodb_buffer_pool_size so that each buffer pool instance is
at least 1 GB.
tx_isolation 'READ-COMMITTED' Transaction isolation level recommended by Nextcloud.
innodb_io_capacity ● SATA SSD starts with 1000
● NVMe starts with 10000
The default value 200. Should be set to around the number of IOPS
that system can handle. Ideally, opt for a lower setting, as at higher
value data is removed from the buffers too quickly, reducing the
effectiveness of caching.
innodb_io_capacity_max ● SATA SSD starts with 4000
● NVMe starts with 20000
The default value 2000. Hard limit of innodb_io_capacity.
innodb_flush_method 'O_DIRECT' Default is fsync. Use this if the MariaDB data directory is located on
a direct physical disk.
InnoDB Tuning (1 of 2)

innodb_read_io_threads Number of CPU cores Default is 4. Max is 64. The total number of innodb_*_io_threads
should not surpass the total number of CPU cores available to
MariaDB.
innodb_write_io_threads Number of CPU cores Default is 4. Max is 64. The total number of innodb_*_io_threads
should not surpass the total number of CPU cores available to
MariaDB.
innodb_log_file_size 512M Default is 96M. The larger the value, the less checkpoint flush
activity is required in the buffer pool, saving disk I/O with a
tradeoff of slower recovery. However, it doesn't really matter in
Galera since we have other operational nodes to cover the service
availability while the node is recovering.
innodb_flush_log_at_trx_commit 0 Default is 1 for full durability. This can be relaxed in Galera, since
we always have other nodes to cover write durability. Set to 0 so
InnoDB log buffer is written to file once per second. Up to 100% to
200% improvement.
InnoDB Tuning (2 of 2)

InnoDB Buffer Pool Best Practice
• Reading a row from disk is about 100,000 times slower than
reading the same row from RAM.
• Therefore, InnoDB uses its own buffer,
innodb_buffer_pool_size as the "working area".
• The first read of data copies the data to RAM, and then
subsequent reads take advantage of the faster performance
of RAM.
• Basic rule-of-thumb:
◦ If your database size can fit into the buffer pool size,
MariaDB will very likely not hitting the disk.
◦ Try to reach 1000/1000 in the buffer pool hit rate.
37
mysql> SHOW ENGINE INNODB STATUSG
...
----------------------
BUFFER POOL AND MEMORY
----------------------
...
Buffer pool hit rate 929 / 1000 ...
...
The hit rate of 929 / 1000 indicates that out of 1000
row reads, it was able to read the row in RAM 929
times. The remaining 71 times, it had to read the row
of data from disk.
This is not a bad ratio, but if your frequently-accessed
data fits fully in RAM, you'll see this ratio increase to
999 / 1000 or even round up to 1000 / 1000.

wsrep_slave_threads Number of CPU cores x 2 The minimum value for the slave threads should be 2 x number of
CPU cores, and must not exceed the wsrep_cert_deps_distance
value.
wsrep_provider_options "gcache.size=2G" Default is 128M. The bigger gcache size, the higher chance of a
joiner can join with incremental state transfer (IST). IST is faster
and lesser impact than full syncing (SST).
Galera Tuning
38

● If you opt for multi-writer Galera setup, make sure every
table has an explicit primary key defined.
○ Some tables in Nextcloud do not have a PK defined. Simply
add a primary key with auto-increment column.
● DDL operations (ALTER, CREATE, TRUNCATE, DROP) might cause
the cluster to stall (wsrep_OSU_method is 'TOI') until the
operation completes.
○ Perform NextCloud upgrade (commonly contains DDL
operations) during a maintenance window, or low-peak
hours or use online schema change tools like pt-online-
schema-change.
● Due to flow-control, if one of the nodes slows down, very
likely the replication performance will slow down as well.
○ Run heavy operations like backup, reporting or analytics
during low-peak hours.
○ Attach an asynchronous replication slave, replicate from
one of the Galera nodes.
/* Get tables without primary keys */
SELECT DISTINCT CONCAT(t.table_schema,'.',t.table_name) AS
tbl, t.engine, IF(ISNULL(c.constraint_name),'NOPK','') AS
nopk, IF(s.index_type = 'FULLTEXT','FULLTEXT','') AS ftidx,
IF(s.index_type = 'SPATIAL','SPATIAL','') AS gisidx
FROM information_schema.tables AS t LEFT JOIN
information_schema.key_column_usage AS c ON (t.table_schema
= c.constraint_schema AND t.table_name = c.table_name AND
c.constraint_name = 'PRIMARY') LEFT JOIN
information_schema.statistics AS s ON (t.table_schema =
s.table_schema AND t.table_name = s.table_name AND
s.index_type IN ('FULLTEXT','SPATIAL'))
WHERE t.table_schema NOT IN
('information_schema','performance_schema','mysql') AND
t.table_type = 'BASE TABLE' AND (t.engine <> 'InnoDB' OR
c.constraint_name IS NULL OR s.index_type IN
('FULLTEXT','SPATIAL'))
ORDER BY t.table_schema,t.table_name;
/* Add an auto-increment primary key */
ALTER TABLE nextcloud.oc_collres_accesscache ADD COLUMN `id` INT
PRIMARY KEY AUTO_INCREMENT;
Galera Best Practice
39

● Enable Redis to offload the database cluster from handling the
transactional file locking job.
● A performance boost on installations going from tens of thousands to
hundreds of thousands of users
○ E.g, when editing a single text file, the database writes decreases by
~40-50% (26 writes vs 14 writes).
○ Reduce write contention on table oc_file_locks.
● Package php-redis must be installed
○ If Redis is running on the same system as Nextcloud, configure it to
listen on an Unix socket.
○ Redis Cluster is also supported, while authentication requires php-redis
4.2.1+.
○ With cluster-mode enabled, there is no need to use a load balancer
since php-redis is already cluster-aware (php-redis 3.0.0+).
● You can also use Redis as PHP sessions handler (requires changes to
php.ini).
// Nextcloud config.php
'filelocking.enabled' => true,
'memcache.locking' => 'OCMemcacheRedis',
'memcache.distributed' => 'OCMemcacheRedis',
'redis' => [
'host' => '/var/run/redis/redis.sock',
'port' => 0,
],
// Nextcloud config.php
'redis.cluster' => [
'seeds' => [
'192.168.10.101:6379',
'192.168.10.103:6379',
'192.168.10.102:6379',
],
'timeout' => 0.0,
'read_timeout' => 0.0,
'failover_mode' => RedisCluster::FAILOVER_ERROR,
'password' => '',
],
// php.ini
session.save_handler = 'redis';// or 'rediscluster'
session.save_path = 'tcp://127.0.0.1:6379'
Caching
40

41

Tips to drive maria db cluster performance for nextcloud

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

Similaire à Tips to drive maria db cluster performance for nextcloud

Similaire à Tips to drive maria db cluster performance for nextcloud (20)

Plus de Severalnines

Plus de Severalnines (20)

Dernier

Dernier (20)

Tips to drive maria db cluster performance for nextcloud