4. EIB(European Investment Bank)
$27M 出資
Alibaba $27M 出資
Microsoft : Platinum Sponsor
2017 2018
MariaDB最新動向
MammothDB 買収
Alibaba Cloud
ApsaraDB RDS for MariaDB TX
ServiceNow出資 / Clustrix買収
Azure Database for MariaDB : GA
5. MariaDB releases
Feb 2010 MariaDB 5.1
Nov 2010 MariaDB 5.2
Apr 2012 MariaDB 5.3
Feb 2012 MariaDB 5.5
Mar 2014 MariaDB 10.0
Oct 2015 MariaDB 10.1
May 2017 MariaDB 10.2.6 GA
May 2018 MariaDB 10.3.7 GA
Nov 2018 MariaDB 10.4.0 Alpha
https://downloads.mariadb.org/mariadb/+releases/
https://github.com/MariaDB/server/releases
6. 12 Million Users in
45 Countries Trust Critical
Business Data to MariaDB
Technology & InternetTelecom
Retail & EcommerceTravel
Financial Services Gvmt & Education
Media & Social
8. Customer and Use Cases
• Multi-terabyte DB
• 80M transactions / month
• 50+ Node Cluster
• Multi-billion rows
• 600 Million reads/second
• 250 servers, 600G + 1.5T archive
• 10M travelers/quarter
• 4M transactions/ month
• ~14TB in MariaDB
production clusters
• Over 150 servers
• 150-200k queries / sec on the MariaDB Cluster
• 6TB and millions of Call Data
Records
• Over 60 TB
• 70 million rows per day
• 4 billion impressions per month
• 3 to 10 TB
• Over billion rows, most tables
100’s of millions of rows
• Over 5 TB in Pay Per click application
14. Michael “Monty” Widenius
The Soul of
Open Source
Founder & CTO of MariaDB
MariaDB was created to preserve
openness and community, so that
we can push ahead faster with the
capabilities for tomorrow’s
applications.
”
“
17. MariaDB MySQL EnterpriseDB
(PostgreSQL)
Oracle
INTERSECT/EXCEPT Yes No Yes Yes
User-defined aggregate
functions
Yes No Yes Yes
Oracle 互換性: PL/SQL Yes No Proprietary Yes
Oracle 互換性: SEQUENCE Yes No Proprietary Yes
テンポラル・テーブル Yes No No Yes
データ難読化/マスキング Yes No No Yes
Instant ADD COLUMN Yes Yes No Yes
INVISIBLE Column Yes No No Yes
COMPRESSED Column Yes No No No
用途別 Storage Engine Yes No No No
29. Purpose-built storage: MyRocks
MyRocks introduction and production deployment - 松信 嘉範 様
https://www.slideshare.net/matsunobu/myrocks-introduction-and-production-deployment
M|18 How to use MyRocks with MariaDB - Sergei Petrunia
https://www.slideshare.net/MariaDB/m18-myrocks-in-mariadb
Google
Bigtable
LevelDB
(Google)
MyRocks
(Facebook)
RocksDB
(Facebook)
MyRocks
(MariaDB)
Apache
Hbase
44. Schema evolution: invisible columns
CREATE TABLE users (
id INT PRIMARY KEY,
name VARCHAR(50),
bio TEXT(2000) COMPRESSED,
secret VARCHAR(20) INVISIBLE
);
SQL Server = HIDDEN (period columns only)
Db2 = IMPLICITLY HIDDEN
Oracle = INVISIBLE
45. Schema evolution: invisible columns
CREATE TABLE users2 (
id INT PRIMARY KEY,
name VARCHAR(50),
bio TEXT(2000) COMPRESSED,
secret VARCHAR(20) INVISIBLE NOT NULL DEFAULT 'OOPS'
);
SQL Server = HIDDEN (period columns only)
Db2 = IMPLICITLY HIDDEN
Oracle = INVISIBLE
46. Invisible Columns : SELECT with *
SELECT * FROM users;
+----+---------+----------------------------------------+
| id | name | bio |
+----+---------+----------------------------------------+
| 1 | Shane | Once deleted a table in production… |
| 2 | William | Was caught listening to Spice Girls… |
| 3 | Aneesh | Was with William listening to… |
+----+---------+----------------------------------------+
secret 列 : 非表示(invisible)
47. Invisible Columns: 明示的カラム指定
SELECT id, name, secret FROM users;
+----+---------+-------------+
| id | name | secret |
+----+---------+-------------+
| 1 | Shane | Gojira |
| 2 | William | Spice Girls |
| 3 | Aneesh | Maria Carey |
+----+---------+-------------+
明示的にカラム指定すると表示される
56. テンポラルテーブルQuery : 任意日時指定
SELECT * FROM cust_notifications
FOR SYSTEM_TIME AS OF '2017-12-31'
WHERE cid = 1;
cid newsletter product_updates security_alerts
1 FALSE FALSE FALSE
57. テンポラルテーブルQuery : 任意時間帯指定
SELECT * FROM cust_notifications
FOR SYSTEM_TIME BETWEEN '2018-02-01' AND '2018-03-30'
WHERE cid = 1;
cid newsletter product_updates security_alerts
1 FALSE FALSE TRUE
1 TRUE FALSE TRUE
BETWEEN includes the start and end
58. テンポラルテーブルQuery : 任意時間帯指定
SELECT * FROM cust_notifications
FOR SYSTEM_TIME FROM '2018-02-01' TO '2018-03-30'
WHERE cid = 1;
cid newsletter product_updates security_alerts
1 TRUE FALSE TRUE
FROM includes the start, but not the end
59. テンポラルテーブル: partitioning
CREATE TABLE cust_notifications (
cid INT WITHOUT SYSTEM VERSIONING,
status VARCHAR(10),
newsletter BOOLEAN,
product_updates BOOLEAN,
security_alerts BOOLEAN
) WITH SYSTEM VERSIONING
PARTITION BY SYSTEM_TIME INTERVAL 1 YEAR (
PARTITION p_year_one HISTORY,
PARTITION p_year_two HISTORY,
PARTITION p_year_three HISTORY,
PARTITION p_year_current CURRENT
);
60. テンポラルテーブル: データ変更履歴の削除
DELETE HISTORY FROM cust_notifications;
DELETE HISTORY FROM cust_notifications
BEFORE SYSTEM_TIME '2018-01-01';
ALTER TABLE cust_notifications
DROP PARTITION p_year_three;
62. MariaDB Server 10.3+ : SEQUENCE
CREATE SEQUENCE seq_customer_id
START WITH 100 INCREMENT BY 10;
SELECT seq_customer_id.NEXTVAL;
CREATE TABLE customers (
id INT DEFAULT seq_customer_id.NEXTVAL
);
63. PL/SQL 互換性
Data types: VARCHAR2, NUMBER, DATE, RAW, BLOB, CLOB
Variable declarations: %TYPE
Records: %ROW_TYPE
Control statements: IF THEN, CASE WHEN, LOOP/END LOOP, WHILE
Static SQL: CURRVAL, NEXTVAL
Dynamic SQL: EXECUTE IMMEDIATE USING
64. PL/SQL 互換性
Implicit cursors: SQL%ISOPEN, SQL%FOUND, SQL%NOTFOUND,
SQL%ROWCOUNT
Explicit cursors: CURSOR IS, FETCH INTO, parameters, FOR IN LOOP
Blocks: DECLARE, BEGIN, EXCEPTION, WHEN THEN, END
Stored procedures: CREATE OR REPLACE PROCEDURE IS|AS, OUT, IN OUT
Functions: CREATE OR REPLACE FUNCTION AS|IS
Triggers: CREATE OR REPLACE TRIGGER, BEFORE|AFTER, FOR EACH ROW,
NEW, OLD
Packages: CREATE PACKAGE, CREATE PACKAGE BODY
79. Download
https://mariadb.com/downloads
1
Read the Technical overviews
https://mariadb.com/resources/datasheets-guides
2
Knowledge Base
https://mariadb.com/kb
3
Smart Style 様 TECH BLOG
https://www.s-style.co.jp/blog
4
Get Started with MariaDB
88. MariaDB ColumnStore Architecture
Columnar Distributed Data Storage
Local Storage | SAN | EBS | GlusterFS | Ceph
BI Tool SQL Client Custom
Big Data App
Application
MariaDB SQL
Front End
Distributed
Query Engine
Data
Storage
User Module (UM)
Performance
Module (PM)
89. 行指向 vs. 列指向
• 行指向(InnoDB etc.)
– 各行シーケンシャルにデータファイ
ルに書込
– クエリ実行時すべての行をスキャン
• 列指向(ColumnStore)
– 各列個別データファイルにストア
– クエリに関連する列のみスキャン
ID Fname Lname State Zip Phone Age Sex
1 Bugs Bunny NY 11217 (718) 938-3235 34 M
2 Yosemite Sam CA 95389 (209) 375-6572 52 M
3 Daffy Duck NY 10013 (212) 227-1810 35 M
4 Elmer Fudd ME 04578 (207) 882-7323 43 M
5 Witch Hazel MA 01970 (978) 744-0991 57 F
ID
1
2
3
4
5
Fname
Bugs
Yosemite
Daffy
Elmer
Witch
Lname
Bunny
Sam
Duck
Fudd
Hazel
State
NY
CA
NY
ME
MA
Zip
11217
95389
10013
04578
01970
Phone
(718) 938-3235
(209) 375-6572
(212) 227-1810
(207) 882-7323
(978) 744-0991
Age
34
52
35
43
57
Sex
M
M
M
M
F
SELECT Fname FROM table1 WHERE State = 'NY'
90. スケーラビリティ
• MPP architecture
– リニアなスケーラビリティ
• 水平スケールアウト
– PMノード追加によるスケールアウト
– ノード追加作業中もReadクエリ実行可
Shared-Nothing Distributed Data Storage
Compressed by default
User
Module
(UM)
Performance
Module
(PM)
Data Storage
91. ColumnStore ストレージアーキテクチャ
• 列指向ストレージ
– 各列は個別のファイルにストア
– インデックス定義は不要
– オンラインでスキーマ変更可
• 自動水平partitioning (extent)
– 8,000,000行ごとにパーティショニング
– 各パーティションの最小/最大値をメタデータ
として保持
– 手動パーティション管理は不要
• データ圧縮
– テーブルへの I/O を低減
Column 1
Extent 1 (8 million rows, 8MB~64MB)
Extent 2 (8 million rows)
Extent M (8 million rows)
Column 2 Column 3 ... Column N
Data automatically arranged by
• Column – Acts as Vertical Partitioning
• Extents – Acts as horizontal partition
Vertical
Partition
Horizontal
Partition
...
Vertical
Partition
Vertical
Partition
Vertical
Partition
Horizontal
Partition
Horizontal
Partition
92. 高速データインポート(cpimport)
• 高速な並列バルクロード
– すべてのPMに対して並列ロード
– 複数テーブルに対して同時にロード可
– Readクエリはデータロード中も実行可能
• ストリーミングデータロード(CDC)
Column 1
Extent 1 (8 million rows, 8MB~64MB)
Extent 2 (8 million rows)
Extent M (8 million rows)
Column 2 ... Column N
Horizontal
Partition
...
Horizontal
Partition
Horizontal
Partition
High Water Mark
New Data being loaded
Dataaccessedby
runningqueries
93. Shared Nothing Distributed Data Storage
SQL
Column
Primitives
User
Module
Performance
Module
UM
PM
高速なクエリ実行
• UM上のMariaDB Front Endでクエリ構文解析
• Storage Engine Plugin がクエリをPMに分散
• PM上で並列/分散しクエリ実行
• GROUP BY / aggregation関数をローカルデータに対
し実行
• PMから中間クエリ結果をUMに返す
Massively parallel, distributed query processing, Shared nothing architecture
Primitives ↓↓↓↓
Intermediate
↑↑Results↑↑
94. Horizontal
Partition:
8 Million Rows
Extent 2
Horizontal
Partition:
8 Million Rows
Extent 3
Horizontal
Partition:
8 Million Rows
Extent 1
低I/Oストレージアーキテクチャ
• クエリに関連する列にのみアクセス
• クエリのWHERE/JOINに関連しない
ブロックにアクセスしない
Extent 1:
Min State: CA, Max State: NY
Extent 2:
Min State: OR, Max State: WY
Extent 3:
Min State: IA, Max State: TN
SELECT Fname FROM Table 1 WHERE State = ‘NY’
高いクエリ性能
ID
1
2
3
4
...
8M
8M+1
...
16M
16M+1
...
24M
Fname
Bugs
Yosemite
Daffy
Hazel
...
...
Jane
...
Elmer
Lname
Bunny
Sam
Duck
Fudd
...
...
...
State
NY
CA
NY
ME
...
MN
WY
TX
OR
...
VA
TN
IA
NY
...
PA
Zip
11217
95389
10013
04578
...
...
...
Phone
(718) 938-3235
(209) 375-6572
(212) 227-1810
(207) 882-7323
...
...
...
Age
34
52
35
43
...
...
...
Sex
M
M
M
F
...
...
...
Vertical
Partition
Vertical
Partition
Vertical
Partition
Vertical
Partition
Vertical
Partition
…
ELIMINATED PARTITION
95. • Analytics
– 複雑な join, aggregation(集約), Window関数
– UDF(ユーザ定義関数)
– InnoDB等, 他ストレージエンジンとの
Cross Engine Join
• BI connectivity
– Java, C/C++, ODBC connectorsによる既存BIツ
ールとの連携
– e.g. Tableau (certified)
Analytics
Daily Running Average Revenue for each item
SELECT item_id, server_date, daily_revenue,
AVG(revenue) OVER
(PARTITION BY item_id ORDER BY server_date
RANGE INTERVAL '1' DAY PRECEDING ) running_avg
FROM web_item_sales
Item ID Server_date Revenue
1 02-01-2014 20,000.00
1 02-02-2014 5,001.00
2 02-01-2014 15,000.00
2 02-04-2014 34,029.00
2 02-05-2014 7,138.00
3 02-01-2014 17,250.00
3 02-03-2014 25,010.00
3 02-04-2014 21,034.00
3 02-05-2014 4,120.00
Running Average
20,000.00
12,500.50
15,000.00
34,209.00
20,583.50
17,250.00
250,100.00
12,577.00
20,583.50
96. セキュリティ/HA
• エンタープライズセキュリティ
–SSL, Roleによりアクセス管理
• 柔軟なプラットフォーム選択
–オンプレミス
–クラウド
• 高可用性(HA)
–自動UM フェールオーバ
–自動PM フェールオーバ
(EBS, GlusterFS等
分散ファイルシステムによる)
Shared-Nothing Distributed Data Storage
Compressed by default
User
Module
(UM)
Performance
Module
(PM)
Data Storage
99. IHME - Institute of Health Metrics and Evaluation
● Use Case:
○ Public Health Data Analytics
● Competition:
○ Percona InnoDB, MemSQL
● Why ColumnStore:
○ InnoDB (Percona) reached
performance limit w/ max tuning
for 4 TB data
○ ColumnStore easy to use
● Data Volume: Started with 4.2 TB,
with goal to go to 30TB of data in 5
years
Wrong database/storage engine (MySQL InnoDB) for analytics use case
Application Application Application Application Application
http://www.healthdata.org/results/data-visualizations
105. ClustrixDB: スケールアウト, Fault-tolerant, MySQL互換
ClustrixDB
ACID Compliant
Transactions & Joins
Optimized for OLTP
Built-In Fault Tolerance
Flex-Up and Flex-Down
Minimal DB Admin
Also runs great in
the Data Center
Built to run
in the Cloud