SlideShare une entreprise Scribd logo
1  sur  27
MySQL and MonetDB
Benchmarks
A corrected comparison between the databases
Author: Tyler Weatherby
Advisor: Dr. Feng Yu
Overview
• History
• The Difference Between Tables
• Engine Background
• Goals of This Project
• TPC-H Background
• Installing TPC-H
• Main Project Issue
• Issue Resolved
• Expansion of the Original Data
• Creating Tables and Loading Data
• Query Scripts
• Graphical Interpretation of Results
• Numeric Interpretation of Results
• Breakdown and Comparison
• Challenges Encountered
• Interpretation
• Possibilities to Further Expand
• Conclusion
History
• MySQL: Developed 1994
• MySQL acquired in 2008 by Sun Microsystems then by Oracle in 2010
• MySQL has a proprietary license
• MySQL is a row store database
• MonetDB: Developed around 1996
• MonetDB is open source and cross-platform
• R and Python support (2014-Present)
• MonetDB is a column store database
The Difference Between Tables
• Row Store
• Stores data by rows like a typical table
• Uses Primary and Foreign Keys
• Primary Key: Unique identifier
• Foreign Key: Targets a Primary Key to another table
• Column Store
• Stores data within the columns instead of rows
• Only affected columns need to be read when queried
Engine Background
• MySQL: MYISAM: Stored on disk in three files
• Row store and the default engine
• MySQL: Memory: Contents loaded into memory
• Row store
• Vulnerable to crashes, hardware issues, and power loss
• MonetDB: Uses main memory for processing
• Column store
• Does not require all data be active in physical memory at once
Goals of This Project
• This project was intended to provide benchmark comparisons against
MySQL engines and MonetDB
• Expand upon current benchmarked data
• Provide fairness for an accurate interpretation
• Use TPH-C to achieve this goal and benchmark 1GB of data
TPC-H Background
• Decision support benchmark
• Useful tools to quickly generate data
• Can handle large volumes of data
• Can produce queries with great complexity
• Generates 8 tables
• Some tables have over millions of records
Installing TPC-H: Step 1
• Recommended that you make a dir to store tpc-h files
• mkdir tpch
• Download tpch files with the following command
• wget http://www.tpc.org/TPC_Documents_Current_Versions/download_programs/tools-
download-request.asp?bm_type=TPC-H&bm_vers=2.17.2&mode=CURRENT-ONLY
• Extract downloaded files from compressed format and install
• unzip TPCH_FileName.zip –tpch
Installing TPC-H: Step 2
• Create makefile before installing, this will set some parameters we
need
• CC = gcc DATABASE = ORACLE
• MACHINE = LINUX WORKLOAD = TPCH
• After we have set the proper parameters for the machine, we can
then make TPC-H by simply running the following command
• Make
• TPC-H should now be installed
Main Project Issue: Running Time Analysis
• Claim: MonetDB was as much as 141,000 times faster than MySQL
engines (InnoDB & MYISAM)
• MySQL MYISAM engine queried previous data with times ranging
from ten to thirty minutes
• Original theory was to contribute this speed to memory hierarchy
Issue Resolved: Not Memory Hierarchy
• Examination of the original data showed the neglect to follow the
benchmarks proper table schema
• Turns out that keys are useful in a database
• Old benchmarks are therefore invalid because of the failure to
provide fairness
Expansion of the Original Data
• Generated 1 GB of data using TPH-C benchmarking tools
• -s is scaled as gigabytes, so -s 0.1 would be 100 MB and -s 1 would be 1 GB
• ./dbgen -s 1
• Generated queries using TPH-C benchmarking tools
• ./qgen (random seed)
• After you’ve generated the data and queries, you can begin to focus
on the database side of things
Creating Tables and Loading Data
• Tables are defined in the TPC-H Documentation, there are 8 of them
• Loading the data into a MySQL table: MySQL must be running from the same
directory as *.tbl files (wherever the user started the program)
• LOAD DATA LOCAL INFILE ‘TableName.tbl' INTO TABLE supplier FIELDS TERMINATED BY '|';
• Loading the data into MonetDB tables were a bit trickier
• copy into customer from '/home/teweatherby/tpch_2_17_0/dbgen/1g/customer.tbl';
• In MonetDB you have to know your full directory name to load data to the table!
Query 1: 2.sql
select s_acctbal, s_name, n_name, p_partkey, p_mfgr, s_address, s_phone, s_comment
from part, supplier, partsupp, nation, region where p_partkey = ps_partkey and s_suppkey = ps_suppkey
and p_size = 4 and p_type like '%STEEL' and s_nationkey = n_nationkey
and n_regionkey = r_regionkey and r_name = 'MIDDLE EAST‘
and ps_supplycost = (select min(ps_supplycost)
from partsupp, supplier, nation, region where p_partkey = ps_partkey and s_suppkey = ps_suppkey
and s_nationkey = n_nationkey and n_regionkey = r_regionkey
and r_name = 'MIDDLE EAST') order by s_acctbal desc, n_name, s_name, p_partkey;
Note: Spacing reduced to preserve readability
Query 2: 3.sql
select l_orderkey, sum(l_extendedprice * (1 - l_discount)) as revenue,
o_orderdate, o_shippriority from customer, orders, lineitem
where c_mktsegment = 'AUTOMOBILE' and c_custkey = o_custkey
and l_orderkey = o_orderkey and o_orderdate < date '1995-03-27' and
l_shipdate > date '1995-03-27‘ group by l_orderkey, o_orderdate,
o_shippriority order by revenue desc, o_orderdate;
Note: Spacing reduced to preserve readability
Query 3: 18.sql
select c_name, c_custkey, o_orderkey, o_orderdate, o_totalprice,
sum(l_quantity) from customer, orders, lineitem
where o_orderkey in (select l_orderkey from lineitem
group by l_orderkey having sum(l_quantity) > 315)
and c_custkey = o_custkey and o_orderkey = l_orderkey
group by c_name, c_custkey, o_orderkey, o_orderdate,
o_totalprice order by o_totalprice desc, o_orderdate;
Note: Spacing reduced to preserve readability
Results: Query 1: Three Trials Each
0
500
1000
1500
2000
2500
3000
3500
4000
4500
5000
MYISAM (MySQL) Memory (MySQL) MonetDB
Time(ms)
Query 1 Results
Trial 1 Trial 2 Trial 3
Time is listed in milliseconds
Results: Query 2: Three Trials Each
0
1000
2000
3000
4000
5000
6000
7000
MYISAM (MySQL) Memory (MySQL) MonetDB
Time(ms)
Query 2 Results
Trial 1 Trial 2 Trial 3
Time is listed in milliseconds
Results: Query 3: Three Trials Each
0
1000
2000
3000
4000
5000
6000
7000
8000
9000
MYISAM (MySQL) Memory (MySQL) MonetDB
Time(ms)
Query 3 Results
Trial 1 Trial 2 Trial 3
Time is listed in milliseconds
Total Results: Query 1
Time in milliseconds MySQL (MYISAM) MySQL (Memory) MonetDB
Trial 1 4930 1060 48
Trial 2 4950 1090 51
Trial 3 4970 1080 62
Average Running Time: 4950 1077 54
Total Results: Query 2
Time in milliseconds MySQL (MYISAM) MySQL (Memory) MonetDB
Trial 1 6040 1810 138
Trial 2 6070 1820 146
Trial 3 6020 1800 136
Average Running Time: 6043 1810 140
Total Results: Query 3
Time in milliseconds MySQL (MYISAM) MySQL (Memory) MonetDB
Trial 1 8440 5600 231
Trial 2 8410 5530 209
Trial 3 8420 5540 205
Average Running Time: 8423 5557 215
Breakdown and Comparison
Query MySQL
(MYISAM)
MySQL
(Memory)
MonetDB MYISAM/
Memory
MYISAM/
MonetDB
Memory/
MonetDB
Query 1 4950 1077 54 4.60 times
faster than
MYISAM
91.7 times
faster than
MYISAM
19.94 times
faster than
Memory
Query 2 6043 1810 140 3.34 times
faster than
MYISAM
43.16 times
faster than
MYISAM
12.93 times
faster than
Memory
Query 3 8423 5557 215 1.52 times
faster than
MYISAM
39.18 times
faster than
MYISAM
25.85 times
faster than
Memory
Note: When we say “… times faster than Memory” we are referring to a MySQL Engine
Time is listed in milliseconds
Challenges Encountered
• Learning Ubuntu command lines and proficiently manipulating the
environment
• Had to increase MySQL’s maximum memory storage to store 1GB of
data in memory. Otherwise table full error.
• SET GLOBAL tmp_table_size = 1024 * 1024 * 1024 * 2; SET GLOBAL
max_heap_table_size = 1024 * 1024 * 1024 * 2
• MonetDB administrative structure
Interpretation
• Certainly, MySQL Memory is faster than MySQL MYISAM
• MonetDB does have a faster time over MySQL MYISAM engines
• MonetDB seems to be faster than MySQL Memory Engines
• Keys are useful for databases!!!
• Is MonetDB better?
Possibilities to Further Expand
• Only compared for querying, how would they perform for
modification?
• Is MonetDB simpler? Easier to understand?
• System resource limitation (memory)
• Other databases (Cassandra)
Conclusion
• Keys in a database matter
• MonetDB seems to have and edge on MySQL’s Memory Engine
• MonetDB certainly has an advantage on MySQL’s MYISAM Engine
• There are opportunities to further expand on this examination

Contenu connexe

Tendances

Troubleshooting redis
Troubleshooting redisTroubleshooting redis
Troubleshooting redisDaeMyung Kang
 
Version Control History and Git Basics
Version Control History and Git BasicsVersion Control History and Git Basics
Version Control History and Git BasicsSreedath N S
 
ClickHouse Monitoring 101: What to monitor and how
ClickHouse Monitoring 101: What to monitor and howClickHouse Monitoring 101: What to monitor and how
ClickHouse Monitoring 101: What to monitor and howAltinity Ltd
 
Hadoop, Pig, and Twitter (NoSQL East 2009)
Hadoop, Pig, and Twitter (NoSQL East 2009)Hadoop, Pig, and Twitter (NoSQL East 2009)
Hadoop, Pig, and Twitter (NoSQL East 2009)Kevin Weil
 
Hdfs ha using journal nodes
Hdfs ha using journal nodesHdfs ha using journal nodes
Hdfs ha using journal nodesEvans Ye
 
Hive 3 - a new horizon
Hive 3 - a new horizonHive 3 - a new horizon
Hive 3 - a new horizonThejas Nair
 
대용량 로그분석 Bigquery로 간단히 사용하기 20160930
대용량 로그분석 Bigquery로 간단히 사용하기 20160930대용량 로그분석 Bigquery로 간단히 사용하기 20160930
대용량 로그분석 Bigquery로 간단히 사용하기 20160930Jaikwang Lee
 
Netflix SRE perf meetup_slides
Netflix SRE perf meetup_slidesNetflix SRE perf meetup_slides
Netflix SRE perf meetup_slidesEd Hunter
 
Redis in Practice
Redis in PracticeRedis in Practice
Redis in PracticeNoah Davis
 
HBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation Buffers
HBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation BuffersHBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation Buffers
HBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation BuffersCloudera, Inc.
 
Design cube in Apache Kylin
Design cube in Apache KylinDesign cube in Apache Kylin
Design cube in Apache KylinYang Li
 
[211] HBase 기반 검색 데이터 저장소 (공개용)
[211] HBase 기반 검색 데이터 저장소 (공개용)[211] HBase 기반 검색 데이터 저장소 (공개용)
[211] HBase 기반 검색 데이터 저장소 (공개용)NAVER D2
 
[261] 실시간 추천엔진 머신한대에 구겨넣기
[261] 실시간 추천엔진 머신한대에 구겨넣기[261] 실시간 추천엔진 머신한대에 구겨넣기
[261] 실시간 추천엔진 머신한대에 구겨넣기NAVER D2
 
Elastic Stack & Data pipeline
Elastic Stack & Data pipelineElastic Stack & Data pipeline
Elastic Stack & Data pipelineJongho Woo
 
Time-Series Apache HBase
Time-Series Apache HBaseTime-Series Apache HBase
Time-Series Apache HBaseHBaseCon
 
Facebook Messages & HBase
Facebook Messages & HBaseFacebook Messages & HBase
Facebook Messages & HBase强 王
 

Tendances (20)

Troubleshooting redis
Troubleshooting redisTroubleshooting redis
Troubleshooting redis
 
Version Control History and Git Basics
Version Control History and Git BasicsVersion Control History and Git Basics
Version Control History and Git Basics
 
Hadoop Oozie
Hadoop OozieHadoop Oozie
Hadoop Oozie
 
ClickHouse Monitoring 101: What to monitor and how
ClickHouse Monitoring 101: What to monitor and howClickHouse Monitoring 101: What to monitor and how
ClickHouse Monitoring 101: What to monitor and how
 
Hadoop, Pig, and Twitter (NoSQL East 2009)
Hadoop, Pig, and Twitter (NoSQL East 2009)Hadoop, Pig, and Twitter (NoSQL East 2009)
Hadoop, Pig, and Twitter (NoSQL East 2009)
 
Introduction to Git and GitHub
Introduction to Git and GitHubIntroduction to Git and GitHub
Introduction to Git and GitHub
 
Hdfs ha using journal nodes
Hdfs ha using journal nodesHdfs ha using journal nodes
Hdfs ha using journal nodes
 
Hive 3 - a new horizon
Hive 3 - a new horizonHive 3 - a new horizon
Hive 3 - a new horizon
 
대용량 로그분석 Bigquery로 간단히 사용하기 20160930
대용량 로그분석 Bigquery로 간단히 사용하기 20160930대용량 로그분석 Bigquery로 간단히 사용하기 20160930
대용량 로그분석 Bigquery로 간단히 사용하기 20160930
 
File Format Benchmark - Avro, JSON, ORC and Parquet
File Format Benchmark - Avro, JSON, ORC and ParquetFile Format Benchmark - Avro, JSON, ORC and Parquet
File Format Benchmark - Avro, JSON, ORC and Parquet
 
Netflix SRE perf meetup_slides
Netflix SRE perf meetup_slidesNetflix SRE perf meetup_slides
Netflix SRE perf meetup_slides
 
Redis in Practice
Redis in PracticeRedis in Practice
Redis in Practice
 
Introduction to HDFS
Introduction to HDFSIntroduction to HDFS
Introduction to HDFS
 
HBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation Buffers
HBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation BuffersHBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation Buffers
HBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation Buffers
 
Design cube in Apache Kylin
Design cube in Apache KylinDesign cube in Apache Kylin
Design cube in Apache Kylin
 
[211] HBase 기반 검색 데이터 저장소 (공개용)
[211] HBase 기반 검색 데이터 저장소 (공개용)[211] HBase 기반 검색 데이터 저장소 (공개용)
[211] HBase 기반 검색 데이터 저장소 (공개용)
 
[261] 실시간 추천엔진 머신한대에 구겨넣기
[261] 실시간 추천엔진 머신한대에 구겨넣기[261] 실시간 추천엔진 머신한대에 구겨넣기
[261] 실시간 추천엔진 머신한대에 구겨넣기
 
Elastic Stack & Data pipeline
Elastic Stack & Data pipelineElastic Stack & Data pipeline
Elastic Stack & Data pipeline
 
Time-Series Apache HBase
Time-Series Apache HBaseTime-Series Apache HBase
Time-Series Apache HBase
 
Facebook Messages & HBase
Facebook Messages & HBaseFacebook Messages & HBase
Facebook Messages & HBase
 

Similaire à MySQL vs MonetDB Bencharmarks

Monomi: Practical Analytical Query Processing over Encrypted Data
Monomi: Practical Analytical Query Processing over Encrypted DataMonomi: Practical Analytical Query Processing over Encrypted Data
Monomi: Practical Analytical Query Processing over Encrypted DataMostafa Arjmand
 
Real World Performance - Data Warehouses
Real World Performance - Data WarehousesReal World Performance - Data Warehouses
Real World Performance - Data WarehousesConnor McDonald
 
Secrets of highly_avail_oltp_archs
Secrets of highly_avail_oltp_archsSecrets of highly_avail_oltp_archs
Secrets of highly_avail_oltp_archsTarik Essawi
 
AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...
AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...
AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...Amazon Web Services
 
hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibaba
hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibabahbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibaba
hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at AlibabaMichael Stack
 
[DBA]_HiramFleitas_SQL_PASS_Summit_2017_Summary
[DBA]_HiramFleitas_SQL_PASS_Summit_2017_Summary[DBA]_HiramFleitas_SQL_PASS_Summit_2017_Summary
[DBA]_HiramFleitas_SQL_PASS_Summit_2017_SummaryHiram Fleitas León
 
EM12c: Capacity Planning with OEM Metrics
EM12c: Capacity Planning with OEM MetricsEM12c: Capacity Planning with OEM Metrics
EM12c: Capacity Planning with OEM MetricsMaaz Anjum
 
Best Practices – Extreme Performance with Data Warehousing on Oracle Databa...
Best Practices –  Extreme Performance with Data Warehousing  on Oracle Databa...Best Practices –  Extreme Performance with Data Warehousing  on Oracle Databa...
Best Practices – Extreme Performance with Data Warehousing on Oracle Databa...Edgar Alejandro Villegas
 
MariaDB ColumnStore
MariaDB ColumnStoreMariaDB ColumnStore
MariaDB ColumnStoreMariaDB plc
 
Silicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel
Silicon Valley Code Camp 2015 - Advanced MongoDB - The SequelSilicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel
Silicon Valley Code Camp 2015 - Advanced MongoDB - The SequelDaniel Coupal
 
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander ZaitsevMigration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander ZaitsevAltinity Ltd
 
Silicon Valley Code Camp 2016 - MongoDB in production
Silicon Valley Code Camp 2016 - MongoDB in productionSilicon Valley Code Camp 2016 - MongoDB in production
Silicon Valley Code Camp 2016 - MongoDB in productionDaniel Coupal
 
Best Practices – Extreme Performance with Data Warehousing on Oracle Database
Best Practices – Extreme Performance with Data Warehousing on Oracle DatabaseBest Practices – Extreme Performance with Data Warehousing on Oracle Database
Best Practices – Extreme Performance with Data Warehousing on Oracle DatabaseEdgar Alejandro Villegas
 
30334823 my sql-cluster-performance-tuning-best-practices
30334823 my sql-cluster-performance-tuning-best-practices30334823 my sql-cluster-performance-tuning-best-practices
30334823 my sql-cluster-performance-tuning-best-practicesDavid Dhavan
 
5 Pitfalls to Avoid with MongoDB
5 Pitfalls to Avoid with MongoDB5 Pitfalls to Avoid with MongoDB
5 Pitfalls to Avoid with MongoDBTim Callaghan
 
Challenges of Building a First Class SQL-on-Hadoop Engine
Challenges of Building a First Class SQL-on-Hadoop EngineChallenges of Building a First Class SQL-on-Hadoop Engine
Challenges of Building a First Class SQL-on-Hadoop EngineNicolas Morales
 
Macy's: Changing Engines in Mid-Flight
Macy's: Changing Engines in Mid-FlightMacy's: Changing Engines in Mid-Flight
Macy's: Changing Engines in Mid-FlightDataStax Academy
 

Similaire à MySQL vs MonetDB Bencharmarks (20)

MySQL vs. MonetDB
MySQL vs. MonetDBMySQL vs. MonetDB
MySQL vs. MonetDB
 
Monomi: Practical Analytical Query Processing over Encrypted Data
Monomi: Practical Analytical Query Processing over Encrypted DataMonomi: Practical Analytical Query Processing over Encrypted Data
Monomi: Practical Analytical Query Processing over Encrypted Data
 
Real World Performance - Data Warehouses
Real World Performance - Data WarehousesReal World Performance - Data Warehouses
Real World Performance - Data Warehouses
 
Secrets of highly_avail_oltp_archs
Secrets of highly_avail_oltp_archsSecrets of highly_avail_oltp_archs
Secrets of highly_avail_oltp_archs
 
AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...
AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...
AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...
 
hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibaba
hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibabahbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibaba
hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibaba
 
memcached Distributed Cache
memcached Distributed Cachememcached Distributed Cache
memcached Distributed Cache
 
[DBA]_HiramFleitas_SQL_PASS_Summit_2017_Summary
[DBA]_HiramFleitas_SQL_PASS_Summit_2017_Summary[DBA]_HiramFleitas_SQL_PASS_Summit_2017_Summary
[DBA]_HiramFleitas_SQL_PASS_Summit_2017_Summary
 
EM12c: Capacity Planning with OEM Metrics
EM12c: Capacity Planning with OEM MetricsEM12c: Capacity Planning with OEM Metrics
EM12c: Capacity Planning with OEM Metrics
 
Best Practices – Extreme Performance with Data Warehousing on Oracle Databa...
Best Practices –  Extreme Performance with Data Warehousing  on Oracle Databa...Best Practices –  Extreme Performance with Data Warehousing  on Oracle Databa...
Best Practices – Extreme Performance with Data Warehousing on Oracle Databa...
 
Performance Tuning
Performance TuningPerformance Tuning
Performance Tuning
 
MariaDB ColumnStore
MariaDB ColumnStoreMariaDB ColumnStore
MariaDB ColumnStore
 
Silicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel
Silicon Valley Code Camp 2015 - Advanced MongoDB - The SequelSilicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel
Silicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel
 
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander ZaitsevMigration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
 
Silicon Valley Code Camp 2016 - MongoDB in production
Silicon Valley Code Camp 2016 - MongoDB in productionSilicon Valley Code Camp 2016 - MongoDB in production
Silicon Valley Code Camp 2016 - MongoDB in production
 
Best Practices – Extreme Performance with Data Warehousing on Oracle Database
Best Practices – Extreme Performance with Data Warehousing on Oracle DatabaseBest Practices – Extreme Performance with Data Warehousing on Oracle Database
Best Practices – Extreme Performance with Data Warehousing on Oracle Database
 
30334823 my sql-cluster-performance-tuning-best-practices
30334823 my sql-cluster-performance-tuning-best-practices30334823 my sql-cluster-performance-tuning-best-practices
30334823 my sql-cluster-performance-tuning-best-practices
 
5 Pitfalls to Avoid with MongoDB
5 Pitfalls to Avoid with MongoDB5 Pitfalls to Avoid with MongoDB
5 Pitfalls to Avoid with MongoDB
 
Challenges of Building a First Class SQL-on-Hadoop Engine
Challenges of Building a First Class SQL-on-Hadoop EngineChallenges of Building a First Class SQL-on-Hadoop Engine
Challenges of Building a First Class SQL-on-Hadoop Engine
 
Macy's: Changing Engines in Mid-Flight
Macy's: Changing Engines in Mid-FlightMacy's: Changing Engines in Mid-Flight
Macy's: Changing Engines in Mid-Flight
 

Dernier

Test Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and BackendTest Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and BackendArshad QA
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxComplianceQuest1
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfjoe51371421
 
Clustering techniques data mining book ....
Clustering techniques data mining book ....Clustering techniques data mining book ....
Clustering techniques data mining book ....ShaimaaMohamedGalal
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...OnePlan Solutions
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...harshavardhanraghave
 
Active Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdfActive Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdfCionsystems
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...OnePlan Solutions
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerThousandEyes
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionSolGuruz
 

Dernier (20)

Test Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and BackendTest Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and Backend
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdf
 
Clustering techniques data mining book ....
Clustering techniques data mining book ....Clustering techniques data mining book ....
Clustering techniques data mining book ....
 
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
Active Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdfActive Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdf
 
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 

MySQL vs MonetDB Bencharmarks

  • 1. MySQL and MonetDB Benchmarks A corrected comparison between the databases Author: Tyler Weatherby Advisor: Dr. Feng Yu
  • 2. Overview • History • The Difference Between Tables • Engine Background • Goals of This Project • TPC-H Background • Installing TPC-H • Main Project Issue • Issue Resolved • Expansion of the Original Data • Creating Tables and Loading Data • Query Scripts • Graphical Interpretation of Results • Numeric Interpretation of Results • Breakdown and Comparison • Challenges Encountered • Interpretation • Possibilities to Further Expand • Conclusion
  • 3. History • MySQL: Developed 1994 • MySQL acquired in 2008 by Sun Microsystems then by Oracle in 2010 • MySQL has a proprietary license • MySQL is a row store database • MonetDB: Developed around 1996 • MonetDB is open source and cross-platform • R and Python support (2014-Present) • MonetDB is a column store database
  • 4. The Difference Between Tables • Row Store • Stores data by rows like a typical table • Uses Primary and Foreign Keys • Primary Key: Unique identifier • Foreign Key: Targets a Primary Key to another table • Column Store • Stores data within the columns instead of rows • Only affected columns need to be read when queried
  • 5. Engine Background • MySQL: MYISAM: Stored on disk in three files • Row store and the default engine • MySQL: Memory: Contents loaded into memory • Row store • Vulnerable to crashes, hardware issues, and power loss • MonetDB: Uses main memory for processing • Column store • Does not require all data be active in physical memory at once
  • 6. Goals of This Project • This project was intended to provide benchmark comparisons against MySQL engines and MonetDB • Expand upon current benchmarked data • Provide fairness for an accurate interpretation • Use TPH-C to achieve this goal and benchmark 1GB of data
  • 7. TPC-H Background • Decision support benchmark • Useful tools to quickly generate data • Can handle large volumes of data • Can produce queries with great complexity • Generates 8 tables • Some tables have over millions of records
  • 8. Installing TPC-H: Step 1 • Recommended that you make a dir to store tpc-h files • mkdir tpch • Download tpch files with the following command • wget http://www.tpc.org/TPC_Documents_Current_Versions/download_programs/tools- download-request.asp?bm_type=TPC-H&bm_vers=2.17.2&mode=CURRENT-ONLY • Extract downloaded files from compressed format and install • unzip TPCH_FileName.zip –tpch
  • 9. Installing TPC-H: Step 2 • Create makefile before installing, this will set some parameters we need • CC = gcc DATABASE = ORACLE • MACHINE = LINUX WORKLOAD = TPCH • After we have set the proper parameters for the machine, we can then make TPC-H by simply running the following command • Make • TPC-H should now be installed
  • 10. Main Project Issue: Running Time Analysis • Claim: MonetDB was as much as 141,000 times faster than MySQL engines (InnoDB & MYISAM) • MySQL MYISAM engine queried previous data with times ranging from ten to thirty minutes • Original theory was to contribute this speed to memory hierarchy
  • 11. Issue Resolved: Not Memory Hierarchy • Examination of the original data showed the neglect to follow the benchmarks proper table schema • Turns out that keys are useful in a database • Old benchmarks are therefore invalid because of the failure to provide fairness
  • 12. Expansion of the Original Data • Generated 1 GB of data using TPH-C benchmarking tools • -s is scaled as gigabytes, so -s 0.1 would be 100 MB and -s 1 would be 1 GB • ./dbgen -s 1 • Generated queries using TPH-C benchmarking tools • ./qgen (random seed) • After you’ve generated the data and queries, you can begin to focus on the database side of things
  • 13. Creating Tables and Loading Data • Tables are defined in the TPC-H Documentation, there are 8 of them • Loading the data into a MySQL table: MySQL must be running from the same directory as *.tbl files (wherever the user started the program) • LOAD DATA LOCAL INFILE ‘TableName.tbl' INTO TABLE supplier FIELDS TERMINATED BY '|'; • Loading the data into MonetDB tables were a bit trickier • copy into customer from '/home/teweatherby/tpch_2_17_0/dbgen/1g/customer.tbl'; • In MonetDB you have to know your full directory name to load data to the table!
  • 14. Query 1: 2.sql select s_acctbal, s_name, n_name, p_partkey, p_mfgr, s_address, s_phone, s_comment from part, supplier, partsupp, nation, region where p_partkey = ps_partkey and s_suppkey = ps_suppkey and p_size = 4 and p_type like '%STEEL' and s_nationkey = n_nationkey and n_regionkey = r_regionkey and r_name = 'MIDDLE EAST‘ and ps_supplycost = (select min(ps_supplycost) from partsupp, supplier, nation, region where p_partkey = ps_partkey and s_suppkey = ps_suppkey and s_nationkey = n_nationkey and n_regionkey = r_regionkey and r_name = 'MIDDLE EAST') order by s_acctbal desc, n_name, s_name, p_partkey; Note: Spacing reduced to preserve readability
  • 15. Query 2: 3.sql select l_orderkey, sum(l_extendedprice * (1 - l_discount)) as revenue, o_orderdate, o_shippriority from customer, orders, lineitem where c_mktsegment = 'AUTOMOBILE' and c_custkey = o_custkey and l_orderkey = o_orderkey and o_orderdate < date '1995-03-27' and l_shipdate > date '1995-03-27‘ group by l_orderkey, o_orderdate, o_shippriority order by revenue desc, o_orderdate; Note: Spacing reduced to preserve readability
  • 16. Query 3: 18.sql select c_name, c_custkey, o_orderkey, o_orderdate, o_totalprice, sum(l_quantity) from customer, orders, lineitem where o_orderkey in (select l_orderkey from lineitem group by l_orderkey having sum(l_quantity) > 315) and c_custkey = o_custkey and o_orderkey = l_orderkey group by c_name, c_custkey, o_orderkey, o_orderdate, o_totalprice order by o_totalprice desc, o_orderdate; Note: Spacing reduced to preserve readability
  • 17. Results: Query 1: Three Trials Each 0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000 MYISAM (MySQL) Memory (MySQL) MonetDB Time(ms) Query 1 Results Trial 1 Trial 2 Trial 3 Time is listed in milliseconds
  • 18. Results: Query 2: Three Trials Each 0 1000 2000 3000 4000 5000 6000 7000 MYISAM (MySQL) Memory (MySQL) MonetDB Time(ms) Query 2 Results Trial 1 Trial 2 Trial 3 Time is listed in milliseconds
  • 19. Results: Query 3: Three Trials Each 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 MYISAM (MySQL) Memory (MySQL) MonetDB Time(ms) Query 3 Results Trial 1 Trial 2 Trial 3 Time is listed in milliseconds
  • 20. Total Results: Query 1 Time in milliseconds MySQL (MYISAM) MySQL (Memory) MonetDB Trial 1 4930 1060 48 Trial 2 4950 1090 51 Trial 3 4970 1080 62 Average Running Time: 4950 1077 54
  • 21. Total Results: Query 2 Time in milliseconds MySQL (MYISAM) MySQL (Memory) MonetDB Trial 1 6040 1810 138 Trial 2 6070 1820 146 Trial 3 6020 1800 136 Average Running Time: 6043 1810 140
  • 22. Total Results: Query 3 Time in milliseconds MySQL (MYISAM) MySQL (Memory) MonetDB Trial 1 8440 5600 231 Trial 2 8410 5530 209 Trial 3 8420 5540 205 Average Running Time: 8423 5557 215
  • 23. Breakdown and Comparison Query MySQL (MYISAM) MySQL (Memory) MonetDB MYISAM/ Memory MYISAM/ MonetDB Memory/ MonetDB Query 1 4950 1077 54 4.60 times faster than MYISAM 91.7 times faster than MYISAM 19.94 times faster than Memory Query 2 6043 1810 140 3.34 times faster than MYISAM 43.16 times faster than MYISAM 12.93 times faster than Memory Query 3 8423 5557 215 1.52 times faster than MYISAM 39.18 times faster than MYISAM 25.85 times faster than Memory Note: When we say “… times faster than Memory” we are referring to a MySQL Engine Time is listed in milliseconds
  • 24. Challenges Encountered • Learning Ubuntu command lines and proficiently manipulating the environment • Had to increase MySQL’s maximum memory storage to store 1GB of data in memory. Otherwise table full error. • SET GLOBAL tmp_table_size = 1024 * 1024 * 1024 * 2; SET GLOBAL max_heap_table_size = 1024 * 1024 * 1024 * 2 • MonetDB administrative structure
  • 25. Interpretation • Certainly, MySQL Memory is faster than MySQL MYISAM • MonetDB does have a faster time over MySQL MYISAM engines • MonetDB seems to be faster than MySQL Memory Engines • Keys are useful for databases!!! • Is MonetDB better?
  • 26. Possibilities to Further Expand • Only compared for querying, how would they perform for modification? • Is MonetDB simpler? Easier to understand? • System resource limitation (memory) • Other databases (Cassandra)
  • 27. Conclusion • Keys in a database matter • MonetDB seems to have and edge on MySQL’s Memory Engine • MonetDB certainly has an advantage on MySQL’s MYISAM Engine • There are opportunities to further expand on this examination

Notes de l'éditeur

  1. Aikin’s original data had shown that MonetDB was benchmarking anywhere from 10,000 times faster to 141,000 times faster than MySQL Engines he tested on. Speculated on the possibility that the benchmarking software was just doing something weird, because it’s a benchmark. However, MySQL databases have been around for a while and they’re used in enterprise systems, so this couldn’t be a real benchmark.
  2. Aikin’s had forgotten the primary and foreign keys… Invalidating most of his project.
  3. Mention the original data of 100MB I did not think was sufficient to benchmark again.
  4. Note the huge difference from Aikin’s original 141,000 times and 32,000 times faster under MYISAM category.
  5. List the possible ways we can expand this study
  6. Concluding thoughts