SlideShare une entreprise Scribd logo
1  sur  35
Télécharger pour lire hors ligne
How Database Convergence Impacts the
Coming Decades of Data Management
Nikita Shamgunov
CEO and co-founder of MemSQL
2
MISSION
Growth of digital business impacting data architectures
We make every company a real-time enterprise
PRODUCT
Top Ranked Operational Data Warehouse
MemSQL provides you the ability to learn and react in real time
ABOUT
Founders are former Facebook, SQL Server database engineers
$85m in funding from Top Tier investors; Enterprise Customers:
MemSQL at a Glance
Converge Transactions and Analytics in a Relational Database
● New breed of applications
○ Analytics as part of a transaction
○ Analytics when the data is born
○ In database AI/ML
● Scalable OLAP and OLTP in one system
○ Fewer systems to manage
○ Utility database consumption
○ Supports HTAP
Traditional + Future Architecture
4
In-Memory Data Store
Analytics, Historical Reporting and Data Discovery
Analytic Apps
DMSA
Data
Integration
Transactions + Operational Analytics Traditional Reports and Analytics
IoT
Data
Social
Data
RAM
?
HTAP Apps
Analytic
Apps
The New Data Architecture without DMSA
5
Transactions + Operational Reports
?
IoT
Data
Social
Data
In-Memory Data Store
HTAP Apps
Analytic
Apps
RAM
Analytics, Historical Reporting and Data Discovery
The Enterprise Requires Performance
6
FAST
Data Loading
Stream data
Real-time loading
Full data access
LOW
Query Latency
Vectorized queries
Real-time dashboards
Live data access
Multi-threaded processing
Transactions and Analytics
Scalable performance
HIGH
Concurrency
● Focus on analytics and Deliver a Hybrid Cloud Data Warehouse
○ Hybrid-cloud
○ Scalable with integration to data lakes
○ Real-time
○ Simplicity
● Converge transactions and analytics
○ Transaction support
○ Multi-cloud reliability
○ Application support
North Star. Build a New Category of Databases
Real-time and Query Performance
Goals: Eliminate batching and deliver instant results to user or app
● Investments
○ Streaming ingest
■ Kafka
■ Kinesis
○ Transactional consistency
■ Ability to change data rapidly
■ Ability to scale analytics to millions requests a second to enable self service customer
customer analytics
○ Query performance
■ Scale out
■ Vectorization
● Results
○ Dramatic query performance improvements for BI use cases
○ PIPELINEs adoption is growing
Simplicity
● Goals
○ No knobs where you don’t need them
○ Data warehousing workloads work out of the box
○ No hints for queries
○ No scaling limits
● Investments
○ Query optimization and query execution
● Timelines
○ Several releases in 2017
10
▪ Columnstore
• On disk with working set
in memory
• Super fast scans
• Support analytical and
data warehousing
workloads
• One index
• Petabyte scale
Access Methods
▪ Rowstore
• Fully in memory
• Submillisecond point
updates
• Multiple indexes
11
▪ Supports multi-statement
transactions
▪ Supports MVCC
Scale out and Transactional
▪ Scalable on commodity
hardware
▪ Data hash partitioned
and stored in two copies
Query Processing
12 MemSQL Confidential
13
Query Performance
▪ Group-By/Aggregate Performance
• Operations on encoded data
• Single-instruction multiple data (SIMD)
▪ Filter pushdown to column store
▪ Preference for dictionary compression
MemSQL Confidential
14
SIMD overview
▪ Intel AVX-2
▪ 256-bit registers
▪ Pack multiple values per
register
▪ Special instructions for
SIMD register operations
▪ Arithmetic, logic, load,
store etc.
▪ Allows multiple operations
in 1 instruction
1 2 3 4
1 1 1 1
2 3 4 5
+
MemSQL Confidential
15
Filter pushdown to dictionary
▪ Example:
• FactClick(id, region_id, …)
• Select region_id, count(*) from FactClick
where region_id like ’%east%’
• region_id has only a few dozen values
• It is dictionary-encoded
MemSQL Confidential
16
Segment-level filter pushdown
▪ E.g. 6 regions
▪ 1M rows per segment
▪ WHERE region_id like ‘%east%’
▪ 6 string comparisons/segment
▪ Cache lookup table L:
[true, true, false, false, false, false]
▪ Output only rows where
L[dictionary_id] = true
dictionary_id Region
0 Northeast
1 Southeast
2 North-central
3 South-central
4 Northwest
5 Southwest
Dictionary
MemSQL Confidential
MemSQL Confidential17
Performance
▪ Improved Group-By/Aggregate, up to 80X
▪ Columnstore string filter pushdown
▪ Improved sort performance (can be by 2-3X)
▪ Unenforced uniqueness constraints with RELY option
▪ Query optimizer improvements
▪ Columnstore update
▪ Columnstore JSON
MemSQL Confidential18
Automatic Statistics
▪ Always-on cardinality statistics for every column
▪ For columnstore tables only
▪ On by default
▪ Will result in better query plans with less DBA involvement
to run ANALYZE TABLE and tune queries
Columnstore update performance
▪ Ability to update rows identified via columnstore sort key
▪ Uses in-memory index on row store segment
19
Row Store
Segment
Col Store
Segment
Col Store
Segment
Index on
Sort Key
…
Seek
New Query Features
▪ Cross-database queries (joins, insert-select)
▪ UPDATE/DELETE with joins
▪ UPDATE with subselect in SET clause
▪ reference_table LEFT JOIN …; (select …) LEFT JOIN… now supported
▪ Window functions with complex frames
• E.g. avg (a) over (order by b rows between 5 preceding and current row)
▪ New window functions
• first_value, last_value, nth_value, percentile_cont, percentile_disc
▪ Unenforced unique constraint + RELY
20 MemSQL Confidential
Extensibility Features
▪ Major, release-defining feature set
▪ User-defined
• Stored procedures (SPs)
• Scalar-valued functions (UDFs)
• Table-valued functions (TVFs)
• Aggregate functions (UDAFs)
▪ Highlights
• SQL-developer friendly, clean syntax (no @, $ etc.)
• Compiled to machine code for speed
• Array and record support
21 MemSQL Confidential
Example UDF: normalize_string()
select normalize_string("     Abc    XYZ  ");
abc xyz
22 MemSQL Confidential
Implementation of normalize_string()
delimiter //
create or replace function normalize_string(str varchar(255)) returns varchar(255) as
declare
  r varchar(255) = ""; i int; previousChar char; nextChar char; s varchar(255);
begin
  s = lower(trim(str));
  if length(s) = 0 then return s; end if;
  previousChar = substr(s, 1, 1);
  r = concat(r, previousChar);
  i = 2;
  while i <= length(s) loop
    nextChar = substr(s, i, 1);
    if not(previousChar = ' ' and nextChar = ' ') then
      r = concat(r, substr(s, i, 1));
    end if;
    previousChar = nextChar;
    i += 1;
  end loop;
  return r;
end //
23 MemSQL Confidential
Example SP: Move data more than 5 minutes old
from t1 to t2;
create table t1(a int, ts datetime);
create table t2(a int, ts datetime);
…
create or replace procedure myMove() as
declare
  boundary datetime
= date_add(now(), interval -5 minute);
begin
  insert into t2 select * from t1 where ts < boundary;
  delete from t1 where ts < boundary;
end;
24 MemSQL Confidential
Example TVF
create table t (i int);
insert into t values (1),(2),(3),(4),(5);
create function basic(l int) returns table as
return select * from t limit l;
memsql> select * from basic(0);
Empty set (0.00 sec)
memsql> select * from basic(2);
+------+
| i    |
+------+
|    3 |
|    2 |
+------+
2 rows in set (0.01 sec)
25 MemSQL Confidential
User-Defined Aggregate Functions (UDAFs)
▪ Used like built-in aggregates like SUM()
▪ Based on 4 user-defined functions
• Initialize
• Iterate
• Merge
• Terminate
26 MemSQL Confidential
Example UDAF
-- pick any arbitrary value from input
delimiter //
create function any_init() returns int as begin return -1; end;//
create function any_iter(s int, v int) returns int as begin return v; end;//
create function any_merge(s1 int, s2 int) returns int as
begin
if s1 = -1 then return s2; else return s1; end if;
end;//
create function any_terminate(s int) returns int as begin return s; end;//
delimiter ;
create aggregate any_val(int)
returns int
with state int
initialize with any_init
iterate with any_iter
merge with any_merge
terminate with any_terminate;
27 MemSQL Confidential
UDAF Output
create table t(g int, x int);
insert into t values (100, 10), (100, 12), (100, 14), (200, 21), (200,
27);
select g, any_val(x) from t group by g;
memsql> select g, any_val(x) from t group by g;
+------+------------+
| g | any_val(x) |
+------+------------+
| 100 | 10 |
| 200 | 27 |
+------+------------+
2 rows in set (0.00 sec)
28 MemSQL Confidential
SCALAR (get a scalar query result)
29
create table t (i int);
insert into t values (1), (2), (3), (4), (5);
create or replace procedure scalar_basic() as
declare
v query(i int) = select max(i) from t;
s int = scalar(v);
begin
call tracelog(s);
end;
MemSQL Confidential
30
COLLECT
create or replace procedure p_coll() as
declare
c array(record(v varchar(80)));
t query(v varchar(80)) = select v from r order by v;
begin
delete from proc_log;
c = collect(t);
for x in c loop
call tracelog(x.v);
end loop;
end;
MemSQL Confidential
CALL and ECHO
▪ call sp_name(args)
• When no need to output rowset
▪ echo sp_name(args)
• Outputs rowset to client
▪ Exception handling supported
31 MemSQL Confidential
Performance
▪ Compiled to machine code using LLVM
▪ UDFs are inlined when appropriate
32 MemSQL Confidential
Distributed Execution
▪ SPs run on aggregator
▪ From SPs, parameters and variables are substituted as
strings on aggregator before execution on leaves
▪ UDFs can run on any node
• Aggregators or leaves
• Multiple invocations can run in parallel within a query
33 MemSQL Confidential
Summary
▪ New, user-defined
• Scalar functions
• Stored procedures
• Table-valued functions
• Aggregate functions
▪ Friendly to experienced SQL developers
▪ Array and record types supported
▪ High-performance through compilation to machine code
34 MemSQL Confidential
Thank you
memsql.com

Contenu connexe

Tendances

ClickHouse on Plug-n-Play Cloud, by Som Sikdar, Kodiak Data
ClickHouse on Plug-n-Play Cloud, by Som Sikdar, Kodiak DataClickHouse on Plug-n-Play Cloud, by Som Sikdar, Kodiak Data
ClickHouse on Plug-n-Play Cloud, by Som Sikdar, Kodiak DataAltinity Ltd
 
Managing Cassandra Databases with OpenStack Trove
Managing Cassandra Databases with OpenStack TroveManaging Cassandra Databases with OpenStack Trove
Managing Cassandra Databases with OpenStack TroveTesora
 
Intro to databricks delta lake
 Intro to databricks delta lake Intro to databricks delta lake
Intro to databricks delta lakeMykola Zerniuk
 
Introduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse ArchitectureIntroduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse ArchitectureDatabricks
 
Change Data Capture in Scylla
Change Data Capture in ScyllaChange Data Capture in Scylla
Change Data Capture in ScyllaScyllaDB
 
Introducing Azure SQL Data Warehouse
Introducing Azure SQL Data WarehouseIntroducing Azure SQL Data Warehouse
Introducing Azure SQL Data WarehouseGrant Fritchey
 
GCP Data Engineer cheatsheet
GCP Data Engineer cheatsheetGCP Data Engineer cheatsheet
GCP Data Engineer cheatsheetGuang Xu
 
eBay Cloud CMS - QCon 2012 - http://yidb.org/
eBay Cloud CMS - QCon 2012 - http://yidb.org/eBay Cloud CMS - QCon 2012 - http://yidb.org/
eBay Cloud CMS - QCon 2012 - http://yidb.org/Xu Jiang
 
Workshop - How to benchmark your database
Workshop - How to benchmark your databaseWorkshop - How to benchmark your database
Workshop - How to benchmark your databaseScyllaDB
 
ETL Made Easy with Azure Data Factory and Azure Databricks
ETL Made Easy with Azure Data Factory and Azure DatabricksETL Made Easy with Azure Data Factory and Azure Databricks
ETL Made Easy with Azure Data Factory and Azure DatabricksDatabricks
 
Sql 2016 - What's New
Sql 2016 - What's NewSql 2016 - What's New
Sql 2016 - What's Newdpcobb
 
Under the hood: SkySQL monitoring
Under the hood: SkySQL monitoringUnder the hood: SkySQL monitoring
Under the hood: SkySQL monitoringMariaDB plc
 
Cassandra on Google Cloud Platform (Ravi Madasu, Google / Ben Lackey, DataSta...
Cassandra on Google Cloud Platform (Ravi Madasu, Google / Ben Lackey, DataSta...Cassandra on Google Cloud Platform (Ravi Madasu, Google / Ben Lackey, DataSta...
Cassandra on Google Cloud Platform (Ravi Madasu, Google / Ben Lackey, DataSta...DataStax
 
Expert summit SQL Server 2016
Expert summit   SQL Server 2016Expert summit   SQL Server 2016
Expert summit SQL Server 2016Łukasz Grala
 
Netflix's Big Leap from Oracle to Cassandra
Netflix's Big Leap from Oracle to CassandraNetflix's Big Leap from Oracle to Cassandra
Netflix's Big Leap from Oracle to CassandraRoopa Tangirala
 
eBay Cloud CMS based on NOSQL
eBay Cloud CMS based on NOSQLeBay Cloud CMS based on NOSQL
eBay Cloud CMS based on NOSQLXu Jiang
 
Scylla Virtual Workshop 2020
Scylla Virtual Workshop 2020Scylla Virtual Workshop 2020
Scylla Virtual Workshop 2020ScyllaDB
 
Keeping your application’s latency SLAs no matter what
Keeping your application’s latency SLAs no matter whatKeeping your application’s latency SLAs no matter what
Keeping your application’s latency SLAs no matter whatScyllaDB
 

Tendances (20)

ClickHouse on Plug-n-Play Cloud, by Som Sikdar, Kodiak Data
ClickHouse on Plug-n-Play Cloud, by Som Sikdar, Kodiak DataClickHouse on Plug-n-Play Cloud, by Som Sikdar, Kodiak Data
ClickHouse on Plug-n-Play Cloud, by Som Sikdar, Kodiak Data
 
Managing Cassandra Databases with OpenStack Trove
Managing Cassandra Databases with OpenStack TroveManaging Cassandra Databases with OpenStack Trove
Managing Cassandra Databases with OpenStack Trove
 
Intro to databricks delta lake
 Intro to databricks delta lake Intro to databricks delta lake
Intro to databricks delta lake
 
Introduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse ArchitectureIntroduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse Architecture
 
Change Data Capture in Scylla
Change Data Capture in ScyllaChange Data Capture in Scylla
Change Data Capture in Scylla
 
Introducing Azure SQL Data Warehouse
Introducing Azure SQL Data WarehouseIntroducing Azure SQL Data Warehouse
Introducing Azure SQL Data Warehouse
 
Cassandra in e-commerce
Cassandra in e-commerceCassandra in e-commerce
Cassandra in e-commerce
 
GCP Data Engineer cheatsheet
GCP Data Engineer cheatsheetGCP Data Engineer cheatsheet
GCP Data Engineer cheatsheet
 
eBay Cloud CMS - QCon 2012 - http://yidb.org/
eBay Cloud CMS - QCon 2012 - http://yidb.org/eBay Cloud CMS - QCon 2012 - http://yidb.org/
eBay Cloud CMS - QCon 2012 - http://yidb.org/
 
Workshop - How to benchmark your database
Workshop - How to benchmark your databaseWorkshop - How to benchmark your database
Workshop - How to benchmark your database
 
ETL Made Easy with Azure Data Factory and Azure Databricks
ETL Made Easy with Azure Data Factory and Azure DatabricksETL Made Easy with Azure Data Factory and Azure Databricks
ETL Made Easy with Azure Data Factory and Azure Databricks
 
Sql 2016 - What's New
Sql 2016 - What's NewSql 2016 - What's New
Sql 2016 - What's New
 
Under the hood: SkySQL monitoring
Under the hood: SkySQL monitoringUnder the hood: SkySQL monitoring
Under the hood: SkySQL monitoring
 
Cassandra on Google Cloud Platform (Ravi Madasu, Google / Ben Lackey, DataSta...
Cassandra on Google Cloud Platform (Ravi Madasu, Google / Ben Lackey, DataSta...Cassandra on Google Cloud Platform (Ravi Madasu, Google / Ben Lackey, DataSta...
Cassandra on Google Cloud Platform (Ravi Madasu, Google / Ben Lackey, DataSta...
 
Azure SQL Data Warehouse
Azure SQL Data Warehouse Azure SQL Data Warehouse
Azure SQL Data Warehouse
 
Expert summit SQL Server 2016
Expert summit   SQL Server 2016Expert summit   SQL Server 2016
Expert summit SQL Server 2016
 
Netflix's Big Leap from Oracle to Cassandra
Netflix's Big Leap from Oracle to CassandraNetflix's Big Leap from Oracle to Cassandra
Netflix's Big Leap from Oracle to Cassandra
 
eBay Cloud CMS based on NOSQL
eBay Cloud CMS based on NOSQLeBay Cloud CMS based on NOSQL
eBay Cloud CMS based on NOSQL
 
Scylla Virtual Workshop 2020
Scylla Virtual Workshop 2020Scylla Virtual Workshop 2020
Scylla Virtual Workshop 2020
 
Keeping your application’s latency SLAs no matter what
Keeping your application’s latency SLAs no matter whatKeeping your application’s latency SLAs no matter what
Keeping your application’s latency SLAs no matter what
 

Similaire à How Database Convergence Impacts the Coming Decades of Data Management

MySQL 8 -- A new beginning : Sunshine PHP/PHP UK (updated)
MySQL 8 -- A new beginning : Sunshine PHP/PHP UK (updated)MySQL 8 -- A new beginning : Sunshine PHP/PHP UK (updated)
MySQL 8 -- A new beginning : Sunshine PHP/PHP UK (updated)Dave Stokes
 
What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1
What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1
What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1MariaDB plc
 
What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1
What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1
What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1MariaDB plc
 
Changing your huge table's data types in production
Changing your huge table's data types in productionChanging your huge table's data types in production
Changing your huge table's data types in productionJimmy Angelakos
 
Macy's: Changing Engines in Mid-Flight
Macy's: Changing Engines in Mid-FlightMacy's: Changing Engines in Mid-Flight
Macy's: Changing Engines in Mid-FlightDataStax Academy
 
Performance Schema for MySQL Troubleshooting
Performance Schema for MySQL TroubleshootingPerformance Schema for MySQL Troubleshooting
Performance Schema for MySQL TroubleshootingSveta Smirnova
 
MySQL Performance Schema in Action
MySQL Performance Schema in ActionMySQL Performance Schema in Action
MySQL Performance Schema in ActionSveta Smirnova
 
Developers’ mDay 2019. - Bogdan Kecman, Oracle – MySQL 8.0 – why upgrade
Developers’ mDay 2019. - Bogdan Kecman, Oracle – MySQL 8.0 – why upgradeDevelopers’ mDay 2019. - Bogdan Kecman, Oracle – MySQL 8.0 – why upgrade
Developers’ mDay 2019. - Bogdan Kecman, Oracle – MySQL 8.0 – why upgrademCloud
 
Performance improvements in PostgreSQL 9.5 and beyond
Performance improvements in PostgreSQL 9.5 and beyondPerformance improvements in PostgreSQL 9.5 and beyond
Performance improvements in PostgreSQL 9.5 and beyondTomas Vondra
 
Big Data Analytics with MariaDB ColumnStore
Big Data Analytics with MariaDB ColumnStoreBig Data Analytics with MariaDB ColumnStore
Big Data Analytics with MariaDB ColumnStoreMariaDB plc
 
MariaDB ColumnStore
MariaDB ColumnStoreMariaDB ColumnStore
MariaDB ColumnStoreMariaDB plc
 
Best Practices for Supercharging Cloud Analytics on Amazon Redshift
Best Practices for Supercharging Cloud Analytics on Amazon RedshiftBest Practices for Supercharging Cloud Analytics on Amazon Redshift
Best Practices for Supercharging Cloud Analytics on Amazon RedshiftSnapLogic
 
MySQL 8.0 New Features -- September 27th presentation for Open Source Summit
MySQL 8.0 New Features -- September 27th presentation for Open Source SummitMySQL 8.0 New Features -- September 27th presentation for Open Source Summit
MySQL 8.0 New Features -- September 27th presentation for Open Source SummitDave Stokes
 
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander ZaitsevMigration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander ZaitsevAltinity Ltd
 
MySQL 8.0 Featured for Developers
MySQL 8.0 Featured for DevelopersMySQL 8.0 Featured for Developers
MySQL 8.0 Featured for DevelopersDave Stokes
 
Query Optimization with MySQL 5.6: Old and New Tricks
Query Optimization with MySQL 5.6: Old and New TricksQuery Optimization with MySQL 5.6: Old and New Tricks
Query Optimization with MySQL 5.6: Old and New TricksMYXPLAIN
 
PostgreSQL 9.5 - Major Features
PostgreSQL 9.5 - Major FeaturesPostgreSQL 9.5 - Major Features
PostgreSQL 9.5 - Major FeaturesInMobi Technology
 
In memory databases presentation
In memory databases presentationIn memory databases presentation
In memory databases presentationMichael Keane
 
Oracle Query Optimizer - An Introduction
Oracle Query Optimizer - An IntroductionOracle Query Optimizer - An Introduction
Oracle Query Optimizer - An Introductionadryanbub
 

Similaire à How Database Convergence Impacts the Coming Decades of Data Management (20)

MySQL 8 -- A new beginning : Sunshine PHP/PHP UK (updated)
MySQL 8 -- A new beginning : Sunshine PHP/PHP UK (updated)MySQL 8 -- A new beginning : Sunshine PHP/PHP UK (updated)
MySQL 8 -- A new beginning : Sunshine PHP/PHP UK (updated)
 
What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1
What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1
What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1
 
What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1
What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1
What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1
 
Changing your huge table's data types in production
Changing your huge table's data types in productionChanging your huge table's data types in production
Changing your huge table's data types in production
 
Macy's: Changing Engines in Mid-Flight
Macy's: Changing Engines in Mid-FlightMacy's: Changing Engines in Mid-Flight
Macy's: Changing Engines in Mid-Flight
 
Performance Schema for MySQL Troubleshooting
Performance Schema for MySQL TroubleshootingPerformance Schema for MySQL Troubleshooting
Performance Schema for MySQL Troubleshooting
 
MySQL Performance Schema in Action
MySQL Performance Schema in ActionMySQL Performance Schema in Action
MySQL Performance Schema in Action
 
Developers’ mDay 2019. - Bogdan Kecman, Oracle – MySQL 8.0 – why upgrade
Developers’ mDay 2019. - Bogdan Kecman, Oracle – MySQL 8.0 – why upgradeDevelopers’ mDay 2019. - Bogdan Kecman, Oracle – MySQL 8.0 – why upgrade
Developers’ mDay 2019. - Bogdan Kecman, Oracle – MySQL 8.0 – why upgrade
 
Performance improvements in PostgreSQL 9.5 and beyond
Performance improvements in PostgreSQL 9.5 and beyondPerformance improvements in PostgreSQL 9.5 and beyond
Performance improvements in PostgreSQL 9.5 and beyond
 
MySQL performance tuning
MySQL performance tuningMySQL performance tuning
MySQL performance tuning
 
Big Data Analytics with MariaDB ColumnStore
Big Data Analytics with MariaDB ColumnStoreBig Data Analytics with MariaDB ColumnStore
Big Data Analytics with MariaDB ColumnStore
 
MariaDB ColumnStore
MariaDB ColumnStoreMariaDB ColumnStore
MariaDB ColumnStore
 
Best Practices for Supercharging Cloud Analytics on Amazon Redshift
Best Practices for Supercharging Cloud Analytics on Amazon RedshiftBest Practices for Supercharging Cloud Analytics on Amazon Redshift
Best Practices for Supercharging Cloud Analytics on Amazon Redshift
 
MySQL 8.0 New Features -- September 27th presentation for Open Source Summit
MySQL 8.0 New Features -- September 27th presentation for Open Source SummitMySQL 8.0 New Features -- September 27th presentation for Open Source Summit
MySQL 8.0 New Features -- September 27th presentation for Open Source Summit
 
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander ZaitsevMigration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
 
MySQL 8.0 Featured for Developers
MySQL 8.0 Featured for DevelopersMySQL 8.0 Featured for Developers
MySQL 8.0 Featured for Developers
 
Query Optimization with MySQL 5.6: Old and New Tricks
Query Optimization with MySQL 5.6: Old and New TricksQuery Optimization with MySQL 5.6: Old and New Tricks
Query Optimization with MySQL 5.6: Old and New Tricks
 
PostgreSQL 9.5 - Major Features
PostgreSQL 9.5 - Major FeaturesPostgreSQL 9.5 - Major Features
PostgreSQL 9.5 - Major Features
 
In memory databases presentation
In memory databases presentationIn memory databases presentation
In memory databases presentation
 
Oracle Query Optimizer - An Introduction
Oracle Query Optimizer - An IntroductionOracle Query Optimizer - An Introduction
Oracle Query Optimizer - An Introduction
 

Plus de SingleStore

Five ways database modernization simplifies your data life
Five ways database modernization simplifies your data lifeFive ways database modernization simplifies your data life
Five ways database modernization simplifies your data lifeSingleStore
 
How Kafka and Modern Databases Benefit Apps and Analytics
How Kafka and Modern Databases Benefit Apps and AnalyticsHow Kafka and Modern Databases Benefit Apps and Analytics
How Kafka and Modern Databases Benefit Apps and AnalyticsSingleStore
 
Architecting Data in the AWS Ecosystem
Architecting Data in the AWS EcosystemArchitecting Data in the AWS Ecosystem
Architecting Data in the AWS EcosystemSingleStore
 
Building the Foundation for a Latency-Free Life
Building the Foundation for a Latency-Free LifeBuilding the Foundation for a Latency-Free Life
Building the Foundation for a Latency-Free LifeSingleStore
 
Converging Database Transactions and Analytics
Converging Database Transactions and Analytics Converging Database Transactions and Analytics
Converging Database Transactions and Analytics SingleStore
 
Building a Machine Learning Recommendation Engine in SQL
Building a Machine Learning Recommendation Engine in SQLBuilding a Machine Learning Recommendation Engine in SQL
Building a Machine Learning Recommendation Engine in SQLSingleStore
 
Introduction to MemSQL
Introduction to MemSQLIntroduction to MemSQL
Introduction to MemSQLSingleStore
 
Building a Fault Tolerant Distributed Architecture
Building a Fault Tolerant Distributed ArchitectureBuilding a Fault Tolerant Distributed Architecture
Building a Fault Tolerant Distributed ArchitectureSingleStore
 
Stream Processing with Pipelines and Stored Procedures
Stream Processing with Pipelines  and Stored ProceduresStream Processing with Pipelines  and Stored Procedures
Stream Processing with Pipelines and Stored ProceduresSingleStore
 
Curriculum Associates Strata NYC 2017
Curriculum Associates Strata NYC 2017Curriculum Associates Strata NYC 2017
Curriculum Associates Strata NYC 2017SingleStore
 
Spark Summit Dublin 2017 - MemSQL - Real-Time Image Recognition
Spark Summit Dublin 2017 - MemSQL - Real-Time Image RecognitionSpark Summit Dublin 2017 - MemSQL - Real-Time Image Recognition
Spark Summit Dublin 2017 - MemSQL - Real-Time Image RecognitionSingleStore
 
The State of the Data Warehouse in 2017 and Beyond
The State of the Data Warehouse in 2017 and BeyondThe State of the Data Warehouse in 2017 and Beyond
The State of the Data Warehouse in 2017 and BeyondSingleStore
 
Teaching Databases to Learn in the World of AI
Teaching Databases to Learn in the World of AITeaching Databases to Learn in the World of AI
Teaching Databases to Learn in the World of AISingleStore
 
Gartner Catalyst 2017: Image Recognition on Streaming Data
Gartner Catalyst 2017: Image Recognition on Streaming DataGartner Catalyst 2017: Image Recognition on Streaming Data
Gartner Catalyst 2017: Image Recognition on Streaming DataSingleStore
 
Spark Summit West 2017: Real-Time Image Recognition with MemSQL and Spark
Spark Summit West 2017: Real-Time Image Recognition with MemSQL and SparkSpark Summit West 2017: Real-Time Image Recognition with MemSQL and Spark
Spark Summit West 2017: Real-Time Image Recognition with MemSQL and SparkSingleStore
 
Real-Time Analytics at Uber Scale
Real-Time Analytics at Uber ScaleReal-Time Analytics at Uber Scale
Real-Time Analytics at Uber ScaleSingleStore
 
Machines and the Magic of Fast Learning
Machines and the Magic of Fast LearningMachines and the Magic of Fast Learning
Machines and the Magic of Fast LearningSingleStore
 
Machines and the Magic of Fast Learning - Strata Keynote
Machines and the Magic of Fast Learning - Strata KeynoteMachines and the Magic of Fast Learning - Strata Keynote
Machines and the Magic of Fast Learning - Strata KeynoteSingleStore
 
Enabling Real-Time Analytics for IoT
Enabling Real-Time Analytics for IoTEnabling Real-Time Analytics for IoT
Enabling Real-Time Analytics for IoTSingleStore
 
Real-Time Analytics with Spark and MemSQL
Real-Time Analytics with Spark and MemSQLReal-Time Analytics with Spark and MemSQL
Real-Time Analytics with Spark and MemSQLSingleStore
 

Plus de SingleStore (20)

Five ways database modernization simplifies your data life
Five ways database modernization simplifies your data lifeFive ways database modernization simplifies your data life
Five ways database modernization simplifies your data life
 
How Kafka and Modern Databases Benefit Apps and Analytics
How Kafka and Modern Databases Benefit Apps and AnalyticsHow Kafka and Modern Databases Benefit Apps and Analytics
How Kafka and Modern Databases Benefit Apps and Analytics
 
Architecting Data in the AWS Ecosystem
Architecting Data in the AWS EcosystemArchitecting Data in the AWS Ecosystem
Architecting Data in the AWS Ecosystem
 
Building the Foundation for a Latency-Free Life
Building the Foundation for a Latency-Free LifeBuilding the Foundation for a Latency-Free Life
Building the Foundation for a Latency-Free Life
 
Converging Database Transactions and Analytics
Converging Database Transactions and Analytics Converging Database Transactions and Analytics
Converging Database Transactions and Analytics
 
Building a Machine Learning Recommendation Engine in SQL
Building a Machine Learning Recommendation Engine in SQLBuilding a Machine Learning Recommendation Engine in SQL
Building a Machine Learning Recommendation Engine in SQL
 
Introduction to MemSQL
Introduction to MemSQLIntroduction to MemSQL
Introduction to MemSQL
 
Building a Fault Tolerant Distributed Architecture
Building a Fault Tolerant Distributed ArchitectureBuilding a Fault Tolerant Distributed Architecture
Building a Fault Tolerant Distributed Architecture
 
Stream Processing with Pipelines and Stored Procedures
Stream Processing with Pipelines  and Stored ProceduresStream Processing with Pipelines  and Stored Procedures
Stream Processing with Pipelines and Stored Procedures
 
Curriculum Associates Strata NYC 2017
Curriculum Associates Strata NYC 2017Curriculum Associates Strata NYC 2017
Curriculum Associates Strata NYC 2017
 
Spark Summit Dublin 2017 - MemSQL - Real-Time Image Recognition
Spark Summit Dublin 2017 - MemSQL - Real-Time Image RecognitionSpark Summit Dublin 2017 - MemSQL - Real-Time Image Recognition
Spark Summit Dublin 2017 - MemSQL - Real-Time Image Recognition
 
The State of the Data Warehouse in 2017 and Beyond
The State of the Data Warehouse in 2017 and BeyondThe State of the Data Warehouse in 2017 and Beyond
The State of the Data Warehouse in 2017 and Beyond
 
Teaching Databases to Learn in the World of AI
Teaching Databases to Learn in the World of AITeaching Databases to Learn in the World of AI
Teaching Databases to Learn in the World of AI
 
Gartner Catalyst 2017: Image Recognition on Streaming Data
Gartner Catalyst 2017: Image Recognition on Streaming DataGartner Catalyst 2017: Image Recognition on Streaming Data
Gartner Catalyst 2017: Image Recognition on Streaming Data
 
Spark Summit West 2017: Real-Time Image Recognition with MemSQL and Spark
Spark Summit West 2017: Real-Time Image Recognition with MemSQL and SparkSpark Summit West 2017: Real-Time Image Recognition with MemSQL and Spark
Spark Summit West 2017: Real-Time Image Recognition with MemSQL and Spark
 
Real-Time Analytics at Uber Scale
Real-Time Analytics at Uber ScaleReal-Time Analytics at Uber Scale
Real-Time Analytics at Uber Scale
 
Machines and the Magic of Fast Learning
Machines and the Magic of Fast LearningMachines and the Magic of Fast Learning
Machines and the Magic of Fast Learning
 
Machines and the Magic of Fast Learning - Strata Keynote
Machines and the Magic of Fast Learning - Strata KeynoteMachines and the Magic of Fast Learning - Strata Keynote
Machines and the Magic of Fast Learning - Strata Keynote
 
Enabling Real-Time Analytics for IoT
Enabling Real-Time Analytics for IoTEnabling Real-Time Analytics for IoT
Enabling Real-Time Analytics for IoT
 
Real-Time Analytics with Spark and MemSQL
Real-Time Analytics with Spark and MemSQLReal-Time Analytics with Spark and MemSQL
Real-Time Analytics with Spark and MemSQL
 

Dernier

08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 

Dernier (20)

08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 

How Database Convergence Impacts the Coming Decades of Data Management

  • 1. How Database Convergence Impacts the Coming Decades of Data Management Nikita Shamgunov CEO and co-founder of MemSQL
  • 2. 2 MISSION Growth of digital business impacting data architectures We make every company a real-time enterprise PRODUCT Top Ranked Operational Data Warehouse MemSQL provides you the ability to learn and react in real time ABOUT Founders are former Facebook, SQL Server database engineers $85m in funding from Top Tier investors; Enterprise Customers: MemSQL at a Glance
  • 3. Converge Transactions and Analytics in a Relational Database ● New breed of applications ○ Analytics as part of a transaction ○ Analytics when the data is born ○ In database AI/ML ● Scalable OLAP and OLTP in one system ○ Fewer systems to manage ○ Utility database consumption ○ Supports HTAP
  • 4. Traditional + Future Architecture 4 In-Memory Data Store Analytics, Historical Reporting and Data Discovery Analytic Apps DMSA Data Integration Transactions + Operational Analytics Traditional Reports and Analytics IoT Data Social Data RAM ? HTAP Apps Analytic Apps
  • 5. The New Data Architecture without DMSA 5 Transactions + Operational Reports ? IoT Data Social Data In-Memory Data Store HTAP Apps Analytic Apps RAM Analytics, Historical Reporting and Data Discovery
  • 6. The Enterprise Requires Performance 6 FAST Data Loading Stream data Real-time loading Full data access LOW Query Latency Vectorized queries Real-time dashboards Live data access Multi-threaded processing Transactions and Analytics Scalable performance HIGH Concurrency
  • 7. ● Focus on analytics and Deliver a Hybrid Cloud Data Warehouse ○ Hybrid-cloud ○ Scalable with integration to data lakes ○ Real-time ○ Simplicity ● Converge transactions and analytics ○ Transaction support ○ Multi-cloud reliability ○ Application support North Star. Build a New Category of Databases
  • 8. Real-time and Query Performance Goals: Eliminate batching and deliver instant results to user or app ● Investments ○ Streaming ingest ■ Kafka ■ Kinesis ○ Transactional consistency ■ Ability to change data rapidly ■ Ability to scale analytics to millions requests a second to enable self service customer customer analytics ○ Query performance ■ Scale out ■ Vectorization ● Results ○ Dramatic query performance improvements for BI use cases ○ PIPELINEs adoption is growing
  • 9. Simplicity ● Goals ○ No knobs where you don’t need them ○ Data warehousing workloads work out of the box ○ No hints for queries ○ No scaling limits ● Investments ○ Query optimization and query execution ● Timelines ○ Several releases in 2017
  • 10. 10 ▪ Columnstore • On disk with working set in memory • Super fast scans • Support analytical and data warehousing workloads • One index • Petabyte scale Access Methods ▪ Rowstore • Fully in memory • Submillisecond point updates • Multiple indexes
  • 11. 11 ▪ Supports multi-statement transactions ▪ Supports MVCC Scale out and Transactional ▪ Scalable on commodity hardware ▪ Data hash partitioned and stored in two copies
  • 13. 13 Query Performance ▪ Group-By/Aggregate Performance • Operations on encoded data • Single-instruction multiple data (SIMD) ▪ Filter pushdown to column store ▪ Preference for dictionary compression MemSQL Confidential
  • 14. 14 SIMD overview ▪ Intel AVX-2 ▪ 256-bit registers ▪ Pack multiple values per register ▪ Special instructions for SIMD register operations ▪ Arithmetic, logic, load, store etc. ▪ Allows multiple operations in 1 instruction 1 2 3 4 1 1 1 1 2 3 4 5 + MemSQL Confidential
  • 15. 15 Filter pushdown to dictionary ▪ Example: • FactClick(id, region_id, …) • Select region_id, count(*) from FactClick where region_id like ’%east%’ • region_id has only a few dozen values • It is dictionary-encoded MemSQL Confidential
  • 16. 16 Segment-level filter pushdown ▪ E.g. 6 regions ▪ 1M rows per segment ▪ WHERE region_id like ‘%east%’ ▪ 6 string comparisons/segment ▪ Cache lookup table L: [true, true, false, false, false, false] ▪ Output only rows where L[dictionary_id] = true dictionary_id Region 0 Northeast 1 Southeast 2 North-central 3 South-central 4 Northwest 5 Southwest Dictionary MemSQL Confidential
  • 17. MemSQL Confidential17 Performance ▪ Improved Group-By/Aggregate, up to 80X ▪ Columnstore string filter pushdown ▪ Improved sort performance (can be by 2-3X) ▪ Unenforced uniqueness constraints with RELY option ▪ Query optimizer improvements ▪ Columnstore update ▪ Columnstore JSON
  • 18. MemSQL Confidential18 Automatic Statistics ▪ Always-on cardinality statistics for every column ▪ For columnstore tables only ▪ On by default ▪ Will result in better query plans with less DBA involvement to run ANALYZE TABLE and tune queries
  • 19. Columnstore update performance ▪ Ability to update rows identified via columnstore sort key ▪ Uses in-memory index on row store segment 19 Row Store Segment Col Store Segment Col Store Segment Index on Sort Key … Seek
  • 20. New Query Features ▪ Cross-database queries (joins, insert-select) ▪ UPDATE/DELETE with joins ▪ UPDATE with subselect in SET clause ▪ reference_table LEFT JOIN …; (select …) LEFT JOIN… now supported ▪ Window functions with complex frames • E.g. avg (a) over (order by b rows between 5 preceding and current row) ▪ New window functions • first_value, last_value, nth_value, percentile_cont, percentile_disc ▪ Unenforced unique constraint + RELY 20 MemSQL Confidential
  • 21. Extensibility Features ▪ Major, release-defining feature set ▪ User-defined • Stored procedures (SPs) • Scalar-valued functions (UDFs) • Table-valued functions (TVFs) • Aggregate functions (UDAFs) ▪ Highlights • SQL-developer friendly, clean syntax (no @, $ etc.) • Compiled to machine code for speed • Array and record support 21 MemSQL Confidential
  • 22. Example UDF: normalize_string() select normalize_string("     Abc    XYZ  "); abc xyz 22 MemSQL Confidential
  • 23. Implementation of normalize_string() delimiter // create or replace function normalize_string(str varchar(255)) returns varchar(255) as declare   r varchar(255) = ""; i int; previousChar char; nextChar char; s varchar(255); begin   s = lower(trim(str));   if length(s) = 0 then return s; end if;   previousChar = substr(s, 1, 1);   r = concat(r, previousChar);   i = 2;   while i <= length(s) loop     nextChar = substr(s, i, 1);     if not(previousChar = ' ' and nextChar = ' ') then       r = concat(r, substr(s, i, 1));     end if;     previousChar = nextChar;     i += 1;   end loop;   return r; end // 23 MemSQL Confidential
  • 24. Example SP: Move data more than 5 minutes old from t1 to t2; create table t1(a int, ts datetime); create table t2(a int, ts datetime); … create or replace procedure myMove() as declare   boundary datetime = date_add(now(), interval -5 minute); begin   insert into t2 select * from t1 where ts < boundary;   delete from t1 where ts < boundary; end; 24 MemSQL Confidential
  • 25. Example TVF create table t (i int); insert into t values (1),(2),(3),(4),(5); create function basic(l int) returns table as return select * from t limit l; memsql> select * from basic(0); Empty set (0.00 sec) memsql> select * from basic(2); +------+ | i    | +------+ |    3 | |    2 | +------+ 2 rows in set (0.01 sec) 25 MemSQL Confidential
  • 26. User-Defined Aggregate Functions (UDAFs) ▪ Used like built-in aggregates like SUM() ▪ Based on 4 user-defined functions • Initialize • Iterate • Merge • Terminate 26 MemSQL Confidential
  • 27. Example UDAF -- pick any arbitrary value from input delimiter // create function any_init() returns int as begin return -1; end;// create function any_iter(s int, v int) returns int as begin return v; end;// create function any_merge(s1 int, s2 int) returns int as begin if s1 = -1 then return s2; else return s1; end if; end;// create function any_terminate(s int) returns int as begin return s; end;// delimiter ; create aggregate any_val(int) returns int with state int initialize with any_init iterate with any_iter merge with any_merge terminate with any_terminate; 27 MemSQL Confidential
  • 28. UDAF Output create table t(g int, x int); insert into t values (100, 10), (100, 12), (100, 14), (200, 21), (200, 27); select g, any_val(x) from t group by g; memsql> select g, any_val(x) from t group by g; +------+------------+ | g | any_val(x) | +------+------------+ | 100 | 10 | | 200 | 27 | +------+------------+ 2 rows in set (0.00 sec) 28 MemSQL Confidential
  • 29. SCALAR (get a scalar query result) 29 create table t (i int); insert into t values (1), (2), (3), (4), (5); create or replace procedure scalar_basic() as declare v query(i int) = select max(i) from t; s int = scalar(v); begin call tracelog(s); end; MemSQL Confidential
  • 30. 30 COLLECT create or replace procedure p_coll() as declare c array(record(v varchar(80))); t query(v varchar(80)) = select v from r order by v; begin delete from proc_log; c = collect(t); for x in c loop call tracelog(x.v); end loop; end; MemSQL Confidential
  • 31. CALL and ECHO ▪ call sp_name(args) • When no need to output rowset ▪ echo sp_name(args) • Outputs rowset to client ▪ Exception handling supported 31 MemSQL Confidential
  • 32. Performance ▪ Compiled to machine code using LLVM ▪ UDFs are inlined when appropriate 32 MemSQL Confidential
  • 33. Distributed Execution ▪ SPs run on aggregator ▪ From SPs, parameters and variables are substituted as strings on aggregator before execution on leaves ▪ UDFs can run on any node • Aggregators or leaves • Multiple invocations can run in parallel within a query 33 MemSQL Confidential
  • 34. Summary ▪ New, user-defined • Scalar functions • Stored procedures • Table-valued functions • Aggregate functions ▪ Friendly to experienced SQL developers ▪ Array and record types supported ▪ High-performance through compilation to machine code 34 MemSQL Confidential