SlideShare une entreprise Scribd logo
1  sur  51
In Memory Databases
An Overview
By John Sullivan
john@inmemory.net
Row Store
Features
• Data is stored sequentially by Row
• Essentially an Array / List Structure
• Easy to Add / Update / Insert /Delete
• Need to read entire Row to get to
one Column’s Data
Column Store
Features
• Data is stored by Column
• Faster to Read a few Columns
• Very Hard to Update / Insert
• Reading Data Sequentially from
Column, CPU Cache Friendly
Compressed Column Store
Compressed Column Store
• Column Array is converted into 2 arrays
–One array contains a list of sorted
Unique Values
–Another array containing an integer
index to the values
Sqlite
• Opened by Special Filename :memory:
• Designed for Single Process / File
• Great for embedded systems/ mobile
devices. E.g. IOS Apps
• Row Store , No Column Store
• One Writer only. Non Server Based.
• Free & Open Source
Excel
• Power Pivot, Introduced in Excel 2010
• Non SQL Query Language
• Data Analysis Expressions (DAX)
• Syntax similar to Excel Formulae
• Requires Pro version of Office or Excel
Tableau
• Primarily a Visualization Tool
• Tableau Data Extracts (TDE)
• Compressed Column Store
• Generates one table flat Extract from Source (
that may involve joins )
• Uses ODBC / OLEDB For Extraction
• Only loads required columns from Extract
Qlik
• One of the Original Developers in Compressed
Columnar In Memory Analytics
• Nice Dashboards
• Incremental Updates
• Autojoins Fields based on Field Name
• Scripting Langauge for Generating QVD Files
Qlik Load Script Example
Companies:
LOAD id AS COMPANY_ID,
name as COMPANY_NAME,
postcode AS COMPANY_POSTCODE,
address AS COMPANY_ADDRESS,
If(id > 100, 1, 0) AS FLAG_NATIONAL;
SQL SELECT id, name, postcode, address
FROM database.Companies;
Monet DB
• Pioneer in Columnar Databases
• Research Focussed out of the Netherlands
• Open Source
• Can Cache Expensive Computations and Reuse
• Early versions was used by Data Distilleries,
which got bought out by SPSS
• R Integration
SQL Server Enterprise
• ColumnStore Indexes
–Data is stored by column.
–Blocks of 1,048,576 Values
• InMemory OLTP
(MEMORY_OPTIMIZED=ON) after Create Table
Data/Delta files of 128 MB
Oracle
• TimesTen
– Works with Oracle Database as a Cache
– Telecoms and Financial Companies
• Oracle 12 Enterprise
– Row & Column Formats
– In Memory Columnstore
• Exalytics
SAP Hana
• Pure In Memory Database
• In Memory OLTP Rowstore
• In Memory Columnstore
– Up to 2^31 rows per block
• Cluster Large Fact tables across nodes
• Hana One Available on EC2 & IBM
SAP Hana Archictecture
Memsql
• Pure In Memory Database
• Mysql Wire Protocol Compatible
• Lockfree Linked Lists and Skiplists
• SQL Queries compiled into C++
• Split Large Tables Across Nodes
• Column Store Aimed at Analytics
• Apache Spark Integration
Skiplists
Clustered Databases
• Amazon Redshift
• EMC Greenplum
• IBM Netezza
• HP Vertica
• Teradata
Other In Memory Players
• Sisense BI Focussed
• Parstream Cisco Owned
• Domo SAAS BI Company. Omniture Founder
• Iri
• InsightSquared BI Focussed
• VoltDB Java Stored Procedure Unit of Exec
• Infobright Open Sourced based on Mysql
• KDB Focussed on HFT / Terse
InMemory.Net
public static void testDoublePerformance() {
double total = 0;
for (int kk = 0; kk < 1000000000; kk++) {
total += kk;
}
Console.WriteLine(total);
}
Results
• Ran in about 2.5 second for a billion Rows
• 400 million rows per second on Single Core
• About 50% of performance of C++ Prog.
• 1.6 billion / second when running using 4 Core
• 2.0 billion / second when running with HT
Cores
Initial Version
• InMemoryColumn<T> {
Dictionary <T,int> initialValuesDict;
List <int> initialIndexes;
T [] finalValues;
int [] finalIndexes;
• }
Next Version
• InMemoryColumn<T> {
Dictionary <T,int> initialValuesDict;
int [][] initialIndexes;
T [] finalValues;
int [] finalIndexes;
• }
Final Version
• InMemoryColumn<T> {
Dictionary <T,int> initialValuesDict;
byte/ushort/int [][] initialIndexes;
T [] finalValues;
byte/ushort/int [] finalIndexes;
• }
ANLTR to Parse Queries
grammar Expr;
prog: (expr NEWLINE)* ;
expr: expr ('*'|'/') expr |
expr ('+'|'-') expr |
INT | '(' expr ')' ;
NEWLINE : [rn]+ ;
INT : [0-9]+ ;
Example Rule from Grammer
mainquery [ImpVars vars] returns [InMemoryQuery query ] :
{ $query = new InMemoryQuery(); }
SELECT1
(CACHE {$query.setCache();} )?
(NOCACHE {$query.setNoCache();} )?
(DISTINCT {$query.setDistinct();} )?
fieldclause [$query,$vars]
(
(INTO label { $query.setInto ($label.text2 ) ;})?
FROM tableclause [$query,$vars]
( (COMMA|CROSS JOIN ) tableclause [$query,$vars] ) *
(WHERE whereclause [$query,$vars])?
(GROUP BY groupclause [$query,$vars])?
(HAVING havingclause [$query,$vars])?
(ORDER BY orderclause [$query,$vars])?
(LIMIT limitclause [$query,$vars])?
)? ;
Code Generation
• Generate C# To Evaluate Query
• Compiled Code undergoes JIT for fast exec
• Parameterize Constants
– Simplify complex Constant Expressions
• Generic Table / Column Naming
• Reuse Generated Code
Detail Queries
• Detail Query
–Initial List Algorithm
–Improved by using Arrays of Arrays
–Only one thread works on one
Array
SELECT customerid FROM Orders
for (int tab1_counter = rowStart; tab1_counter < rowEnd; tab1_counter++,)
{ groupRowD1 = groupRowCount >> 14;
groupRowD2 = groupRowCount & 16383;
if (groupRowD2 == 0)
{
if (groupRowD1 > 0)
{
blockCounts[groupRowD1 - 1] = 16384;
}
lock (lock_newBlockObject)
{
groupRowCount = nextRecordD1 << 14;
nextRecordD1++;
}
groupRowD1 = groupRowCount >> 14;
t_total0[groupRowD1] = new byte[16384];
total0 = t_total0[groupRowD1];
};
total0[groupRowD2] = val_t1_c1[tab1_counter];
groupRowCount++;
if ((groupRowCount & 16383) == 0)
{
blockCounts[groupRowD1] = 16384;
}
}
Aggregative Queries
• Group Cardinality =1
• Group Cardinality < 500k
– Use Arrays of Arrays,
– Lookup Key being Group Index
• Group Cardinality > 500k
– Use Dictionaries to Correlate Group Index ->
Storage
– Arrays of Arrays
SELECT customer, SUM(1) FROM orders
WHERE employee=1 GROUP BY customer
for (int tab1_counter = rowStart; tab1_counter < rowEnd;
tab1_counter++, newRow = false) {
if ((val_t1_c2[tab1_counter] == const_0_t1_c2)) {
rowIndex = val_t1_c1[tab1_counter];
if (groupRowExists[rowIndex] == 0) newRow = true;
groupRowExists[rowIndex] = 1;
total1[rowIndex] += const_0;
if (newRow) {
total0[rowIndex]=val_t1_c1[tab1_counter];
}
}
}
COUNT DISTINCT
• Initial Algorithm used Byte []
• Used lots of Memory on Large Cores
• Upgraded to 1 [] across all Cores
• Interlocked.CompareExchange to set Bit
• Hashmap for initial Values
• Then switch to byte []
Subqueries
• Subquery in Table clause can be materialized
into temp table ( CACHE )
• Simplify Subquery ( NOCACHE)
Only Fields Parent SELECT Requires
Pass Through Parent WHERE Clause
JOINS
• LEFT & INNER JOIN SUPPORT
• Merge Parent & Child Column Values
• Parent Value -> Child Indexes
• ONE to ONE
– Join becomes an Array Lookup
• ONE to Many
– Join Becomes for Loop
Query Simplification
• Rewrite Aggregate Queries with Expressions
SELECT SUM(1) / SUM (qty ) FROM Orders
SELECT SUM(1) as A, SUM(QTY) as B from
Orders
SELECT A/B FROM TEMP_QUERY
More Simplifications
• Group Expressions with 1 Database Field
e.g. Group by Month ( OrderDate )
Inner Join OrderDate to Table of Its Unique
Values and Month ( OrderDate )
• Remove Redundant Group By Parts
Group BY OrderDate , Month ( Orderdate )
Group BY OrderDate , Month ( Orderdate )
HAVING Clause
• Convert to two Queries
• One Query without Having Clause
• Having Clause becomes Where of Second
Query
Function List
String Functions
CAST | CAST_STR_AS_INT | CAST_STR_AS_DECIMAL | CHAR | CHARINDEX | COALESCE | CONCAT | CSTR | ENDSWITH | INSERT | ISNULL | ISNULLOREMPTY
| LEFT | LEN | LCASE | LTRIM | REMOVE | REPLACE | REVERSE | RIGHT | RTRIM | SUBSTRING | STARTSWITH | TRIM | UCASE
Date Functions
CDATE | DATEADD | DATEDIFF | DATEDIFFMILLISECOND | DATEPART | DATESERIAL | DAY | DAYOFWEEK | MONTH | TRUNC | YEAR
Math
ABS | CAST_NUM_AS_BYTE | CAST_NUM_AS_DECIMAL | CAST_NUM_AS_DOUBLE | CAST_NUM_AS_INT | CAST_NUM_AS_LONG | CAST_NUM_AS_SHORT
| CAST_NUM_AS_SINGLE | FLOOR | LOG | MAX | MAXLIST | MIN | MINLIST | POWER | RAND | ROUND | SIGN | SQRT
Trigonometric
ASIN | ACOS | ATAN | ATAN2 | COS | COSH | SIN | SINH | TAN | TANH
Aggregate Functions
MIN | MAX | COUNT | AVG | SUM | COUNT ( DISTINCT() ) | MINLIST | MAXLIST
Statistical Functions
STDEV| STDEVP | VAR | VARP
Special Cases
• SELECT DISCOUNT ( COUNT CUSTOMER )
FROM ORDERS
• Answer is No of Customer Values
• SELECT DISTINCT CUSTOMER FROM ORDERS
Answer is List of Customer Unique Values
Importing Data
DATASOURCE a1=ODBC 'dsn=ir_northwind'
IMPORT Customers=a1.customers
IMPORT Products=a1.{SELECT * FROM Products}
IMPORT orders-a1.'somequery.sql'
SAVE
Importing Data II
• ODBC / OLEDB / DOT NET Providers
• Special ME Datasource
• Existing In Memory Databases
• UNION ALL Between Sources
• SLURP Command
• Variables, Expressions & IF
Interfacing to the Database
• Native Dot Net API
• Dot Net Data Provider
• COM/ ACTIVEX API
• ODBC Driver
C / C++ IO
Licensed ODBC Kit
Parameterized Queries + Cursor Support
Hard Learned Lessons
• Allocated and Store Variables Relating to One
Thread Sequentially. Don’t intermix
• Xeon Servers with Maxed out memory can
have slower memory access speed
– 1 Rank 1,866 Mhz
– 2 Ranks 1,600 Mhz
– 3 Ranks 1,333 Mhz
Bitcoin Mining / HFT
• CPUS
• GPUs
• FPGAs
• Dedicated Mining Chip
GPU & InMemory Databases
• GPUDB, MAPD
– Good for Visualising Billions of Points
– GPUs can run thousands of Cores on Data
– GPU to Main Memory Bottleneck
– Potentially more Data Reduction
• Blazegraph, Graphsql
Fast Graph Database that can use GPU
FPGA Potential
• Field-Programmable Gate Array
– is an integrated circuit designed to be configured
by a customer or a designer after manufacturing
– Programmable Integrated Circuit
• Could be used to enhanced In Memory DBs
• Intel bought Altera back in June 2015
– Will roll technology out into Data Center
Hardware Transaction Memory
• Simplifies Concurrent Programming
– Group of Load & Store Instructions
– Can Execute Atomically
• Hardware of Software Transactional Memory
• Intel TSX
– Transaction Synchronization Extensions
– Available in some Skylake Processors
– Added to Haswell/Broadwell but Disabled
3D XPoint Memory
• Announced by Intel & Micron June 2015
• 1000 times more Durable than Flash
• Like DRAM that has Permanence
• Latency 10 times faster than NAND SSD
• 4-6 Times slower than DRAM
Thanks for help with Market Research
• Dan Khasis
• Niall Dalton
• Jeff Cordova – Wavefront
• SapHanaTutorial.com

Contenu connexe

Tendances

Introduction to NoSQL Databases
Introduction to NoSQL DatabasesIntroduction to NoSQL Databases
Introduction to NoSQL DatabasesDerek Stainer
 
Delta lake and the delta architecture
Delta lake and the delta architectureDelta lake and the delta architecture
Delta lake and the delta architectureAdam Doyle
 
A Seminar on NoSQL Databases.
A Seminar on NoSQL Databases.A Seminar on NoSQL Databases.
A Seminar on NoSQL Databases.Navdeep Charan
 
AWS (Amazon Redshift) presentation
AWS (Amazon Redshift) presentationAWS (Amazon Redshift) presentation
AWS (Amazon Redshift) presentationVolodymyr Rovetskiy
 
Migrating Your Oracle Database to PostgreSQL - AWS Online Tech Talks
Migrating Your Oracle Database to PostgreSQL - AWS Online Tech TalksMigrating Your Oracle Database to PostgreSQL - AWS Online Tech Talks
Migrating Your Oracle Database to PostgreSQL - AWS Online Tech TalksAmazon Web Services
 
Databricks Delta Lake and Its Benefits
Databricks Delta Lake and Its BenefitsDatabricks Delta Lake and Its Benefits
Databricks Delta Lake and Its BenefitsDatabricks
 
Sql vs NoSQL
Sql vs NoSQLSql vs NoSQL
Sql vs NoSQLRTigger
 
NoSQL Now! NoSQL Architecture Patterns
NoSQL Now! NoSQL Architecture PatternsNoSQL Now! NoSQL Architecture Patterns
NoSQL Now! NoSQL Architecture PatternsDATAVERSITY
 
DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDatabricks
 
Building the Data Lake with Azure Data Factory and Data Lake Analytics
Building the Data Lake with Azure Data Factory and Data Lake AnalyticsBuilding the Data Lake with Azure Data Factory and Data Lake Analytics
Building the Data Lake with Azure Data Factory and Data Lake AnalyticsKhalid Salama
 
Introduction to memcached
Introduction to memcachedIntroduction to memcached
Introduction to memcachedJurriaan Persyn
 
Data Warehouse or Data Lake, Which Do I Choose?
Data Warehouse or Data Lake, Which Do I Choose?Data Warehouse or Data Lake, Which Do I Choose?
Data Warehouse or Data Lake, Which Do I Choose?DATAVERSITY
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to RedisDvir Volk
 
Apache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek BerlinApache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek BerlinChristian Johannsen
 
Cassandra at Instagram 2016 (Dikang Gu, Facebook) | Cassandra Summit 2016
Cassandra at Instagram 2016 (Dikang Gu, Facebook) | Cassandra Summit 2016Cassandra at Instagram 2016 (Dikang Gu, Facebook) | Cassandra Summit 2016
Cassandra at Instagram 2016 (Dikang Gu, Facebook) | Cassandra Summit 2016DataStax
 
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the CloudAmazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the CloudNoritaka Sekiyama
 

Tendances (20)

Introduction to NoSQL Databases
Introduction to NoSQL DatabasesIntroduction to NoSQL Databases
Introduction to NoSQL Databases
 
Modern Data Architecture
Modern Data ArchitectureModern Data Architecture
Modern Data Architecture
 
Delta lake and the delta architecture
Delta lake and the delta architectureDelta lake and the delta architecture
Delta lake and the delta architecture
 
Introduction to NoSQL
Introduction to NoSQLIntroduction to NoSQL
Introduction to NoSQL
 
A Seminar on NoSQL Databases.
A Seminar on NoSQL Databases.A Seminar on NoSQL Databases.
A Seminar on NoSQL Databases.
 
AWS (Amazon Redshift) presentation
AWS (Amazon Redshift) presentationAWS (Amazon Redshift) presentation
AWS (Amazon Redshift) presentation
 
Migrating Your Oracle Database to PostgreSQL - AWS Online Tech Talks
Migrating Your Oracle Database to PostgreSQL - AWS Online Tech TalksMigrating Your Oracle Database to PostgreSQL - AWS Online Tech Talks
Migrating Your Oracle Database to PostgreSQL - AWS Online Tech Talks
 
Databricks Delta Lake and Its Benefits
Databricks Delta Lake and Its BenefitsDatabricks Delta Lake and Its Benefits
Databricks Delta Lake and Its Benefits
 
Sql vs NoSQL
Sql vs NoSQLSql vs NoSQL
Sql vs NoSQL
 
Snowflake Overview
Snowflake OverviewSnowflake Overview
Snowflake Overview
 
Key-Value NoSQL Database
Key-Value NoSQL DatabaseKey-Value NoSQL Database
Key-Value NoSQL Database
 
NoSQL Now! NoSQL Architecture Patterns
NoSQL Now! NoSQL Architecture PatternsNoSQL Now! NoSQL Architecture Patterns
NoSQL Now! NoSQL Architecture Patterns
 
DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
 
Building the Data Lake with Azure Data Factory and Data Lake Analytics
Building the Data Lake with Azure Data Factory and Data Lake AnalyticsBuilding the Data Lake with Azure Data Factory and Data Lake Analytics
Building the Data Lake with Azure Data Factory and Data Lake Analytics
 
Introduction to memcached
Introduction to memcachedIntroduction to memcached
Introduction to memcached
 
Data Warehouse or Data Lake, Which Do I Choose?
Data Warehouse or Data Lake, Which Do I Choose?Data Warehouse or Data Lake, Which Do I Choose?
Data Warehouse or Data Lake, Which Do I Choose?
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to Redis
 
Apache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek BerlinApache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek Berlin
 
Cassandra at Instagram 2016 (Dikang Gu, Facebook) | Cassandra Summit 2016
Cassandra at Instagram 2016 (Dikang Gu, Facebook) | Cassandra Summit 2016Cassandra at Instagram 2016 (Dikang Gu, Facebook) | Cassandra Summit 2016
Cassandra at Instagram 2016 (Dikang Gu, Facebook) | Cassandra Summit 2016
 
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the CloudAmazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
 

En vedette

In-Memory Database Platform for Big Data
In-Memory Database Platform for Big DataIn-Memory Database Platform for Big Data
In-Memory Database Platform for Big DataSAP Technology
 
In-Memory Computing: How, Why? and common Patterns
In-Memory Computing: How, Why? and common PatternsIn-Memory Computing: How, Why? and common Patterns
In-Memory Computing: How, Why? and common PatternsSrinath Perera
 
in-memory database system and low latency
in-memory database system and low latencyin-memory database system and low latency
in-memory database system and low latencyhyeongchae lee
 
Using In-Memory Encrypted Databases on the Cloud
Using In-Memory Encrypted Databases on the CloudUsing In-Memory Encrypted Databases on the Cloud
Using In-Memory Encrypted Databases on the CloudFrancesco Pagano
 
IN-MEMORY DATABASE SYSTEMS.SAP HANA DATABASE.
IN-MEMORY DATABASE SYSTEMS.SAP HANA DATABASE.IN-MEMORY DATABASE SYSTEMS.SAP HANA DATABASE.
IN-MEMORY DATABASE SYSTEMS.SAP HANA DATABASE.George Joseph
 
Sap technical deep dive in a column oriented in memory database
Sap technical deep dive in a column oriented in memory databaseSap technical deep dive in a column oriented in memory database
Sap technical deep dive in a column oriented in memory databaseAlexander Talac
 
In-memory Database and MySQL Cluster
In-memory Database and MySQL ClusterIn-memory Database and MySQL Cluster
In-memory Database and MySQL Clustergrandis_au
 
In memory big data management and processing a survey
In memory big data management and processing a surveyIn memory big data management and processing a survey
In memory big data management and processing a surveyredpel dot com
 
Oracle Big Data. Обзор технологий
Oracle Big Data. Обзор технологийOracle Big Data. Обзор технологий
Oracle Big Data. Обзор технологийAndrey Akulov
 
IMCSummit 2015 - Day 2 IT Business Track - 4 Myths about In-Memory Databases ...
IMCSummit 2015 - Day 2 IT Business Track - 4 Myths about In-Memory Databases ...IMCSummit 2015 - Day 2 IT Business Track - 4 Myths about In-Memory Databases ...
IMCSummit 2015 - Day 2 IT Business Track - 4 Myths about In-Memory Databases ...In-Memory Computing Summit
 
Data Migration Between MongoDB and Oracle
Data Migration Between MongoDB and OracleData Migration Between MongoDB and Oracle
Data Migration Between MongoDB and OracleChihYung(Raymond) Wu
 
Oracle To Sql Server migration process
Oracle To Sql Server migration processOracle To Sql Server migration process
Oracle To Sql Server migration processharirk1986
 
Oracle 12 c new-features
Oracle 12 c new-featuresOracle 12 c new-features
Oracle 12 c new-featuresNavneet Upneja
 
Best Practices for Genomic and Bioinformatics Analysis Pipelines on AWS
Best Practices for Genomic and Bioinformatics Analysis Pipelines on AWS Best Practices for Genomic and Bioinformatics Analysis Pipelines on AWS
Best Practices for Genomic and Bioinformatics Analysis Pipelines on AWS Amazon Web Services
 

En vedette (20)

In-Memory DataBase
In-Memory DataBaseIn-Memory DataBase
In-Memory DataBase
 
In-Memory Database Platform for Big Data
In-Memory Database Platform for Big DataIn-Memory Database Platform for Big Data
In-Memory Database Platform for Big Data
 
In-Memory Computing: How, Why? and common Patterns
In-Memory Computing: How, Why? and common PatternsIn-Memory Computing: How, Why? and common Patterns
In-Memory Computing: How, Why? and common Patterns
 
in-memory database system and low latency
in-memory database system and low latencyin-memory database system and low latency
in-memory database system and low latency
 
Using In-Memory Encrypted Databases on the Cloud
Using In-Memory Encrypted Databases on the CloudUsing In-Memory Encrypted Databases on the Cloud
Using In-Memory Encrypted Databases on the Cloud
 
IN-MEMORY DATABASE SYSTEMS.SAP HANA DATABASE.
IN-MEMORY DATABASE SYSTEMS.SAP HANA DATABASE.IN-MEMORY DATABASE SYSTEMS.SAP HANA DATABASE.
IN-MEMORY DATABASE SYSTEMS.SAP HANA DATABASE.
 
Sap technical deep dive in a column oriented in memory database
Sap technical deep dive in a column oriented in memory databaseSap technical deep dive in a column oriented in memory database
Sap technical deep dive in a column oriented in memory database
 
Ibm aix
Ibm aixIbm aix
Ibm aix
 
In-memory Database and MySQL Cluster
In-memory Database and MySQL ClusterIn-memory Database and MySQL Cluster
In-memory Database and MySQL Cluster
 
Dell server basics v5 0713
Dell server basics v5 0713Dell server basics v5 0713
Dell server basics v5 0713
 
In memory big data management and processing a survey
In memory big data management and processing a surveyIn memory big data management and processing a survey
In memory big data management and processing a survey
 
Oracle Big Data. Обзор технологий
Oracle Big Data. Обзор технологийOracle Big Data. Обзор технологий
Oracle Big Data. Обзор технологий
 
IMCSummit 2015 - Day 2 IT Business Track - 4 Myths about In-Memory Databases ...
IMCSummit 2015 - Day 2 IT Business Track - 4 Myths about In-Memory Databases ...IMCSummit 2015 - Day 2 IT Business Track - 4 Myths about In-Memory Databases ...
IMCSummit 2015 - Day 2 IT Business Track - 4 Myths about In-Memory Databases ...
 
Data Migration Between MongoDB and Oracle
Data Migration Between MongoDB and OracleData Migration Between MongoDB and Oracle
Data Migration Between MongoDB and Oracle
 
No sql databases
No sql databasesNo sql databases
No sql databases
 
Oracle To Sql Server migration process
Oracle To Sql Server migration processOracle To Sql Server migration process
Oracle To Sql Server migration process
 
Oracle 12 c new-features
Oracle 12 c new-featuresOracle 12 c new-features
Oracle 12 c new-features
 
Unix Administration 1
Unix Administration 1Unix Administration 1
Unix Administration 1
 
Installing Aix
Installing AixInstalling Aix
Installing Aix
 
Best Practices for Genomic and Bioinformatics Analysis Pipelines on AWS
Best Practices for Genomic and Bioinformatics Analysis Pipelines on AWS Best Practices for Genomic and Bioinformatics Analysis Pipelines on AWS
Best Practices for Genomic and Bioinformatics Analysis Pipelines on AWS
 

Similaire à In memory databases presentation

MariaDB ColumnStore
MariaDB ColumnStoreMariaDB ColumnStore
MariaDB ColumnStoreMariaDB plc
 
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander ZaitsevMigration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander ZaitsevAltinity Ltd
 
30334823 my sql-cluster-performance-tuning-best-practices
30334823 my sql-cluster-performance-tuning-best-practices30334823 my sql-cluster-performance-tuning-best-practices
30334823 my sql-cluster-performance-tuning-best-practicesDavid Dhavan
 
Building a Large Scale SEO/SEM Application with Apache Solr
Building a Large Scale SEO/SEM Application with Apache SolrBuilding a Large Scale SEO/SEM Application with Apache Solr
Building a Large Scale SEO/SEM Application with Apache SolrRahul Jain
 
Scaling MySQL Strategies for Developers
Scaling MySQL Strategies for DevelopersScaling MySQL Strategies for Developers
Scaling MySQL Strategies for DevelopersJonathan Levin
 
Building a Large Scale SEO/SEM Application with Apache Solr: Presented by Rah...
Building a Large Scale SEO/SEM Application with Apache Solr: Presented by Rah...Building a Large Scale SEO/SEM Application with Apache Solr: Presented by Rah...
Building a Large Scale SEO/SEM Application with Apache Solr: Presented by Rah...Lucidworks
 
Deep Dive into DynamoDB
Deep Dive into DynamoDBDeep Dive into DynamoDB
Deep Dive into DynamoDBAWS Germany
 
Aioug vizag oracle12c_new_features
Aioug vizag oracle12c_new_featuresAioug vizag oracle12c_new_features
Aioug vizag oracle12c_new_featuresAiougVizagChapter
 
dbs class 7.ppt
dbs class 7.pptdbs class 7.ppt
dbs class 7.pptMARasheed3
 
Best Practices for Supercharging Cloud Analytics on Amazon Redshift
Best Practices for Supercharging Cloud Analytics on Amazon RedshiftBest Practices for Supercharging Cloud Analytics on Amazon Redshift
Best Practices for Supercharging Cloud Analytics on Amazon RedshiftSnapLogic
 
Large Scale Lakehouse Implementation Using Structured Streaming
Large Scale Lakehouse Implementation Using Structured StreamingLarge Scale Lakehouse Implementation Using Structured Streaming
Large Scale Lakehouse Implementation Using Structured StreamingDatabricks
 
hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibaba
hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibabahbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibaba
hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at AlibabaMichael Stack
 
AWS June 2016 Webinar Series - Amazon Redshift or Big Data Analytics
AWS June 2016 Webinar Series - Amazon Redshift or Big Data AnalyticsAWS June 2016 Webinar Series - Amazon Redshift or Big Data Analytics
AWS June 2016 Webinar Series - Amazon Redshift or Big Data AnalyticsAmazon Web Services
 
SparkSQL: A Compiler from Queries to RDDs
SparkSQL: A Compiler from Queries to RDDsSparkSQL: A Compiler from Queries to RDDs
SparkSQL: A Compiler from Queries to RDDsDatabricks
 
Best Practices for Migrating your Data Warehouse to Amazon Redshift
Best Practices for Migrating your Data Warehouse to Amazon RedshiftBest Practices for Migrating your Data Warehouse to Amazon Redshift
Best Practices for Migrating your Data Warehouse to Amazon RedshiftAmazon Web Services
 

Similaire à In memory databases presentation (20)

MariaDB ColumnStore
MariaDB ColumnStoreMariaDB ColumnStore
MariaDB ColumnStore
 
Cassandra training
Cassandra trainingCassandra training
Cassandra training
 
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander ZaitsevMigration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
 
Redshift overview
Redshift overviewRedshift overview
Redshift overview
 
30334823 my sql-cluster-performance-tuning-best-practices
30334823 my sql-cluster-performance-tuning-best-practices30334823 my sql-cluster-performance-tuning-best-practices
30334823 my sql-cluster-performance-tuning-best-practices
 
Building a Large Scale SEO/SEM Application with Apache Solr
Building a Large Scale SEO/SEM Application with Apache SolrBuilding a Large Scale SEO/SEM Application with Apache Solr
Building a Large Scale SEO/SEM Application with Apache Solr
 
Scaling MySQL Strategies for Developers
Scaling MySQL Strategies for DevelopersScaling MySQL Strategies for Developers
Scaling MySQL Strategies for Developers
 
Building a Large Scale SEO/SEM Application with Apache Solr: Presented by Rah...
Building a Large Scale SEO/SEM Application with Apache Solr: Presented by Rah...Building a Large Scale SEO/SEM Application with Apache Solr: Presented by Rah...
Building a Large Scale SEO/SEM Application with Apache Solr: Presented by Rah...
 
Deep Dive into DynamoDB
Deep Dive into DynamoDBDeep Dive into DynamoDB
Deep Dive into DynamoDB
 
L6.sp17.pptx
L6.sp17.pptxL6.sp17.pptx
L6.sp17.pptx
 
Aioug vizag oracle12c_new_features
Aioug vizag oracle12c_new_featuresAioug vizag oracle12c_new_features
Aioug vizag oracle12c_new_features
 
dbs class 7.ppt
dbs class 7.pptdbs class 7.ppt
dbs class 7.ppt
 
Best Practices for Supercharging Cloud Analytics on Amazon Redshift
Best Practices for Supercharging Cloud Analytics on Amazon RedshiftBest Practices for Supercharging Cloud Analytics on Amazon Redshift
Best Practices for Supercharging Cloud Analytics on Amazon Redshift
 
Large Scale Lakehouse Implementation Using Structured Streaming
Large Scale Lakehouse Implementation Using Structured StreamingLarge Scale Lakehouse Implementation Using Structured Streaming
Large Scale Lakehouse Implementation Using Structured Streaming
 
hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibaba
hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibabahbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibaba
hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibaba
 
AWS June 2016 Webinar Series - Amazon Redshift or Big Data Analytics
AWS June 2016 Webinar Series - Amazon Redshift or Big Data AnalyticsAWS June 2016 Webinar Series - Amazon Redshift or Big Data Analytics
AWS June 2016 Webinar Series - Amazon Redshift or Big Data Analytics
 
Redshift deep dive
Redshift deep diveRedshift deep dive
Redshift deep dive
 
SparkSQL: A Compiler from Queries to RDDs
SparkSQL: A Compiler from Queries to RDDsSparkSQL: A Compiler from Queries to RDDs
SparkSQL: A Compiler from Queries to RDDs
 
Master tuning
Master   tuningMaster   tuning
Master tuning
 
Best Practices for Migrating your Data Warehouse to Amazon Redshift
Best Practices for Migrating your Data Warehouse to Amazon RedshiftBest Practices for Migrating your Data Warehouse to Amazon Redshift
Best Practices for Migrating your Data Warehouse to Amazon Redshift
 

Dernier

Digital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareDigital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareGraham Ware
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...gajnagarg
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNKTimothy Spann
 
The-boAt-Story-Navigating-the-Waves-of-Innovation.pptx
The-boAt-Story-Navigating-the-Waves-of-Innovation.pptxThe-boAt-Story-Navigating-the-Waves-of-Innovation.pptx
The-boAt-Story-Navigating-the-Waves-of-Innovation.pptxVivek487417
 
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制vexqp
 
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样wsppdmt
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...Elaine Werffeli
 
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxchadhar227
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book nowVadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book nowgargpaaro
 
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...nirzagarg
 
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制vexqp
 
Harnessing the Power of GenAI for BI and Reporting.pptx
Harnessing the Power of GenAI for BI and Reporting.pptxHarnessing the Power of GenAI for BI and Reporting.pptx
Harnessing the Power of GenAI for BI and Reporting.pptxParas Gupta
 
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Klinik kandungan
 
Switzerland Constitution 2002.pdf.........
Switzerland Constitution 2002.pdf.........Switzerland Constitution 2002.pdf.........
Switzerland Constitution 2002.pdf.........EfruzAsilolu
 
Dubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls DubaiDubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls Dubaikojalkojal131
 
怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制
怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制
怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制vexqp
 
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...gajnagarg
 

Dernier (20)

Digital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareDigital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham Ware
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
The-boAt-Story-Navigating-the-Waves-of-Innovation.pptx
The-boAt-Story-Navigating-the-Waves-of-Innovation.pptxThe-boAt-Story-Navigating-the-Waves-of-Innovation.pptx
The-boAt-Story-Navigating-the-Waves-of-Innovation.pptx
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
 
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
 
Cytotec in Jeddah+966572737505) get unwanted pregnancy kit Riyadh
Cytotec in Jeddah+966572737505) get unwanted pregnancy kit RiyadhCytotec in Jeddah+966572737505) get unwanted pregnancy kit Riyadh
Cytotec in Jeddah+966572737505) get unwanted pregnancy kit Riyadh
 
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptx
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book nowVadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
 
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
 
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
 
Harnessing the Power of GenAI for BI and Reporting.pptx
Harnessing the Power of GenAI for BI and Reporting.pptxHarnessing the Power of GenAI for BI and Reporting.pptx
Harnessing the Power of GenAI for BI and Reporting.pptx
 
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
 
Switzerland Constitution 2002.pdf.........
Switzerland Constitution 2002.pdf.........Switzerland Constitution 2002.pdf.........
Switzerland Constitution 2002.pdf.........
 
Dubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls DubaiDubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls Dubai
 
怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制
怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制
怎样办理伦敦大学毕业证(UoL毕业证书)成绩单学校原版复制
 
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
 

In memory databases presentation

  • 1. In Memory Databases An Overview By John Sullivan john@inmemory.net
  • 3. Features • Data is stored sequentially by Row • Essentially an Array / List Structure • Easy to Add / Update / Insert /Delete • Need to read entire Row to get to one Column’s Data
  • 5. Features • Data is stored by Column • Faster to Read a few Columns • Very Hard to Update / Insert • Reading Data Sequentially from Column, CPU Cache Friendly
  • 7. Compressed Column Store • Column Array is converted into 2 arrays –One array contains a list of sorted Unique Values –Another array containing an integer index to the values
  • 8. Sqlite • Opened by Special Filename :memory: • Designed for Single Process / File • Great for embedded systems/ mobile devices. E.g. IOS Apps • Row Store , No Column Store • One Writer only. Non Server Based. • Free & Open Source
  • 9. Excel • Power Pivot, Introduced in Excel 2010 • Non SQL Query Language • Data Analysis Expressions (DAX) • Syntax similar to Excel Formulae • Requires Pro version of Office or Excel
  • 10. Tableau • Primarily a Visualization Tool • Tableau Data Extracts (TDE) • Compressed Column Store • Generates one table flat Extract from Source ( that may involve joins ) • Uses ODBC / OLEDB For Extraction • Only loads required columns from Extract
  • 11. Qlik • One of the Original Developers in Compressed Columnar In Memory Analytics • Nice Dashboards • Incremental Updates • Autojoins Fields based on Field Name • Scripting Langauge for Generating QVD Files
  • 12. Qlik Load Script Example Companies: LOAD id AS COMPANY_ID, name as COMPANY_NAME, postcode AS COMPANY_POSTCODE, address AS COMPANY_ADDRESS, If(id > 100, 1, 0) AS FLAG_NATIONAL; SQL SELECT id, name, postcode, address FROM database.Companies;
  • 13. Monet DB • Pioneer in Columnar Databases • Research Focussed out of the Netherlands • Open Source • Can Cache Expensive Computations and Reuse • Early versions was used by Data Distilleries, which got bought out by SPSS • R Integration
  • 14. SQL Server Enterprise • ColumnStore Indexes –Data is stored by column. –Blocks of 1,048,576 Values • InMemory OLTP (MEMORY_OPTIMIZED=ON) after Create Table Data/Delta files of 128 MB
  • 15. Oracle • TimesTen – Works with Oracle Database as a Cache – Telecoms and Financial Companies • Oracle 12 Enterprise – Row & Column Formats – In Memory Columnstore • Exalytics
  • 16. SAP Hana • Pure In Memory Database • In Memory OLTP Rowstore • In Memory Columnstore – Up to 2^31 rows per block • Cluster Large Fact tables across nodes • Hana One Available on EC2 & IBM
  • 18. Memsql • Pure In Memory Database • Mysql Wire Protocol Compatible • Lockfree Linked Lists and Skiplists • SQL Queries compiled into C++ • Split Large Tables Across Nodes • Column Store Aimed at Analytics • Apache Spark Integration
  • 20. Clustered Databases • Amazon Redshift • EMC Greenplum • IBM Netezza • HP Vertica • Teradata
  • 21. Other In Memory Players • Sisense BI Focussed • Parstream Cisco Owned • Domo SAAS BI Company. Omniture Founder • Iri • InsightSquared BI Focussed • VoltDB Java Stored Procedure Unit of Exec • Infobright Open Sourced based on Mysql • KDB Focussed on HFT / Terse
  • 22. InMemory.Net public static void testDoublePerformance() { double total = 0; for (int kk = 0; kk < 1000000000; kk++) { total += kk; } Console.WriteLine(total); }
  • 23. Results • Ran in about 2.5 second for a billion Rows • 400 million rows per second on Single Core • About 50% of performance of C++ Prog. • 1.6 billion / second when running using 4 Core • 2.0 billion / second when running with HT Cores
  • 24. Initial Version • InMemoryColumn<T> { Dictionary <T,int> initialValuesDict; List <int> initialIndexes; T [] finalValues; int [] finalIndexes; • }
  • 25. Next Version • InMemoryColumn<T> { Dictionary <T,int> initialValuesDict; int [][] initialIndexes; T [] finalValues; int [] finalIndexes; • }
  • 26. Final Version • InMemoryColumn<T> { Dictionary <T,int> initialValuesDict; byte/ushort/int [][] initialIndexes; T [] finalValues; byte/ushort/int [] finalIndexes; • }
  • 27. ANLTR to Parse Queries grammar Expr; prog: (expr NEWLINE)* ; expr: expr ('*'|'/') expr | expr ('+'|'-') expr | INT | '(' expr ')' ; NEWLINE : [rn]+ ; INT : [0-9]+ ;
  • 28. Example Rule from Grammer mainquery [ImpVars vars] returns [InMemoryQuery query ] : { $query = new InMemoryQuery(); } SELECT1 (CACHE {$query.setCache();} )? (NOCACHE {$query.setNoCache();} )? (DISTINCT {$query.setDistinct();} )? fieldclause [$query,$vars] ( (INTO label { $query.setInto ($label.text2 ) ;})? FROM tableclause [$query,$vars] ( (COMMA|CROSS JOIN ) tableclause [$query,$vars] ) * (WHERE whereclause [$query,$vars])? (GROUP BY groupclause [$query,$vars])? (HAVING havingclause [$query,$vars])? (ORDER BY orderclause [$query,$vars])? (LIMIT limitclause [$query,$vars])? )? ;
  • 29. Code Generation • Generate C# To Evaluate Query • Compiled Code undergoes JIT for fast exec • Parameterize Constants – Simplify complex Constant Expressions • Generic Table / Column Naming • Reuse Generated Code
  • 30. Detail Queries • Detail Query –Initial List Algorithm –Improved by using Arrays of Arrays –Only one thread works on one Array
  • 31. SELECT customerid FROM Orders for (int tab1_counter = rowStart; tab1_counter < rowEnd; tab1_counter++,) { groupRowD1 = groupRowCount >> 14; groupRowD2 = groupRowCount & 16383; if (groupRowD2 == 0) { if (groupRowD1 > 0) { blockCounts[groupRowD1 - 1] = 16384; } lock (lock_newBlockObject) { groupRowCount = nextRecordD1 << 14; nextRecordD1++; } groupRowD1 = groupRowCount >> 14; t_total0[groupRowD1] = new byte[16384]; total0 = t_total0[groupRowD1]; }; total0[groupRowD2] = val_t1_c1[tab1_counter]; groupRowCount++; if ((groupRowCount & 16383) == 0) { blockCounts[groupRowD1] = 16384; } }
  • 32. Aggregative Queries • Group Cardinality =1 • Group Cardinality < 500k – Use Arrays of Arrays, – Lookup Key being Group Index • Group Cardinality > 500k – Use Dictionaries to Correlate Group Index -> Storage – Arrays of Arrays
  • 33. SELECT customer, SUM(1) FROM orders WHERE employee=1 GROUP BY customer for (int tab1_counter = rowStart; tab1_counter < rowEnd; tab1_counter++, newRow = false) { if ((val_t1_c2[tab1_counter] == const_0_t1_c2)) { rowIndex = val_t1_c1[tab1_counter]; if (groupRowExists[rowIndex] == 0) newRow = true; groupRowExists[rowIndex] = 1; total1[rowIndex] += const_0; if (newRow) { total0[rowIndex]=val_t1_c1[tab1_counter]; } } }
  • 34. COUNT DISTINCT • Initial Algorithm used Byte [] • Used lots of Memory on Large Cores • Upgraded to 1 [] across all Cores • Interlocked.CompareExchange to set Bit • Hashmap for initial Values • Then switch to byte []
  • 35. Subqueries • Subquery in Table clause can be materialized into temp table ( CACHE ) • Simplify Subquery ( NOCACHE) Only Fields Parent SELECT Requires Pass Through Parent WHERE Clause
  • 36. JOINS • LEFT & INNER JOIN SUPPORT • Merge Parent & Child Column Values • Parent Value -> Child Indexes • ONE to ONE – Join becomes an Array Lookup • ONE to Many – Join Becomes for Loop
  • 37. Query Simplification • Rewrite Aggregate Queries with Expressions SELECT SUM(1) / SUM (qty ) FROM Orders SELECT SUM(1) as A, SUM(QTY) as B from Orders SELECT A/B FROM TEMP_QUERY
  • 38. More Simplifications • Group Expressions with 1 Database Field e.g. Group by Month ( OrderDate ) Inner Join OrderDate to Table of Its Unique Values and Month ( OrderDate ) • Remove Redundant Group By Parts Group BY OrderDate , Month ( Orderdate ) Group BY OrderDate , Month ( Orderdate )
  • 39. HAVING Clause • Convert to two Queries • One Query without Having Clause • Having Clause becomes Where of Second Query
  • 40. Function List String Functions CAST | CAST_STR_AS_INT | CAST_STR_AS_DECIMAL | CHAR | CHARINDEX | COALESCE | CONCAT | CSTR | ENDSWITH | INSERT | ISNULL | ISNULLOREMPTY | LEFT | LEN | LCASE | LTRIM | REMOVE | REPLACE | REVERSE | RIGHT | RTRIM | SUBSTRING | STARTSWITH | TRIM | UCASE Date Functions CDATE | DATEADD | DATEDIFF | DATEDIFFMILLISECOND | DATEPART | DATESERIAL | DAY | DAYOFWEEK | MONTH | TRUNC | YEAR Math ABS | CAST_NUM_AS_BYTE | CAST_NUM_AS_DECIMAL | CAST_NUM_AS_DOUBLE | CAST_NUM_AS_INT | CAST_NUM_AS_LONG | CAST_NUM_AS_SHORT | CAST_NUM_AS_SINGLE | FLOOR | LOG | MAX | MAXLIST | MIN | MINLIST | POWER | RAND | ROUND | SIGN | SQRT Trigonometric ASIN | ACOS | ATAN | ATAN2 | COS | COSH | SIN | SINH | TAN | TANH Aggregate Functions MIN | MAX | COUNT | AVG | SUM | COUNT ( DISTINCT() ) | MINLIST | MAXLIST Statistical Functions STDEV| STDEVP | VAR | VARP
  • 41. Special Cases • SELECT DISCOUNT ( COUNT CUSTOMER ) FROM ORDERS • Answer is No of Customer Values • SELECT DISTINCT CUSTOMER FROM ORDERS Answer is List of Customer Unique Values
  • 42. Importing Data DATASOURCE a1=ODBC 'dsn=ir_northwind' IMPORT Customers=a1.customers IMPORT Products=a1.{SELECT * FROM Products} IMPORT orders-a1.'somequery.sql' SAVE
  • 43. Importing Data II • ODBC / OLEDB / DOT NET Providers • Special ME Datasource • Existing In Memory Databases • UNION ALL Between Sources • SLURP Command • Variables, Expressions & IF
  • 44. Interfacing to the Database • Native Dot Net API • Dot Net Data Provider • COM/ ACTIVEX API • ODBC Driver C / C++ IO Licensed ODBC Kit Parameterized Queries + Cursor Support
  • 45. Hard Learned Lessons • Allocated and Store Variables Relating to One Thread Sequentially. Don’t intermix • Xeon Servers with Maxed out memory can have slower memory access speed – 1 Rank 1,866 Mhz – 2 Ranks 1,600 Mhz – 3 Ranks 1,333 Mhz
  • 46. Bitcoin Mining / HFT • CPUS • GPUs • FPGAs • Dedicated Mining Chip
  • 47. GPU & InMemory Databases • GPUDB, MAPD – Good for Visualising Billions of Points – GPUs can run thousands of Cores on Data – GPU to Main Memory Bottleneck – Potentially more Data Reduction • Blazegraph, Graphsql Fast Graph Database that can use GPU
  • 48. FPGA Potential • Field-Programmable Gate Array – is an integrated circuit designed to be configured by a customer or a designer after manufacturing – Programmable Integrated Circuit • Could be used to enhanced In Memory DBs • Intel bought Altera back in June 2015 – Will roll technology out into Data Center
  • 49. Hardware Transaction Memory • Simplifies Concurrent Programming – Group of Load & Store Instructions – Can Execute Atomically • Hardware of Software Transactional Memory • Intel TSX – Transaction Synchronization Extensions – Available in some Skylake Processors – Added to Haswell/Broadwell but Disabled
  • 50. 3D XPoint Memory • Announced by Intel & Micron June 2015 • 1000 times more Durable than Flash • Like DRAM that has Permanence • Latency 10 times faster than NAND SSD • 4-6 Times slower than DRAM
  • 51. Thanks for help with Market Research • Dan Khasis • Niall Dalton • Jeff Cordova – Wavefront • SapHanaTutorial.com