SlideShare a Scribd company logo
1 of 35
Download to read offline
Lightning Talk

In-Memory Data Management 

Trends & Techniques"
GREG LUCK!
CTO HAZELCAST
2
In-Memory Hardware Trends
How to Use It
3
Von Neumann Architecture
3
Hardware Trends
4
5
Commodity Multi-core Servers
5
0
4
8
12
16
20
Cores/CPU
UMA -> NUMA
7
Commodity 64 bit servers
7
4GB32 18EB64
8
50 Years of RAM Prices
Historical and Projected
8
9
50 Years of Disk Prices
9
10
SSD Prices
10
Average Price 	

$1/GB
11
Cost Comparison: USD/GB 2012
11
Disk: 	

$0.04	

	

 	

	

SSD: 	

$1	

	

 	

25x	

	

DRAM: $21	

	

 	

525x	

$4k	

$100k	

$2.1m	

100TB
12
Max RAM Per Commodity Server
12
0
1
2
3
4
5
6
7
8
9
2010 2011 2012 2013
TB
13
Latency across the network
13
0
10
20
30
40
50
60
70
µs
14
Access Times & Sizes
14
Level RR Latency Typical Size Technology Managed By
Registers <1 ns 1 KB Custom CMOS Compiler
L1 Cache 1 ns 8 – 128 KB SRAM Hardware
L2 Cache 3 ns .5 – 8 MB SRAM Hardware
L3 Cache (oc) 10-15 ns 4 – 30 MB SRAM Hardware
Main Memory 60 ns 16GB – TB DRAM OS/App
SSD 50 -100us 400GB – 6TB Flash Memory OS/App
Main Memory
over Network
2-100us Unbounded DRAM/
Ethernet/
Infinband
OS/App
Disk 4 - 7ms Multiple TBs Magnetic
Rotational Disk
OS/App
Disk over
Network
6 - 10ms Unbounded Disk/Ethernet/
Infiniband
OS/App
15
Access Times & Sizes
15
Level RR Latency Typical Size Technology Managed By
Registers <1 ns 1 KB Custom CMOS Compiler
L1 Cache 1 ns 8 – 128 KB SRAM Hardware
L2 Cache 3 ns .5 – 8 MB SRAM Hardware
L3 Cache (oc) 10-15 ns 4 – 30 MB SRAM Hardware
Main Memory 60 ns 16GB – TB DRAM OS/App
SSD 50 -100us 400GB – 6TB Flash Memory OS/App
Main Memory
over Network
2-100us Unbounded DRAM/
Ethernet/
Infinband
OS/App
Disk 4 - 7ms Multiple TBs Magnetic
Rotational Disk
OS/App
Disk over
Network
6 - 10ms Unbounded Disk/Ethernet/
Infiniband
OS/App
Cache up to 30 times faster than memory. 	

Memory 106 times faster than disk.	

Network Memory 103 times faster than disk.	

SSD 102 faster than disk
Techniques
16
Exploit Data Locality
Data is more likely to be read if:	

•  It was recently read (temporal locality)	

•  If it is adjacent to other data (e.g. arrays, fields in an object)	

•  If it is part of a pattern (e.g. looping, relations)	

•  Some data is naturally accessed more frequently e.g. Pareto
Distribution
Working with the CPU’s Cache Hierarchy
•  Memory up to 30x slower than cache	

•  Alleviated somewhat by NUMA, wide 
channel, multi-channel/large cache	

•  Vector instructions	

•  Work with Cache Lines	

•  Work with Memory Pages (TLBs)	

•  Work with Prefetching	

•  Exploit NUMA with cpu affinity

numactl --physcpubind=0 –localalloc java …
	

•  Exploit natural data locality
Data Locality Effects – intra machine
0
20
40
60
80
100
120
140
160
Linear Random -
Page
Random -
Heap
Intel U4100
i7-860
i7-2760QM
20
Tiered Storage
20
20
Local Disk
SSD and Rotational
(Restartable)
Local Storage
Heap
Store
Off-Heap Store
5,000,000+
1,000,000
10
1000+
2,000+
Speed (TPS) Size (GB)
100,000
10,000s -
Network Storage
Network Accessible Memory
- 100,000 +
21
Data Locality Effects – inter machine
2121
Compared	
  with	
  hybrid	
  in-­‐process	
  and	
  distributed	
  
cache:	
  
	
  
Latency	
  =	
  L1	
  speed	
  *	
  propor:on	
  	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  +	
  L2	
  speed	
  *	
  propor:on	
  
L1	
  =	
  0ms	
  (	
  5us)	
  for	
  on-­‐heap	
  and	
  50-­‐100	
  us	
  off-­‐
heap	
  
L2	
  =	
  1	
  ms	
  
	
  
80%	
  L1	
  Pareto	
  Model:	
  
	
  
	
  =	
  0	
  *	
  .8	
  +	
  1	
  *	
  .2	
  
=	
  .2	
  ms	
  
	
  
90%	
  L1	
  Pareto	
  Model:	
  
	
  
latency	
  =	
  0	
  *	
  .9	
  +	
  1	
  *	
  .1	
  
=	
  .1	
  ms	
  
Columnar Storage
•  Manipulate data locality 	

•  Sorted Dictionary compression
for finite values	

•  Allows values to be held in
cache for SSE instructions	

•  Better cache line effectiveness	

•  Fewer CPU cache misses for
aggregate calculations	

•  Cross-over point is around a
few dozen columns
Parallelism
•  Multi-threading 	

•  Avoid synchronized: CAS	

•  Query using a scatter gather pattern	

•  Map/Reduce e.g. Hazelcast Map/Reduce
Java: Will it make the cut?
Garbage Collection limits heap usage. G1
and Balanced aim for 100ms at 10GB.	


	

Unused
Memory
64GB
4GB 4s
Heap
Java Apps Memory Bound
GC Pause
Time
Available
Memory
GC
Off-Heap Storage	

No low-level CPU access	


	

Java is challenged as an infrastructure
language despite its newly popular 
usage for this
CEP/Stream Processing
•  Don’t let data pool up and then process with “pull queries”.	

•  Invert that and process it as it streams in.“push queries”	

•  Queries execute against “tables” that breaks the stream up into
a current time window	

•  Hold the window and intermediate results in memory



Results are in real-time
In-Situ Processing
Rather than moving the data to be processed you process it in-situ.
	

Examples:
	

- HANA Calculation Engine
- Google Big Query
- Exadata Storage Servers
- Hazelcast EntryProcessor and Distributed Executor Service
27
Souped-Up Von Neumann Architecture
27
Memory	

Over The
Network	

Memory	

Over The
Network	

SSD	

(Flash and
RAM)	

Multi-
processor	

Multi-core/
Compression	

64 bit	

DRAM	

More Cache, NUMA, Wide/
Multi channel, Locality	

PCI Flash	

PCI Flash	

Vector/AES etc
The Data Management Landscape
28
2929
The new data management world
Data Grid	

	

Terracotta	

Coherence	

Gemfire …
SAP HANA
Relational | Analytical
•  “Appliance”	

•  Aggressive IA64 optimisations	

•  ACID, SQL and MDX	

•  In-memory SSD and Disk 	

•  Row and Column based Storage	

•  Fast aggregation on column store	

•  Single Instance 1TB limit	

•  Uses compression (est. 5x size)	

•  Parallel DB - round-robin, hash, or range partitioning of a table
with shared storage	

•  Updates as delta inserts	

•  Data is fed from source systems near real-time, real-time or
batch
Volt DB
Relational | New SQL | Operational | Analytical
•  An all in-memory design	

•  Full SQL and full ACID	

•  Partitioned per core so that one thread own its partition –
avoids locking and latching 	

•  Redundancy provided by 
multiples instances with 
writes being replicated	

•  Claims to be 45x faster
Oracle Exadata
Relational | Operational | Analytical | Appliance
•  Combines Oracle RAC with “Storage Servers”	

•  Connected with the box with Infiniband QDR	

•  SS use PCI Flash (not SSD) for a 22 TB hardware cache	

•  In-situ computation on the Storage Servers with “Smart Scan”	

•  Uses “Hybrid Columnar Compression” a compromise of row
and column storage. 	

PCI Flash Card
Terracotta BigMemory
Key-Value | Operational | Data Grid
•  In-memory	

•  Key-value with the Ehcache and soon javax.cache APIs	

•  In-process (L1) and server storage (L2)	

•  Persistence via log-forward Fast Restart Store: SSD or Disk	

•  Tiered Storage: local on-heap, local off-heap, server on-heap,
server off-heap	

•  Partitions with consistent hashing	

•  Search with parallel in-situ execution	

•  Off-heap allows 2TB uncompressed in each app server Java
process and on each server partition	

•  Compression 	

•  Speed ranging from  1µs to a few ms.
Hazelcast
Key-Value | Operational | Data Grid
•  In-memory	

•  Key-value Map API and javax.cache API	

•  Near cache and server data storage	

•  Tiered Storage: local on-heap, local off-heap, server on-heap,
server off-heap	

•  Partitions with consistent hashing	

•  Search with parallel in-situ execution	

•  In-situ processing with Entry Processors and Distributed
Executors	

•  Speed ranging from  1µs to a few ms.
Disk is the new tape
35
SSD is the new disk
Memory is the new
operational store

More Related Content

What's hot

Rigorous and Multi-tenant HBase Performance
Rigorous and Multi-tenant HBase PerformanceRigorous and Multi-tenant HBase Performance
Rigorous and Multi-tenant HBase PerformanceCloudera, Inc.
 
Off-heaping the Apache HBase Read Path
Off-heaping the Apache HBase Read Path Off-heaping the Apache HBase Read Path
Off-heaping the Apache HBase Read Path HBaseCon
 
Designing your SaaS Database for Scale with Postgres
Designing your SaaS Database for Scale with PostgresDesigning your SaaS Database for Scale with Postgres
Designing your SaaS Database for Scale with PostgresOzgun Erdogan
 
Interactive Hadoop via Flash and Memory
Interactive Hadoop via Flash and MemoryInteractive Hadoop via Flash and Memory
Interactive Hadoop via Flash and MemoryChris Nauroth
 
Cassandra Troubleshooting 3.0
Cassandra Troubleshooting 3.0Cassandra Troubleshooting 3.0
Cassandra Troubleshooting 3.0J.B. Langston
 
SparkSpark in the Big Data dark by Sergey Levandovskiy
SparkSpark in the Big Data dark by Sergey Levandovskiy  SparkSpark in the Big Data dark by Sergey Levandovskiy
SparkSpark in the Big Data dark by Sergey Levandovskiy Lohika_Odessa_TechTalks
 
High Performance Hardware for Data Analysis
High Performance Hardware for Data AnalysisHigh Performance Hardware for Data Analysis
High Performance Hardware for Data AnalysisMike Pittaro
 
High Performance Hardware for Data Analysis
High Performance Hardware for Data AnalysisHigh Performance Hardware for Data Analysis
High Performance Hardware for Data AnalysisMike Pittaro
 
HBase Application Performance Improvement
HBase Application Performance ImprovementHBase Application Performance Improvement
HBase Application Performance ImprovementBiju Nair
 
Cassandra TK 2014 - Large Nodes
Cassandra TK 2014 - Large NodesCassandra TK 2014 - Large Nodes
Cassandra TK 2014 - Large Nodesaaronmorton
 
Lessons Learned From Running 1800 Clusters (Brooke Jensen, Instaclustr) | Cas...
Lessons Learned From Running 1800 Clusters (Brooke Jensen, Instaclustr) | Cas...Lessons Learned From Running 1800 Clusters (Brooke Jensen, Instaclustr) | Cas...
Lessons Learned From Running 1800 Clusters (Brooke Jensen, Instaclustr) | Cas...DataStax
 
Apache HBase in the Enterprise Data Hub at Cerner
Apache HBase in the Enterprise Data Hub at CernerApache HBase in the Enterprise Data Hub at Cerner
Apache HBase in the Enterprise Data Hub at CernerHBaseCon
 
Intro to big data choco devday - 23-01-2014
Intro to big data   choco devday - 23-01-2014Intro to big data   choco devday - 23-01-2014
Intro to big data choco devday - 23-01-2014Hassan Islamov
 
Benchmarking Top NoSQL Databases: Apache Cassandra, Apache HBase and MongoDB
Benchmarking Top NoSQL Databases: Apache Cassandra, Apache HBase and MongoDBBenchmarking Top NoSQL Databases: Apache Cassandra, Apache HBase and MongoDB
Benchmarking Top NoSQL Databases: Apache Cassandra, Apache HBase and MongoDBAthiq Ahamed
 
HBaseCon 2015: Multitenancy in HBase
HBaseCon 2015: Multitenancy in HBaseHBaseCon 2015: Multitenancy in HBase
HBaseCon 2015: Multitenancy in HBaseHBaseCon
 
August 2013 HUG: Removing the NameNode's memory limitation
August 2013 HUG: Removing the NameNode's memory limitation August 2013 HUG: Removing the NameNode's memory limitation
August 2013 HUG: Removing the NameNode's memory limitation Yahoo Developer Network
 

What's hot (20)

Rigorous and Multi-tenant HBase Performance
Rigorous and Multi-tenant HBase PerformanceRigorous and Multi-tenant HBase Performance
Rigorous and Multi-tenant HBase Performance
 
Off-heaping the Apache HBase Read Path
Off-heaping the Apache HBase Read Path Off-heaping the Apache HBase Read Path
Off-heaping the Apache HBase Read Path
 
Designing your SaaS Database for Scale with Postgres
Designing your SaaS Database for Scale with PostgresDesigning your SaaS Database for Scale with Postgres
Designing your SaaS Database for Scale with Postgres
 
Interactive Hadoop via Flash and Memory
Interactive Hadoop via Flash and MemoryInteractive Hadoop via Flash and Memory
Interactive Hadoop via Flash and Memory
 
Cassandra Troubleshooting 3.0
Cassandra Troubleshooting 3.0Cassandra Troubleshooting 3.0
Cassandra Troubleshooting 3.0
 
HBase Low Latency
HBase Low LatencyHBase Low Latency
HBase Low Latency
 
Spark in the BigData dark
Spark in the BigData darkSpark in the BigData dark
Spark in the BigData dark
 
SparkSpark in the Big Data dark by Sergey Levandovskiy
SparkSpark in the Big Data dark by Sergey Levandovskiy  SparkSpark in the Big Data dark by Sergey Levandovskiy
SparkSpark in the Big Data dark by Sergey Levandovskiy
 
Hbase: an introduction
Hbase: an introductionHbase: an introduction
Hbase: an introduction
 
NoSQL: Cassadra vs. HBase
NoSQL: Cassadra vs. HBaseNoSQL: Cassadra vs. HBase
NoSQL: Cassadra vs. HBase
 
High Performance Hardware for Data Analysis
High Performance Hardware for Data AnalysisHigh Performance Hardware for Data Analysis
High Performance Hardware for Data Analysis
 
High Performance Hardware for Data Analysis
High Performance Hardware for Data AnalysisHigh Performance Hardware for Data Analysis
High Performance Hardware for Data Analysis
 
HBase Application Performance Improvement
HBase Application Performance ImprovementHBase Application Performance Improvement
HBase Application Performance Improvement
 
Cassandra TK 2014 - Large Nodes
Cassandra TK 2014 - Large NodesCassandra TK 2014 - Large Nodes
Cassandra TK 2014 - Large Nodes
 
Lessons Learned From Running 1800 Clusters (Brooke Jensen, Instaclustr) | Cas...
Lessons Learned From Running 1800 Clusters (Brooke Jensen, Instaclustr) | Cas...Lessons Learned From Running 1800 Clusters (Brooke Jensen, Instaclustr) | Cas...
Lessons Learned From Running 1800 Clusters (Brooke Jensen, Instaclustr) | Cas...
 
Apache HBase in the Enterprise Data Hub at Cerner
Apache HBase in the Enterprise Data Hub at CernerApache HBase in the Enterprise Data Hub at Cerner
Apache HBase in the Enterprise Data Hub at Cerner
 
Intro to big data choco devday - 23-01-2014
Intro to big data   choco devday - 23-01-2014Intro to big data   choco devday - 23-01-2014
Intro to big data choco devday - 23-01-2014
 
Benchmarking Top NoSQL Databases: Apache Cassandra, Apache HBase and MongoDB
Benchmarking Top NoSQL Databases: Apache Cassandra, Apache HBase and MongoDBBenchmarking Top NoSQL Databases: Apache Cassandra, Apache HBase and MongoDB
Benchmarking Top NoSQL Databases: Apache Cassandra, Apache HBase and MongoDB
 
HBaseCon 2015: Multitenancy in HBase
HBaseCon 2015: Multitenancy in HBaseHBaseCon 2015: Multitenancy in HBase
HBaseCon 2015: Multitenancy in HBase
 
August 2013 HUG: Removing the NameNode's memory limitation
August 2013 HUG: Removing the NameNode's memory limitation August 2013 HUG: Removing the NameNode's memory limitation
August 2013 HUG: Removing the NameNode's memory limitation
 

Similar to In-memory Data Management Trends & Techniques

Accelerating HBase with NVMe and Bucket Cache
Accelerating HBase with NVMe and Bucket CacheAccelerating HBase with NVMe and Bucket Cache
Accelerating HBase with NVMe and Bucket CacheNicolas Poggi
 
Optimizing Latency-sensitive queries for Presto at Facebook: A Collaboration ...
Optimizing Latency-sensitive queries for Presto at Facebook: A Collaboration ...Optimizing Latency-sensitive queries for Presto at Facebook: A Collaboration ...
Optimizing Latency-sensitive queries for Presto at Facebook: A Collaboration ...Alluxio, Inc.
 
Mike Pittaro - High Performance Hardware for Data Analysis
Mike Pittaro - High Performance Hardware for Data Analysis Mike Pittaro - High Performance Hardware for Data Analysis
Mike Pittaro - High Performance Hardware for Data Analysis PyData
 
Dissecting Scalable Database Architectures
Dissecting Scalable Database ArchitecturesDissecting Scalable Database Architectures
Dissecting Scalable Database Architectureshypertable
 
M6d cassandrapresentation
M6d cassandrapresentationM6d cassandrapresentation
M6d cassandrapresentationEdward Capriolo
 
SUSE Storage: Sizing and Performance (Ceph)
SUSE Storage: Sizing and Performance (Ceph)SUSE Storage: Sizing and Performance (Ceph)
SUSE Storage: Sizing and Performance (Ceph)Lars Marowsky-Brée
 
Responding rapidly when you have 100+ GB data sets in Java
Responding rapidly when you have 100+ GB data sets in JavaResponding rapidly when you have 100+ GB data sets in Java
Responding rapidly when you have 100+ GB data sets in JavaPeter Lawrey
 
Modeling, estimating, and predicting Ceph (Linux Foundation - Vault 2015)
Modeling, estimating, and predicting Ceph (Linux Foundation - Vault 2015)Modeling, estimating, and predicting Ceph (Linux Foundation - Vault 2015)
Modeling, estimating, and predicting Ceph (Linux Foundation - Vault 2015)Lars Marowsky-Brée
 
Colvin exadata mistakes_ioug_2014
Colvin exadata mistakes_ioug_2014Colvin exadata mistakes_ioug_2014
Colvin exadata mistakes_ioug_2014marvin herrera
 
Optimizing Performance and Computing Resource Efficiency of In-Memory Big Dat...
Optimizing Performance and Computing Resource Efficiency of In-Memory Big Dat...Optimizing Performance and Computing Resource Efficiency of In-Memory Big Dat...
Optimizing Performance and Computing Resource Efficiency of In-Memory Big Dat...Databricks
 
Java one2015 - Work With Hundreds of Hot Terabytes in JVMs
Java one2015 - Work With Hundreds of Hot Terabytes in JVMsJava one2015 - Work With Hundreds of Hot Terabytes in JVMs
Java one2015 - Work With Hundreds of Hot Terabytes in JVMsSpeedment, Inc.
 
Ceph Day Amsterdam 2015: Measuring and predicting performance of Ceph clusters
Ceph Day Amsterdam 2015: Measuring and predicting performance of Ceph clusters Ceph Day Amsterdam 2015: Measuring and predicting performance of Ceph clusters
Ceph Day Amsterdam 2015: Measuring and predicting performance of Ceph clusters Ceph Community
 
Spark Summit EU talk by Berni Schiefer
Spark Summit EU talk by Berni SchieferSpark Summit EU talk by Berni Schiefer
Spark Summit EU talk by Berni SchieferSpark Summit
 
505 kobal exadata
505 kobal exadata505 kobal exadata
505 kobal exadataKam Chan
 
Managing Security At 1M Events a Second using Elasticsearch
Managing Security At 1M Events a Second using ElasticsearchManaging Security At 1M Events a Second using Elasticsearch
Managing Security At 1M Events a Second using ElasticsearchJoe Alex
 
Linux tuning to improve PostgreSQL performance
Linux tuning to improve PostgreSQL performanceLinux tuning to improve PostgreSQL performance
Linux tuning to improve PostgreSQL performancePostgreSQL-Consulting
 
Accelerating hbase with nvme and bucket cache
Accelerating hbase with nvme and bucket cacheAccelerating hbase with nvme and bucket cache
Accelerating hbase with nvme and bucket cacheDavid Grier
 
ApacheCon2010: Cache & Concurrency Considerations in Cassandra (& limits of JVM)
ApacheCon2010: Cache & Concurrency Considerations in Cassandra (& limits of JVM)ApacheCon2010: Cache & Concurrency Considerations in Cassandra (& limits of JVM)
ApacheCon2010: Cache & Concurrency Considerations in Cassandra (& limits of JVM)srisatish ambati
 

Similar to In-memory Data Management Trends & Techniques (20)

Accelerating HBase with NVMe and Bucket Cache
Accelerating HBase with NVMe and Bucket CacheAccelerating HBase with NVMe and Bucket Cache
Accelerating HBase with NVMe and Bucket Cache
 
Optimizing Latency-sensitive queries for Presto at Facebook: A Collaboration ...
Optimizing Latency-sensitive queries for Presto at Facebook: A Collaboration ...Optimizing Latency-sensitive queries for Presto at Facebook: A Collaboration ...
Optimizing Latency-sensitive queries for Presto at Facebook: A Collaboration ...
 
Mike Pittaro - High Performance Hardware for Data Analysis
Mike Pittaro - High Performance Hardware for Data Analysis Mike Pittaro - High Performance Hardware for Data Analysis
Mike Pittaro - High Performance Hardware for Data Analysis
 
CPU Caches
CPU CachesCPU Caches
CPU Caches
 
Kinetic basho public
Kinetic basho publicKinetic basho public
Kinetic basho public
 
Dissecting Scalable Database Architectures
Dissecting Scalable Database ArchitecturesDissecting Scalable Database Architectures
Dissecting Scalable Database Architectures
 
M6d cassandrapresentation
M6d cassandrapresentationM6d cassandrapresentation
M6d cassandrapresentation
 
SUSE Storage: Sizing and Performance (Ceph)
SUSE Storage: Sizing and Performance (Ceph)SUSE Storage: Sizing and Performance (Ceph)
SUSE Storage: Sizing and Performance (Ceph)
 
Responding rapidly when you have 100+ GB data sets in Java
Responding rapidly when you have 100+ GB data sets in JavaResponding rapidly when you have 100+ GB data sets in Java
Responding rapidly when you have 100+ GB data sets in Java
 
Modeling, estimating, and predicting Ceph (Linux Foundation - Vault 2015)
Modeling, estimating, and predicting Ceph (Linux Foundation - Vault 2015)Modeling, estimating, and predicting Ceph (Linux Foundation - Vault 2015)
Modeling, estimating, and predicting Ceph (Linux Foundation - Vault 2015)
 
Colvin exadata mistakes_ioug_2014
Colvin exadata mistakes_ioug_2014Colvin exadata mistakes_ioug_2014
Colvin exadata mistakes_ioug_2014
 
Optimizing Performance and Computing Resource Efficiency of In-Memory Big Dat...
Optimizing Performance and Computing Resource Efficiency of In-Memory Big Dat...Optimizing Performance and Computing Resource Efficiency of In-Memory Big Dat...
Optimizing Performance and Computing Resource Efficiency of In-Memory Big Dat...
 
Java one2015 - Work With Hundreds of Hot Terabytes in JVMs
Java one2015 - Work With Hundreds of Hot Terabytes in JVMsJava one2015 - Work With Hundreds of Hot Terabytes in JVMs
Java one2015 - Work With Hundreds of Hot Terabytes in JVMs
 
Ceph Day Amsterdam 2015: Measuring and predicting performance of Ceph clusters
Ceph Day Amsterdam 2015: Measuring and predicting performance of Ceph clusters Ceph Day Amsterdam 2015: Measuring and predicting performance of Ceph clusters
Ceph Day Amsterdam 2015: Measuring and predicting performance of Ceph clusters
 
Spark Summit EU talk by Berni Schiefer
Spark Summit EU talk by Berni SchieferSpark Summit EU talk by Berni Schiefer
Spark Summit EU talk by Berni Schiefer
 
505 kobal exadata
505 kobal exadata505 kobal exadata
505 kobal exadata
 
Managing Security At 1M Events a Second using Elasticsearch
Managing Security At 1M Events a Second using ElasticsearchManaging Security At 1M Events a Second using Elasticsearch
Managing Security At 1M Events a Second using Elasticsearch
 
Linux tuning to improve PostgreSQL performance
Linux tuning to improve PostgreSQL performanceLinux tuning to improve PostgreSQL performance
Linux tuning to improve PostgreSQL performance
 
Accelerating hbase with nvme and bucket cache
Accelerating hbase with nvme and bucket cacheAccelerating hbase with nvme and bucket cache
Accelerating hbase with nvme and bucket cache
 
ApacheCon2010: Cache & Concurrency Considerations in Cassandra (& limits of JVM)
ApacheCon2010: Cache & Concurrency Considerations in Cassandra (& limits of JVM)ApacheCon2010: Cache & Concurrency Considerations in Cassandra (& limits of JVM)
ApacheCon2010: Cache & Concurrency Considerations in Cassandra (& limits of JVM)
 

More from Hazelcast

Hazelcast 3.6 Roadmap Preview
Hazelcast 3.6 Roadmap PreviewHazelcast 3.6 Roadmap Preview
Hazelcast 3.6 Roadmap PreviewHazelcast
 
Time to Make the Move to In-Memory Data Grids
Time to Make the Move to In-Memory Data GridsTime to Make the Move to In-Memory Data Grids
Time to Make the Move to In-Memory Data GridsHazelcast
 
The Power of the JVM: Applied Polyglot Projects with Java and JavaScript
The Power of the JVM: Applied Polyglot Projects with Java and JavaScriptThe Power of the JVM: Applied Polyglot Projects with Java and JavaScript
The Power of the JVM: Applied Polyglot Projects with Java and JavaScriptHazelcast
 
JCache - It's finally here
JCache -  It's finally hereJCache -  It's finally here
JCache - It's finally hereHazelcast
 
Speed Up Your Existing Relational Databases with Hazelcast and Speedment
Speed Up Your Existing Relational Databases with Hazelcast and SpeedmentSpeed Up Your Existing Relational Databases with Hazelcast and Speedment
Speed Up Your Existing Relational Databases with Hazelcast and SpeedmentHazelcast
 
Shared Memory Performance: Beyond TCP/IP with Ben Cotton, JPMorgan
Shared Memory Performance: Beyond TCP/IP with Ben Cotton, JPMorganShared Memory Performance: Beyond TCP/IP with Ben Cotton, JPMorgan
Shared Memory Performance: Beyond TCP/IP with Ben Cotton, JPMorganHazelcast
 
Applying Real-time SQL Changes in your Hazelcast Data Grid
Applying Real-time SQL Changes in your Hazelcast Data GridApplying Real-time SQL Changes in your Hazelcast Data Grid
Applying Real-time SQL Changes in your Hazelcast Data GridHazelcast
 
WAN Replication: Hazelcast Enterprise Lightning Talk
WAN Replication: Hazelcast Enterprise Lightning TalkWAN Replication: Hazelcast Enterprise Lightning Talk
WAN Replication: Hazelcast Enterprise Lightning TalkHazelcast
 
JAAS Security Suite: Hazelcast Enterprise Lightning Talk
JAAS Security Suite: Hazelcast Enterprise Lightning TalkJAAS Security Suite: Hazelcast Enterprise Lightning Talk
JAAS Security Suite: Hazelcast Enterprise Lightning TalkHazelcast
 
Hazelcast for Terracotta Users
Hazelcast for Terracotta UsersHazelcast for Terracotta Users
Hazelcast for Terracotta UsersHazelcast
 
Extreme Network Performance with Hazelcast on Torusware
Extreme Network Performance with Hazelcast on ToruswareExtreme Network Performance with Hazelcast on Torusware
Extreme Network Performance with Hazelcast on ToruswareHazelcast
 
Big Data, Simple and Fast: Addressing the Shortcomings of Hadoop
Big Data, Simple and Fast: Addressing the Shortcomings of HadoopBig Data, Simple and Fast: Addressing the Shortcomings of Hadoop
Big Data, Simple and Fast: Addressing the Shortcomings of HadoopHazelcast
 
JAXLondon - Squeezing Performance of IMDGs
JAXLondon - Squeezing Performance of IMDGsJAXLondon - Squeezing Performance of IMDGs
JAXLondon - Squeezing Performance of IMDGsHazelcast
 
OrientDB & Hazelcast: In-Memory Distributed Graph Database
 OrientDB & Hazelcast: In-Memory Distributed Graph Database OrientDB & Hazelcast: In-Memory Distributed Graph Database
OrientDB & Hazelcast: In-Memory Distributed Graph DatabaseHazelcast
 
How to Use HazelcastMQ for Flexible Messaging and More
 How to Use HazelcastMQ for Flexible Messaging and More How to Use HazelcastMQ for Flexible Messaging and More
How to Use HazelcastMQ for Flexible Messaging and MoreHazelcast
 
Devoxx UK 2014 High Performance In-Memory Java with Open Source
Devoxx UK 2014   High Performance In-Memory Java with Open SourceDevoxx UK 2014   High Performance In-Memory Java with Open Source
Devoxx UK 2014 High Performance In-Memory Java with Open SourceHazelcast
 
JSR107 State of the Union JavaOne 2013
JSR107  State of the Union JavaOne 2013JSR107  State of the Union JavaOne 2013
JSR107 State of the Union JavaOne 2013Hazelcast
 
Jfokus - Hazlecast
Jfokus - HazlecastJfokus - Hazlecast
Jfokus - HazlecastHazelcast
 
In-memory No SQL- GIDS2014
In-memory No SQL- GIDS2014In-memory No SQL- GIDS2014
In-memory No SQL- GIDS2014Hazelcast
 
How to Speed up your Database
How to Speed up your DatabaseHow to Speed up your Database
How to Speed up your DatabaseHazelcast
 

More from Hazelcast (20)

Hazelcast 3.6 Roadmap Preview
Hazelcast 3.6 Roadmap PreviewHazelcast 3.6 Roadmap Preview
Hazelcast 3.6 Roadmap Preview
 
Time to Make the Move to In-Memory Data Grids
Time to Make the Move to In-Memory Data GridsTime to Make the Move to In-Memory Data Grids
Time to Make the Move to In-Memory Data Grids
 
The Power of the JVM: Applied Polyglot Projects with Java and JavaScript
The Power of the JVM: Applied Polyglot Projects with Java and JavaScriptThe Power of the JVM: Applied Polyglot Projects with Java and JavaScript
The Power of the JVM: Applied Polyglot Projects with Java and JavaScript
 
JCache - It's finally here
JCache -  It's finally hereJCache -  It's finally here
JCache - It's finally here
 
Speed Up Your Existing Relational Databases with Hazelcast and Speedment
Speed Up Your Existing Relational Databases with Hazelcast and SpeedmentSpeed Up Your Existing Relational Databases with Hazelcast and Speedment
Speed Up Your Existing Relational Databases with Hazelcast and Speedment
 
Shared Memory Performance: Beyond TCP/IP with Ben Cotton, JPMorgan
Shared Memory Performance: Beyond TCP/IP with Ben Cotton, JPMorganShared Memory Performance: Beyond TCP/IP with Ben Cotton, JPMorgan
Shared Memory Performance: Beyond TCP/IP with Ben Cotton, JPMorgan
 
Applying Real-time SQL Changes in your Hazelcast Data Grid
Applying Real-time SQL Changes in your Hazelcast Data GridApplying Real-time SQL Changes in your Hazelcast Data Grid
Applying Real-time SQL Changes in your Hazelcast Data Grid
 
WAN Replication: Hazelcast Enterprise Lightning Talk
WAN Replication: Hazelcast Enterprise Lightning TalkWAN Replication: Hazelcast Enterprise Lightning Talk
WAN Replication: Hazelcast Enterprise Lightning Talk
 
JAAS Security Suite: Hazelcast Enterprise Lightning Talk
JAAS Security Suite: Hazelcast Enterprise Lightning TalkJAAS Security Suite: Hazelcast Enterprise Lightning Talk
JAAS Security Suite: Hazelcast Enterprise Lightning Talk
 
Hazelcast for Terracotta Users
Hazelcast for Terracotta UsersHazelcast for Terracotta Users
Hazelcast for Terracotta Users
 
Extreme Network Performance with Hazelcast on Torusware
Extreme Network Performance with Hazelcast on ToruswareExtreme Network Performance with Hazelcast on Torusware
Extreme Network Performance with Hazelcast on Torusware
 
Big Data, Simple and Fast: Addressing the Shortcomings of Hadoop
Big Data, Simple and Fast: Addressing the Shortcomings of HadoopBig Data, Simple and Fast: Addressing the Shortcomings of Hadoop
Big Data, Simple and Fast: Addressing the Shortcomings of Hadoop
 
JAXLondon - Squeezing Performance of IMDGs
JAXLondon - Squeezing Performance of IMDGsJAXLondon - Squeezing Performance of IMDGs
JAXLondon - Squeezing Performance of IMDGs
 
OrientDB & Hazelcast: In-Memory Distributed Graph Database
 OrientDB & Hazelcast: In-Memory Distributed Graph Database OrientDB & Hazelcast: In-Memory Distributed Graph Database
OrientDB & Hazelcast: In-Memory Distributed Graph Database
 
How to Use HazelcastMQ for Flexible Messaging and More
 How to Use HazelcastMQ for Flexible Messaging and More How to Use HazelcastMQ for Flexible Messaging and More
How to Use HazelcastMQ for Flexible Messaging and More
 
Devoxx UK 2014 High Performance In-Memory Java with Open Source
Devoxx UK 2014   High Performance In-Memory Java with Open SourceDevoxx UK 2014   High Performance In-Memory Java with Open Source
Devoxx UK 2014 High Performance In-Memory Java with Open Source
 
JSR107 State of the Union JavaOne 2013
JSR107  State of the Union JavaOne 2013JSR107  State of the Union JavaOne 2013
JSR107 State of the Union JavaOne 2013
 
Jfokus - Hazlecast
Jfokus - HazlecastJfokus - Hazlecast
Jfokus - Hazlecast
 
In-memory No SQL- GIDS2014
In-memory No SQL- GIDS2014In-memory No SQL- GIDS2014
In-memory No SQL- GIDS2014
 
How to Speed up your Database
How to Speed up your DatabaseHow to Speed up your Database
How to Speed up your Database
 

Recently uploaded

Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesPhilip Schwarz
 
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptxReal-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptxRTS corp
 
Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Mater
 
UI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptxUI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptxAndreas Kunz
 
Cyber security and its impact on E commerce
Cyber security and its impact on E commerceCyber security and its impact on E commerce
Cyber security and its impact on E commercemanigoyal112
 
Powering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsPowering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsSafe Software
 
How to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationHow to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationBradBedford3
 
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdf
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdfInnovate and Collaborate- Harnessing the Power of Open Source Software.pdf
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdfYashikaSharma391629
 
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Matt Ray
 
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...OnePlan Solutions
 
Xen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdfXen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdfStefano Stabellini
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtimeandrehoraa
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaHanief Utama
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsAhmed Mohamed
 
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Cizo Technology Services
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作qr0udbr0
 
Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Velvetech LLC
 
CRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. SalesforceCRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. SalesforceBrainSell Technologies
 

Recently uploaded (20)

Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a series
 
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptxReal-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
 
Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)
 
UI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptxUI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptx
 
Cyber security and its impact on E commerce
Cyber security and its impact on E commerceCyber security and its impact on E commerce
Cyber security and its impact on E commerce
 
Powering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsPowering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data Streams
 
How to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationHow to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion Application
 
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdf
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdfInnovate and Collaborate- Harnessing the Power of Open Source Software.pdf
Innovate and Collaborate- Harnessing the Power of Open Source Software.pdf
 
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
 
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
 
Xen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdfXen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdf
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtime
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief Utama
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML Diagrams
 
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作
 
Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...
 
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort ServiceHot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
 
Advantages of Odoo ERP 17 for Your Business
Advantages of Odoo ERP 17 for Your BusinessAdvantages of Odoo ERP 17 for Your Business
Advantages of Odoo ERP 17 for Your Business
 
CRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. SalesforceCRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. Salesforce
 

In-memory Data Management Trends & Techniques

  • 1. Lightning Talk
 In-Memory Data Management 
 Trends & Techniques" GREG LUCK! CTO HAZELCAST
  • 7. 7 Commodity 64 bit servers 7 4GB32 18EB64
  • 8. 8 50 Years of RAM Prices Historical and Projected 8
  • 9. 9 50 Years of Disk Prices 9
  • 11. 11 Cost Comparison: USD/GB 2012 11 Disk: $0.04 SSD: $1 25x DRAM: $21 525x $4k $100k $2.1m 100TB
  • 12. 12 Max RAM Per Commodity Server 12 0 1 2 3 4 5 6 7 8 9 2010 2011 2012 2013 TB
  • 13. 13 Latency across the network 13 0 10 20 30 40 50 60 70 µs
  • 14. 14 Access Times & Sizes 14 Level RR Latency Typical Size Technology Managed By Registers <1 ns 1 KB Custom CMOS Compiler L1 Cache 1 ns 8 – 128 KB SRAM Hardware L2 Cache 3 ns .5 – 8 MB SRAM Hardware L3 Cache (oc) 10-15 ns 4 – 30 MB SRAM Hardware Main Memory 60 ns 16GB – TB DRAM OS/App SSD 50 -100us 400GB – 6TB Flash Memory OS/App Main Memory over Network 2-100us Unbounded DRAM/ Ethernet/ Infinband OS/App Disk 4 - 7ms Multiple TBs Magnetic Rotational Disk OS/App Disk over Network 6 - 10ms Unbounded Disk/Ethernet/ Infiniband OS/App
  • 15. 15 Access Times & Sizes 15 Level RR Latency Typical Size Technology Managed By Registers <1 ns 1 KB Custom CMOS Compiler L1 Cache 1 ns 8 – 128 KB SRAM Hardware L2 Cache 3 ns .5 – 8 MB SRAM Hardware L3 Cache (oc) 10-15 ns 4 – 30 MB SRAM Hardware Main Memory 60 ns 16GB – TB DRAM OS/App SSD 50 -100us 400GB – 6TB Flash Memory OS/App Main Memory over Network 2-100us Unbounded DRAM/ Ethernet/ Infinband OS/App Disk 4 - 7ms Multiple TBs Magnetic Rotational Disk OS/App Disk over Network 6 - 10ms Unbounded Disk/Ethernet/ Infiniband OS/App Cache up to 30 times faster than memory. Memory 106 times faster than disk. Network Memory 103 times faster than disk. SSD 102 faster than disk
  • 17. Exploit Data Locality Data is more likely to be read if: •  It was recently read (temporal locality) •  If it is adjacent to other data (e.g. arrays, fields in an object) •  If it is part of a pattern (e.g. looping, relations) •  Some data is naturally accessed more frequently e.g. Pareto Distribution
  • 18. Working with the CPU’s Cache Hierarchy •  Memory up to 30x slower than cache •  Alleviated somewhat by NUMA, wide channel, multi-channel/large cache •  Vector instructions •  Work with Cache Lines •  Work with Memory Pages (TLBs) •  Work with Prefetching •  Exploit NUMA with cpu affinity numactl --physcpubind=0 –localalloc java … •  Exploit natural data locality
  • 19. Data Locality Effects – intra machine 0 20 40 60 80 100 120 140 160 Linear Random - Page Random - Heap Intel U4100 i7-860 i7-2760QM
  • 20. 20 Tiered Storage 20 20 Local Disk SSD and Rotational (Restartable) Local Storage Heap Store Off-Heap Store 5,000,000+ 1,000,000 10 1000+ 2,000+ Speed (TPS) Size (GB) 100,000 10,000s - Network Storage Network Accessible Memory - 100,000 +
  • 21. 21 Data Locality Effects – inter machine 2121 Compared  with  hybrid  in-­‐process  and  distributed   cache:     Latency  =  L1  speed  *  propor:on                                        +  L2  speed  *  propor:on   L1  =  0ms  (  5us)  for  on-­‐heap  and  50-­‐100  us  off-­‐ heap   L2  =  1  ms     80%  L1  Pareto  Model:      =  0  *  .8  +  1  *  .2   =  .2  ms     90%  L1  Pareto  Model:     latency  =  0  *  .9  +  1  *  .1   =  .1  ms  
  • 22. Columnar Storage •  Manipulate data locality •  Sorted Dictionary compression for finite values •  Allows values to be held in cache for SSE instructions •  Better cache line effectiveness •  Fewer CPU cache misses for aggregate calculations •  Cross-over point is around a few dozen columns
  • 23. Parallelism •  Multi-threading •  Avoid synchronized: CAS •  Query using a scatter gather pattern •  Map/Reduce e.g. Hazelcast Map/Reduce
  • 24. Java: Will it make the cut? Garbage Collection limits heap usage. G1 and Balanced aim for 100ms at 10GB. Unused Memory 64GB 4GB 4s Heap Java Apps Memory Bound GC Pause Time Available Memory GC Off-Heap Storage No low-level CPU access Java is challenged as an infrastructure language despite its newly popular usage for this
  • 25. CEP/Stream Processing •  Don’t let data pool up and then process with “pull queries”. •  Invert that and process it as it streams in.“push queries” •  Queries execute against “tables” that breaks the stream up into a current time window •  Hold the window and intermediate results in memory Results are in real-time
  • 26. In-Situ Processing Rather than moving the data to be processed you process it in-situ. Examples: - HANA Calculation Engine - Google Big Query - Exadata Storage Servers - Hazelcast EntryProcessor and Distributed Executor Service
  • 27. 27 Souped-Up Von Neumann Architecture 27 Memory Over The Network Memory Over The Network SSD (Flash and RAM) Multi- processor Multi-core/ Compression 64 bit DRAM More Cache, NUMA, Wide/ Multi channel, Locality PCI Flash PCI Flash Vector/AES etc
  • 28. The Data Management Landscape 28
  • 29. 2929 The new data management world Data Grid Terracotta Coherence Gemfire …
  • 30. SAP HANA Relational | Analytical •  “Appliance” •  Aggressive IA64 optimisations •  ACID, SQL and MDX •  In-memory SSD and Disk •  Row and Column based Storage •  Fast aggregation on column store •  Single Instance 1TB limit •  Uses compression (est. 5x size) •  Parallel DB - round-robin, hash, or range partitioning of a table with shared storage •  Updates as delta inserts •  Data is fed from source systems near real-time, real-time or batch
  • 31. Volt DB Relational | New SQL | Operational | Analytical •  An all in-memory design •  Full SQL and full ACID •  Partitioned per core so that one thread own its partition – avoids locking and latching •  Redundancy provided by multiples instances with writes being replicated •  Claims to be 45x faster
  • 32. Oracle Exadata Relational | Operational | Analytical | Appliance •  Combines Oracle RAC with “Storage Servers” •  Connected with the box with Infiniband QDR •  SS use PCI Flash (not SSD) for a 22 TB hardware cache •  In-situ computation on the Storage Servers with “Smart Scan” •  Uses “Hybrid Columnar Compression” a compromise of row and column storage. PCI Flash Card
  • 33. Terracotta BigMemory Key-Value | Operational | Data Grid •  In-memory •  Key-value with the Ehcache and soon javax.cache APIs •  In-process (L1) and server storage (L2) •  Persistence via log-forward Fast Restart Store: SSD or Disk •  Tiered Storage: local on-heap, local off-heap, server on-heap, server off-heap •  Partitions with consistent hashing •  Search with parallel in-situ execution •  Off-heap allows 2TB uncompressed in each app server Java process and on each server partition •  Compression •  Speed ranging from 1µs to a few ms.
  • 34. Hazelcast Key-Value | Operational | Data Grid •  In-memory •  Key-value Map API and javax.cache API •  Near cache and server data storage •  Tiered Storage: local on-heap, local off-heap, server on-heap, server off-heap •  Partitions with consistent hashing •  Search with parallel in-situ execution •  In-situ processing with Entry Processors and Distributed Executors •  Speed ranging from 1µs to a few ms.
  • 35. Disk is the new tape 35 SSD is the new disk Memory is the new operational store