SlideShare une entreprise Scribd logo
1  sur  48
Infinispan
Distributed in-memory key/value data
grid and cache
@infinispan
Agenda
• Introduction
• Part 1
• Hash Tables
• Distributed Hash Tables
• Consistent Hashing
• Chord Lookup Protocol
• Part 2
• Data Grids
• Infinispan
• Architecture
• Consistent Hashing / Split Clusters
• Other features
Part I – A (very) short introduction to
distributed hash tables
Hash Tables
Source: Wikipedia
http://commons.wikimedia.org/wiki/File:Hash_table_5_0_1_1_1_1_1_LL.svg#/media/File:Hash_table_5_0_1_1_1_1_1_LL.svg
Distributed Hash Tables (DHT)
Source: Wikipedia - http://commons.wikimedia.org/wiki/File:DHT_en.svg#/media/File:DHT_en.svg
• Decentralized Hash Table functionality
• Interface
• put(K,V)
• get(K) -> V
• Nodes can fail, join and leave
• The system has to scale
Distributed Hash Tables (DHT)
• Flooding in N nodes
• put() – store in any node O(1)
• get() – send query to all nodes O(N)
• Full replication in N nodes
• put() – store in all nodes O(N)
• get() – check any node O(1)
Simple solutions
Fixed Hashing
NodeID = hash(key) % TotalNodes.
Fixed Hashing with High Availability
NodeID = hash(key) % TotalNodes.
Fixed Hashing and Scalability
NodeID = hash(key) % TotalNodes+1.
2 Nodes, Key Space={0,1,2,3,4,5}
NodeID = hash(key) % 2.
NodeID = hash(key) % 3.
N0 (key mod 2 = 0) N1 (key mod 2 = 1)
0,2,4 1,3,5
N0 (key mod 3 = 0) N1 (key mod 3 = 1) N2 (key mod 3 = 2)
0,3 1,4 2,5
Consistent Hashing
Consistent Hashing – The Hash Ring
0
N0
N1
N2
K1
K2
K3
K4
K5
K6
Consistent Hashing – Nodes Joining, Leaving
Source: http://www.griddynamics.com/distributed-algorithms-in-nosql-databases/
Chord: Peer-to-peer Lookup Protocol
• Load Balance – distributed hash function, spreading
keys evenly over nodes
• Decentralization – fully distributed no SPOF
• Scalability – logarithmic growth of lookup cost with
the number of nodes, large systems are feasible
• Availability – automatically adjusts its internal tables
to ensure the node responsible for a key is always
found
• Flexible naming – key-space is flat (flexibility in how
to map names to keys)
Chord – Lookup O(N)
Source: Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications
Ion Stoica , Robert Morrisz, David Liben-Nowellz, David R. Kargerz, M. Frans Kaashoekz, Frank Dabekz, Hari Balakrishnanz
Chord – Lookup O(logN)
Source: Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications
Ion Stoica , Robert Morrisz, David Liben-Nowellz, David R. Kargerz, M. Frans Kaashoekz, Frank Dabekz, Hari Balakrishnanz
• K=6 (0, 26−1)
• Finger[i] = first node that succeeds
(N+ 2𝑖−1
) mod 2K
, where 1 ≤ 𝑖 ≤ 𝐾
• Successor/Predecessor – the next/previous node on circle
Chord – Node Join
Source: Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications
Ion Stoica , Robert Morrisz, David Liben-Nowellz, David R. Kargerz, M. Frans Kaashoekz, Frank Dabekz, Hari Balakrishnanz
• Node 26 joins the system between nodes 21 and 32.
• (a) Initial state: node 21 points to node 32;
• (b) node 26 finds its successor (i.e., node 32) and points to it;
• (c) node 26 copies all keys less than 26 from node 32;
• (d) the stabilize procedure updates the successor of node 21
to node 26.
• CAN (Hypercube), Chord (Ring), Pastry (Tree+Ring),
Tapestry (Tree+Ring), Viceroy, Kademlia, Skipnet,
Symphony (Ring), Koorde, Apocrypha, Land,
Bamboo, ORDI …
The world of DHTs …
Part II – A short introduction to
Infinispan
Where do we store data?
One size does not fit all...
Infinispan – History
• 2002 – JBoss App Server needed a clustered solution for
HTTP and EJB session state replication for HA clusters.
JGroups (open source group communication suite) had a
replicated map demo, expanded to a tree data structure,
added eviction and JTA transactions.
• 2003 – this was moved to JBoss AS code base
• 2005 – JBoss Cache was extracted and became a standalone
project
… JBoss Cache evolved into Infinispan, core parts redesigned
• 2009 – JBoss Cache 3.2 and Infinispan 4.0.0.ALPHA1 was
released
• 2015 - 7.2.0.Alpha1
• Check the Infinispan RoadMap for more details
Code?
<dependency>
<groupId>org.infinispan</groupId>
<artifactId>infinispan-embedded</artifactId>
<version>7.1.0.Final</version>
</dependency>
EmbeddedCacheManager cacheManager = new DefaultCacheManager();
Cache<String,String> cache = cacheManager.getCache();
cache.put("Hello", "World!");
Usage Modes
• Embedded / library mode
• clustering for apps and frameworks (e.g. JBoss
session replication)
• Local mode single cache
• JSR 107: JCACHE - Java Temporary Caching API
• Transactional local cache
• Eviction, expiration, write through, write
behind, preloading, notifications, statistics
• Cluster of caches
• Invalidation, Hibernate 2nd level cache
• Server mode – remote data store
• REST, MemCached, HotRod, WebSocket (*)
Code?
Configuration config = new ConfigurationBuilder()
.clustering()
.cacheMode(CacheMode.DIST_SYNC)
.sync()
.l1().lifespan(25000L)
.hash().numSegments(100).numOwners(3)
.build();
Configuration config = new ConfigurationBuilder()
.eviction()
.maxEntries(20000).strategy(EvictionStrategy.LRU)
.expiration()
.wakeUpInterval(5000L)
.maxIdle(120000L)
.build();
Infinispan – Core Architecture
Remote App 1 (C++) Remote App 2 (Java) Remote App 3 (.NET)
Network (TCP)
Node (JVM)
MemCached, HotRod, REST,
WebSocket (*)
Embedded App (Java)
Transport (JGroups)
Notification
Transactions / XA
Query
Map / Reduce
Monitoring
Storage Engine
(RAM +
Overflow)
Node (JVM)
MemCached, HotRod, REST,
WebSocket (*)
Embedded App (Java)
Transport (JGroups)
Notification
Transactions / XA
Query
Map / Reduce
Monitoring
Storage Engine
(RAM +
Overflow)
TCP/UDP
Infinispan Clustering and Consistent Hashing
• JGroups Views
• Each node has a unique address
• View changes when nodes join, leave
• Keys are hashed using MurmurHash3
algorithm
• Hash Space is divided into segments
• Key > Segment > Owners
• Primary and Backup Owners
Does it scale?
• 320 nodes, 3000 caches, 20 TB RAM
• Largest cluster formed: 1000 nodes
Empty Cluster
CLUSTER
Add 1 Entry
CLUSTER
K1
Primary and Backup
CLUSTER
K1
K1
Add another one
CLUSTER
K1
K1
K2
Primary And Backup
CLUSTER
K1
K1
K2K2
A cluster with more keys
CLUSTER
K1
K1
K2K2
K3
K3
K4
K4
K5
K5
A node dies…
CLUSTER
K1
K1
K2K2
K3
K3
K4
K4
K5
K5
The cluster heals
CLUSTER
K1
K1
K2K2
K3 K3
K4
K4
K5
K5
If multiple nodes fail…
• CAP Theorem to the rescue:
• Formulated by Eric Brewer in 1998
• C - Consistency
• A - High Availability
• P - Tolerance to Network Partitions
• Can only satisfy 2 at the same time:
• Consistency + Availability: The Ideal World where
network partitions do not exist
• Partitioning + Availability: Data might be different
between partitions
• Partitioning + Consistency: Do not corrupt data!
Infinispan Partition Handling Strategies
• In the presence of network partitions
• Prefer availability (partition handling DISABLED)
• Prefer consistency (partition handling ENABLED)
• Split Detection with partition handling ENABLED:
• Ensure stable topology
• LOST > numOwners OR no simple majority
• Check segment ownership
• Mark partition as Available / Degraded
• Send PartitionStatusChangedEvent to listeners
Cluster Partitioning – No data lost
K1
K1
K2K2
K3
K3
K4
K4
K5
K5
Partition1 Partition2
Cluster Partitioning – Lost data
K1
K1
K2K2
K3
K3
K4
K4
K5
K5
Partition1
Partition2
Merging Split Clusters
• Split Clusters see each other again
• Step1: Ensure stable topology
• Step2: Automatic: based on partition state
• 1 Available -> attempt merge
• All Degraded -> attempt merge
• Step3: Manual
• Data was lost
• Custom listener on Merge
• Application decides
Querying Infinispan
• Apache Lucene Index
• Native Query API (Query DSL)
• Hibernate Search and Apache Lucene to index and
search
• Native Map/Reduce
• Index-less
• Distributed Execution Framework
• Hadoop Integration (WIP)
• Run existing map/reduce jobs on Infinispan data
Map Reduce:
MapReduceTask<String, String, String, Integer> mapReduceTask
= new MapReduceTask<>(wordCache);
mapReduceTask
.mappedWith(new WordCountMapper())
.reducedWith(new WordCountReducer());
Map<String, Integer> wordCountMap = mapReduceTask.execute();
Query DSL:
QueryParser qp = new QueryParser("default", new
StandardAnalyzer());
Query luceneQ = qp
.parse("+station.name:airport +year:2014 +month:12
+(avgTemp < 0)");
CacheQuery cq = Search.getSearchManager(cache)
.getQuery(luceneQ, DaySummary.class);
List<Object> results = query.list();
Other features
• JMX Management
• RHQ (JBoss Enterprise Management Solution)
• CDI Support
• JSR 107 (JCACHE) integration
• Custom interceptors
• Runs on Amazon Web Services Platform
• Command line client
• JTA with JBoss TM, Bitronix, Atomikos
• GridFS (experimental API), CloudTM, Cross Site
Replication
DEMO
Q & A
Thank you!
Resources:
http://www.griddynamics.com/distributed-algorithms-in-nosql-databases/
http://www.infoq.com/articles/cap-twelve-years-later-how-the-rules-have-changed
http://www.allthingsdistributed.com/files/amazon-dynamo-sosp2007.pdf
http://pdos.csail.mit.edu/papers/ton:chord/paper-ton.pdf
http://www.martinbroadhurst.com/Consistent-Hash-Ring.html
http://infinispan.org/docs/7.2.x/user_guide/user_guide.html
https://github.com/infinispan/infinispan/wiki

Contenu connexe

Tendances

Advanced Streaming Analytics with Apache Flink and Apache Kafka, Stephan Ewen
Advanced Streaming Analytics with Apache Flink and Apache Kafka, Stephan EwenAdvanced Streaming Analytics with Apache Flink and Apache Kafka, Stephan Ewen
Advanced Streaming Analytics with Apache Flink and Apache Kafka, Stephan Ewen
confluent
 

Tendances (20)

Architecture patterns for distributed, hybrid, edge and global Apache Kafka d...
Architecture patterns for distributed, hybrid, edge and global Apache Kafka d...Architecture patterns for distributed, hybrid, edge and global Apache Kafka d...
Architecture patterns for distributed, hybrid, edge and global Apache Kafka d...
 
Building an Authorization Solution for Microservices Using Neo4j and OPA
Building an Authorization Solution for Microservices Using Neo4j and OPABuilding an Authorization Solution for Microservices Using Neo4j and OPA
Building an Authorization Solution for Microservices Using Neo4j and OPA
 
Migrating Oracle to PostgreSQL
Migrating Oracle to PostgreSQLMigrating Oracle to PostgreSQL
Migrating Oracle to PostgreSQL
 
Confluent REST Proxy and Schema Registry (Concepts, Architecture, Features)
Confluent REST Proxy and Schema Registry (Concepts, Architecture, Features)Confluent REST Proxy and Schema Registry (Concepts, Architecture, Features)
Confluent REST Proxy and Schema Registry (Concepts, Architecture, Features)
 
PySpark Best Practices
PySpark Best PracticesPySpark Best Practices
PySpark Best Practices
 
CDC patterns in Apache Kafka®
CDC patterns in Apache Kafka®CDC patterns in Apache Kafka®
CDC patterns in Apache Kafka®
 
Kafka 101
Kafka 101Kafka 101
Kafka 101
 
Stream processing using Kafka
Stream processing using KafkaStream processing using Kafka
Stream processing using Kafka
 
Performance Tuning RocksDB for Kafka Streams’ State Stores
Performance Tuning RocksDB for Kafka Streams’ State StoresPerformance Tuning RocksDB for Kafka Streams’ State Stores
Performance Tuning RocksDB for Kafka Streams’ State Stores
 
Exactly-once Stream Processing with Kafka Streams
Exactly-once Stream Processing with Kafka StreamsExactly-once Stream Processing with Kafka Streams
Exactly-once Stream Processing with Kafka Streams
 
Big Data, Data Lake, Fast Data - Dataserialiation-Formats
Big Data, Data Lake, Fast Data - Dataserialiation-FormatsBig Data, Data Lake, Fast Data - Dataserialiation-Formats
Big Data, Data Lake, Fast Data - Dataserialiation-Formats
 
Advanced Streaming Analytics with Apache Flink and Apache Kafka, Stephan Ewen
Advanced Streaming Analytics with Apache Flink and Apache Kafka, Stephan EwenAdvanced Streaming Analytics with Apache Flink and Apache Kafka, Stephan Ewen
Advanced Streaming Analytics with Apache Flink and Apache Kafka, Stephan Ewen
 
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
 
Oracle Enterprise Manager
Oracle Enterprise ManagerOracle Enterprise Manager
Oracle Enterprise Manager
 
Oracle E-Business Suite R12.2.5 on Database 12c: Install, Patch and Administer
Oracle E-Business Suite R12.2.5 on Database 12c: Install, Patch and AdministerOracle E-Business Suite R12.2.5 on Database 12c: Install, Patch and Administer
Oracle E-Business Suite R12.2.5 on Database 12c: Install, Patch and Administer
 
Redis vs Infinispan | DevNation Tech Talk
Redis vs Infinispan | DevNation Tech TalkRedis vs Infinispan | DevNation Tech Talk
Redis vs Infinispan | DevNation Tech Talk
 
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Kafka Tutorial - Introduction to Apache Kafka (Part 1)Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
 
Kafka 101 and Developer Best Practices
Kafka 101 and Developer Best PracticesKafka 101 and Developer Best Practices
Kafka 101 and Developer Best Practices
 
Solr CDCR (Cross Data Center Replication) in AWS
Solr CDCR (Cross Data Center Replication) in AWS Solr CDCR (Cross Data Center Replication) in AWS
Solr CDCR (Cross Data Center Replication) in AWS
 
Data Guard Architecture & Setup
Data Guard Architecture & SetupData Guard Architecture & Setup
Data Guard Architecture & Setup
 

En vedette

2009 kalman.graffi emanics_aspects_ofautonomiccomputing_20090617
2009 kalman.graffi emanics_aspects_ofautonomiccomputing_200906172009 kalman.graffi emanics_aspects_ofautonomiccomputing_20090617
2009 kalman.graffi emanics_aspects_ofautonomiccomputing_20090617
Kalman Graffi
 
Criminals in the Cloud: Past, Present, and Future
Criminals in the Cloud: Past, Present, and FutureCriminals in the Cloud: Past, Present, and Future
Criminals in the Cloud: Past, Present, and Future
Jim Lippard
 
The bigger picture
The bigger pictureThe bigger picture
The bigger picture
Suresh Iyer
 
Dynamic Search Algorithm for unstructured Peer to Peer Networks
Dynamic Search Algorithm for unstructured Peer to Peer NetworksDynamic Search Algorithm for unstructured Peer to Peer Networks
Dynamic Search Algorithm for unstructured Peer to Peer Networks
Venkata Sai Manoj Illendula
 

En vedette (20)

LMAX Disruptor - High Performance Inter-Thread Messaging Library
LMAX Disruptor - High Performance Inter-Thread Messaging LibraryLMAX Disruptor - High Performance Inter-Thread Messaging Library
LMAX Disruptor - High Performance Inter-Thread Messaging Library
 
2009 kalman.graffi emanics_aspects_ofautonomiccomputing_20090617
2009 kalman.graffi emanics_aspects_ofautonomiccomputing_200906172009 kalman.graffi emanics_aspects_ofautonomiccomputing_20090617
2009 kalman.graffi emanics_aspects_ofautonomiccomputing_20090617
 
Exercises 10
Exercises 10Exercises 10
Exercises 10
 
Criminals in the Cloud: Past, Present, and Future
Criminals in the Cloud: Past, Present, and FutureCriminals in the Cloud: Past, Present, and Future
Criminals in the Cloud: Past, Present, and Future
 
Kademlia(日本語版)
Kademlia(日本語版)Kademlia(日本語版)
Kademlia(日本語版)
 
The bigger picture
The bigger pictureThe bigger picture
The bigger picture
 
Dynamic Search Algorithm for unstructured Peer to Peer Networks
Dynamic Search Algorithm for unstructured Peer to Peer NetworksDynamic Search Algorithm for unstructured Peer to Peer Networks
Dynamic Search Algorithm for unstructured Peer to Peer Networks
 
Ods chapter7
Ods chapter7Ods chapter7
Ods chapter7
 
Infinispan – the open source data grid platform by Mircea Markus
Infinispan – the open source data grid platform by Mircea MarkusInfinispan – the open source data grid platform by Mircea Markus
Infinispan – the open source data grid platform by Mircea Markus
 
Why RESTful Design for the Cloud is Best
Why RESTful Design for the Cloud is BestWhy RESTful Design for the Cloud is Best
Why RESTful Design for the Cloud is Best
 
What's New in Infinispan 6.0
What's New in Infinispan 6.0What's New in Infinispan 6.0
What's New in Infinispan 6.0
 
Infinispan,Lucene,Hibername OGM
Infinispan,Lucene,Hibername OGMInfinispan,Lucene,Hibername OGM
Infinispan,Lucene,Hibername OGM
 
Infinispan
InfinispanInfinispan
Infinispan
 
LMAX Disruptor as real-life example
LMAX Disruptor as real-life exampleLMAX Disruptor as real-life example
LMAX Disruptor as real-life example
 
Infinispan Data Grid Platform
Infinispan Data Grid PlatformInfinispan Data Grid Platform
Infinispan Data Grid Platform
 
IEEE CCNC 2011: Kalman Graffi - LifeSocial.KOM: A Secure and P2P-based Soluti...
IEEE CCNC 2011: Kalman Graffi - LifeSocial.KOM: A Secure and P2P-based Soluti...IEEE CCNC 2011: Kalman Graffi - LifeSocial.KOM: A Secure and P2P-based Soluti...
IEEE CCNC 2011: Kalman Graffi - LifeSocial.KOM: A Secure and P2P-based Soluti...
 
LMAX Architecture
LMAX ArchitectureLMAX Architecture
LMAX Architecture
 
London JBUG April 2015 - Performance Tuning Apps with WildFly Application Server
London JBUG April 2015 - Performance Tuning Apps with WildFly Application ServerLondon JBUG April 2015 - Performance Tuning Apps with WildFly Application Server
London JBUG April 2015 - Performance Tuning Apps with WildFly Application Server
 
Introduction P2p
Introduction P2pIntroduction P2p
Introduction P2p
 
Performance evaluation methods for P2P overlays
Performance evaluation methods for P2P overlaysPerformance evaluation methods for P2P overlays
Performance evaluation methods for P2P overlays
 

Similaire à Infinispan, a distributed in-memory key/value data grid and cache

Keynote Yonik Seeley & Steve Rowe lucene solr roadmap
Keynote   Yonik Seeley & Steve Rowe lucene solr roadmapKeynote   Yonik Seeley & Steve Rowe lucene solr roadmap
Keynote Yonik Seeley & Steve Rowe lucene solr roadmap
lucenerevolution
 
KEYNOTE: Lucene / Solr road map
KEYNOTE: Lucene / Solr road mapKEYNOTE: Lucene / Solr road map
KEYNOTE: Lucene / Solr road map
lucenerevolution
 
Deploying Grid Services Using Hadoop
Deploying Grid Services Using HadoopDeploying Grid Services Using Hadoop
Deploying Grid Services Using Hadoop
George Ang
 
SDEC2011 NoSQL concepts and models
SDEC2011 NoSQL concepts and modelsSDEC2011 NoSQL concepts and models
SDEC2011 NoSQL concepts and models
Korea Sdec
 

Similaire à Infinispan, a distributed in-memory key/value data grid and cache (20)

An introduction to Pincaster
An introduction to PincasterAn introduction to Pincaster
An introduction to Pincaster
 
Hops - Distributed metadata for Hadoop
Hops - Distributed metadata for HadoopHops - Distributed metadata for Hadoop
Hops - Distributed metadata for Hadoop
 
Keynote Yonik Seeley & Steve Rowe lucene solr roadmap
Keynote   Yonik Seeley & Steve Rowe lucene solr roadmapKeynote   Yonik Seeley & Steve Rowe lucene solr roadmap
Keynote Yonik Seeley & Steve Rowe lucene solr roadmap
 
KEYNOTE: Lucene / Solr road map
KEYNOTE: Lucene / Solr road mapKEYNOTE: Lucene / Solr road map
KEYNOTE: Lucene / Solr road map
 
Storing and distributing data
Storing and distributing dataStoring and distributing data
Storing and distributing data
 
What's new in hadoop 3.0
What's new in hadoop 3.0What's new in hadoop 3.0
What's new in hadoop 3.0
 
From 0 to syncing
From 0 to syncingFrom 0 to syncing
From 0 to syncing
 
MyHeritage backend group - build to scale
MyHeritage backend group - build to scaleMyHeritage backend group - build to scale
MyHeritage backend group - build to scale
 
Using Containers and HPC to Solve the Mysteries of the Universe by Deborah Bard
Using Containers and HPC to Solve the Mysteries of the Universe by Deborah BardUsing Containers and HPC to Solve the Mysteries of the Universe by Deborah Bard
Using Containers and HPC to Solve the Mysteries of the Universe by Deborah Bard
 
Deploying Grid Services Using Hadoop
Deploying Grid Services Using HadoopDeploying Grid Services Using Hadoop
Deploying Grid Services Using Hadoop
 
Data Science
Data ScienceData Science
Data Science
 
SDEC2011 NoSQL concepts and models
SDEC2011 NoSQL concepts and modelsSDEC2011 NoSQL concepts and models
SDEC2011 NoSQL concepts and models
 
MySQL Options in OpenStack
MySQL Options in OpenStackMySQL Options in OpenStack
MySQL Options in OpenStack
 
OpenStack Days East -- MySQL Options in OpenStack
OpenStack Days East -- MySQL Options in OpenStackOpenStack Days East -- MySQL Options in OpenStack
OpenStack Days East -- MySQL Options in OpenStack
 
Securing Your Apache Spark Applications
Securing Your Apache Spark ApplicationsSecuring Your Apache Spark Applications
Securing Your Apache Spark Applications
 
Securing Spark Applications by Kostas Sakellis and Marcelo Vanzin
Securing Spark Applications by Kostas Sakellis and Marcelo VanzinSecuring Spark Applications by Kostas Sakellis and Marcelo Vanzin
Securing Spark Applications by Kostas Sakellis and Marcelo Vanzin
 
L6.sp17.pptx
L6.sp17.pptxL6.sp17.pptx
L6.sp17.pptx
 
Oracle OpenWo2014 review part 03 three_paa_s_database
Oracle OpenWo2014 review part 03 three_paa_s_databaseOracle OpenWo2014 review part 03 three_paa_s_database
Oracle OpenWo2014 review part 03 three_paa_s_database
 
Introduction to libre « fulltext » technology
Introduction to libre « fulltext » technologyIntroduction to libre « fulltext » technology
Introduction to libre « fulltext » technology
 
Apache Cassandra training. Overview and Basics
Apache Cassandra training. Overview and BasicsApache Cassandra training. Overview and Basics
Apache Cassandra training. Overview and Basics
 

Dernier

TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
mohitmore19
 
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
VishalKumarJha10
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 

Dernier (20)

OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
 
%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...
 
The Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdfThe Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdf
 
Generic or specific? Making sensible software design decisions
Generic or specific? Making sensible software design decisionsGeneric or specific? Making sensible software design decisions
Generic or specific? Making sensible software design decisions
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation Template
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
 
%in Durban+277-882-255-28 abortion pills for sale in Durban
%in Durban+277-882-255-28 abortion pills for sale in Durban%in Durban+277-882-255-28 abortion pills for sale in Durban
%in Durban+277-882-255-28 abortion pills for sale in Durban
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
Exploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfExploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdf
 
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdfPayment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the past
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg
 

Infinispan, a distributed in-memory key/value data grid and cache

  • 1. Infinispan Distributed in-memory key/value data grid and cache @infinispan
  • 2. Agenda • Introduction • Part 1 • Hash Tables • Distributed Hash Tables • Consistent Hashing • Chord Lookup Protocol • Part 2 • Data Grids • Infinispan • Architecture • Consistent Hashing / Split Clusters • Other features
  • 3. Part I – A (very) short introduction to distributed hash tables
  • 5. Distributed Hash Tables (DHT) Source: Wikipedia - http://commons.wikimedia.org/wiki/File:DHT_en.svg#/media/File:DHT_en.svg
  • 6. • Decentralized Hash Table functionality • Interface • put(K,V) • get(K) -> V • Nodes can fail, join and leave • The system has to scale Distributed Hash Tables (DHT)
  • 7. • Flooding in N nodes • put() – store in any node O(1) • get() – send query to all nodes O(N) • Full replication in N nodes • put() – store in all nodes O(N) • get() – check any node O(1) Simple solutions
  • 8. Fixed Hashing NodeID = hash(key) % TotalNodes.
  • 9. Fixed Hashing with High Availability NodeID = hash(key) % TotalNodes.
  • 10. Fixed Hashing and Scalability NodeID = hash(key) % TotalNodes+1.
  • 11. 2 Nodes, Key Space={0,1,2,3,4,5} NodeID = hash(key) % 2. NodeID = hash(key) % 3. N0 (key mod 2 = 0) N1 (key mod 2 = 1) 0,2,4 1,3,5 N0 (key mod 3 = 0) N1 (key mod 3 = 1) N2 (key mod 3 = 2) 0,3 1,4 2,5
  • 13. Consistent Hashing – The Hash Ring 0 N0 N1 N2 K1 K2 K3 K4 K5 K6
  • 14. Consistent Hashing – Nodes Joining, Leaving Source: http://www.griddynamics.com/distributed-algorithms-in-nosql-databases/
  • 15. Chord: Peer-to-peer Lookup Protocol • Load Balance – distributed hash function, spreading keys evenly over nodes • Decentralization – fully distributed no SPOF • Scalability – logarithmic growth of lookup cost with the number of nodes, large systems are feasible • Availability – automatically adjusts its internal tables to ensure the node responsible for a key is always found • Flexible naming – key-space is flat (flexibility in how to map names to keys)
  • 16. Chord – Lookup O(N) Source: Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications Ion Stoica , Robert Morrisz, David Liben-Nowellz, David R. Kargerz, M. Frans Kaashoekz, Frank Dabekz, Hari Balakrishnanz
  • 17. Chord – Lookup O(logN) Source: Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications Ion Stoica , Robert Morrisz, David Liben-Nowellz, David R. Kargerz, M. Frans Kaashoekz, Frank Dabekz, Hari Balakrishnanz • K=6 (0, 26−1) • Finger[i] = first node that succeeds (N+ 2𝑖−1 ) mod 2K , where 1 ≤ 𝑖 ≤ 𝐾 • Successor/Predecessor – the next/previous node on circle
  • 18. Chord – Node Join Source: Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications Ion Stoica , Robert Morrisz, David Liben-Nowellz, David R. Kargerz, M. Frans Kaashoekz, Frank Dabekz, Hari Balakrishnanz • Node 26 joins the system between nodes 21 and 32. • (a) Initial state: node 21 points to node 32; • (b) node 26 finds its successor (i.e., node 32) and points to it; • (c) node 26 copies all keys less than 26 from node 32; • (d) the stabilize procedure updates the successor of node 21 to node 26.
  • 19. • CAN (Hypercube), Chord (Ring), Pastry (Tree+Ring), Tapestry (Tree+Ring), Viceroy, Kademlia, Skipnet, Symphony (Ring), Koorde, Apocrypha, Land, Bamboo, ORDI … The world of DHTs …
  • 20. Part II – A short introduction to Infinispan
  • 21. Where do we store data? One size does not fit all...
  • 22.
  • 23. Infinispan – History • 2002 – JBoss App Server needed a clustered solution for HTTP and EJB session state replication for HA clusters. JGroups (open source group communication suite) had a replicated map demo, expanded to a tree data structure, added eviction and JTA transactions. • 2003 – this was moved to JBoss AS code base • 2005 – JBoss Cache was extracted and became a standalone project … JBoss Cache evolved into Infinispan, core parts redesigned • 2009 – JBoss Cache 3.2 and Infinispan 4.0.0.ALPHA1 was released • 2015 - 7.2.0.Alpha1 • Check the Infinispan RoadMap for more details
  • 25. Usage Modes • Embedded / library mode • clustering for apps and frameworks (e.g. JBoss session replication) • Local mode single cache • JSR 107: JCACHE - Java Temporary Caching API • Transactional local cache • Eviction, expiration, write through, write behind, preloading, notifications, statistics • Cluster of caches • Invalidation, Hibernate 2nd level cache • Server mode – remote data store • REST, MemCached, HotRod, WebSocket (*)
  • 26. Code? Configuration config = new ConfigurationBuilder() .clustering() .cacheMode(CacheMode.DIST_SYNC) .sync() .l1().lifespan(25000L) .hash().numSegments(100).numOwners(3) .build(); Configuration config = new ConfigurationBuilder() .eviction() .maxEntries(20000).strategy(EvictionStrategy.LRU) .expiration() .wakeUpInterval(5000L) .maxIdle(120000L) .build();
  • 27. Infinispan – Core Architecture Remote App 1 (C++) Remote App 2 (Java) Remote App 3 (.NET) Network (TCP) Node (JVM) MemCached, HotRod, REST, WebSocket (*) Embedded App (Java) Transport (JGroups) Notification Transactions / XA Query Map / Reduce Monitoring Storage Engine (RAM + Overflow) Node (JVM) MemCached, HotRod, REST, WebSocket (*) Embedded App (Java) Transport (JGroups) Notification Transactions / XA Query Map / Reduce Monitoring Storage Engine (RAM + Overflow) TCP/UDP
  • 28. Infinispan Clustering and Consistent Hashing • JGroups Views • Each node has a unique address • View changes when nodes join, leave • Keys are hashed using MurmurHash3 algorithm • Hash Space is divided into segments • Key > Segment > Owners • Primary and Backup Owners
  • 29. Does it scale? • 320 nodes, 3000 caches, 20 TB RAM • Largest cluster formed: 1000 nodes
  • 35. A cluster with more keys CLUSTER K1 K1 K2K2 K3 K3 K4 K4 K5 K5
  • 38. If multiple nodes fail… • CAP Theorem to the rescue: • Formulated by Eric Brewer in 1998 • C - Consistency • A - High Availability • P - Tolerance to Network Partitions • Can only satisfy 2 at the same time: • Consistency + Availability: The Ideal World where network partitions do not exist • Partitioning + Availability: Data might be different between partitions • Partitioning + Consistency: Do not corrupt data!
  • 39. Infinispan Partition Handling Strategies • In the presence of network partitions • Prefer availability (partition handling DISABLED) • Prefer consistency (partition handling ENABLED) • Split Detection with partition handling ENABLED: • Ensure stable topology • LOST > numOwners OR no simple majority • Check segment ownership • Mark partition as Available / Degraded • Send PartitionStatusChangedEvent to listeners
  • 40. Cluster Partitioning – No data lost K1 K1 K2K2 K3 K3 K4 K4 K5 K5 Partition1 Partition2
  • 41. Cluster Partitioning – Lost data K1 K1 K2K2 K3 K3 K4 K4 K5 K5 Partition1 Partition2
  • 42. Merging Split Clusters • Split Clusters see each other again • Step1: Ensure stable topology • Step2: Automatic: based on partition state • 1 Available -> attempt merge • All Degraded -> attempt merge • Step3: Manual • Data was lost • Custom listener on Merge • Application decides
  • 43. Querying Infinispan • Apache Lucene Index • Native Query API (Query DSL) • Hibernate Search and Apache Lucene to index and search • Native Map/Reduce • Index-less • Distributed Execution Framework • Hadoop Integration (WIP) • Run existing map/reduce jobs on Infinispan data
  • 44. Map Reduce: MapReduceTask<String, String, String, Integer> mapReduceTask = new MapReduceTask<>(wordCache); mapReduceTask .mappedWith(new WordCountMapper()) .reducedWith(new WordCountReducer()); Map<String, Integer> wordCountMap = mapReduceTask.execute();
  • 45. Query DSL: QueryParser qp = new QueryParser("default", new StandardAnalyzer()); Query luceneQ = qp .parse("+station.name:airport +year:2014 +month:12 +(avgTemp < 0)"); CacheQuery cq = Search.getSearchManager(cache) .getQuery(luceneQ, DaySummary.class); List<Object> results = query.list();
  • 46. Other features • JMX Management • RHQ (JBoss Enterprise Management Solution) • CDI Support • JSR 107 (JCACHE) integration • Custom interceptors • Runs on Amazon Web Services Platform • Command line client • JTA with JBoss TM, Bitronix, Atomikos • GridFS (experimental API), CloudTM, Cross Site Replication