2. Infinispan
In-memory data grid meets NoSQL
Manik Surtani
Founder and project lead, Infinispan
Senior Principal Engineer
Red Hat, Inc.
manik@jboss.org | http://twitter.com/maniksurtani | http://blog.infinispan.org
3. “Could data storage be the one
thing that hampers true cloud
scalability and elasticity?”
manik@jboss.org | http://twitter.com/maniksurtani | http://blog.infinispan.org
4. Who is Manik?
• R&D Engineer, Red Hat Inc.
• Founder and project lead, Infinispan
• Project lead, JBoss Cache
• Frequent speaker on cloud computing and
cloud data storage
http://twitter.com/maniksurtani
http://blog.infinispan.org
manik@jboss.org | http://twitter.com/maniksurtani | http://blog.infinispan.org
5. Agenda
• What is Data-as-a-Service?
• Introducing Infinispan
• Implementing Data-as-a-Service with data grids
• Data grids vs. NoSQL
manik@jboss.org | http://twitter.com/maniksurtani | http://blog.infinispan.org
6. Traditional 3-tier App
manik@jboss.org | http://twitter.com/maniksurtani | http://blog.infinispan.org
12. Virtualizing Data
• Some public services do exist
• Amazon RDS and SimpleDB
• FathomDB
• Cloundant
• MongoHQ
• etc.
manik@jboss.org | http://twitter.com/maniksurtani | http://blog.infinispan.org
13. What about private clouds?
• Not all cloud deployments
are public!
• Private cloud is very
important
• How can you build DaaS
yourself?
manik@jboss.org | http://twitter.com/maniksurtani | http://blog.infinispan.org
14. Characteristics of DaaS
• Elastic data
• Needs to scale with other tiers
• Response times should be linear
• Needs to be highly available
• Nodes will die! The service
shouldn’t.
manik@jboss.org | http://twitter.com/maniksurtani | http://blog.infinispan.org
15. “Traditional” RDBMSs
• Lack of distribution hampers elasticity and HA
• These limitations can be worked around
• ... but this isn’t trivial
• ... or cheap
manik@jboss.org | http://twitter.com/maniksurtani | http://blog.infinispan.org
16. “Traditional” RDBMSs
• Lack of distribution hampers elasticity and HA
• These limitations can be worked around
• ... but this isn’t trivial
• ... or cheap
manik@jboss.org | http://twitter.com/maniksurtani | http://blog.infinispan.org
17. Distributed Data Grid
• Far better suited to elastic data
• Distributed by nature
• Highly available by nature
• A good building block for your data service
manik@jboss.org | http://twitter.com/maniksurtani | http://blog.infinispan.org
18. API is king
• Apps should use their native data storage APIs
• E.g., JPA (Java EE), ActiveRecord (Ruby), etc.
• Key/value too low level
• Akin to direct JDBC calls!
manik@jboss.org | http://twitter.com/maniksurtani | http://blog.infinispan.org
20. What is Infinispan?
• Open source (LGPL) in-memory Data Grid
• Some concepts from Amazon Dynamo
2 usage modes
• Embedded
• Client-server
• memcached
• Hot Rod
• REST
manik@jboss.org | http://twitter.com/maniksurtani | http://blog.infinispan.org
23. API
• Map-like key/value store
• Upcoming JPA-like layer
• Other high-level APIs being discussed in the
community e.g., ActiveRecord
manik@jboss.org | http://twitter.com/maniksurtani | http://blog.infinispan.org
24. Consistent hash based distribution
• Self healing
• No single point of failure
Highly concurrent
• MVCC locking
manik@jboss.org | http://twitter.com/maniksurtani | http://blog.infinispan.org
25. Persistence
• Not just in memory!
• Write through and write behind
• Pluggable “drivers”
Eviction and expiry
• Efficient, adaptive algorithms
• Addresses shortcomings of LRU & FIFO
manik@jboss.org | http://twitter.com/maniksurtani | http://blog.infinispan.org
27. Map/Reduce
• In a pre-release state right now
• Please try out 5.0.0.ALPHA3 with these APIs!
Querying
• Using Lucene and Hibernate Search to index
manik@jboss.org | http://twitter.com/maniksurtani | http://blog.infinispan.org
29. Client/Server Architecture
Supported Protocols
• REST
• Memcached
• Hot Rod
manik@jboss.org | http://twitter.com/maniksurtani | http://blog.infinispan.org
30. WTF is Hot Rod?
• Wire protocol for client server
communications
• Open
• Language independent
• Built-in failover and load balancing
• Smart routing
manik@jboss.org | http://twitter.com/maniksurtani | http://blog.infinispan.org
31. Server Endpoint Comparison
Client Clustered Smart Load Balancing/
Protocol
Libraries ? Routing Failover
REST Text N/A Yes No Any HTTP load balancer
Only with predefined
Memcached Text Plenty Yes No
server list
Currently
Hot Rod Binary Yes Yes Dynamic
only Java
manik@jboss.org | http://twitter.com/maniksurtani | http://blog.infinispan.org
32. So is Infinispan
a data grid?
• In-memory
• P2P, distributed
• Low-latency, fast key/
value store
manik@jboss.org | http://twitter.com/maniksurtani | http://blog.infinispan.org
33. So is Infinispan
a data grid?
• In-memory
• P2P, distributed ... or is it a NoSQL
• Low-latency, fast key/
value store
database?
• Persistence
• Map/Reduce
• Client/Server mode
manik@jboss.org | http://twitter.com/maniksurtani | http://blog.infinispan.org
34. So is Infinispan
a data grid?
• In-memory
• P2P, distributed ... or is it a NoSQL
• Low-latency, fast key/
value store
database?
• Persistence
... or something
• Map/Reduce
• Client/Server mode else?
• Querying support
• Transactional
manik@jboss.org | http://twitter.com/maniksurtani | http://blog.infinispan.org
35. DaaS with
Infinispan
manik@jboss.org | http://twitter.com/maniksurtani | http://blog.infinispan.org
36. Architecture
Manage and Monitor
manik@jboss.org | http://twitter.com/maniksurtani | http://blog.infinispan.org
37. Data Grids
vs.
NoSQL
manik@jboss.org | http://twitter.com/maniksurtani | http://blog.infinispan.org
38. Data Grids vs. NoSQL
• Data grids/distributed caches as proto-NoSQL?
• Been around since early 2000’s
• Used as database offload
• Bottleneck removal
• Primary K/V stores for certain types of data
• Low latency data access
manik@jboss.org | http://twitter.com/maniksurtani | http://blog.infinispan.org
41. NoSQL standards
• APIs - basic interactions and Map/Reduce
• JCP for Java + others
• http://bit.ly/data_grid_jsr
• Wire protocol
• Remote communications.
• Hot Rod?
• Query language
• Without relational presumptions
manik@jboss.org | http://twitter.com/maniksurtani | http://blog.infinispan.org
42. Summing things up
• Elastic data is hard
• Public data services not always suitable
• Data grids make elastic storage easy
• Infinispan server endpoints help build
elastic data tiers
• Discussed data grids versus NoSQL
manik@jboss.org | http://twitter.com/maniksurtani | http://blog.infinispan.org
44. Starting an Infinispan Server
• Hot Rod or memcached server endpoint
$ bin/startServer.sh -r hotrod
-c infinispan.xml
$ bin/startServer.sh -r memcached
-c infinispan.xml
manik@jboss.org | http://twitter.com/maniksurtani | http://blog.infinispan.org
45. Starting an Infinispan Server
• REST endpoint
• Deploy infinispan-server-rest.war in your
favorite servlet container.
manik@jboss.org | http://twitter.com/maniksurtani | http://blog.infinispan.org
46. Roadmap
4.0.0 Starobrno
• Map-like API
• Async API
• Consistent Hash based distribution
• Write-through, write-behind
• Eviction, expiration
• Management tooling
• REST API
• Hibernate 2nd Level Cache
• Released Feb 2010
manik@jboss.org | http://twitter.com/maniksurtani | http://blog.infinispan.org
47. Roadmap
4.1.0 Radegast
• Deadlock detection
• Client/Server protocols
• Memcached
• Hot Rod
• Smart clients using Hot Rod
• Lucene Directory implementation
• LIRS: adaptive, recency-based eviction policies
• Released August 2010
manik@jboss.org | http://twitter.com/maniksurtani | http://blog.infinispan.org
48. Roadmap
5.0.0 Pagoa
• JPA-like API
• Fine-grained replication
• Distributed code execution
• Map/reduce
• Virtual nodes for more even distribution
• In active development
5.1.0 and beyond
• Dynamic provisioning based on SLAs
• Complex event processing features
manik@jboss.org | http://twitter.com/maniksurtani | http://blog.infinispan.org