2. Infinispan:
New Kid on the NoSQL Block
Galder Zamarreño
Senior Engineer, Red Hat
14th October 2010, Lausanne JUG
galder@jboss.org | twitter.com/galderz | zamarreno.com
Monday, October 18, 2010
3. “There is a need for a viable cloudready data store. People need to
rethink the way they organize, store
and access data.”
galder@jboss.org | twitter.com/galderz | zamarreno.com
Monday, October 18, 2010
4. Who is Galder?
• R&D engineer (Red Hat Inc):
• Infinispan developer
• JBoss Cache developer
• Contributor and committer:
• JBoss AS, Hibernate, JGroups, JBoss Portal,...etc
• Blog: zamarreno.com
• Twitter: @galderz
galder@jboss.org | twitter.com/galderz | zamarreno.com
Monday, October 18, 2010
5. Agenda
• Cloud computing and data storage
• And why you should care!
• Data grids and cloud storage
• Introducing Infinispan
galder@jboss.org | twitter.com/galderz | zamarreno.com
Monday, October 18, 2010
6. Clouds are today!
• Clouds are happening
• *aaS
• You cannot escape them!
• Public: Amazon, Google, Rackspace, ...
• Private: Red Hat, Oracle, VMWare, ...
• Clouds will become mainstream
• Traditional data centers become marginalized
galder@jboss.org | twitter.com/galderz | zamarreno.com
Monday, October 18, 2010
7. Why are clouds popular?
• Piecemeal costs, perfect utilization
• Pay for what you use, no more!
• Massive economies of scale
• High availability = Implicit backups!
• Very fast provisioning -> Elasticity
• Familiar charging model, controllable costs
• Operational expenditure versus capital expenditure
galder@jboss.org | twitter.com/galderz | zamarreno.com
Monday, October 18, 2010
8. Why should I care?
• My favorite platform is still relevant
• Java, Java EE
• Python, Ruby, .NET,... whatever!
• My favorite OS is still relevant:
• Linux
• Solaris, ...etc.
galder@jboss.org | twitter.com/galderz | zamarreno.com
Monday, October 18, 2010
9. Data Storage
• Databases on clouds:
• not a match made in heaven!
• Traditional modes of data storage won't work
• Clouds are inherently stateless, ephemeral
• Cloud deployments should scale
• ... but databases still are a bottleneck
• … and single point of failure!
galder@jboss.org | twitter.com/galderz | zamarreno.com
Monday, October 18, 2010
10. RDBMS on clouds:
your options
• Non-ephemeral storage
• Restrictive
• Highly specialized hardware
• E.g., a SAN for Oracle RAC, ExaLogic?
• Hardly commodity hardware!
• Native database clustering
• Unreliable, expensive
galder@jboss.org | twitter.com/galderz | zamarreno.com
Monday, October 18, 2010
11. Another solution: Data Grids!
• Data grids are perfect for clouds
• Highly scalable
• No single point of failure
• Works with ephemeral cloud nodes
• Very low latency
galder@jboss.org | twitter.com/galderz | zamarreno.com
Monday, October 18, 2010
12. Data Grids and other vendors
• Data grids
• Amazon SimpleDB uses Dynamo
• Google BigTable
• Infinispan
• Many other commercial and OSS offerings
galder@jboss.org | twitter.com/galderz | zamarreno.com
Monday, October 18, 2010
13. In-Memory Data Grids - Speed!
• Low latency
• minimal disk lookup
• Memory 2 orders of magnitude faster than disk
• especially for frequently used data
• Concurrency, hardware threads
• Disk IO is always a concurrency bottleneck
• Memory offers far greater concurrency
galder@jboss.org | twitter.com/galderz | zamarreno.com
Monday, October 18, 2010
15. Introducing Infinispan
• Scalable data grid platform
• open source - LGPL
• based on some JBoss Cache code ... but mostly all-new
• JBoss Cache...
• ... is a clustered caching library
• ... exposes a tree-structured API
• Infinispan has a Map-like API - (JSR-107 JCACHE)
• ... so, primarily key/value NoSQL
galder@jboss.org | twitter.com/galderz | zamarreno.com
Monday, October 18, 2010
17. Infinispan != JBoss Cache 4
• New architecture
• Brand new data container design
• Cutting edge algorithms
• New, completely different, APIs
• Not backward-compatible
• Although an code-level compatibility layer is available
• New expectations
• Designed for a far wider scope of purpose
galder@jboss.org | twitter.com/galderz | zamarreno.com
Monday, October 18, 2010
18. More scalable than JBC
• Internal structures more memory-efficient
• Data organised in Map-like dictionaries
• As opposed to a tree
• Making better use of CAS
• Minimizing synchronized blocks, mutexes
• Highly precise and low overhead data eviction
• Uses JBoss Marshalling
• smaller payloads + poolable streams = faster RPC
galder@jboss.org | twitter.com/galderz | zamarreno.com
Monday, October 18, 2010
19. “Borrowed” from JBoss Cache
• JTA transactions
• Replicated data structure
• Fine-grained replication
• Eviction, cache persistence
• Notifications and eventing API
• JMX reporting and Query API
• MVCC locking
• Non-blocking state transfer techniques
galder@jboss.org | twitter.com/galderz | zamarreno.com
Monday, October 18, 2010
20. … and new features!
• Consistent hash based data distribution
• Much simpler Map API (JSR-107 compliant)
• Ability to be consumed by non-JVM platforms
• Client/server module
• Memcached compatibility
• HotRod - binary protocol supporting “smart clients”
• Javascript access via Websocket server
• REST API
galder@jboss.org | twitter.com/galderz | zamarreno.com
Monday, October 18, 2010
21. … and new features!
• JOPR based GUI management console
• JPA-like API
• Distributed execution
• Map/reduce made easy!
galder@jboss.org | twitter.com/galderz | zamarreno.com
Monday, October 18, 2010
22. Data distribution
• Consistent hash based data distribution
• Locating entries very efficient
• No network calls, no need for metadata
• Will allow us to scale to bigger clusters
• Goal of efficient scaling to 1000’s of nodes
• Lightweight, “L1” cache for efficient reads
• On writes, “L1” gets invalidated
• Dynamic rebalancing
galder@jboss.org | twitter.com/galderz | zamarreno.com
Monday, October 18, 2010
23. JPA-like API, fine-grained
replication
• Successor to POJO Cache
• JPA-like interface: persist, find, remove...
• Will not rely on AOP, javassist, etc
• More robust and easier to use/debug
• Familiar JPA-like interface
• Easy migration from existing, “traditional” data stores!
galder@jboss.org | twitter.com/galderz | zamarreno.com
Monday, October 18, 2010
24. Management
• Uses JOPR, a rich web-based GUI
• Simple WAR file
• Open Source (LGPL)
• Infinispan exposes data, operations in JMX
• Infinispan-JOPR plugin represents this graphically
• Other plugins can be built for other tools
• HP OpenView, Hyperic, etc.
galder@jboss.org | twitter.com/galderz | zamarreno.com
Monday, October 18, 2010
25. So why is Infinispan sexy?
galder@jboss.org | twitter.com/galderz | zamarreno.com
Monday, October 18, 2010
26. Why is Infinispan sexy?
• Transparent horizontal scalability
• Elastic in both directions
• Fast, low latency data access
• Ability to address a very large heap
• Cloud-ready datastore
• Not just for Java
• Free and doesn't suck!
galder@jboss.org | twitter.com/galderz | zamarreno.com
Monday, October 18, 2010
28. Roadmap
• Infinispan 4.0.0 Starobrno (Released Feb 2010)
• New Map API
• Async API
• Distributed cache mode
• Management tooling
• REST API
• Hibernate 2nd level cache
galder@jboss.org | twitter.com/galderz | zamarreno.com
Monday, October 18, 2010
29. Roadmap
• Infinispan 4.1.0 Radegast (Released August 2010)
• Client/server
• Memcached protocol
• Hot Rod protocol
• Smart clients using HotRod
• Websocket server
• Lucene Directory
• LIRS adaptive, recency-based eviction
galder@jboss.org | twitter.com/galderz | zamarreno.com
Monday, October 18, 2010
30. Roadmap
• Infinispan 4.2.0 Ursus
• Collocated nodes in DIST
• Cassandra based cache store
• Infinispan 5.0.0 Pagoa
• JPA-like API + fine-grained replication
• Distributed executors
• Map/reduce programming model
galder@jboss.org | twitter.com/galderz | zamarreno.com
Monday, October 18, 2010
31. To sum it up
• Clouds are becoming mainstream
• Need to think about challenges
• DBs and clouds pose many challenges
• Data grids offer a good alternative
• Infinispan, a new open source data grid
• Viable cloud data store but not just for clouds
• removes bottlenecks, single points of failure in non-cloud
architectures too
galder@jboss.org | twitter.com/galderz | zamarreno.com
Monday, October 18, 2010
32. How can YOU participate?
• Download and try it out!
• Report bugs in code, even docs, wikis, etc.
• Suggest new features!
• Test with your own use cases and tell us how you use it!!
• Lend a hand with development
• Open and democratic dev process
• Helps prioritize features you want!
• Several non-Red Hat core committers already!
galder@jboss.org | twitter.com/galderz | zamarreno.com
Monday, October 18, 2010