A stable data model provides numerous advantages in developing Big Data and NoSQL systems, especially when sharing data over the cloud. Adopting an immutable model eases much of the pain of achieving consistency, especially at great scale. There can be trade-offs however that you need to be aware of.
This webinar will present details and examples of immutable data models as applied to various NoSQL systems, including MongoDB, Cloudant, Riak and Cassandra. The emphasis will be on the impact to application designers and architects, as well as the technical trade-offs and advantages. The discussion will be well-grounded in real world examples from within and beyond the enterprise.
5. 2014-06-12
This is your problem when…
!
… data doesn’t fit on one server.
… data replicated between servers (e.g. read slaves).
… data spread between data centers.
… state spread across more than one device (mobile!)
… mixed workloads with concurrency.
… state spread across more than one process.
5
17. 2014-06-12 17
Google File System (2003)
http://research.google.com/archive/gfs.html
!
Google MapReduce (2004)
http://research.google.com/archive/mapreduce.html
!
Google BigTable (2006)
http://research.google.com/archive/bigtable.html
!
Amazon’s Dynamo (2007)
http://s3.amazonaws.com/AllThingsDistributed/sosp/amazon-dynamo-sosp2007.pdf
34. 2014-06-12
Immutability isn’t new
!
‣ “Accountants don’t use erasers”
‣ Functional, concurrent, distributed languages (e.g. Erlang)
‣ File systems (e.g. ZFS)
‣ Storage engines (e.g. LevelDB) & Databases (CouchDB, Datomic, …)
‣ Data model
34
35. 2014-06-12
‣ Don’t update in place
‣ Keep old versions
‣ Query for newest version
‣ Even works for deletions (write a “tombstone”)
35
{Strategy 1: ‘Write Only’}
36. 2014-06-12 36
{Strategy 2: ‘Minimize Contention’}
‣ Break out one-to-many, many-to-many relationships using foreign keys and links.
‣ Normalize!
Learn your indexing options!
37. 2014-06-12 37
{Strategy 3: ‘Think Commutative’}
‣ Store “deltas”, just like your checkbook
Account Value via Materialized View
39. 2014-06-12
Future Work
‣ Additional explicit data modeling examples
‣ Advanced reasoning for “AP” systems
‣ CRDTs
‣ Secondary index consistency, maintenance (RAMP)
‣ “New” transactional systems (HAT, Google Spanner)
‣ “Call me maybe”:
• (http://aphyr.com/posts/281-call-me-maybe-carly-rae-jepsen-and-the-perils-of-network-
partitions)
39