ApacheCon Europe 2012 - Real Time Big Data in practice with Cassandra
Prochain SlideShare
Loading in...5
×

Vous aimez ? Partagez donc ce contenu avec votre réseau

Partager

ApacheCon Europe 2012 - Real Time Big Data in practice with Cassandra

  • 5,950 vues
Uploaded on

 

  • Full Name Full Name Comment goes here.
    Êtes-vous sûr de vouloir
    Votre message apparaîtra ici
    Be the first to comment
No Downloads

Vues

Total des vues
5,950
Sur Slideshare
5,846
From Embeds
104
Nombre d'ajouts
2

Actions

Partages
Téléchargements
65
Commentaires
0
J'aime
8

Ajouts 104

https://twitter.com 103
https://si0.twimg.com 1

Signaler un contenu

Signalé comme inapproprié Signaler comme inapproprié
Signaler comme inapproprié

Indiquez la raison pour laquelle vous avez signalé cette présentation comme n'étant pas appropriée.

Annuler
    No notes for slide

Transcript

  • 1. Real-Time Big Data inpractice with CassandraMichaël Figuière@mfiguiere
  • 2. Speaker Michaël Figuière @mfiguiere©2012 DataStax 2
  • 3. Ring Architecture Node Node Node Cassandra Node Node Node©2012 DataStax 3
  • 4. Ring Architecture Node Replica Node Replica Node Replica©2012 DataStax 4
  • 5. Linear Scalability Client Writes/s by Node Count - Replication Factor = 3©2012 DataStax 5
  • 6. Client / Server Communication Client ? Node Replica Client Node Replica Client Node Client Replica©2012 DataStax 6
  • 7. Client / Server Communication Client Node Replica Client Node Replica Client Node Client Replica Coordinator node: Forwards all R/W requests to corresponding replicas©2012 DataStax 7
  • 8. Tunable Consistency Time A A A 3 replicas©2012 DataStax 8
  • 9. Tunable Consistency Time A A A B A A Write ‘B’ Write and wait for acknowledge from one node©2012 DataStax 9
  • 10. Tunable Consistency Time R +W < N A A A B A A B A A Read waiting for one node Write and wait for to answer acknowledge from one node©2012 DataStax 10
  • 11. Tunable Consistency Time R +W = N A A A B B A B B A Read waiting for one node Write and wait for to answer acknowledges from two nodes©2012 DataStax 11
  • 12. Tunable Consistency Time R +W > N A A A B B A B B A Read waiting for two nodes Write and wait for to answer acknowledges from two nodes©2012 DataStax 12
  • 13. Tunable Consistency Time R = W = QUORUM A A A B B A B B A QUORUM = (N / 2) + 1©2012 DataStax 13
  • 14. Request Path 1 Client Node Replica 2 3 Client 4 2 Node Replica 3 Client 2 3 Node Client Replica Coordinator node©2012 DataStax 14
  • 15. Column Family Data Model name email address state jbellis Jonathan jb@ds.com 123 main TX name email address state dhutch Daria dh@ds.com 45 2nd st CA name email egilmore Eric eg@ds.com Row Key Columns©2012 DataStax 15
  • 16. Column Family Data Model dhutch egilmore datastax mzcassie jbellis egilmore dhutch datastax mzcassie egilmore Row Key Columns©2012 DataStax 16
  • 17. CQL3 Data Model Timeline Table user_id tweet_id author body gmason 1765 phenry Give me liberty or give me death gmason 1742 gwashington I chopped down the cherry tree ahamilton 1797 jadams A government of laws, not men ahamilton 1742 gwashington I chopped down the cherry tree Partition Remaining Key Key©2012 DataStax 17
  • 18. CQL3 Data Model Timeline Table user_id tweet_id author body gmason 1765 phenry Give me liberty or give me death gmason 1742 gwashington I chopped down the cherry tree ahamilton 1797 jadams A government of laws, not men ahamilton 1742 gwashington I chopped down the cherry tree CQL CREATE TABLE timeline ( user_id varchar, tweet_id uuid, author varchar, body varchar, PRIMARY KEY (user_id, tweet_id));©2012 DataStax 18
  • 19. CQL3 Data Model Timeline Table user_id tweet_id author body gmason 1765 phenry Give me liberty or give me death gmason 1742 gwashington I chopped down the cherry tree ahamilton 1797 jadams A government of laws, not men ahamilton 1742 gwashington I chopped down the cherry treeTimeline Physical Layout [1742, author] [1742, body] [1765, author] [1765, body] gmason gwashington I chopped down the... phenry Give me liberty or give... [1742, author] [1742, body] [1797, author] [1797, body] ahamilton gwashington I chopped down the... jadams A government of laws...©2012 DataStax 19
  • 20. Denormalized Data Model Data duplicated over several tables©2012 DataStax 20
  • 21. Real-Time Analytics Google Analytics gives you immediate statistics about your website traffic©2012 DataStax 21
  • 22. Web Analytics Data Model Analytics Table url time views from_search direct from_referrer /index.html 12:00 354 300 20 34 /index.html 12:01 402 333 25 44 /contacts.html 12:00 23 3 0 20 /contacts.html 12:01 20 4 1 15 CQL CREATE TABLE analytics ( url varchar, time timestamp, views counter, from_search counter, direct counter, from_referrer counter, PRIMARY KEY (url, time));©2012 DataStax 22
  • 23. Web Analytics Data Model Analytics Table url time views from_search direct from_referrer /index.html 12:00 354 300 20 34 /index.html 12:01 402 333 25 44 /contacts.html 12:00 23 3 0 20 /contacts.html 12:01 20 4 1 15 CQL UPDATE analytics SET views = views + 1, from_search = from_search + 1 WHERE url = /index.html AND time = 2012-10-06 12:00;©2012 DataStax 23
  • 24. Web Analytics Data Model Analytics Table url time views from_search direct from_referrer /index.html 12:00 354 300 20 34 /index.html 12:01 402 333 25 44 /contacts.html 12:00 23 3 0 20 /contacts.html 12:01 20 4 1 15 CQL SELECT * FROM analytics WHERE url = /index.html©2012 DataStax 24
  • 25. Online Business Intelligence Storage for application Distributed batch in production processing Application Cassandra Hadoop Using results in Storage for production results©2012 DataStax 25
  • 26. Stay Tuned! blog.datastax.com @mfiguiere