Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.

Cassandra NYC 2011 Data Modeling

13 367 vues

Publié le

Publié dans : Technologie, Économie & finance
  • Hi Matthew,

    I'm looking for the best way to model my new problem:
    I have many systems which send to the DB a set of metrics (ex: system1 sends every 10 secondes, temp1 - temp2 - acc1 - vel1 - ...)

    When I see your 9th slide, I wonder if it's the best way to store the values, as the systems don't have the same metrics, and maybe the metrics can changes over the time (ex: tomorrow, the system1 won't have the temp2, but have acc2 and acc3).

    What would be the best way to do?

    Philippe (twitter.com/philmod)
    Voulez-vous vraiment ?  Oui  Non
    Votre message apparaîtra ici

Cassandra NYC 2011 Data Modeling

  1. 1. Data Modeling ExamplesMatthew F. Dennis // @mdennis
  2. 2. Overview● general guiding goals for Cassandra data models● Interesting and/or common examples/questions to get us started● Should be plenty of time at the end for questions, so bring them up if you have them !
  3. 3. Data Modeling Goals● Keep data queried together on disk together● In a more general sense think about the efficiency of querying your data and work backward from there to a model in Cassandra● Usually, you shouldnt try to normalize your data (contrary to many use cases in relational databases)● Usually better to keep a record that something happened as opposed to changing a value (not always the best approach though)
  4. 4. Time Series Data● Easily the most common use of Cassandra ● Financial tick data ● Click streams ● Sensor data ● Performance metrics ● GPS data ● Event logs ● etc, etc, etc ...● All of the above are essentially the same as far as C* is concerned
  5. 5. Time Series Thought Model● Things happen in some timestamp ordered stream and consist of values associated with the given timestamp (i.e. “data points”) – Every 30 seconds record location, speed, heading and engine temp – Every 5 minutes record CPU, IO and Memory usage● We are interested in recreating, aggregating and/or analyzing arbitrary time slices of the stream – Where was agent:007 and what was he doing between 11:21am and 2:38pm yesterday? – What are the last N actions foo did on my site?
  6. 6. Data Points Defined● Each data point has 1-N values● Each data point corresponds to a specific point in time or an interval/bucket (e.g. 5 th minute of 17th hour on some date)
  7. 7. Data Points Mapped to Cassandra● Row Key is id of the data point stream bucketed by time – e.g. plane01:jan_2011 or plane01:jan_01_2011 for month or day buckets respectively● Column Name is TimeUUID(timestamp of date point)● Column Value is serialized data point – JSON, XML, pickle, msgpack, thrift, protobuf, avro, BSON, WTFe● Bucketing – Avoids always requiring multiple seeks when only small slices of the stream are requested (e.g. stream is 5 years old but Im on only interested in Jan 5 th 3 years ago and/or yesterday between 2pm and 3pm). – Make it easy to lazily aggregate old stream activity – Reduces compaction overhead since old rows will never have to be merged again (until you “back fill” and/or delete something)
  8. 8. A Slightly More Concrete Example● Sensor data from airplanes● Every 30 seconds each plane sends latitude+longitude, altitude and wine remaining in mdennis glass.
  9. 9. The Visual plane5:jan_2011 TimeUUID0 TimeUUID1 TimeUUID2 p5:j11 28.90, 124.30 45K feet 28.85, 124.25 44K feet 28.81, 124.22 44K feet 70% 50% 95% Middle of the ocean and half a glass of wine at 44K feet● Row Key is the id of stream being recorded (e.g. plane5:jan_2011)● Column Name is timestamp (or TimeUUID) associated with the data point● Column Value is the value of the event (e.g. protobuf serialized lat/long+alt+wine_level)
  10. 10. Querying● When querying, construct TimeUUIDs for the min/max of the time range in question and use them as the start/end in your get_slice call● Or use a empty start and/or end along with a count
  11. 11. Bucket Sizes?● Depends greatly on ● Average size of time slice queried ● Average data point size ● Write rate of data points to a stream ● IO capacity of the nodes
  12. 12. So... Bucket Sizes?● No Bigger than a few GB per row ● bucket_size * write_rate * sizeof(avg_data_point)● Bucket size >= average size of time slice queried● No more than maybe 10M entries per row● No more than a month if you have lots of different streams● NB: there are exceptions to all of the above, which are really nothing more than guidelines
  13. 13. Ordering● In cases where the most recent data is the most interesting (e.g. last N events for entity foo or last hour of events for entity bar), you can reverse the comparator (i.e. sort descending instead of ascending) ● http://thelastpickle.com/2011/10/03/Reverse-Comparators/ ● https://issues.apache.org/jira/browse/CASSANDRA-2355
  14. 14. Spanning Buckets● If your time slice spans buckets, youll need to construct all the row keys in question (i.e. number of unique row keys = spans+1)● If you want all the results between the dates, pass all the row keys to multiget_slice with the start and end of the desired time slice● If you only want the first N results within your time slice, lowest latency comes from multiget_slice as above but best efficiency comes from serially paging one row key at a time until your desired count is reached
  15. 15. Expiring Streams (e.g. “I only care about the past year”)● Just set the TTL to the age you want to keep● yeah, thats pretty much it ...
  16. 16. Counters● Sometimes youre only interested in counting things that happened within some time slice● Minor adaptation to the previous content to use counters (be aware they are not idempotent) ● Column names become buckets ● Values become counters
  17. 17. Example: Counting User Logins user3:system5:logins:by_day 20110107 ... 20110523 U3:S5:L:D 2 ... 7 2 logins on Jan 7th 2011 7 logins on May 23rd 2011 for user 3 on system 5 for user 3 on system 5 user3:system5:logins:by_hour 2011010710 ... 2011052316 U3:S5:L:H 1 ... 7one login for user 3 on system 5 2 logins for user 3 on system 5on Jan 7th 2011 for the 10th hour on May 23rd 2011 for the 16th hour
  18. 18. Eventually Atomic● In a legacy RDBMS atomicity is “easy”● Attempting full ACID compliance in distributed systems is a bad idea (and actually impossible in the strictest sense)● However, consistency is important and can certainly be achieved in C*● Many approaches / alternatives● I like a transaction log approach, especially in the context of C*
  19. 19. Transaction Logs (in this context)● Records what is going to be performed before it is actually performed● Performs the actions that need to be atomic (in the indivisible sense, not the all at once sense which is usually what people mean when they say isolation)● Marks that the actions were performed
  20. 20. In Cassandra● Serialize all actions that need to be performed in a single column – JSON, XML, YAML (yuck!), pickle, JSO, msgpack, protobuf, et cetera ● Row Key = randomly chosen C* node token ● Column Name = TimeUUID(nowish)● Perform actions● Delete Column
  21. 21. Configuration Details● Short gc_grace_seconds on the XACT_LOG Column Family (e.g. 5 minutes)● Write to XACT_LOG at CL.QUORUM or CL.LOCAL_QUORUM for durability ● if it fails with an unavailable exception, pick a different node token and/or node and try again (gives same semantics as a relational DB in terms of knowing the state of your transaction)
  22. 22. Failures● Before insert into the XACT_LOG● After insert, before actions● After insert, in middle of actions● After insert, after actions, before delete● After insert, after actions, after delete
  23. 23. Recovery● Each C* has a crond job offset from every other by some time period● Each job runs the same code: multiget_slice for all node tokens for all columns older than some time period (the “recovery period”)● Any columns need to be replayed in their entirety and are deleted after replay (normally there are no columns because normally things are working)
  24. 24. XACT_LOG Comments● Idempotent writes are awesome (thats why this works so well)● Doesnt work so well for counters (theyre not idempotent)● Clients must be able to deal with temporarily inconsistent data (they have to do this anyway)
  25. 25. Q?Cassandra Data Modeling Examples Matthew F. Dennis // @mdennis