SlideShare une entreprise Scribd logo
1  sur  36
Cozy with Cassandra Getting to know the Cassandra Codebase Gary Dusbabek • Rackspace @gdusbabek Cassandra Summit • Mission Bay Conference Center • San Francisco • 10 August 2010
Outline Code Themes Startup Sequence Key Classes Read Path Write Path Stages & Threading Bootstrap & Streaming Tests & IDE considerations Adding API methods Questions
Themes and Patterns Layers
Themes and Patterns Services
Themes and Patterns Singletons  and  statics
Themes and Patterns Stages & Thread pools
StartupProcess
CassandraDaemon Loads configuration Transport initialization Storage (Keyspace initialization) CommitLogrecovery StorageService.initServer() Initializes CassandraServer Passes it off to transport
CassandraServer Implements IDL interface methods (cassandra.thrift, cassandra.genavro) Good place to start diving when adding or troubleshooting API methods
Configuration DatabaseDescriptor Via CassandraDaemon.setup() Looks for config path, loads yaml Doesn’t spin anything up Defines system tables KS and CF described by CFMetaData and KSMetaData
Code Some Classes
Main Controllers End with *Service or *Manager StorageService, MessagingService CompactionManager, HintedHandoffManager, StageManager, StreamInManager
StorageProxy Put & Get methods Collection of static methods Merges local and distributed operations Tracks latency Exposed via StorageProxyMBean
StorageService initServer()—Starts services Registers verb handlers (in MessagingService) Main event responders Repository of replication strategies and TokenMetadata Ring topology & token information
MessagingService Verb handlers reside here Sets up socket listeners Gateway for outbound messages MS.sendRR() MS.sendOneWay() Inbound too MS.receive()
Table & ColumnFamilyStore Also RowMutation Low-level storage operations o.a.c.db.* SSTable Local operations
Read+Write Paths
Reading	 Socket->CassandraServer Permissions Request validation Marshalling
Reading	 StorageProxy Ranges Collectors Local & remote branches
Reading	 StorageProxy local Table, ColumnFamilyStore CFS Make QueryFilter Query Memtables Query SSTables Coalesce in iterators o.a.c.db package o.a.c.db.filter
Reading	 StorageProxy remote read command Response handler Send to remote nodes
Writing Socket->CassandraServer Validation Convert to Mutation (IDL object) Penalties!
Writing StorageProxy blocking/non-blocking mutate  local/remote branch RowMutation one ColumnFamily per ColumnFamily Collection of column modifications
Writing RM.apply->Table.apply Write to CL Iterate over RM CFs CFS.apply() Overwrites results on pre-existing column families
Writing RM is serialized into a Message and sent to other nodes Waits for ACKs depending on CL
Stages & Threading
Stages SEDA http://www.eecs.harvard.edu/~mdw/papers/seda-sosp01.pdf o.a.c.concurrent.StageManager Read, mutation, stream, gossip, response, anti entropy, load balance, migration Thread pools that consume tasks
Threads MessagingService.listen() spawns thread. Each incoming connection spawns a new short-lived thread (IncomingTcpConnection) Non-stream ops go to MS.messageDeserializerExecutor_ Stream ops handled there. Anti-entropy repair
Bootstraping! Any interest? //FIXMETODOCRAP
<=0.6 Bootstraping A wants data, B has data. StreamingRequestMessage A->B Handled on B by StreamRequestVerbHandler For each range StreamOut.transferRanges() Flush, anticompaction StreamInitiateMessage B->A for each range transfer Meanwhile, back on A… StreamInitiateVerbHandler gets the SIM from B, does some nesting. StreamInitiateDone A->B Back to B… StreamInitiateDoneHandler gets the SID from A Calls StreamOutManager.startNext() which sends a single file to A MessagingService on A picks this up and the file is streamed. Sstable is created STREAM_FINISHED A->B B gets rid of the file, calls SOM.startNext()
0.7 Bootstrapping A wants data, B has data StreamRequestMessage A->B On B, StreamRequestVerbHandler If single file, sends it. If range, StreamOut.transferRangesForRequest() Send next file (first will contain meta data  about all files) On A, IncomingStreamReader.read()	 Data is received, sstable created Ack, request next file
Tests Testable & Untestable Unit tests ant clean build test System tests ant gen-thrift-py nosetests test/system/test_thrift_server.py
IDE Configuration file must be in the classpath Treat as source lib vs build/lib Log at debug
IDE -ea -Xms128M –Xmx2G  -Dcom.sun.management.jmxremote.port=8081  -Dcom.sun.management.jmxremote.ssl=false  -Dcom.sun.management.jmxremote.authenticate=false -Dcassandra-foreground=yes  -Dlog4j.configuration=log4j-server.properties -Dmx4jport=9081
Adding API methods Same goes for modifying Define method and structures in IDL interface/cassandra.thrift Regenerate files ant gen-thrift-java gen-thrift-py Implement methods in o.a.c.thrift.CassandraServer Create a system test (tests/system/test_thrift_server.py)
Questions? gdusbabek@gmail.com @gdusbabek

Contenu connexe

Tendances

Gude for C++11 in Apache Traffic Server
Gude for C++11 in Apache Traffic ServerGude for C++11 in Apache Traffic Server
Gude for C++11 in Apache Traffic ServerApache Traffic Server
 
Cassandra SF Meetup - CQL Performance With Apache Cassandra 3.X
Cassandra SF Meetup - CQL Performance With Apache Cassandra 3.XCassandra SF Meetup - CQL Performance With Apache Cassandra 3.X
Cassandra SF Meetup - CQL Performance With Apache Cassandra 3.Xaaronmorton
 
Monitoring with Prometheus
Monitoring with PrometheusMonitoring with Prometheus
Monitoring with PrometheusShiao-An Yuan
 
Apache Kafka DC Meetup: Replicating DB Binary Logs to Kafka
Apache Kafka DC Meetup: Replicating DB Binary Logs to KafkaApache Kafka DC Meetup: Replicating DB Binary Logs to Kafka
Apache Kafka DC Meetup: Replicating DB Binary Logs to KafkaMark Bittmann
 
Back-Pressure in Action: Handling High-Burst Workloads with Akka Streams & Ka...
Back-Pressure in Action: Handling High-Burst Workloads with Akka Streams & Ka...Back-Pressure in Action: Handling High-Burst Workloads with Akka Streams & Ka...
Back-Pressure in Action: Handling High-Burst Workloads with Akka Streams & Ka...Reactivesummit
 
Back-Pressure in Action: Handling High-Burst Workloads with Akka Streams & Kafka
Back-Pressure in Action: Handling High-Burst Workloads with Akka Streams & KafkaBack-Pressure in Action: Handling High-Burst Workloads with Akka Streams & Kafka
Back-Pressure in Action: Handling High-Burst Workloads with Akka Streams & KafkaAkara Sucharitakul
 
Openstack meetup lyon_2017-09-28
Openstack meetup lyon_2017-09-28Openstack meetup lyon_2017-09-28
Openstack meetup lyon_2017-09-28Xavier Lucas
 
Specs2 whirlwind tour at Scaladays 2014
Specs2 whirlwind tour at Scaladays 2014Specs2 whirlwind tour at Scaladays 2014
Specs2 whirlwind tour at Scaladays 2014Eric Torreborre
 
Logging for OpenStack - Elasticsearch, Fluentd, Logstash, Kibana
Logging for OpenStack - Elasticsearch, Fluentd, Logstash, KibanaLogging for OpenStack - Elasticsearch, Fluentd, Logstash, Kibana
Logging for OpenStack - Elasticsearch, Fluentd, Logstash, KibanaMd Safiyat Reza
 
Training Slides: Intermediate 205: Configuring Tungsten Replicator to Extract...
Training Slides: Intermediate 205: Configuring Tungsten Replicator to Extract...Training Slides: Intermediate 205: Configuring Tungsten Replicator to Extract...
Training Slides: Intermediate 205: Configuring Tungsten Replicator to Extract...Continuent
 
Introduction to Akka-Streams
Introduction to Akka-StreamsIntroduction to Akka-Streams
Introduction to Akka-Streamsdmantula
 
Testing Kafka components with Kafka for JUnit
Testing Kafka components with Kafka for JUnitTesting Kafka components with Kafka for JUnit
Testing Kafka components with Kafka for JUnitMarkus Günther
 
Kafka Summit NYC 2017 - Running Hundreds of Kafka Clusters with 5 People
Kafka Summit NYC 2017 - Running Hundreds of Kafka Clusters with 5 PeopleKafka Summit NYC 2017 - Running Hundreds of Kafka Clusters with 5 People
Kafka Summit NYC 2017 - Running Hundreds of Kafka Clusters with 5 Peopleconfluent
 
Real-time streaming and data pipelines with Apache Kafka
Real-time streaming and data pipelines with Apache KafkaReal-time streaming and data pipelines with Apache Kafka
Real-time streaming and data pipelines with Apache KafkaJoe Stein
 
Reactive Streams / Akka Streams - GeeCON Prague 2014
Reactive Streams / Akka Streams - GeeCON Prague 2014Reactive Streams / Akka Streams - GeeCON Prague 2014
Reactive Streams / Akka Streams - GeeCON Prague 2014Konrad Malawski
 
Deterministic behaviour and performance in trading systems
Deterministic behaviour and performance in trading systemsDeterministic behaviour and performance in trading systems
Deterministic behaviour and performance in trading systemsPeter Lawrey
 
Building Scalable Stateless Applications with RxJava
Building Scalable Stateless Applications with RxJavaBuilding Scalable Stateless Applications with RxJava
Building Scalable Stateless Applications with RxJavaRick Warren
 
Determinism in finance
Determinism in financeDeterminism in finance
Determinism in financePeter Lawrey
 

Tendances (20)

Gude for C++11 in Apache Traffic Server
Gude for C++11 in Apache Traffic ServerGude for C++11 in Apache Traffic Server
Gude for C++11 in Apache Traffic Server
 
Cassandra SF Meetup - CQL Performance With Apache Cassandra 3.X
Cassandra SF Meetup - CQL Performance With Apache Cassandra 3.XCassandra SF Meetup - CQL Performance With Apache Cassandra 3.X
Cassandra SF Meetup - CQL Performance With Apache Cassandra 3.X
 
Monitoring with Prometheus
Monitoring with PrometheusMonitoring with Prometheus
Monitoring with Prometheus
 
Apache Kafka DC Meetup: Replicating DB Binary Logs to Kafka
Apache Kafka DC Meetup: Replicating DB Binary Logs to KafkaApache Kafka DC Meetup: Replicating DB Binary Logs to Kafka
Apache Kafka DC Meetup: Replicating DB Binary Logs to Kafka
 
Back-Pressure in Action: Handling High-Burst Workloads with Akka Streams & Ka...
Back-Pressure in Action: Handling High-Burst Workloads with Akka Streams & Ka...Back-Pressure in Action: Handling High-Burst Workloads with Akka Streams & Ka...
Back-Pressure in Action: Handling High-Burst Workloads with Akka Streams & Ka...
 
Back-Pressure in Action: Handling High-Burst Workloads with Akka Streams & Kafka
Back-Pressure in Action: Handling High-Burst Workloads with Akka Streams & KafkaBack-Pressure in Action: Handling High-Burst Workloads with Akka Streams & Kafka
Back-Pressure in Action: Handling High-Burst Workloads with Akka Streams & Kafka
 
Openstack meetup lyon_2017-09-28
Openstack meetup lyon_2017-09-28Openstack meetup lyon_2017-09-28
Openstack meetup lyon_2017-09-28
 
Specs2 whirlwind tour at Scaladays 2014
Specs2 whirlwind tour at Scaladays 2014Specs2 whirlwind tour at Scaladays 2014
Specs2 whirlwind tour at Scaladays 2014
 
Logging for OpenStack - Elasticsearch, Fluentd, Logstash, Kibana
Logging for OpenStack - Elasticsearch, Fluentd, Logstash, KibanaLogging for OpenStack - Elasticsearch, Fluentd, Logstash, Kibana
Logging for OpenStack - Elasticsearch, Fluentd, Logstash, Kibana
 
Training Slides: Intermediate 205: Configuring Tungsten Replicator to Extract...
Training Slides: Intermediate 205: Configuring Tungsten Replicator to Extract...Training Slides: Intermediate 205: Configuring Tungsten Replicator to Extract...
Training Slides: Intermediate 205: Configuring Tungsten Replicator to Extract...
 
Introduction to Akka-Streams
Introduction to Akka-StreamsIntroduction to Akka-Streams
Introduction to Akka-Streams
 
Testing Kafka components with Kafka for JUnit
Testing Kafka components with Kafka for JUnitTesting Kafka components with Kafka for JUnit
Testing Kafka components with Kafka for JUnit
 
NodeJs
NodeJsNodeJs
NodeJs
 
Kafka Summit NYC 2017 - Running Hundreds of Kafka Clusters with 5 People
Kafka Summit NYC 2017 - Running Hundreds of Kafka Clusters with 5 PeopleKafka Summit NYC 2017 - Running Hundreds of Kafka Clusters with 5 People
Kafka Summit NYC 2017 - Running Hundreds of Kafka Clusters with 5 People
 
Real-time streaming and data pipelines with Apache Kafka
Real-time streaming and data pipelines with Apache KafkaReal-time streaming and data pipelines with Apache Kafka
Real-time streaming and data pipelines with Apache Kafka
 
Reactive Streams / Akka Streams - GeeCON Prague 2014
Reactive Streams / Akka Streams - GeeCON Prague 2014Reactive Streams / Akka Streams - GeeCON Prague 2014
Reactive Streams / Akka Streams - GeeCON Prague 2014
 
Mario on spark
Mario on sparkMario on spark
Mario on spark
 
Deterministic behaviour and performance in trading systems
Deterministic behaviour and performance in trading systemsDeterministic behaviour and performance in trading systems
Deterministic behaviour and performance in trading systems
 
Building Scalable Stateless Applications with RxJava
Building Scalable Stateless Applications with RxJavaBuilding Scalable Stateless Applications with RxJava
Building Scalable Stateless Applications with RxJava
 
Determinism in finance
Determinism in financeDeterminism in finance
Determinism in finance
 

Similaire à Getting to Know the Cassandra Codebase

Apache Con NA 2013 - Cassandra Internals
Apache Con NA 2013 - Cassandra InternalsApache Con NA 2013 - Cassandra Internals
Apache Con NA 2013 - Cassandra Internalsaaronmorton
 
Service messaging using Kafka
Service messaging using KafkaService messaging using Kafka
Service messaging using KafkaRobert Vadai
 
Apache Cassandra in Bangalore - Cassandra Internals and Performance
Apache Cassandra in Bangalore - Cassandra Internals and PerformanceApache Cassandra in Bangalore - Cassandra Internals and Performance
Apache Cassandra in Bangalore - Cassandra Internals and Performanceaaronmorton
 
Spinnaker VLDB 2011
Spinnaker VLDB 2011Spinnaker VLDB 2011
Spinnaker VLDB 2011sandeep_tata
 
Introduction to apache_cassandra_for_developers-lhg
Introduction to apache_cassandra_for_developers-lhgIntroduction to apache_cassandra_for_developers-lhg
Introduction to apache_cassandra_for_developers-lhgzznate
 
weblogic perfomence tuning
weblogic perfomence tuningweblogic perfomence tuning
weblogic perfomence tuningprathap kumar
 
Cassandra Java APIs Old and New – A Comparison
Cassandra Java APIs Old and New – A ComparisonCassandra Java APIs Old and New – A Comparison
Cassandra Java APIs Old and New – A Comparisonshsedghi
 
Learning spark ch10 - Spark Streaming
Learning spark ch10 - Spark StreamingLearning spark ch10 - Spark Streaming
Learning spark ch10 - Spark Streamingphanleson
 
Recipes for Running Spark Streaming Applications in Production-(Tathagata Das...
Recipes for Running Spark Streaming Applications in Production-(Tathagata Das...Recipes for Running Spark Streaming Applications in Production-(Tathagata Das...
Recipes for Running Spark Streaming Applications in Production-(Tathagata Das...Spark Summit
 
Building Continuous Application with Structured Streaming and Real-Time Data ...
Building Continuous Application with Structured Streaming and Real-Time Data ...Building Continuous Application with Structured Streaming and Real-Time Data ...
Building Continuous Application with Structured Streaming and Real-Time Data ...Databricks
 
Non-blocking I/O, Event loops and node.js
Non-blocking I/O, Event loops and node.jsNon-blocking I/O, Event loops and node.js
Non-blocking I/O, Event loops and node.jsMarcus Frödin
 
Web program-peformance-optimization
Web program-peformance-optimizationWeb program-peformance-optimization
Web program-peformance-optimizationxiaojueqq12345
 
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...Guido Schmutz
 
Building a High-Performance Database with Scala, Akka, and Spark
Building a High-Performance Database with Scala, Akka, and SparkBuilding a High-Performance Database with Scala, Akka, and Spark
Building a High-Performance Database with Scala, Akka, and SparkEvan Chan
 
Introduction to JDBC and database access in web applications
Introduction to JDBC and database access in web applicationsIntroduction to JDBC and database access in web applications
Introduction to JDBC and database access in web applicationsFulvio Corno
 

Similaire à Getting to Know the Cassandra Codebase (20)

Apache Con NA 2013 - Cassandra Internals
Apache Con NA 2013 - Cassandra InternalsApache Con NA 2013 - Cassandra Internals
Apache Con NA 2013 - Cassandra Internals
 
NoSql Database
NoSql DatabaseNoSql Database
NoSql Database
 
Service messaging using Kafka
Service messaging using KafkaService messaging using Kafka
Service messaging using Kafka
 
Apache Cassandra in Bangalore - Cassandra Internals and Performance
Apache Cassandra in Bangalore - Cassandra Internals and PerformanceApache Cassandra in Bangalore - Cassandra Internals and Performance
Apache Cassandra in Bangalore - Cassandra Internals and Performance
 
Spinnaker VLDB 2011
Spinnaker VLDB 2011Spinnaker VLDB 2011
Spinnaker VLDB 2011
 
Introduction to apache_cassandra_for_developers-lhg
Introduction to apache_cassandra_for_developers-lhgIntroduction to apache_cassandra_for_developers-lhg
Introduction to apache_cassandra_for_developers-lhg
 
weblogic perfomence tuning
weblogic perfomence tuningweblogic perfomence tuning
weblogic perfomence tuning
 
No sql
No sqlNo sql
No sql
 
Cassandra Java APIs Old and New – A Comparison
Cassandra Java APIs Old and New – A ComparisonCassandra Java APIs Old and New – A Comparison
Cassandra Java APIs Old and New – A Comparison
 
Learning spark ch10 - Spark Streaming
Learning spark ch10 - Spark StreamingLearning spark ch10 - Spark Streaming
Learning spark ch10 - Spark Streaming
 
Recipes for Running Spark Streaming Applications in Production-(Tathagata Das...
Recipes for Running Spark Streaming Applications in Production-(Tathagata Das...Recipes for Running Spark Streaming Applications in Production-(Tathagata Das...
Recipes for Running Spark Streaming Applications in Production-(Tathagata Das...
 
Building Continuous Application with Structured Streaming and Real-Time Data ...
Building Continuous Application with Structured Streaming and Real-Time Data ...Building Continuous Application with Structured Streaming and Real-Time Data ...
Building Continuous Application with Structured Streaming and Real-Time Data ...
 
Jdbc ppt
Jdbc pptJdbc ppt
Jdbc ppt
 
Non-blocking I/O, Event loops and node.js
Non-blocking I/O, Event loops and node.jsNon-blocking I/O, Event loops and node.js
Non-blocking I/O, Event loops and node.js
 
Web program-peformance-optimization
Web program-peformance-optimizationWeb program-peformance-optimization
Web program-peformance-optimization
 
Jdbc[1]
Jdbc[1]Jdbc[1]
Jdbc[1]
 
JDBC programming
JDBC programmingJDBC programming
JDBC programming
 
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
 
Building a High-Performance Database with Scala, Akka, and Spark
Building a High-Performance Database with Scala, Akka, and SparkBuilding a High-Performance Database with Scala, Akka, and Spark
Building a High-Performance Database with Scala, Akka, and Spark
 
Introduction to JDBC and database access in web applications
Introduction to JDBC and database access in web applicationsIntroduction to JDBC and database access in web applications
Introduction to JDBC and database access in web applications
 

Plus de gdusbabek

My Futuristic Vision of the Future of Cassandra's Future - NGCC 2015
My Futuristic Vision of the Future of Cassandra's Future - NGCC 2015My Futuristic Vision of the Future of Cassandra's Future - NGCC 2015
My Futuristic Vision of the Future of Cassandra's Future - NGCC 2015gdusbabek
 
How To (Not) Open Source - Javazone, Oslo 2014
How To (Not) Open Source - Javazone, Oslo 2014How To (Not) Open Source - Javazone, Oslo 2014
How To (Not) Open Source - Javazone, Oslo 2014gdusbabek
 
Blueflood and Beyond: The Future of Metrics - Berlin Buzzwords 2014
Blueflood and Beyond: The Future of Metrics - Berlin Buzzwords 2014Blueflood and Beyond: The Future of Metrics - Berlin Buzzwords 2014
Blueflood and Beyond: The Future of Metrics - Berlin Buzzwords 2014gdusbabek
 
Measure All the Things! - Austin Data Day 2014
Measure All the Things! - Austin Data Day 2014Measure All the Things! - Austin Data Day 2014
Measure All the Things! - Austin Data Day 2014gdusbabek
 
Blueflood: Open Source Metrics Processing at CassandraEU 2013
Blueflood: Open Source Metrics Processing at CassandraEU 2013Blueflood: Open Source Metrics Processing at CassandraEU 2013
Blueflood: Open Source Metrics Processing at CassandraEU 2013gdusbabek
 
Introduction to Blueflood at Berlin Buzzwords 2013
Introduction to Blueflood at Berlin Buzzwords 2013Introduction to Blueflood at Berlin Buzzwords 2013
Introduction to Blueflood at Berlin Buzzwords 2013gdusbabek
 
Rackspace Cloud Monitoring - Strata NYC
Rackspace Cloud Monitoring - Strata NYCRackspace Cloud Monitoring - Strata NYC
Rackspace Cloud Monitoring - Strata NYCgdusbabek
 
Austin cassandra meetup
Austin cassandra meetupAustin cassandra meetup
Austin cassandra meetupgdusbabek
 
How Rackspace Cloud Monitoring uses Cassandra
How Rackspace Cloud Monitoring uses CassandraHow Rackspace Cloud Monitoring uses Cassandra
How Rackspace Cloud Monitoring uses Cassandragdusbabek
 
Breaking the Relational Headlock: A Survey of NoSQL Datastores
Breaking the Relational Headlock: A Survey of NoSQL DatastoresBreaking the Relational Headlock: A Survey of NoSQL Datastores
Breaking the Relational Headlock: A Survey of NoSQL Datastoresgdusbabek
 
Building Rackspace Cloud Monitoring
Building Rackspace Cloud MonitoringBuilding Rackspace Cloud Monitoring
Building Rackspace Cloud Monitoringgdusbabek
 
Data Modeling with Cassandra Column Families
Data Modeling with Cassandra Column FamiliesData Modeling with Cassandra Column Families
Data Modeling with Cassandra Column Familiesgdusbabek
 
Introduction to Cassandra (June 2010)
Introduction to Cassandra (June 2010)Introduction to Cassandra (June 2010)
Introduction to Cassandra (June 2010)gdusbabek
 
Cassandra Presentation for San Antonio JUG
Cassandra Presentation for San Antonio JUGCassandra Presentation for San Antonio JUG
Cassandra Presentation for San Antonio JUGgdusbabek
 

Plus de gdusbabek (14)

My Futuristic Vision of the Future of Cassandra's Future - NGCC 2015
My Futuristic Vision of the Future of Cassandra's Future - NGCC 2015My Futuristic Vision of the Future of Cassandra's Future - NGCC 2015
My Futuristic Vision of the Future of Cassandra's Future - NGCC 2015
 
How To (Not) Open Source - Javazone, Oslo 2014
How To (Not) Open Source - Javazone, Oslo 2014How To (Not) Open Source - Javazone, Oslo 2014
How To (Not) Open Source - Javazone, Oslo 2014
 
Blueflood and Beyond: The Future of Metrics - Berlin Buzzwords 2014
Blueflood and Beyond: The Future of Metrics - Berlin Buzzwords 2014Blueflood and Beyond: The Future of Metrics - Berlin Buzzwords 2014
Blueflood and Beyond: The Future of Metrics - Berlin Buzzwords 2014
 
Measure All the Things! - Austin Data Day 2014
Measure All the Things! - Austin Data Day 2014Measure All the Things! - Austin Data Day 2014
Measure All the Things! - Austin Data Day 2014
 
Blueflood: Open Source Metrics Processing at CassandraEU 2013
Blueflood: Open Source Metrics Processing at CassandraEU 2013Blueflood: Open Source Metrics Processing at CassandraEU 2013
Blueflood: Open Source Metrics Processing at CassandraEU 2013
 
Introduction to Blueflood at Berlin Buzzwords 2013
Introduction to Blueflood at Berlin Buzzwords 2013Introduction to Blueflood at Berlin Buzzwords 2013
Introduction to Blueflood at Berlin Buzzwords 2013
 
Rackspace Cloud Monitoring - Strata NYC
Rackspace Cloud Monitoring - Strata NYCRackspace Cloud Monitoring - Strata NYC
Rackspace Cloud Monitoring - Strata NYC
 
Austin cassandra meetup
Austin cassandra meetupAustin cassandra meetup
Austin cassandra meetup
 
How Rackspace Cloud Monitoring uses Cassandra
How Rackspace Cloud Monitoring uses CassandraHow Rackspace Cloud Monitoring uses Cassandra
How Rackspace Cloud Monitoring uses Cassandra
 
Breaking the Relational Headlock: A Survey of NoSQL Datastores
Breaking the Relational Headlock: A Survey of NoSQL DatastoresBreaking the Relational Headlock: A Survey of NoSQL Datastores
Breaking the Relational Headlock: A Survey of NoSQL Datastores
 
Building Rackspace Cloud Monitoring
Building Rackspace Cloud MonitoringBuilding Rackspace Cloud Monitoring
Building Rackspace Cloud Monitoring
 
Data Modeling with Cassandra Column Families
Data Modeling with Cassandra Column FamiliesData Modeling with Cassandra Column Families
Data Modeling with Cassandra Column Families
 
Introduction to Cassandra (June 2010)
Introduction to Cassandra (June 2010)Introduction to Cassandra (June 2010)
Introduction to Cassandra (June 2010)
 
Cassandra Presentation for San Antonio JUG
Cassandra Presentation for San Antonio JUGCassandra Presentation for San Antonio JUG
Cassandra Presentation for San Antonio JUG
 

Dernier

Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Bhuvaneswari Subramani
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontologyjohnbeverley2021
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Angeliki Cooney
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 

Dernier (20)

Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 

Getting to Know the Cassandra Codebase

  • 1. Cozy with Cassandra Getting to know the Cassandra Codebase Gary Dusbabek • Rackspace @gdusbabek Cassandra Summit • Mission Bay Conference Center • San Francisco • 10 August 2010
  • 2. Outline Code Themes Startup Sequence Key Classes Read Path Write Path Stages & Threading Bootstrap & Streaming Tests & IDE considerations Adding API methods Questions
  • 5. Themes and Patterns Singletons and statics
  • 6. Themes and Patterns Stages & Thread pools
  • 8. CassandraDaemon Loads configuration Transport initialization Storage (Keyspace initialization) CommitLogrecovery StorageService.initServer() Initializes CassandraServer Passes it off to transport
  • 9. CassandraServer Implements IDL interface methods (cassandra.thrift, cassandra.genavro) Good place to start diving when adding or troubleshooting API methods
  • 10. Configuration DatabaseDescriptor Via CassandraDaemon.setup() Looks for config path, loads yaml Doesn’t spin anything up Defines system tables KS and CF described by CFMetaData and KSMetaData
  • 12. Main Controllers End with *Service or *Manager StorageService, MessagingService CompactionManager, HintedHandoffManager, StageManager, StreamInManager
  • 13. StorageProxy Put & Get methods Collection of static methods Merges local and distributed operations Tracks latency Exposed via StorageProxyMBean
  • 14. StorageService initServer()—Starts services Registers verb handlers (in MessagingService) Main event responders Repository of replication strategies and TokenMetadata Ring topology & token information
  • 15. MessagingService Verb handlers reside here Sets up socket listeners Gateway for outbound messages MS.sendRR() MS.sendOneWay() Inbound too MS.receive()
  • 16. Table & ColumnFamilyStore Also RowMutation Low-level storage operations o.a.c.db.* SSTable Local operations
  • 18. Reading Socket->CassandraServer Permissions Request validation Marshalling
  • 19. Reading StorageProxy Ranges Collectors Local & remote branches
  • 20. Reading StorageProxy local Table, ColumnFamilyStore CFS Make QueryFilter Query Memtables Query SSTables Coalesce in iterators o.a.c.db package o.a.c.db.filter
  • 21. Reading StorageProxy remote read command Response handler Send to remote nodes
  • 22. Writing Socket->CassandraServer Validation Convert to Mutation (IDL object) Penalties!
  • 23. Writing StorageProxy blocking/non-blocking mutate local/remote branch RowMutation one ColumnFamily per ColumnFamily Collection of column modifications
  • 24. Writing RM.apply->Table.apply Write to CL Iterate over RM CFs CFS.apply() Overwrites results on pre-existing column families
  • 25. Writing RM is serialized into a Message and sent to other nodes Waits for ACKs depending on CL
  • 27. Stages SEDA http://www.eecs.harvard.edu/~mdw/papers/seda-sosp01.pdf o.a.c.concurrent.StageManager Read, mutation, stream, gossip, response, anti entropy, load balance, migration Thread pools that consume tasks
  • 28. Threads MessagingService.listen() spawns thread. Each incoming connection spawns a new short-lived thread (IncomingTcpConnection) Non-stream ops go to MS.messageDeserializerExecutor_ Stream ops handled there. Anti-entropy repair
  • 29. Bootstraping! Any interest? //FIXMETODOCRAP
  • 30. <=0.6 Bootstraping A wants data, B has data. StreamingRequestMessage A->B Handled on B by StreamRequestVerbHandler For each range StreamOut.transferRanges() Flush, anticompaction StreamInitiateMessage B->A for each range transfer Meanwhile, back on A… StreamInitiateVerbHandler gets the SIM from B, does some nesting. StreamInitiateDone A->B Back to B… StreamInitiateDoneHandler gets the SID from A Calls StreamOutManager.startNext() which sends a single file to A MessagingService on A picks this up and the file is streamed. Sstable is created STREAM_FINISHED A->B B gets rid of the file, calls SOM.startNext()
  • 31. 0.7 Bootstrapping A wants data, B has data StreamRequestMessage A->B On B, StreamRequestVerbHandler If single file, sends it. If range, StreamOut.transferRangesForRequest() Send next file (first will contain meta data about all files) On A, IncomingStreamReader.read() Data is received, sstable created Ack, request next file
  • 32. Tests Testable & Untestable Unit tests ant clean build test System tests ant gen-thrift-py nosetests test/system/test_thrift_server.py
  • 33. IDE Configuration file must be in the classpath Treat as source lib vs build/lib Log at debug
  • 34. IDE -ea -Xms128M –Xmx2G -Dcom.sun.management.jmxremote.port=8081 -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false -Dcassandra-foreground=yes -Dlog4j.configuration=log4j-server.properties -Dmx4jport=9081
  • 35. Adding API methods Same goes for modifying Define method and structures in IDL interface/cassandra.thrift Regenerate files ant gen-thrift-java gen-thrift-py Implement methods in o.a.c.thrift.CassandraServer Create a system test (tests/system/test_thrift_server.py)

Notes de l'éditeur

  1. Talk about Cassandra processes and the classes that relate to them.
  2. Data operated on,then passed off to another layer.Evident on R/W pathOnion analogy
  3. DisjointCoupled when neccessary
  4. Model of class/object designSuck it upNot a class project
  5. ServicesGossiperMessagingServiceMigrationManagerBootstrapPreloaded cacheHandle to server-related tasks.
  6. layers