SlideShare une entreprise Scribd logo
1  sur  32
JMXExpress 
Transporting Cassandra Metrics 
To Graphite
Cassandra Is Awesome 
● No Single Point of Failure 
● Fault Tolerant 
● Multi-DC Is A Picnic 
● Great Properties That Let Ops Teams to 
Sleep at 2 AM
Robustness Have Price 
● C* Isn’t A Fire and Forget System :( 
● Most Times You Don’t Notice Problems 
o Things can go up/down for a minutes 
o C* Simply Queues Request, and Services Still 
Running, but nobody notices
Be Proactive 
Do Daily/Weekly Checkups to detect and 
prevent Problems: 
● Capacity 
● Exceptions 
● Performance Bottlenecks 
● Data Modeling Issues
Reactive 
● Something Will Go Wrong: 
o Hardware Failures 
o Bugs 
o Malicious or Non-Malicious Users 
● Alarms: NOC, Pager-Duty
Proactive or Reactive? 
● You Need Data 
o Form Alerts 
o Find Anomalies 
o Trends 
o Debugging 
● You Should Monitor Everything
Gathering Metrics 
● Cassandra 
o OpsCenter 
o JMX 
o Nodetool 
o Logs 
● Environment 
o CPU, Memory, Disks, Network, … 
o Logs 
o JVM
Give Data Context 
You Should Give the 
Data Context … 
Otherwise it’s just pretty 
Graphs...
JMX 
● Java Management Extensions 
● Complex… 
● Resources are presented as Objects with 
Attributes 
● Used for Both Monitoring and For Actions
Native JMX 
● Un-Friendly way to get metrics 
o Requires Java 
o Slow and have memory leaks 
o Nightmare for Ops (Network/Security) 
Client Cassandra 
Init Port 7199 
Reply 
Hostname:Port 
7199 
1- Get new 
7199 
host/port 
2- Drop old conn 
3- Connect with 
new host/port 1024-65536 
Init Port 7199
JMX Tools 
● Visual 
o JConsole 
o VisualVM 
o Commercial 
● Command Line 
o jmxterm 
o jmxsh 
● Jolokia 
● MX4J
JMX Syntax 
[domain]:[key1]=[value1],[key2]=[value2] … 
org.apache.cassandra.metrics:type=ColumnFamily,keyspace=outbrain,scope=user_events,name=TotalDiskSpaceUsed
JMX Domains 
org.apache.cassandra 
● db 
● internal 
● net 
● request 
org.apache.cassandra.metrics
JMX Types 
org.apache.cassandra.metrics: type= 
● Cache 
● Client 
● ClientRequest 
● ClientRequestMetrics 
● ColumnFamily 
● CommitLog 
● Compaction 
● DroppedMessages 
● FileCache 
● Storage 
● ThreadPools
Coda-Hale Metrics 
● Toolkit called metrics from metrics 
o By Yammer Coda-Hale Library 
● Easy to Use 
● Easy to Read (If you speak Java) 
● Popular
Types of Metrics 
● Gauge: Instantaneous value 
● Counter: number that can be 
incremented/decremented 
● Meter: Rate of Events Over time 
(request/second/minutes/5min/15min) 
● Histogram: Statistical Distribution 
o 50,75,95,98,99,99.9 percentile 
o average/median/min/max/stddev 
● Timer:rate of events/historgram of 
duration
75th percentile is 650.75 us 
(75% took 650.75us or less) 
One Minute Write rate is 
13,915 per second
Native JMX 
● Its overwhelming at first 
● Hard to tell what they mean with the source 
● Moves around a lot between versions 
● Fortunately there is nodetool
Coda-Hale Reporting Interface 
Coda-Hale Metrics Library: 
● Default 
o JMX 
o Console 
o CSV 
o Slf4J 
● Addons 
o Ganglia / Graphite 
● Community 
o Cassandra / StatsD / NewRelic / Splunk / Cloudwatch 
o Kafka / Riemann / TempDB/ Munin / Riak / InfluxDB / Sematext 
o MongoDB / OpenTSDB/ Librato 
o … More
Reporting Interface Activation 
● Metrics library: 
o Included in Cassandra since 1.1 
o Pre 2.0 It required writing your Java agent reporter
Pluggable Metrics in Cassandra 2.0.2 
● Starting from Cassandra 2.0.2, you need only to configure special YAML 
file: 
/etc/cassandra/metrics-reporter-config-graphite.yaml 
● Load the Coda-Hale metrics by including the build-in agent in the 
cassandra-env.sh file 
-Dcassandra.metricsReporterConfigFile=yourCoolFile.yaml 
● Save the file in /etc/cassandra/ directory only and don’t specify full path, 
otherwise it will not work
Pluggable Metrics in Cassandra 2.0.2 
Yaml Example: 
graphite: 
- 
period: 60 
timeunit: 'SECONDS' 
hosts: 
- host: 'graphite' 
port: 2003 
predicate: 
color: "white" 
useQualifiedName: true 
patterns: 
- "^org.apache.cassandra.metrics.Cache.+" 
- "^org.apache.cassandra.metrics.ClientRequest.+" 
- "^org.apache.cassandra.metrics.Storage.+" 
- "^org.apache.cassandra.metrics.ThreadPools.+"
Caveats of Pluggable Metrics 
- Works only in 2.0.2 or higher 
- Has bad metrics names: sometimes begins 
with ‘.’ and not suitable for Graphite Tree 
- Limited ability to manipulate metrics
Our Approach 
- Use older version (2.0.3) of Metrics Library 
that fits to all C* version (down to 1.1) 
- Write our own Java agent for backward 
compatibility 
- Run the metrics via Manipulator daemon to 
be able for reformat them and fit them to our 
dashboards
The Java Agent 
From the Documentation
The Java Agent 
● Compiling it: 
javac -cp $CASSANDRA_HOME/lib/metrics-core-2.0.3.jar:$CASSANDRA_HOME/lib/metrics-graphite-2.0.3.jar 
com/datastax/example/ReportAgent.java 
$ jar -cfM reporter.jar . 
● Loading the Agent with Cassandra 
(Edit cassandra-env.sh and add the following line to the bottom) 
JVM_OPTS="-javaagent:/path/to/your/reporter.jar $JVM_OPTS"
Manipulating the Metrics 
● Metrics comes in org.apache.cassandra… 
syntax 
● They don’t fit into our Graphite Scheme 
● Some metrics begins with . (dot) 
● Need to be able to filter and manipulate 
metrics
Manipulating the Metrics 
We have build a Simple Bash script that poses 
to a Graphite server and manipulates the 
metrics as we wish: 
● We change the prefix 
● We can filter metrics 
● Keep unified output 
● Solve some syntax issues like IP addresses 
read by Graphite as separate metric tree
Metrics in Graphite (Sample: Write Latency Histograms)
Monitoring Cassandra with graphite using Yammer Coda-Hale Library
Monitoring Cassandra with graphite using Yammer Coda-Hale Library

Contenu connexe

Tendances

Maintaining Consistency Across Data Centers (Randy Fradin, BlackRock) | Cassa...
Maintaining Consistency Across Data Centers (Randy Fradin, BlackRock) | Cassa...Maintaining Consistency Across Data Centers (Randy Fradin, BlackRock) | Cassa...
Maintaining Consistency Across Data Centers (Randy Fradin, BlackRock) | Cassa...
DataStax
 

Tendances (20)

Cassandra multi-datacenter operations essentials
Cassandra multi-datacenter operations essentialsCassandra multi-datacenter operations essentials
Cassandra multi-datacenter operations essentials
 
Scaling with sync_replication using Galera and EC2
Scaling with sync_replication using Galera and EC2Scaling with sync_replication using Galera and EC2
Scaling with sync_replication using Galera and EC2
 
Maintaining Consistency Across Data Centers (Randy Fradin, BlackRock) | Cassa...
Maintaining Consistency Across Data Centers (Randy Fradin, BlackRock) | Cassa...Maintaining Consistency Across Data Centers (Randy Fradin, BlackRock) | Cassa...
Maintaining Consistency Across Data Centers (Randy Fradin, BlackRock) | Cassa...
 
Reactive mistakes - ScalaDays Chicago 2017
Reactive mistakes -  ScalaDays Chicago 2017Reactive mistakes -  ScalaDays Chicago 2017
Reactive mistakes - ScalaDays Chicago 2017
 
合并到 XtraDB 存储引擎集群
合并到 XtraDB 存储引擎集群合并到 XtraDB 存储引擎集群
合并到 XtraDB 存储引擎集群
 
Node.js and Cassandra
Node.js and CassandraNode.js and Cassandra
Node.js and Cassandra
 
Tales from Taming the Long Tail
Tales from Taming the Long TailTales from Taming the Long Tail
Tales from Taming the Long Tail
 
Using advanced options in MariaDB Connector/J
Using advanced options in MariaDB Connector/JUsing advanced options in MariaDB Connector/J
Using advanced options in MariaDB Connector/J
 
Building and running cloud native cassandra
Building and running cloud native cassandraBuilding and running cloud native cassandra
Building and running cloud native cassandra
 
Scylla Summit 2018: Make Scylla Fast Again! Find out how using Tools, Talent,...
Scylla Summit 2018: Make Scylla Fast Again! Find out how using Tools, Talent,...Scylla Summit 2018: Make Scylla Fast Again! Find out how using Tools, Talent,...
Scylla Summit 2018: Make Scylla Fast Again! Find out how using Tools, Talent,...
 
Instaclustr Webinar 50,000 Transactions Per Second with Apache Spark on Apach...
Instaclustr Webinar 50,000 Transactions Per Second with Apache Spark on Apach...Instaclustr Webinar 50,000 Transactions Per Second with Apache Spark on Apach...
Instaclustr Webinar 50,000 Transactions Per Second with Apache Spark on Apach...
 
Scylla Summit 2018: How Scylla Helps You to be a Better Application Developer
Scylla Summit 2018: How Scylla Helps You to be a Better Application DeveloperScylla Summit 2018: How Scylla Helps You to be a Better Application Developer
Scylla Summit 2018: How Scylla Helps You to be a Better Application Developer
 
Micro-batching: High-performance writes
Micro-batching: High-performance writesMicro-batching: High-performance writes
Micro-batching: High-performance writes
 
Scylla Summit 2018: Building Recoverable (and optionally Async) Spark Pipelines
Scylla Summit 2018: Building Recoverable (and optionally Async) Spark PipelinesScylla Summit 2018: Building Recoverable (and optionally Async) Spark Pipelines
Scylla Summit 2018: Building Recoverable (and optionally Async) Spark Pipelines
 
HBaseCon2017 HBase at Xiaomi
HBaseCon2017 HBase at XiaomiHBaseCon2017 HBase at Xiaomi
HBaseCon2017 HBase at Xiaomi
 
Building a Real-time Streaming ETL Framework Using ksqlDB and NoSQL
Building a Real-time Streaming ETL Framework Using ksqlDB and NoSQLBuilding a Real-time Streaming ETL Framework Using ksqlDB and NoSQL
Building a Real-time Streaming ETL Framework Using ksqlDB and NoSQL
 
Introduction to .Net Driver
Introduction to .Net DriverIntroduction to .Net Driver
Introduction to .Net Driver
 
Update on OpenTSDB and AsyncHBase
Update on OpenTSDB and AsyncHBase Update on OpenTSDB and AsyncHBase
Update on OpenTSDB and AsyncHBase
 
OpenTSDB: HBaseCon2017
OpenTSDB: HBaseCon2017OpenTSDB: HBaseCon2017
OpenTSDB: HBaseCon2017
 
hbaseconasia2017: HBase Practice At XiaoMi
hbaseconasia2017: HBase Practice At XiaoMihbaseconasia2017: HBase Practice At XiaoMi
hbaseconasia2017: HBase Practice At XiaoMi
 

En vedette

Advanced Cassandra Operations via JMX (Nate McCall, The Last Pickle) | C* Sum...
Advanced Cassandra Operations via JMX (Nate McCall, The Last Pickle) | C* Sum...Advanced Cassandra Operations via JMX (Nate McCall, The Last Pickle) | C* Sum...
Advanced Cassandra Operations via JMX (Nate McCall, The Last Pickle) | C* Sum...
DataStax
 
BPF: Tracing and more
BPF: Tracing and moreBPF: Tracing and more
BPF: Tracing and more
Brendan Gregg
 

En vedette (14)

Intro to Relational Databases
Intro to Relational DatabasesIntro to Relational Databases
Intro to Relational Databases
 
Monitoring Cassandra: Don't Miss a Thing (Alain Rodriguez, The Last Pickle) |...
Monitoring Cassandra: Don't Miss a Thing (Alain Rodriguez, The Last Pickle) |...Monitoring Cassandra: Don't Miss a Thing (Alain Rodriguez, The Last Pickle) |...
Monitoring Cassandra: Don't Miss a Thing (Alain Rodriguez, The Last Pickle) |...
 
Monitoring Cassandra at Scale (Jason Cacciatore, Netflix) | C* Summit 2016
Monitoring Cassandra at Scale (Jason Cacciatore, Netflix) | C* Summit 2016Monitoring Cassandra at Scale (Jason Cacciatore, Netflix) | C* Summit 2016
Monitoring Cassandra at Scale (Jason Cacciatore, Netflix) | C* Summit 2016
 
Jmxtrans presentation
Jmxtrans presentationJmxtrans presentation
Jmxtrans presentation
 
Advanced Cassandra Operations via JMX (Nate McCall, The Last Pickle) | C* Sum...
Advanced Cassandra Operations via JMX (Nate McCall, The Last Pickle) | C* Sum...Advanced Cassandra Operations via JMX (Nate McCall, The Last Pickle) | C* Sum...
Advanced Cassandra Operations via JMX (Nate McCall, The Last Pickle) | C* Sum...
 
Systems Monitoring with Prometheus (Devops Ireland April 2015)
Systems Monitoring with Prometheus (Devops Ireland April 2015)Systems Monitoring with Prometheus (Devops Ireland April 2015)
Systems Monitoring with Prometheus (Devops Ireland April 2015)
 
Monitoring Cassandra with Riemann
Monitoring Cassandra with RiemannMonitoring Cassandra with Riemann
Monitoring Cassandra with Riemann
 
Learning Cassandra
Learning CassandraLearning Cassandra
Learning Cassandra
 
Linux Performance Analysis: New Tools and Old Secrets
Linux Performance Analysis: New Tools and Old SecretsLinux Performance Analysis: New Tools and Old Secrets
Linux Performance Analysis: New Tools and Old Secrets
 
Linux Systems Performance 2016
Linux Systems Performance 2016Linux Systems Performance 2016
Linux Systems Performance 2016
 
Broken Linux Performance Tools 2016
Broken Linux Performance Tools 2016Broken Linux Performance Tools 2016
Broken Linux Performance Tools 2016
 
BPF: Tracing and more
BPF: Tracing and moreBPF: Tracing and more
BPF: Tracing and more
 
Velocity 2015 linux perf tools
Velocity 2015 linux perf toolsVelocity 2015 linux perf tools
Velocity 2015 linux perf tools
 
Linux Profiling at Netflix
Linux Profiling at NetflixLinux Profiling at Netflix
Linux Profiling at Netflix
 

Similaire à Monitoring Cassandra with graphite using Yammer Coda-Hale Library

kranonit S06E01 Игорь Цинько: High load
kranonit S06E01 Игорь Цинько: High loadkranonit S06E01 Игорь Цинько: High load
kranonit S06E01 Игорь Цинько: High load
Krivoy Rog IT Community
 
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
DataStax
 
Using and Customizing the Android Framework / part 4 of Embedded Android Work...
Using and Customizing the Android Framework / part 4 of Embedded Android Work...Using and Customizing the Android Framework / part 4 of Embedded Android Work...
Using and Customizing the Android Framework / part 4 of Embedded Android Work...
Opersys inc.
 
Gatling - Bordeaux JUG
Gatling - Bordeaux JUGGatling - Bordeaux JUG
Gatling - Bordeaux JUG
slandelle
 

Similaire à Monitoring Cassandra with graphite using Yammer Coda-Hale Library (20)

Looking towards an official cassandra sidecar netflix
Looking towards an official cassandra sidecar   netflixLooking towards an official cassandra sidecar   netflix
Looking towards an official cassandra sidecar netflix
 
Cassandra Summit 2014: Monitor Everything!
Cassandra Summit 2014: Monitor Everything!Cassandra Summit 2014: Monitor Everything!
Cassandra Summit 2014: Monitor Everything!
 
ApacheCon BigData Europe 2015
ApacheCon BigData Europe 2015 ApacheCon BigData Europe 2015
ApacheCon BigData Europe 2015
 
NetflixOSS Meetup season 3 episode 1
NetflixOSS Meetup season 3 episode 1NetflixOSS Meetup season 3 episode 1
NetflixOSS Meetup season 3 episode 1
 
Monitoring with Ganglia
Monitoring with GangliaMonitoring with Ganglia
Monitoring with Ganglia
 
Cassandra Lunch #92: Securing Apache Cassandra - Managing Roles and Permissions
Cassandra Lunch #92: Securing Apache Cassandra - Managing Roles and PermissionsCassandra Lunch #92: Securing Apache Cassandra - Managing Roles and Permissions
Cassandra Lunch #92: Securing Apache Cassandra - Managing Roles and Permissions
 
kranonit S06E01 Игорь Цинько: High load
kranonit S06E01 Игорь Цинько: High loadkranonit S06E01 Игорь Цинько: High load
kranonit S06E01 Игорь Цинько: High load
 
BigData in IoT #iotconfua
BigData in IoT #iotconfuaBigData in IoT #iotconfua
BigData in IoT #iotconfua
 
NetFlow Data processing using Hadoop and Vertica
NetFlow Data processing using Hadoop and VerticaNetFlow Data processing using Hadoop and Vertica
NetFlow Data processing using Hadoop and Vertica
 
Five Lessons in Distributed Databases
Five Lessons  in Distributed DatabasesFive Lessons  in Distributed Databases
Five Lessons in Distributed Databases
 
Software Profiling: Java Performance, Profiling and Flamegraphs
Software Profiling: Java Performance, Profiling and FlamegraphsSoftware Profiling: Java Performance, Profiling and Flamegraphs
Software Profiling: Java Performance, Profiling and Flamegraphs
 
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
 
Scaling Up Logging and Metrics
Scaling Up Logging and MetricsScaling Up Logging and Metrics
Scaling Up Logging and Metrics
 
Using and Customizing the Android Framework / part 4 of Embedded Android Work...
Using and Customizing the Android Framework / part 4 of Embedded Android Work...Using and Customizing the Android Framework / part 4 of Embedded Android Work...
Using and Customizing the Android Framework / part 4 of Embedded Android Work...
 
Shared Database Concurrency
Shared Database ConcurrencyShared Database Concurrency
Shared Database Concurrency
 
GumGum: Multi-Region Cassandra in AWS
GumGum: Multi-Region Cassandra in AWSGumGum: Multi-Region Cassandra in AWS
GumGum: Multi-Region Cassandra in AWS
 
Truemotion Adventures in Containerization
Truemotion Adventures in ContainerizationTruemotion Adventures in Containerization
Truemotion Adventures in Containerization
 
GE IOT Predix Time Series & Data Ingestion Service using Apache Apex (Hadoop)
GE IOT Predix Time Series & Data Ingestion Service using Apache Apex (Hadoop)GE IOT Predix Time Series & Data Ingestion Service using Apache Apex (Hadoop)
GE IOT Predix Time Series & Data Ingestion Service using Apache Apex (Hadoop)
 
Building a data pipeline to ingest data into Hadoop in minutes using Streamse...
Building a data pipeline to ingest data into Hadoop in minutes using Streamse...Building a data pipeline to ingest data into Hadoop in minutes using Streamse...
Building a data pipeline to ingest data into Hadoop in minutes using Streamse...
 
Gatling - Bordeaux JUG
Gatling - Bordeaux JUGGatling - Bordeaux JUG
Gatling - Bordeaux JUG
 

Dernier

%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
masabamasaba
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
masabamasaba
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
masabamasaba
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
masabamasaba
 

Dernier (20)

%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand
 
Harnessing ChatGPT - Elevating Productivity in Today's Agile Environment
Harnessing ChatGPT  - Elevating Productivity in Today's Agile EnvironmentHarnessing ChatGPT  - Elevating Productivity in Today's Agile Environment
Harnessing ChatGPT - Elevating Productivity in Today's Agile Environment
 
WSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go PlatformlessWSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go Platformless
 
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
 
%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Harare%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Harare
 
WSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With SimplicityWSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
 
tonesoftg
tonesoftgtonesoftg
tonesoftg
 
Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the past
 
Announcing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareAnnouncing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK Software
 
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
 
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
 
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdfPayment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
 

Monitoring Cassandra with graphite using Yammer Coda-Hale Library

  • 1. JMXExpress Transporting Cassandra Metrics To Graphite
  • 2. Cassandra Is Awesome ● No Single Point of Failure ● Fault Tolerant ● Multi-DC Is A Picnic ● Great Properties That Let Ops Teams to Sleep at 2 AM
  • 3. Robustness Have Price ● C* Isn’t A Fire and Forget System :( ● Most Times You Don’t Notice Problems o Things can go up/down for a minutes o C* Simply Queues Request, and Services Still Running, but nobody notices
  • 4. Be Proactive Do Daily/Weekly Checkups to detect and prevent Problems: ● Capacity ● Exceptions ● Performance Bottlenecks ● Data Modeling Issues
  • 5. Reactive ● Something Will Go Wrong: o Hardware Failures o Bugs o Malicious or Non-Malicious Users ● Alarms: NOC, Pager-Duty
  • 6. Proactive or Reactive? ● You Need Data o Form Alerts o Find Anomalies o Trends o Debugging ● You Should Monitor Everything
  • 7. Gathering Metrics ● Cassandra o OpsCenter o JMX o Nodetool o Logs ● Environment o CPU, Memory, Disks, Network, … o Logs o JVM
  • 8. Give Data Context You Should Give the Data Context … Otherwise it’s just pretty Graphs...
  • 9. JMX ● Java Management Extensions ● Complex… ● Resources are presented as Objects with Attributes ● Used for Both Monitoring and For Actions
  • 10. Native JMX ● Un-Friendly way to get metrics o Requires Java o Slow and have memory leaks o Nightmare for Ops (Network/Security) Client Cassandra Init Port 7199 Reply Hostname:Port 7199 1- Get new 7199 host/port 2- Drop old conn 3- Connect with new host/port 1024-65536 Init Port 7199
  • 11. JMX Tools ● Visual o JConsole o VisualVM o Commercial ● Command Line o jmxterm o jmxsh ● Jolokia ● MX4J
  • 12. JMX Syntax [domain]:[key1]=[value1],[key2]=[value2] … org.apache.cassandra.metrics:type=ColumnFamily,keyspace=outbrain,scope=user_events,name=TotalDiskSpaceUsed
  • 13. JMX Domains org.apache.cassandra ● db ● internal ● net ● request org.apache.cassandra.metrics
  • 14. JMX Types org.apache.cassandra.metrics: type= ● Cache ● Client ● ClientRequest ● ClientRequestMetrics ● ColumnFamily ● CommitLog ● Compaction ● DroppedMessages ● FileCache ● Storage ● ThreadPools
  • 15. Coda-Hale Metrics ● Toolkit called metrics from metrics o By Yammer Coda-Hale Library ● Easy to Use ● Easy to Read (If you speak Java) ● Popular
  • 16. Types of Metrics ● Gauge: Instantaneous value ● Counter: number that can be incremented/decremented ● Meter: Rate of Events Over time (request/second/minutes/5min/15min) ● Histogram: Statistical Distribution o 50,75,95,98,99,99.9 percentile o average/median/min/max/stddev ● Timer:rate of events/historgram of duration
  • 17. 75th percentile is 650.75 us (75% took 650.75us or less) One Minute Write rate is 13,915 per second
  • 18. Native JMX ● Its overwhelming at first ● Hard to tell what they mean with the source ● Moves around a lot between versions ● Fortunately there is nodetool
  • 19. Coda-Hale Reporting Interface Coda-Hale Metrics Library: ● Default o JMX o Console o CSV o Slf4J ● Addons o Ganglia / Graphite ● Community o Cassandra / StatsD / NewRelic / Splunk / Cloudwatch o Kafka / Riemann / TempDB/ Munin / Riak / InfluxDB / Sematext o MongoDB / OpenTSDB/ Librato o … More
  • 20. Reporting Interface Activation ● Metrics library: o Included in Cassandra since 1.1 o Pre 2.0 It required writing your Java agent reporter
  • 21. Pluggable Metrics in Cassandra 2.0.2 ● Starting from Cassandra 2.0.2, you need only to configure special YAML file: /etc/cassandra/metrics-reporter-config-graphite.yaml ● Load the Coda-Hale metrics by including the build-in agent in the cassandra-env.sh file -Dcassandra.metricsReporterConfigFile=yourCoolFile.yaml ● Save the file in /etc/cassandra/ directory only and don’t specify full path, otherwise it will not work
  • 22. Pluggable Metrics in Cassandra 2.0.2 Yaml Example: graphite: - period: 60 timeunit: 'SECONDS' hosts: - host: 'graphite' port: 2003 predicate: color: "white" useQualifiedName: true patterns: - "^org.apache.cassandra.metrics.Cache.+" - "^org.apache.cassandra.metrics.ClientRequest.+" - "^org.apache.cassandra.metrics.Storage.+" - "^org.apache.cassandra.metrics.ThreadPools.+"
  • 23. Caveats of Pluggable Metrics - Works only in 2.0.2 or higher - Has bad metrics names: sometimes begins with ‘.’ and not suitable for Graphite Tree - Limited ability to manipulate metrics
  • 24. Our Approach - Use older version (2.0.3) of Metrics Library that fits to all C* version (down to 1.1) - Write our own Java agent for backward compatibility - Run the metrics via Manipulator daemon to be able for reformat them and fit them to our dashboards
  • 25. The Java Agent From the Documentation
  • 26. The Java Agent ● Compiling it: javac -cp $CASSANDRA_HOME/lib/metrics-core-2.0.3.jar:$CASSANDRA_HOME/lib/metrics-graphite-2.0.3.jar com/datastax/example/ReportAgent.java $ jar -cfM reporter.jar . ● Loading the Agent with Cassandra (Edit cassandra-env.sh and add the following line to the bottom) JVM_OPTS="-javaagent:/path/to/your/reporter.jar $JVM_OPTS"
  • 27. Manipulating the Metrics ● Metrics comes in org.apache.cassandra… syntax ● They don’t fit into our Graphite Scheme ● Some metrics begins with . (dot) ● Need to be able to filter and manipulate metrics
  • 28. Manipulating the Metrics We have build a Simple Bash script that poses to a Graphite server and manipulates the metrics as we wish: ● We change the prefix ● We can filter metrics ● Keep unified output ● Solve some syntax issues like IP addresses read by Graphite as separate metric tree
  • 29.
  • 30. Metrics in Graphite (Sample: Write Latency Histograms)