SlideShare a Scribd company logo
1 of 66
Download to read offline
Monitoring Big Data Systems Done
"The Simple Way"
Demi Ben-Ari - CTO @ Panorays
About Me
Demi Ben-Ari, Co-Founder & CTO @ Panorays
● BS’c Computer Science – Academic College Tel-Aviv Yaffo
● Co-Founder “Big Things” Big Data Community
In the Past:
● Sr. Data Engineer - Windward
● Team Leader & Sr. Java Software Engineer,
Missile defense and Alert System - “Ofek” – IAF
Interested in almost every kind of technology – A True Geek
Agenda
● A lot of (NOT) funny Jokes
● Problem definition and Environment
● Monitoring pipeline solutions
○ Metrics
○ Datastore
○ Dashboards
○ Alerting
● Summary
● (Not going to address Service discovery and monitoring)
Say “Distributed”, Say “Big Data”,
Say….
What is Big Data (IMHO)? And What to Monitor?
● Systems involving the “3 Vs”:
What are the right questions we want to ask?
○ Volume - How much?
■ Amount per second / minute / hour / day….
■ Gigabytes, Terabytes, Petabytes…
○ Velocity - How fast?
■ Count per second / minute / hour / day….
○ Variety
■ What kind? (Difference)
■ Sensor Data, Logs, Data Streams, Financial Transactions, Geo Locations...
Monolith Structure
OS CPU Memory Disk
Processes Java
Application
Server
Database
Web Server
Load
Balancer
Users - Other Applications
Monitoring
System
UI
Distributed Microservices Architecture
Service A
Queue
DB
Service B
DBCache
Cache DBService C
Web
Server
DB
Analytics Cluster
Master
Slave Slave Slave
Monitoring System???
Some basic concepts
Basic Concepts
● Monitoring
○ Collecting, processing, aggregating, and displaying real-time quantitative data about a
system
● White-box
○ Monitoring based on metrics exposed by the internals of the system
○ logs, interfaces JMX of JVM, etc
● Black-box
○ Testing externally visible behavior as a user would see it.
● Dashboard
○ An application that provides a summary view of a service’s core metrics.
Basic Concepts
● Alert
○ A notification intended to be read by a human and that is pushed to a system such as a
bug or ticket queue, an email alias or a pager.
● Root cause
○ A defect in a software or human system that - if repaired, instills confidence that this
event won’t happen again in the same way.
● Node and machine
○ Used interchangeably to indicate a single instance (physical server, virtual machine or
container). There might be multiple services worth monitoring on a single machine.
● Push
○ Any change to a service’s running software or its configuration.
● KPI - Key Performance Indicator
Data flow and Environment
(Our Use Case)
Data Flow Diagram
External
Data
Source
Analytics
Layers
Data Pipeline
Parsed
Raw
Entity Resolution
Process
Building insights
on top of the entities
Data Output
Layer
Anomaly
Detection
Trends
UI for End Users
Environment Description
Cluster
Dev Testing
Live
Staging
ProductionEnv
OB1K
RESTful Java Services
Situations
MongoDB + Spark
Worker 1
Worker 2
….
….
…
…
Worker N
Spark
Cluster
Master
Write
Read
MasterSahrded
MongoDB
Replica Set
Cassandra + Spark
Worker 1
Worker 2
….
….
…
…
Worker N
Cassandra
Cluster
Spark
Cluster
Write
Read
Cassandra + Serving
Cassandra
Cluster
Write
Read
UI Client
UI Client
UI Client
UI Client
Web
ServiceWeb
ServiceWeb
ServiceWeb
Service
Problems
● Multiple physical servers
● Multiple logical services
● Want Scaling => More Servers
● Even if you had all of the metrics
○ You’ll have an overflow of the data
● Your monitoring becomes a “Big Data” problem itself
The what really “Distributed” Means
The DevOps Guy
(It might be you)
So...Let’s Start!
Report to Where?
● We chose:
● Graphite (InfluxDB) + Grafana
● Can correlate System and
Application metrics in one
place :)
Report to Where?
● Save DevOps efforts if you’re willing to Pay :)
● Hosted Graphite
○ https://www.hostedgraphite.com/
● Throwing the “Big Data” volume monitoring problem at someone else
Connections
Connections...
http://www.mememaker.net/meme/connections-connections-everywhere2/
Drivers to Datastores
● Actions they usually do:
○ Open connection
○ Apply actions
■ Select
■ Insert
■ Update
■ Delete
○ Close connection
● Do you monitor each?
○ Hint: Yes!!!! Hell Yes!!!
● Creating a wrapper in any programming language and reporting the metrics
○ Count, execution times, errors…
○ A bit of Infrastructure code that will give great visibility
Monitoring
Operation System
Monitoring Operation System Metrics
● What to measure:
○ CPU
○ Memory
○ Disk Space
● How to measure:
○ CollectD or StatsD reporting to Graphite
○ New Relic
■ Nice and easy UI
■ Even the free account gives great tool
■ Alerting of thresholds
Monitoring
Cassandra
Monitoring Cassandra
● OpsCenter - by DataStax
Monitoring Cassandra
● Is the enough?...
We can connect it to Graphite also (Blog: “Monitoring the hell out of
Cassandra”)
● Plug & Play the metrics to Graphite - Internal Cassandra mechanism
● Back to the basics: dstat, iostat, iotop, jstack
Monitoring Cassandra
Monitoring Cassandra - Alternative
Monitoring Cassandra - Some more :)
● Cyanite: http://cyanite.io/
Graphite with Cassandra backend as a datasource.
● Nodetool - Cassandra tool
● Back to the basics: dstat, iostat, iotop, jstack
Some help
from “the Cloud”
Monitoring via AWS’s CloudWatch
Google Stackdriver (GCP)
● Can integrate both GCP and Amazon accounts
Monitoring Spark
What to monitor in an Apache Spark Cluster
● Application execution
● Resource consumption and allocation
● Task Failures
● Environment and Amount of servers
● Physical OS metrics
● Infrastructure services
Ways to Monitoring Spark
● Sending Metrics: Spark → Graphite (Execution)
● http://spark.apache.org/docs/latest/monitoring.html
Ways to Monitoring Spark
● Sending Metrics: Spark → Graphite (JVM metrics)
● http://spark.apache.org/docs/latest/monitoring.html
Ways to Monitoring Spark
● Grafana-spark-dashboards
○ Blog: http://www.hammerlab.org/2015/02/27/monitoring-spark-with-graphite-and-grafana/
● Spark UI - Online on each application running
● Spark History Server - Offline (After application finishes)
● Spark REST API
○ Querying via inner tools to do ad-hoc monitoring
● Back to the basics: dstat, iostat, iotop, jstack
● Blog post by Tzach Zohar - “Tips from the Trenches”
Monitoring
Your Data
https://memegenerator.net/instance/53617544
Data Questions?
● Did all of the computation occur?
● Are there any data layers missing?
● How much data do we have? (Volume)
● Is all of the data in the Database?
Data Answers!
● KISS (Keep it simple stupid)
● Jenkins + Maven (JUnit) for the rescue
● Creating a maven “monitoring” project.
○ Running scheduled tasks, each for the relevant data source
■ Database data existence
■ S3 files existence
■ Data flow that keeps on coming from sensors
■ (Any other data source that you can imagine…)
○ Scheduled task that write amount metrics to Graphite -> Dashboards
○ Report task execution to Graphite
Data Answers!
● The method doesn’t really matter, as long as you:
○ Can follow the results over time
○ Know what your data flow and know where things might fail
○ It’s easy for anyone to add more monitoring
(For the ones that add the new data each time…)
○ It don’t trust others to add monitoring
(It will always end up the DevOps’s “fault” -> No monitoring will be
applied)
Logging?
Monitoring?
https://lh4.googleusercontent.com/DFVcH-E5XKj8cbhEtI0qabmf_wwVqWWvk0pK5H5rnC_kVxY2tXClKfzV-
LvAH61YRLJUEvtO9amjWfjcY4Z57VBYCuQ95_hdAVEHgLAuepJiArH0wJERWuzzmgnPysCiIA
● Elastic
● Architecture:
Server
Server
Server
ELK - Elasticsearch + Logstash + Kibana
Shippers
Queue
Indexer Web UIStorage
● (Simpler) Architecture:
○ The problem: Log42 only works with TCP :( => Log4J2 works with UDP too
Server
Server
Server
ELK - Elasticsearch + Logstash + Kibana
Indexer Web UIStorage
TCP / UDP
ELK - Elasticsearch + Logstash + Kibana
http://www.digitalgov.gov/2014/05/07/analyzing-search-data-in-real-time-to-drive-decisions/
ELK - Elasticsearch + Logstash + Kibana
http://blog.takipi.com/log-management-tools-face-off-splunk-vs-logstash-vs-sumo-logic/
Who else Logs?
● Graylog2
● ….
● Logging As a Service :)
○ Logz.io (http://logz.io/blog/deploy-elk-production)
○ Logly
○ sematext
How does it look in real life?
● http://www.digitalgov.gov/2015/01/07/elk/
● http://www.ragedsyscoder.com/monitoring-slides/file/img/tvs.jpg
Did someone say
“Dashboard”?
http://www.funpic.hu/_files/pictures/original/86/71/27186.jpg
Redash
● http://redash.io/
● Open Source: https://github.com/getredash/redash
● Came out as one of many Open source tool by Everything.me
● Created and Maintained by Arik Fraimovich (You rock!)
● Written in Python
● Has an on-premise and hosted solution
●‫רןאאקמ‬
Redash - Data Sources
Redash - Screenshots
Redash - Screenshots
Redash - the “Why?”
● Having multiple data sources in the organization
● Wanting to see all a combination of data sources in one place
● It’s open source and ready to use
● Why implement fancy UI and spend a lot of time?!?!?
● So...just use it!
Alerting
Alerting
● Syren - Open source
● Reporting to:
○ Email, Flowdock, HipChat, HTTP,
Hubot, IRCcat, PagerDuty,
Pushover, SLF4J, Slack,
SNMP, Twilio
ELK - And what about alerting???
● Elastalert
● http://engineeringblog.yelp.com/2015/10/elastalert-alerting-at-scale-with-elasticsearch.html
● http://engineeringblog.yelp.com/2016/03/elastalert-part-two.html
Some more alerting
● Cloudwatch and Stackdriver has their own alerting mechanism
● New Relic has it’s own alerting too
● Even with our Jenkins tests we’ve created alerting via emails
○ Beware of “Spam”
● Find which solution you would like as long as:
○ You can notice what is wrong => when it’s wrong
○ Be able to “Acknowledge” your errors
○ Do something you won’t be able to ignore :)
Summary - Monitoring Stack
Alerting
Metrics Collection
Datastore
Dashboard
Data Monitoring
Log Monitoring
Conclusions
● Correlating Application and System metrics!!!!
● Ask the right monitoring questions and answer them with Dashboards
● KISS - simple is key, what’s hard, we tend not to do at all
● Alert about what you can actually react to (And to the relevant person)
● Measure whatever you can - only way to know if you’re improving
● Monitor your business KPIs too.
● If all of the above is not enough,
Graphs are fricking cool!
http://www.rantlifestyle.com/2013/09/23/how-happy-this-baby-is-will-shock-you/
Questions?
Demi Ben-Ari
● LinkedIn
● Twitter: @demibenari
● Blog: http://progexc.blogspot.com/
● Email: demi.benari@gmail.com
● “Big Things” Community
Meetup, YouTube, Facebook, Twitter
● GDG Cloud
Thanks! my contact:
Resources
● Monitoring distributed systems - A case study in how Google monitors its
complex systems

More Related Content

What's hot

Big Data Monitoring Cockpit
Big Data Monitoring CockpitBig Data Monitoring Cockpit
Big Data Monitoring CockpitStefan Bergstein
 
Thinking DevOps in the era of the Cloud - Demi Ben-Ari
Thinking DevOps in the era of the Cloud - Demi Ben-AriThinking DevOps in the era of the Cloud - Demi Ben-Ari
Thinking DevOps in the era of the Cloud - Demi Ben-AriDemi Ben-Ari
 
Open Source Monitoring Tools Shootout
Open Source Monitoring Tools ShootoutOpen Source Monitoring Tools Shootout
Open Source Monitoring Tools Shootouttomdc
 
Monitoring in Big Data Frameworks @ Big Data Meetup, Timisoara, 2015
Monitoring in Big Data Frameworks @ Big Data Meetup, Timisoara, 2015Monitoring in Big Data Frameworks @ Big Data Meetup, Timisoara, 2015
Monitoring in Big Data Frameworks @ Big Data Meetup, Timisoara, 2015Institute e-Austria Timisoara
 
Sensu at brightpearl
Sensu at brightpearlSensu at brightpearl
Sensu at brightpearlDavid Tibbs
 
Unreal Engine 4 Blueprints: Odio e amore Roberto De Ioris - Codemotion Rome 2017
Unreal Engine 4 Blueprints: Odio e amore Roberto De Ioris - Codemotion Rome 2017Unreal Engine 4 Blueprints: Odio e amore Roberto De Ioris - Codemotion Rome 2017
Unreal Engine 4 Blueprints: Odio e amore Roberto De Ioris - Codemotion Rome 2017Codemotion
 
The Open-Source Monitoring Landscape
The Open-Source Monitoring LandscapeThe Open-Source Monitoring Landscape
The Open-Source Monitoring LandscapeMike Merideth
 
Prometheus Overview
Prometheus OverviewPrometheus Overview
Prometheus OverviewBrian Brazil
 
Just enough web ops for web developers
Just enough web ops for web developersJust enough web ops for web developers
Just enough web ops for web developersDatadog
 
Validating Big Data Jobs—Stopping Failures Before Production on Apache Spark...
 Validating Big Data Jobs—Stopping Failures Before Production on Apache Spark... Validating Big Data Jobs—Stopping Failures Before Production on Apache Spark...
Validating Big Data Jobs—Stopping Failures Before Production on Apache Spark...Databricks
 
Systems Monitoring with Prometheus (Devops Ireland April 2015)
Systems Monitoring with Prometheus (Devops Ireland April 2015)Systems Monitoring with Prometheus (Devops Ireland April 2015)
Systems Monitoring with Prometheus (Devops Ireland April 2015)Brian Brazil
 
Scalable and Reliable Logging at Pinterest
Scalable and Reliable Logging at PinterestScalable and Reliable Logging at Pinterest
Scalable and Reliable Logging at PinterestKrishna Gade
 
Prometheus lightning talk (Devops Dublin March 2015)
Prometheus lightning talk (Devops Dublin March 2015)Prometheus lightning talk (Devops Dublin March 2015)
Prometheus lightning talk (Devops Dublin March 2015)Brian Brazil
 
Monitoring, Hold the Infrastructure - Getting the Most out of AWS Lambda – Da...
Monitoring, Hold the Infrastructure - Getting the Most out of AWS Lambda – Da...Monitoring, Hold the Infrastructure - Getting the Most out of AWS Lambda – Da...
Monitoring, Hold the Infrastructure - Getting the Most out of AWS Lambda – Da...Amazon Web Services
 
Python performance profiling
Python performance profilingPython performance profiling
Python performance profilingJon Haddad
 
ELK Wrestling (Leeds DevOps)
ELK Wrestling (Leeds DevOps)ELK Wrestling (Leeds DevOps)
ELK Wrestling (Leeds DevOps)Steve Elliott
 
Logmanagement with Icinga2 and ELK
Logmanagement with Icinga2 and ELKLogmanagement with Icinga2 and ELK
Logmanagement with Icinga2 and ELKIcinga
 
Prometheus design and philosophy
Prometheus design and philosophy   Prometheus design and philosophy
Prometheus design and philosophy Docker, Inc.
 
Intro to Python for C# Developers
Intro to Python for C# DevelopersIntro to Python for C# Developers
Intro to Python for C# DevelopersSarah Dutkiewicz
 

What's hot (20)

Big Data Monitoring Cockpit
Big Data Monitoring CockpitBig Data Monitoring Cockpit
Big Data Monitoring Cockpit
 
Thinking DevOps in the era of the Cloud - Demi Ben-Ari
Thinking DevOps in the era of the Cloud - Demi Ben-AriThinking DevOps in the era of the Cloud - Demi Ben-Ari
Thinking DevOps in the era of the Cloud - Demi Ben-Ari
 
Open Source Monitoring Tools Shootout
Open Source Monitoring Tools ShootoutOpen Source Monitoring Tools Shootout
Open Source Monitoring Tools Shootout
 
Monitoring in Big Data Frameworks @ Big Data Meetup, Timisoara, 2015
Monitoring in Big Data Frameworks @ Big Data Meetup, Timisoara, 2015Monitoring in Big Data Frameworks @ Big Data Meetup, Timisoara, 2015
Monitoring in Big Data Frameworks @ Big Data Meetup, Timisoara, 2015
 
Sensu at brightpearl
Sensu at brightpearlSensu at brightpearl
Sensu at brightpearl
 
Unreal Engine 4 Blueprints: Odio e amore Roberto De Ioris - Codemotion Rome 2017
Unreal Engine 4 Blueprints: Odio e amore Roberto De Ioris - Codemotion Rome 2017Unreal Engine 4 Blueprints: Odio e amore Roberto De Ioris - Codemotion Rome 2017
Unreal Engine 4 Blueprints: Odio e amore Roberto De Ioris - Codemotion Rome 2017
 
The Open-Source Monitoring Landscape
The Open-Source Monitoring LandscapeThe Open-Source Monitoring Landscape
The Open-Source Monitoring Landscape
 
Prometheus Overview
Prometheus OverviewPrometheus Overview
Prometheus Overview
 
Just enough web ops for web developers
Just enough web ops for web developersJust enough web ops for web developers
Just enough web ops for web developers
 
Validating Big Data Jobs—Stopping Failures Before Production on Apache Spark...
 Validating Big Data Jobs—Stopping Failures Before Production on Apache Spark... Validating Big Data Jobs—Stopping Failures Before Production on Apache Spark...
Validating Big Data Jobs—Stopping Failures Before Production on Apache Spark...
 
Systems Monitoring with Prometheus (Devops Ireland April 2015)
Systems Monitoring with Prometheus (Devops Ireland April 2015)Systems Monitoring with Prometheus (Devops Ireland April 2015)
Systems Monitoring with Prometheus (Devops Ireland April 2015)
 
Scalable and Reliable Logging at Pinterest
Scalable and Reliable Logging at PinterestScalable and Reliable Logging at Pinterest
Scalable and Reliable Logging at Pinterest
 
Prometheus lightning talk (Devops Dublin March 2015)
Prometheus lightning talk (Devops Dublin March 2015)Prometheus lightning talk (Devops Dublin March 2015)
Prometheus lightning talk (Devops Dublin March 2015)
 
Monitoring, Hold the Infrastructure - Getting the Most out of AWS Lambda – Da...
Monitoring, Hold the Infrastructure - Getting the Most out of AWS Lambda – Da...Monitoring, Hold the Infrastructure - Getting the Most out of AWS Lambda – Da...
Monitoring, Hold the Infrastructure - Getting the Most out of AWS Lambda – Da...
 
Python performance profiling
Python performance profilingPython performance profiling
Python performance profiling
 
ELK Wrestling (Leeds DevOps)
ELK Wrestling (Leeds DevOps)ELK Wrestling (Leeds DevOps)
ELK Wrestling (Leeds DevOps)
 
Logmanagement with Icinga2 and ELK
Logmanagement with Icinga2 and ELKLogmanagement with Icinga2 and ELK
Logmanagement with Icinga2 and ELK
 
Prometheus design and philosophy
Prometheus design and philosophy   Prometheus design and philosophy
Prometheus design and philosophy
 
Intro to Python for C# Developers
Intro to Python for C# DevelopersIntro to Python for C# Developers
Intro to Python for C# Developers
 
Sensu Monitoring
Sensu MonitoringSensu Monitoring
Sensu Monitoring
 

Viewers also liked

TIFF'40 2015 CanCon screening schedule
TIFF'40 2015 CanCon screening scheduleTIFF'40 2015 CanCon screening schedule
TIFF'40 2015 CanCon screening scheduleAgatha Daisley
 
Now and Future of APM
Now and Future of APMNow and Future of APM
Now and Future of APMcowboy93
 
Pig on Tez: Low Latency Data Processing with Big Data
Pig on Tez: Low Latency Data Processing with Big DataPig on Tez: Low Latency Data Processing with Big Data
Pig on Tez: Low Latency Data Processing with Big DataDataWorks Summit
 
별천지세미나(2회) 세션5 hamlet
별천지세미나(2회) 세션5 hamlet별천지세미나(2회) 세션5 hamlet
별천지세미나(2회) 세션5 hamletJongKwang Kim
 
Jco14 오픈소스를 이용한 모니터링 방법
Jco14 오픈소스를 이용한 모니터링 방법Jco14 오픈소스를 이용한 모니터링 방법
Jco14 오픈소스를 이용한 모니터링 방법정수 한
 
Techique, Methodology, Culture
Techique, Methodology, CultureTechique, Methodology, Culture
Techique, Methodology, CultureBenny Bauer
 
안정적인 서비스 운영 2014.03
안정적인 서비스 운영   2014.03안정적인 서비스 운영   2014.03
안정적인 서비스 운영 2014.03Changyol BAEK
 
Apache Tez - A New Chapter in Hadoop Data Processing
Apache Tez - A New Chapter in Hadoop Data ProcessingApache Tez - A New Chapter in Hadoop Data Processing
Apache Tez - A New Chapter in Hadoop Data ProcessingDataWorks Summit
 
Monitoring System Targeting OpenStack, Baremetal, and Network Fabric
Monitoring System Targeting OpenStack, Baremetal, and Network FabricMonitoring System Targeting OpenStack, Baremetal, and Network Fabric
Monitoring System Targeting OpenStack, Baremetal, and Network FabricJaesuk Ahn
 
Integrating big data into the monitoring and evaluation of development progra...
Integrating big data into the monitoring and evaluation of development progra...Integrating big data into the monitoring and evaluation of development progra...
Integrating big data into the monitoring and evaluation of development progra...UN Global Pulse
 
Apache Spark 101 - Demi Ben-Ari
Apache Spark 101 - Demi Ben-AriApache Spark 101 - Demi Ben-Ari
Apache Spark 101 - Demi Ben-AriDemi Ben-Ari
 
메모리 할당에 관한 기초
메모리 할당에 관한 기초메모리 할당에 관한 기초
메모리 할당에 관한 기초Changyol BAEK
 
Hive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep DiveHive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep DiveDataWorks Summit
 
[오픈소스컨설팅]Zabbix Installation and Configuration Guide
[오픈소스컨설팅]Zabbix Installation and Configuration Guide[오픈소스컨설팅]Zabbix Installation and Configuration Guide
[오픈소스컨설팅]Zabbix Installation and Configuration GuideJi-Woong Choi
 
Serverless - When to FaaS?
Serverless - When to FaaS?Serverless - When to FaaS?
Serverless - When to FaaS?Benny Bauer
 
오픈소스 모니터링비교
오픈소스 모니터링비교오픈소스 모니터링비교
오픈소스 모니터링비교sprdd
 

Viewers also liked (18)

TIFF'40 2015 CanCon screening schedule
TIFF'40 2015 CanCon screening scheduleTIFF'40 2015 CanCon screening schedule
TIFF'40 2015 CanCon screening schedule
 
Now and Future of APM
Now and Future of APMNow and Future of APM
Now and Future of APM
 
Pig on Tez: Low Latency Data Processing with Big Data
Pig on Tez: Low Latency Data Processing with Big DataPig on Tez: Low Latency Data Processing with Big Data
Pig on Tez: Low Latency Data Processing with Big Data
 
별천지세미나(2회) 세션5 hamlet
별천지세미나(2회) 세션5 hamlet별천지세미나(2회) 세션5 hamlet
별천지세미나(2회) 세션5 hamlet
 
Jco14 오픈소스를 이용한 모니터링 방법
Jco14 오픈소스를 이용한 모니터링 방법Jco14 오픈소스를 이용한 모니터링 방법
Jco14 오픈소스를 이용한 모니터링 방법
 
Techique, Methodology, Culture
Techique, Methodology, CultureTechique, Methodology, Culture
Techique, Methodology, Culture
 
안정적인 서비스 운영 2014.03
안정적인 서비스 운영   2014.03안정적인 서비스 운영   2014.03
안정적인 서비스 운영 2014.03
 
Apache Tez - A New Chapter in Hadoop Data Processing
Apache Tez - A New Chapter in Hadoop Data ProcessingApache Tez - A New Chapter in Hadoop Data Processing
Apache Tez - A New Chapter in Hadoop Data Processing
 
Monitoring System Targeting OpenStack, Baremetal, and Network Fabric
Monitoring System Targeting OpenStack, Baremetal, and Network FabricMonitoring System Targeting OpenStack, Baremetal, and Network Fabric
Monitoring System Targeting OpenStack, Baremetal, and Network Fabric
 
Integrating big data into the monitoring and evaluation of development progra...
Integrating big data into the monitoring and evaluation of development progra...Integrating big data into the monitoring and evaluation of development progra...
Integrating big data into the monitoring and evaluation of development progra...
 
Apache Spark 101 - Demi Ben-Ari
Apache Spark 101 - Demi Ben-AriApache Spark 101 - Demi Ben-Ari
Apache Spark 101 - Demi Ben-Ari
 
DevOps Demo
DevOps DemoDevOps Demo
DevOps Demo
 
메모리 할당에 관한 기초
메모리 할당에 관한 기초메모리 할당에 관한 기초
메모리 할당에 관한 기초
 
Hive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep DiveHive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep Dive
 
[오픈소스컨설팅]Zabbix Installation and Configuration Guide
[오픈소스컨설팅]Zabbix Installation and Configuration Guide[오픈소스컨설팅]Zabbix Installation and Configuration Guide
[오픈소스컨설팅]Zabbix Installation and Configuration Guide
 
Serverless - When to FaaS?
Serverless - When to FaaS?Serverless - When to FaaS?
Serverless - When to FaaS?
 
From Code to Kubernetes
From Code to KubernetesFrom Code to Kubernetes
From Code to Kubernetes
 
오픈소스 모니터링비교
오픈소스 모니터링비교오픈소스 모니터링비교
오픈소스 모니터링비교
 

Similar to Monitoring Big Data Systems - "The Simple Way"

Monitoring Big Data Systems "Done the simple way" - Demi Ben-Ari - Codemotion...
Monitoring Big Data Systems "Done the simple way" - Demi Ben-Ari - Codemotion...Monitoring Big Data Systems "Done the simple way" - Demi Ben-Ari - Codemotion...
Monitoring Big Data Systems "Done the simple way" - Demi Ben-Ari - Codemotion...Demi Ben-Ari
 
Demi Ben-Ari - Monitoring Big Data Systems Done "The Simple Way" - Codemotion...
Demi Ben-Ari - Monitoring Big Data Systems Done "The Simple Way" - Codemotion...Demi Ben-Ari - Monitoring Big Data Systems Done "The Simple Way" - Codemotion...
Demi Ben-Ari - Monitoring Big Data Systems Done "The Simple Way" - Codemotion...Codemotion
 
Thinking DevOps in the Era of the Cloud - Demi Ben-Ari
Thinking DevOps in the Era of the Cloud - Demi Ben-AriThinking DevOps in the Era of the Cloud - Demi Ben-Ari
Thinking DevOps in the Era of the Cloud - Demi Ben-AriDemi Ben-Ari
 
Data Science in the Cloud @StitchFix
Data Science in the Cloud @StitchFixData Science in the Cloud @StitchFix
Data Science in the Cloud @StitchFixC4Media
 
Machine learning and big data @ uber a tale of two systems
Machine learning and big data @ uber a tale of two systemsMachine learning and big data @ uber a tale of two systems
Machine learning and big data @ uber a tale of two systemsZhenxiao Luo
 
Big data @ uber vu (1)
Big data @ uber vu (1)Big data @ uber vu (1)
Big data @ uber vu (1)Mihnea Giurgea
 
An EyeWitness View into your Network
An EyeWitness View into your NetworkAn EyeWitness View into your Network
An EyeWitness View into your NetworkCTruncer
 
Building a data pipeline to ingest data into Hadoop in minutes using Streamse...
Building a data pipeline to ingest data into Hadoop in minutes using Streamse...Building a data pipeline to ingest data into Hadoop in minutes using Streamse...
Building a data pipeline to ingest data into Hadoop in minutes using Streamse...Guglielmo Iozzia
 
OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert
OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris BuytaertOSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert
OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris BuytaertNETWAYS
 
Lambda Architecture and open source technology stack for real time big data
Lambda Architecture and open source technology stack for real time big dataLambda Architecture and open source technology stack for real time big data
Lambda Architecture and open source technology stack for real time big dataTrieu Nguyen
 
AWS Big Data Demystified #1.2 | Big Data architecture lessons learned
AWS Big Data Demystified #1.2 | Big Data architecture lessons learned AWS Big Data Demystified #1.2 | Big Data architecture lessons learned
AWS Big Data Demystified #1.2 | Big Data architecture lessons learned Omid Vahdaty
 
Lessons learned from designing a QA Automation for analytics databases (big d...
Lessons learned from designing a QA Automation for analytics databases (big d...Lessons learned from designing a QA Automation for analytics databases (big d...
Lessons learned from designing a QA Automation for analytics databases (big d...Omid Vahdaty
 
Extracting Insights from Data at Twitter
Extracting Insights from Data at TwitterExtracting Insights from Data at Twitter
Extracting Insights from Data at TwitterPrasad Wagle
 
MongoDB World 2019: Packing Up Your Data and Moving to MongoDB Atlas
MongoDB World 2019: Packing Up Your Data and Moving to MongoDB AtlasMongoDB World 2019: Packing Up Your Data and Moving to MongoDB Atlas
MongoDB World 2019: Packing Up Your Data and Moving to MongoDB AtlasMongoDB
 
How To Get The Most Out Of Your Hibernate, JBoss EAP 7 Application (Ståle Ped...
How To Get The Most Out Of Your Hibernate, JBoss EAP 7 Application (Ståle Ped...How To Get The Most Out Of Your Hibernate, JBoss EAP 7 Application (Ståle Ped...
How To Get The Most Out Of Your Hibernate, JBoss EAP 7 Application (Ståle Ped...Red Hat Developers
 
WSO2Con USA 2015: An Introduction to the WSO2 Analytics Platform
WSO2Con USA 2015: An Introduction to the WSO2 Analytics PlatformWSO2Con USA 2015: An Introduction to the WSO2 Analytics Platform
WSO2Con USA 2015: An Introduction to the WSO2 Analytics PlatformWSO2
 
Web-scale data processing: practical approaches for low-latency and batch
Web-scale data processing: practical approaches for low-latency and batchWeb-scale data processing: practical approaches for low-latency and batch
Web-scale data processing: practical approaches for low-latency and batchEdward Capriolo
 

Similar to Monitoring Big Data Systems - "The Simple Way" (20)

Monitoring Big Data Systems "Done the simple way" - Demi Ben-Ari - Codemotion...
Monitoring Big Data Systems "Done the simple way" - Demi Ben-Ari - Codemotion...Monitoring Big Data Systems "Done the simple way" - Demi Ben-Ari - Codemotion...
Monitoring Big Data Systems "Done the simple way" - Demi Ben-Ari - Codemotion...
 
Demi Ben-Ari - Monitoring Big Data Systems Done "The Simple Way" - Codemotion...
Demi Ben-Ari - Monitoring Big Data Systems Done "The Simple Way" - Codemotion...Demi Ben-Ari - Monitoring Big Data Systems Done "The Simple Way" - Codemotion...
Demi Ben-Ari - Monitoring Big Data Systems Done "The Simple Way" - Codemotion...
 
Handout: 'Open Source Tools & Resources'
Handout: 'Open Source Tools & Resources'Handout: 'Open Source Tools & Resources'
Handout: 'Open Source Tools & Resources'
 
Thinking DevOps in the Era of the Cloud - Demi Ben-Ari
Thinking DevOps in the Era of the Cloud - Demi Ben-AriThinking DevOps in the Era of the Cloud - Demi Ben-Ari
Thinking DevOps in the Era of the Cloud - Demi Ben-Ari
 
Data Science in the Cloud @StitchFix
Data Science in the Cloud @StitchFixData Science in the Cloud @StitchFix
Data Science in the Cloud @StitchFix
 
Machine learning and big data @ uber a tale of two systems
Machine learning and big data @ uber a tale of two systemsMachine learning and big data @ uber a tale of two systems
Machine learning and big data @ uber a tale of two systems
 
Big data @ uber vu (1)
Big data @ uber vu (1)Big data @ uber vu (1)
Big data @ uber vu (1)
 
An EyeWitness View into your Network
An EyeWitness View into your NetworkAn EyeWitness View into your Network
An EyeWitness View into your Network
 
Cloud accounting software uk
Cloud accounting software ukCloud accounting software uk
Cloud accounting software uk
 
Building a data pipeline to ingest data into Hadoop in minutes using Streamse...
Building a data pipeline to ingest data into Hadoop in minutes using Streamse...Building a data pipeline to ingest data into Hadoop in minutes using Streamse...
Building a data pipeline to ingest data into Hadoop in minutes using Streamse...
 
OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert
OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris BuytaertOSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert
OSMC 2014: From monitoringsucks to monitoringlove (and back) | Kris Buytaert
 
Lambda Architecture and open source technology stack for real time big data
Lambda Architecture and open source technology stack for real time big dataLambda Architecture and open source technology stack for real time big data
Lambda Architecture and open source technology stack for real time big data
 
AWS Big Data Demystified #1.2 | Big Data architecture lessons learned
AWS Big Data Demystified #1.2 | Big Data architecture lessons learned AWS Big Data Demystified #1.2 | Big Data architecture lessons learned
AWS Big Data Demystified #1.2 | Big Data architecture lessons learned
 
Lessons learned from designing a QA Automation for analytics databases (big d...
Lessons learned from designing a QA Automation for analytics databases (big d...Lessons learned from designing a QA Automation for analytics databases (big d...
Lessons learned from designing a QA Automation for analytics databases (big d...
 
Extracting Insights from Data at Twitter
Extracting Insights from Data at TwitterExtracting Insights from Data at Twitter
Extracting Insights from Data at Twitter
 
MongoDB World 2019: Packing Up Your Data and Moving to MongoDB Atlas
MongoDB World 2019: Packing Up Your Data and Moving to MongoDB AtlasMongoDB World 2019: Packing Up Your Data and Moving to MongoDB Atlas
MongoDB World 2019: Packing Up Your Data and Moving to MongoDB Atlas
 
Lambda architecture
Lambda architectureLambda architecture
Lambda architecture
 
How To Get The Most Out Of Your Hibernate, JBoss EAP 7 Application (Ståle Ped...
How To Get The Most Out Of Your Hibernate, JBoss EAP 7 Application (Ståle Ped...How To Get The Most Out Of Your Hibernate, JBoss EAP 7 Application (Ståle Ped...
How To Get The Most Out Of Your Hibernate, JBoss EAP 7 Application (Ståle Ped...
 
WSO2Con USA 2015: An Introduction to the WSO2 Analytics Platform
WSO2Con USA 2015: An Introduction to the WSO2 Analytics PlatformWSO2Con USA 2015: An Introduction to the WSO2 Analytics Platform
WSO2Con USA 2015: An Introduction to the WSO2 Analytics Platform
 
Web-scale data processing: practical approaches for low-latency and batch
Web-scale data processing: practical approaches for low-latency and batchWeb-scale data processing: practical approaches for low-latency and batch
Web-scale data processing: practical approaches for low-latency and batch
 

More from Demi Ben-Ari

CTO Management Tool Box - Demi Ben-Ari at Panorays
CTO Management Tool Box - Demi Ben-Ari at PanoraysCTO Management Tool Box - Demi Ben-Ari at Panorays
CTO Management Tool Box - Demi Ben-Ari at PanoraysDemi Ben-Ari
 
Kubernetes, Toolbox to fail or succeed for beginners - Demi Ben-Ari, VP R&D @...
Kubernetes, Toolbox to fail or succeed for beginners - Demi Ben-Ari, VP R&D @...Kubernetes, Toolbox to fail or succeed for beginners - Demi Ben-Ari, VP R&D @...
Kubernetes, Toolbox to fail or succeed for beginners - Demi Ben-Ari, VP R&D @...Demi Ben-Ari
 
Hacker vs company, Cloud Cyber Security Automated with Kubernetes - Demi Ben-...
Hacker vs company, Cloud Cyber Security Automated with Kubernetes - Demi Ben-...Hacker vs company, Cloud Cyber Security Automated with Kubernetes - Demi Ben-...
Hacker vs company, Cloud Cyber Security Automated with Kubernetes - Demi Ben-...Demi Ben-Ari
 
CTO Management ToolBox - Demi Ben-Ari -- Panorays
CTO Management ToolBox - Demi Ben-Ari -- PanoraysCTO Management ToolBox - Demi Ben-Ari -- Panorays
CTO Management ToolBox - Demi Ben-Ari -- PanoraysDemi Ben-Ari
 
All I Wanted Is to Found a Startup - Demi Ben-Ari - Panorays
All I Wanted Is to Found a Startup - Demi Ben-Ari - PanoraysAll I Wanted Is to Found a Startup - Demi Ben-Ari - Panorays
All I Wanted Is to Found a Startup - Demi Ben-Ari - PanoraysDemi Ben-Ari
 
Hacking for fun & profit - The Kubernetes Way - Demi Ben-Ari - Panorays
Hacking for fun & profit - The Kubernetes Way - Demi Ben-Ari - PanoraysHacking for fun & profit - The Kubernetes Way - Demi Ben-Ari - Panorays
Hacking for fun & profit - The Kubernetes Way - Demi Ben-Ari - PanoraysDemi Ben-Ari
 
Community, Unifying the Geeks to Create Value - Demi Ben-Ari
Community, Unifying the Geeks to Create Value - Demi Ben-AriCommunity, Unifying the Geeks to Create Value - Demi Ben-Ari
Community, Unifying the Geeks to Create Value - Demi Ben-AriDemi Ben-Ari
 
Apache Spark 101 - Demi Ben-Ari - Panorays
Apache Spark 101 - Demi Ben-Ari - PanoraysApache Spark 101 - Demi Ben-Ari - Panorays
Apache Spark 101 - Demi Ben-Ari - PanoraysDemi Ben-Ari
 
Know the Startup World - Demi Ben-Ari - Ofek Alumni
Know the Startup World - Demi Ben-Ari - Ofek AlumniKnow the Startup World - Demi Ben-Ari - Ofek Alumni
Know the Startup World - Demi Ben-Ari - Ofek AlumniDemi Ben-Ari
 
Big Data made easy in the era of the Cloud - Demi Ben-Ari
Big Data made easy in the era of the Cloud - Demi Ben-AriBig Data made easy in the era of the Cloud - Demi Ben-Ari
Big Data made easy in the era of the Cloud - Demi Ben-AriDemi Ben-Ari
 
Know the Startup World - Demi Ben Ari - Ofek Alumni
Know the Startup World - Demi Ben Ari - Ofek AlumniKnow the Startup World - Demi Ben Ari - Ofek Alumni
Know the Startup World - Demi Ben Ari - Ofek AlumniDemi Ben-Ari
 
Bootstrapping a Tech Community - Demi Ben-Ari
Bootstrapping a Tech Community - Demi Ben-AriBootstrapping a Tech Community - Demi Ben-Ari
Bootstrapping a Tech Community - Demi Ben-AriDemi Ben-Ari
 
S3 cassandra or outer space? dumping time series data using spark
S3 cassandra or outer space? dumping time series data using sparkS3 cassandra or outer space? dumping time series data using spark
S3 cassandra or outer space? dumping time series data using sparkDemi Ben-Ari
 
Spark 101 – First Steps To Distributed Computing - Demi Ben-Ari @ Ofek Alumni
Spark 101 – First Steps To Distributed Computing - Demi Ben-Ari @ Ofek AlumniSpark 101 – First Steps To Distributed Computing - Demi Ben-Ari @ Ofek Alumni
Spark 101 – First Steps To Distributed Computing - Demi Ben-Ari @ Ofek AlumniDemi Ben-Ari
 
Migrating Data Pipeline from MongoDB to Cassandra
Migrating Data Pipeline from MongoDB to CassandraMigrating Data Pipeline from MongoDB to Cassandra
Migrating Data Pipeline from MongoDB to CassandraDemi Ben-Ari
 
Spark 101 - First steps to distributed computing
Spark 101 - First steps to distributed computingSpark 101 - First steps to distributed computing
Spark 101 - First steps to distributed computingDemi Ben-Ari
 
Transform & Analyze Time Series Data via Apache Spark @Windward
Transform & Analyze Time Series Data via Apache Spark @WindwardTransform & Analyze Time Series Data via Apache Spark @Windward
Transform & Analyze Time Series Data via Apache Spark @WindwardDemi Ben-Ari
 
Spark in the Maritime Domain
Spark in the Maritime DomainSpark in the Maritime Domain
Spark in the Maritime DomainDemi Ben-Ari
 
Spark to Production @Windward
Spark to Production @WindwardSpark to Production @Windward
Spark to Production @WindwardDemi Ben-Ari
 
Bring the Spark To Your Eyes
Bring the Spark To Your EyesBring the Spark To Your Eyes
Bring the Spark To Your EyesDemi Ben-Ari
 

More from Demi Ben-Ari (20)

CTO Management Tool Box - Demi Ben-Ari at Panorays
CTO Management Tool Box - Demi Ben-Ari at PanoraysCTO Management Tool Box - Demi Ben-Ari at Panorays
CTO Management Tool Box - Demi Ben-Ari at Panorays
 
Kubernetes, Toolbox to fail or succeed for beginners - Demi Ben-Ari, VP R&D @...
Kubernetes, Toolbox to fail or succeed for beginners - Demi Ben-Ari, VP R&D @...Kubernetes, Toolbox to fail or succeed for beginners - Demi Ben-Ari, VP R&D @...
Kubernetes, Toolbox to fail or succeed for beginners - Demi Ben-Ari, VP R&D @...
 
Hacker vs company, Cloud Cyber Security Automated with Kubernetes - Demi Ben-...
Hacker vs company, Cloud Cyber Security Automated with Kubernetes - Demi Ben-...Hacker vs company, Cloud Cyber Security Automated with Kubernetes - Demi Ben-...
Hacker vs company, Cloud Cyber Security Automated with Kubernetes - Demi Ben-...
 
CTO Management ToolBox - Demi Ben-Ari -- Panorays
CTO Management ToolBox - Demi Ben-Ari -- PanoraysCTO Management ToolBox - Demi Ben-Ari -- Panorays
CTO Management ToolBox - Demi Ben-Ari -- Panorays
 
All I Wanted Is to Found a Startup - Demi Ben-Ari - Panorays
All I Wanted Is to Found a Startup - Demi Ben-Ari - PanoraysAll I Wanted Is to Found a Startup - Demi Ben-Ari - Panorays
All I Wanted Is to Found a Startup - Demi Ben-Ari - Panorays
 
Hacking for fun & profit - The Kubernetes Way - Demi Ben-Ari - Panorays
Hacking for fun & profit - The Kubernetes Way - Demi Ben-Ari - PanoraysHacking for fun & profit - The Kubernetes Way - Demi Ben-Ari - Panorays
Hacking for fun & profit - The Kubernetes Way - Demi Ben-Ari - Panorays
 
Community, Unifying the Geeks to Create Value - Demi Ben-Ari
Community, Unifying the Geeks to Create Value - Demi Ben-AriCommunity, Unifying the Geeks to Create Value - Demi Ben-Ari
Community, Unifying the Geeks to Create Value - Demi Ben-Ari
 
Apache Spark 101 - Demi Ben-Ari - Panorays
Apache Spark 101 - Demi Ben-Ari - PanoraysApache Spark 101 - Demi Ben-Ari - Panorays
Apache Spark 101 - Demi Ben-Ari - Panorays
 
Know the Startup World - Demi Ben-Ari - Ofek Alumni
Know the Startup World - Demi Ben-Ari - Ofek AlumniKnow the Startup World - Demi Ben-Ari - Ofek Alumni
Know the Startup World - Demi Ben-Ari - Ofek Alumni
 
Big Data made easy in the era of the Cloud - Demi Ben-Ari
Big Data made easy in the era of the Cloud - Demi Ben-AriBig Data made easy in the era of the Cloud - Demi Ben-Ari
Big Data made easy in the era of the Cloud - Demi Ben-Ari
 
Know the Startup World - Demi Ben Ari - Ofek Alumni
Know the Startup World - Demi Ben Ari - Ofek AlumniKnow the Startup World - Demi Ben Ari - Ofek Alumni
Know the Startup World - Demi Ben Ari - Ofek Alumni
 
Bootstrapping a Tech Community - Demi Ben-Ari
Bootstrapping a Tech Community - Demi Ben-AriBootstrapping a Tech Community - Demi Ben-Ari
Bootstrapping a Tech Community - Demi Ben-Ari
 
S3 cassandra or outer space? dumping time series data using spark
S3 cassandra or outer space? dumping time series data using sparkS3 cassandra or outer space? dumping time series data using spark
S3 cassandra or outer space? dumping time series data using spark
 
Spark 101 – First Steps To Distributed Computing - Demi Ben-Ari @ Ofek Alumni
Spark 101 – First Steps To Distributed Computing - Demi Ben-Ari @ Ofek AlumniSpark 101 – First Steps To Distributed Computing - Demi Ben-Ari @ Ofek Alumni
Spark 101 – First Steps To Distributed Computing - Demi Ben-Ari @ Ofek Alumni
 
Migrating Data Pipeline from MongoDB to Cassandra
Migrating Data Pipeline from MongoDB to CassandraMigrating Data Pipeline from MongoDB to Cassandra
Migrating Data Pipeline from MongoDB to Cassandra
 
Spark 101 - First steps to distributed computing
Spark 101 - First steps to distributed computingSpark 101 - First steps to distributed computing
Spark 101 - First steps to distributed computing
 
Transform & Analyze Time Series Data via Apache Spark @Windward
Transform & Analyze Time Series Data via Apache Spark @WindwardTransform & Analyze Time Series Data via Apache Spark @Windward
Transform & Analyze Time Series Data via Apache Spark @Windward
 
Spark in the Maritime Domain
Spark in the Maritime DomainSpark in the Maritime Domain
Spark in the Maritime Domain
 
Spark to Production @Windward
Spark to Production @WindwardSpark to Production @Windward
Spark to Production @Windward
 
Bring the Spark To Your Eyes
Bring the Spark To Your EyesBring the Spark To Your Eyes
Bring the Spark To Your Eyes
 

Recently uploaded

Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...121011101441
 
Class 1 | NFPA 72 | Overview Fire Alarm System
Class 1 | NFPA 72 | Overview Fire Alarm SystemClass 1 | NFPA 72 | Overview Fire Alarm System
Class 1 | NFPA 72 | Overview Fire Alarm Systemirfanmechengr
 
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfCCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfAsst.prof M.Gokilavani
 
National Level Hackathon Participation Certificate.pdf
National Level Hackathon Participation Certificate.pdfNational Level Hackathon Participation Certificate.pdf
National Level Hackathon Participation Certificate.pdfRajuKanojiya4
 
Concrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxConcrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxKartikeyaDwivedi3
 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AIabhishek36461
 
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTION
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTIONTHE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTION
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTIONjhunlian
 
Arduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.pptArduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.pptSAURABHKUMAR892774
 
Mine Environment II Lab_MI10448MI__________.pptx
Mine Environment II Lab_MI10448MI__________.pptxMine Environment II Lab_MI10448MI__________.pptx
Mine Environment II Lab_MI10448MI__________.pptxRomil Mishra
 
Indian Dairy Industry Present Status and.ppt
Indian Dairy Industry Present Status and.pptIndian Dairy Industry Present Status and.ppt
Indian Dairy Industry Present Status and.pptMadan Karki
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024Mark Billinghurst
 
Risk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdfRisk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdfROCENODodongVILLACER
 
Earthing details of Electrical Substation
Earthing details of Electrical SubstationEarthing details of Electrical Substation
Earthing details of Electrical Substationstephanwindworld
 
System Simulation and Modelling with types and Event Scheduling
System Simulation and Modelling with types and Event SchedulingSystem Simulation and Modelling with types and Event Scheduling
System Simulation and Modelling with types and Event SchedulingBootNeck1
 
Introduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxIntroduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxk795866
 
Industrial Safety Unit-IV workplace health and safety.ppt
Industrial Safety Unit-IV workplace health and safety.pptIndustrial Safety Unit-IV workplace health and safety.ppt
Industrial Safety Unit-IV workplace health and safety.pptNarmatha D
 
Input Output Management in Operating System
Input Output Management in Operating SystemInput Output Management in Operating System
Input Output Management in Operating SystemRashmi Bhat
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort servicejennyeacort
 

Recently uploaded (20)

Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...
 
Class 1 | NFPA 72 | Overview Fire Alarm System
Class 1 | NFPA 72 | Overview Fire Alarm SystemClass 1 | NFPA 72 | Overview Fire Alarm System
Class 1 | NFPA 72 | Overview Fire Alarm System
 
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfCCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
 
National Level Hackathon Participation Certificate.pdf
National Level Hackathon Participation Certificate.pdfNational Level Hackathon Participation Certificate.pdf
National Level Hackathon Participation Certificate.pdf
 
Concrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxConcrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptx
 
Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AI
 
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTION
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTIONTHE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTION
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTION
 
Arduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.pptArduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.ppt
 
Mine Environment II Lab_MI10448MI__________.pptx
Mine Environment II Lab_MI10448MI__________.pptxMine Environment II Lab_MI10448MI__________.pptx
Mine Environment II Lab_MI10448MI__________.pptx
 
Indian Dairy Industry Present Status and.ppt
Indian Dairy Industry Present Status and.pptIndian Dairy Industry Present Status and.ppt
Indian Dairy Industry Present Status and.ppt
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024
 
Risk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdfRisk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdf
 
Earthing details of Electrical Substation
Earthing details of Electrical SubstationEarthing details of Electrical Substation
Earthing details of Electrical Substation
 
System Simulation and Modelling with types and Event Scheduling
System Simulation and Modelling with types and Event SchedulingSystem Simulation and Modelling with types and Event Scheduling
System Simulation and Modelling with types and Event Scheduling
 
Introduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxIntroduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptx
 
Industrial Safety Unit-IV workplace health and safety.ppt
Industrial Safety Unit-IV workplace health and safety.pptIndustrial Safety Unit-IV workplace health and safety.ppt
Industrial Safety Unit-IV workplace health and safety.ppt
 
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
 
Input Output Management in Operating System
Input Output Management in Operating SystemInput Output Management in Operating System
Input Output Management in Operating System
 
young call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Serviceyoung call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Service
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
 

Monitoring Big Data Systems - "The Simple Way"

  • 1. Monitoring Big Data Systems Done "The Simple Way" Demi Ben-Ari - CTO @ Panorays
  • 2. About Me Demi Ben-Ari, Co-Founder & CTO @ Panorays ● BS’c Computer Science – Academic College Tel-Aviv Yaffo ● Co-Founder “Big Things” Big Data Community In the Past: ● Sr. Data Engineer - Windward ● Team Leader & Sr. Java Software Engineer, Missile defense and Alert System - “Ofek” – IAF Interested in almost every kind of technology – A True Geek
  • 3. Agenda ● A lot of (NOT) funny Jokes ● Problem definition and Environment ● Monitoring pipeline solutions ○ Metrics ○ Datastore ○ Dashboards ○ Alerting ● Summary ● (Not going to address Service discovery and monitoring)
  • 4. Say “Distributed”, Say “Big Data”, Say….
  • 5. What is Big Data (IMHO)? And What to Monitor? ● Systems involving the “3 Vs”: What are the right questions we want to ask? ○ Volume - How much? ■ Amount per second / minute / hour / day…. ■ Gigabytes, Terabytes, Petabytes… ○ Velocity - How fast? ■ Count per second / minute / hour / day…. ○ Variety ■ What kind? (Difference) ■ Sensor Data, Logs, Data Streams, Financial Transactions, Geo Locations...
  • 6. Monolith Structure OS CPU Memory Disk Processes Java Application Server Database Web Server Load Balancer Users - Other Applications Monitoring System UI
  • 7. Distributed Microservices Architecture Service A Queue DB Service B DBCache Cache DBService C Web Server DB Analytics Cluster Master Slave Slave Slave Monitoring System???
  • 9. Basic Concepts ● Monitoring ○ Collecting, processing, aggregating, and displaying real-time quantitative data about a system ● White-box ○ Monitoring based on metrics exposed by the internals of the system ○ logs, interfaces JMX of JVM, etc ● Black-box ○ Testing externally visible behavior as a user would see it. ● Dashboard ○ An application that provides a summary view of a service’s core metrics.
  • 10. Basic Concepts ● Alert ○ A notification intended to be read by a human and that is pushed to a system such as a bug or ticket queue, an email alias or a pager. ● Root cause ○ A defect in a software or human system that - if repaired, instills confidence that this event won’t happen again in the same way. ● Node and machine ○ Used interchangeably to indicate a single instance (physical server, virtual machine or container). There might be multiple services worth monitoring on a single machine. ● Push ○ Any change to a service’s running software or its configuration. ● KPI - Key Performance Indicator
  • 11. Data flow and Environment (Our Use Case)
  • 12. Data Flow Diagram External Data Source Analytics Layers Data Pipeline Parsed Raw Entity Resolution Process Building insights on top of the entities Data Output Layer Anomaly Detection Trends UI for End Users
  • 15. MongoDB + Spark Worker 1 Worker 2 …. …. … … Worker N Spark Cluster Master Write Read MasterSahrded MongoDB Replica Set
  • 16. Cassandra + Spark Worker 1 Worker 2 …. …. … … Worker N Cassandra Cluster Spark Cluster Write Read
  • 17. Cassandra + Serving Cassandra Cluster Write Read UI Client UI Client UI Client UI Client Web ServiceWeb ServiceWeb ServiceWeb Service
  • 18. Problems ● Multiple physical servers ● Multiple logical services ● Want Scaling => More Servers ● Even if you had all of the metrics ○ You’ll have an overflow of the data ● Your monitoring becomes a “Big Data” problem itself
  • 19. The what really “Distributed” Means The DevOps Guy (It might be you)
  • 21. Report to Where? ● We chose: ● Graphite (InfluxDB) + Grafana ● Can correlate System and Application metrics in one place :)
  • 22. Report to Where? ● Save DevOps efforts if you’re willing to Pay :) ● Hosted Graphite ○ https://www.hostedgraphite.com/ ● Throwing the “Big Data” volume monitoring problem at someone else
  • 24. Drivers to Datastores ● Actions they usually do: ○ Open connection ○ Apply actions ■ Select ■ Insert ■ Update ■ Delete ○ Close connection ● Do you monitor each? ○ Hint: Yes!!!! Hell Yes!!! ● Creating a wrapper in any programming language and reporting the metrics ○ Count, execution times, errors… ○ A bit of Infrastructure code that will give great visibility
  • 26. Monitoring Operation System Metrics ● What to measure: ○ CPU ○ Memory ○ Disk Space ● How to measure: ○ CollectD or StatsD reporting to Graphite ○ New Relic ■ Nice and easy UI ■ Even the free account gives great tool ■ Alerting of thresholds
  • 29. Monitoring Cassandra ● Is the enough?... We can connect it to Graphite also (Blog: “Monitoring the hell out of Cassandra”) ● Plug & Play the metrics to Graphite - Internal Cassandra mechanism ● Back to the basics: dstat, iostat, iotop, jstack
  • 31. Monitoring Cassandra - Alternative
  • 32. Monitoring Cassandra - Some more :) ● Cyanite: http://cyanite.io/ Graphite with Cassandra backend as a datasource. ● Nodetool - Cassandra tool ● Back to the basics: dstat, iostat, iotop, jstack
  • 35. Google Stackdriver (GCP) ● Can integrate both GCP and Amazon accounts
  • 37. What to monitor in an Apache Spark Cluster ● Application execution ● Resource consumption and allocation ● Task Failures ● Environment and Amount of servers ● Physical OS metrics ● Infrastructure services
  • 38. Ways to Monitoring Spark ● Sending Metrics: Spark → Graphite (Execution) ● http://spark.apache.org/docs/latest/monitoring.html
  • 39. Ways to Monitoring Spark ● Sending Metrics: Spark → Graphite (JVM metrics) ● http://spark.apache.org/docs/latest/monitoring.html
  • 40. Ways to Monitoring Spark ● Grafana-spark-dashboards ○ Blog: http://www.hammerlab.org/2015/02/27/monitoring-spark-with-graphite-and-grafana/ ● Spark UI - Online on each application running ● Spark History Server - Offline (After application finishes) ● Spark REST API ○ Querying via inner tools to do ad-hoc monitoring ● Back to the basics: dstat, iostat, iotop, jstack ● Blog post by Tzach Zohar - “Tips from the Trenches”
  • 42. Data Questions? ● Did all of the computation occur? ● Are there any data layers missing? ● How much data do we have? (Volume) ● Is all of the data in the Database?
  • 43. Data Answers! ● KISS (Keep it simple stupid) ● Jenkins + Maven (JUnit) for the rescue ● Creating a maven “monitoring” project. ○ Running scheduled tasks, each for the relevant data source ■ Database data existence ■ S3 files existence ■ Data flow that keeps on coming from sensors ■ (Any other data source that you can imagine…) ○ Scheduled task that write amount metrics to Graphite -> Dashboards ○ Report task execution to Graphite
  • 44. Data Answers! ● The method doesn’t really matter, as long as you: ○ Can follow the results over time ○ Know what your data flow and know where things might fail ○ It’s easy for anyone to add more monitoring (For the ones that add the new data each time…) ○ It don’t trust others to add monitoring (It will always end up the DevOps’s “fault” -> No monitoring will be applied)
  • 46. ● Elastic ● Architecture: Server Server Server ELK - Elasticsearch + Logstash + Kibana Shippers Queue Indexer Web UIStorage
  • 47. ● (Simpler) Architecture: ○ The problem: Log42 only works with TCP :( => Log4J2 works with UDP too Server Server Server ELK - Elasticsearch + Logstash + Kibana Indexer Web UIStorage TCP / UDP
  • 48. ELK - Elasticsearch + Logstash + Kibana http://www.digitalgov.gov/2014/05/07/analyzing-search-data-in-real-time-to-drive-decisions/
  • 49. ELK - Elasticsearch + Logstash + Kibana http://blog.takipi.com/log-management-tools-face-off-splunk-vs-logstash-vs-sumo-logic/
  • 50. Who else Logs? ● Graylog2 ● …. ● Logging As a Service :) ○ Logz.io (http://logz.io/blog/deploy-elk-production) ○ Logly ○ sematext
  • 51. How does it look in real life? ● http://www.digitalgov.gov/2015/01/07/elk/ ● http://www.ragedsyscoder.com/monitoring-slides/file/img/tvs.jpg
  • 53. Redash ● http://redash.io/ ● Open Source: https://github.com/getredash/redash ● Came out as one of many Open source tool by Everything.me ● Created and Maintained by Arik Fraimovich (You rock!) ● Written in Python ● Has an on-premise and hosted solution ●‫רןאאקמ‬
  • 54. Redash - Data Sources
  • 57. Redash - the “Why?” ● Having multiple data sources in the organization ● Wanting to see all a combination of data sources in one place ● It’s open source and ready to use ● Why implement fancy UI and spend a lot of time?!?!? ● So...just use it!
  • 59. Alerting ● Syren - Open source ● Reporting to: ○ Email, Flowdock, HipChat, HTTP, Hubot, IRCcat, PagerDuty, Pushover, SLF4J, Slack, SNMP, Twilio
  • 60. ELK - And what about alerting??? ● Elastalert ● http://engineeringblog.yelp.com/2015/10/elastalert-alerting-at-scale-with-elasticsearch.html ● http://engineeringblog.yelp.com/2016/03/elastalert-part-two.html
  • 61. Some more alerting ● Cloudwatch and Stackdriver has their own alerting mechanism ● New Relic has it’s own alerting too ● Even with our Jenkins tests we’ve created alerting via emails ○ Beware of “Spam” ● Find which solution you would like as long as: ○ You can notice what is wrong => when it’s wrong ○ Be able to “Acknowledge” your errors ○ Do something you won’t be able to ignore :)
  • 62. Summary - Monitoring Stack Alerting Metrics Collection Datastore Dashboard Data Monitoring Log Monitoring
  • 63. Conclusions ● Correlating Application and System metrics!!!! ● Ask the right monitoring questions and answer them with Dashboards ● KISS - simple is key, what’s hard, we tend not to do at all ● Alert about what you can actually react to (And to the relevant person) ● Measure whatever you can - only way to know if you’re improving ● Monitor your business KPIs too. ● If all of the above is not enough, Graphs are fricking cool! http://www.rantlifestyle.com/2013/09/23/how-happy-this-baby-is-will-shock-you/
  • 65. Demi Ben-Ari ● LinkedIn ● Twitter: @demibenari ● Blog: http://progexc.blogspot.com/ ● Email: demi.benari@gmail.com ● “Big Things” Community Meetup, YouTube, Facebook, Twitter ● GDG Cloud Thanks! my contact:
  • 66. Resources ● Monitoring distributed systems - A case study in how Google monitors its complex systems