SlideShare une entreprise Scribd logo
1  sur  26
Télécharger pour lire hors ligne
globo.com
globo
.com
a two-year case study
A two-year case study globo
.com
Who are we?
• One of major media player in Brazil
• responsible for content delivery
• Numbers
• 45M unique visitors/day
• 8 major products
• 43% Brazil audience (mobile/computer)
• 3k physical devices (+3k VM's and growing fast)
• 600k items, 160k triggers, 2k nvps
A two-year case study globo
.com
About this talk
• Technical challenges
• and how we overcame them
• New tools
• what we have developed
• Time for questions
A two-year case study globo
.com
About::our team
• 4(12) people
• 80+% time dedicated to Zabbix
• 3 Zabbix Certified Specialist's
• dev + deploy + infra + Zabbix administration
• DBA team
• NGX Labs
• developing custom tools
• 30 people on 24x7 NOC
A two-year case study globo
.com
Know your infrastructure
• Know what to monitor
• how fast and how long you should keep data
• Define reasonable standards
• ease Zabbix management
• high/med/low/control update times
• what trigger priority means
A two-year case study globo
.com
Know your infrastructure
Source: Unirede
A two-year case study globo
.com
Know your infrastructure
• Plan Zabbix capacity ahead
• and then monitor it
A two-year case study globo
.com
Migration from legacy
• Can some consultancy help?
• Unirede did
• What is crucial for your business?
• What do you need to plan your future?
• "Everything" might not be a good answer!
• Talk to your customers!
• synergy will help you get through
• We used (still use) Cricket and Nagios
A two-year case study globo
.com
Migrating from legacy::Cricket
• Cricket (RRD based)
• Import of historical data
• Choose carefully what you need
• we chose only few services
• trends data
• Challenges to send/sort data to Zabbix
• migration took almost one month to complete
A two-year case study globo
.com
Migrating from legacy::Nagios
• Talk to each customer
• make presentations, be present
• (re)design each service template
• teach them how to use the API and Frontend
• no more ticket to CRUD for monitoring requests
• takes time, change is hard
A two-year case study globo
.com
Optimize::database
• Study alongside with your DBA
• tune everything, everywhere
• use read-replicas for large historical data, storage and user
visualization needs
• use partitions and forget about housekeeper
A two-year case study globo
.com
Optimize::database
A two-year case study globo
.com
Infrastructure
• values
A two-year case study globo
.com
Infrastructure
A two-year case study globo
.com
Integrations
• Why
• third party tools and data sources
• completes visualization and data correlation
• help while debugging problems
• automate everyday tasks
• minimize human error
• doc infrastructure
• consolidates information
A two-year case study globo
.com
Integrations::graphix
• Graphix
• Alternative LIVE views through network graphs
• Faster and easier understanding of data
• Triggers, Hosts, Templates, Proxies, etc can be
crossed with external sources of data in
Realtime and at the same screen.
• Integrates data from Zabbix with ServiceNow
(CMDB), Tsuru(PaaS) and Network Topologies
A two-year case study globo
.com
Integrations::cloudmon
• Cloudmon
• manages Cloudstack VM's inside Zabbix
• replicate VM's status (started=monitoring, halted=not monitoring, ...)
• System VM's and VRouters are also monitored
• replicate Cloudstack project as Zabbix host group
• ease visualization, data consolidation, sums, etc
• Based on Hyclops extension
A two-year case study globo
.com
Integrations::gbix
• Gbix
• API and GUI/WebApp
• faster api response through caching (eg: triggers)
• custom methods (fewer steps to interact with api)
• users who can't code are still able to do custom operations
A two-year case study globo
.com
Integrations::macrosync
• Macrosync
• Talks nicely with DBforBix
• No need to edit DBforBix manually for each DB monitor that's gonna
be created
• Updates DBforBix config and reload automatically when changes are
detected to Zabbix designated database hostgroup
• Integrates with Gbix API
A two-year case study globo
.com
Integrations::discovery
• Discovery
• Create "automagically" discovery rules
• Distribute rules across available proxies
• Region aware
• Balance huge networks with many small ones
• VLANS /24 /27
• Integrate with Globo Network API
• github.com/globocom/GloboNetworkAPI
A two-year case study globo
.com
Integrations::git
• Gitlab/Github
• Many internal project
documentations and topology hosted
• Wow effect
A two-year case study globo
.com
Templates
• Cisco
• F5 balancer
• Netapp storage
• http/https
• other vital stuff
• hardware health
• datacenter sensors
A two-year case study globo
.com
Deploy
• Git
• Code your infrastructure
• Puppet
• Manages configuration, rpm, etc
• Scripts
• Faster interventions, debug
• Script everything
A two-year case study globo
.com
Deploy
A two-year case study globo
.com
Questions?
fpaternot@corp.globo.com
gabriel@corp.globo.com
Thank you!
globo
.com
A two-year case study

Contenu connexe

Tendances

Tendances (20)

Nagios Conference 2014 - Scott Wilkerson - Log Monitoring and Log Management ...
Nagios Conference 2014 - Scott Wilkerson - Log Monitoring and Log Management ...Nagios Conference 2014 - Scott Wilkerson - Log Monitoring and Log Management ...
Nagios Conference 2014 - Scott Wilkerson - Log Monitoring and Log Management ...
 
Dave Williams - Nagios Log Server - Practical Experience
Dave Williams - Nagios Log Server - Practical ExperienceDave Williams - Nagios Log Server - Practical Experience
Dave Williams - Nagios Log Server - Practical Experience
 
Reporting Large Environment Zabbix Database
Reporting Large Environment Zabbix DatabaseReporting Large Environment Zabbix Database
Reporting Large Environment Zabbix Database
 
NetflixOSS Meetup season 3 episode 1
NetflixOSS Meetup season 3 episode 1NetflixOSS Meetup season 3 episode 1
NetflixOSS Meetup season 3 episode 1
 
Nagios Conference 2014 - Luis Contreras - Monitoring SAP System with Nagios Core
Nagios Conference 2014 - Luis Contreras - Monitoring SAP System with Nagios CoreNagios Conference 2014 - Luis Contreras - Monitoring SAP System with Nagios Core
Nagios Conference 2014 - Luis Contreras - Monitoring SAP System with Nagios Core
 
Marcus Rochelle - Landis+Gyr - Monitoring with Nagios Enterprise Edition
Marcus Rochelle - Landis+Gyr - Monitoring with Nagios Enterprise EditionMarcus Rochelle - Landis+Gyr - Monitoring with Nagios Enterprise Edition
Marcus Rochelle - Landis+Gyr - Monitoring with Nagios Enterprise Edition
 
Mike Weber - Nagios and Group Deployment of Service Checks
Mike Weber - Nagios and Group Deployment of Service ChecksMike Weber - Nagios and Group Deployment of Service Checks
Mike Weber - Nagios and Group Deployment of Service Checks
 
Netflix Open Source Meetup Season 4 Episode 2
Netflix Open Source Meetup Season 4 Episode 2Netflix Open Source Meetup Season 4 Episode 2
Netflix Open Source Meetup Season 4 Episode 2
 
AWS to Bare Metal: Motivation, Pitfalls, and Results
AWS to Bare Metal: Motivation, Pitfalls, and ResultsAWS to Bare Metal: Motivation, Pitfalls, and Results
AWS to Bare Metal: Motivation, Pitfalls, and Results
 
USENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a Month
USENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a MonthUSENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a Month
USENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a Month
 
NGINX Installation and Tuning
NGINX Installation and TuningNGINX Installation and Tuning
NGINX Installation and Tuning
 
Matt Bruzek - Monitoring Your Public Cloud With Nagios
Matt Bruzek - Monitoring Your Public Cloud With NagiosMatt Bruzek - Monitoring Your Public Cloud With Nagios
Matt Bruzek - Monitoring Your Public Cloud With Nagios
 
Breaking the Monolith: Organizing Your Team to Embrace Microservices
Breaking the Monolith: Organizing Your Team to Embrace MicroservicesBreaking the Monolith: Organizing Your Team to Embrace Microservices
Breaking the Monolith: Organizing Your Team to Embrace Microservices
 
Nagios World Conference 2015 - Scott Wilkerson Opening
Nagios World Conference 2015 - Scott Wilkerson OpeningNagios World Conference 2015 - Scott Wilkerson Opening
Nagios World Conference 2015 - Scott Wilkerson Opening
 
Using ELK-Stack (Elasticsearch, Logstash and Kibana) with BizTalk Server
Using ELK-Stack (Elasticsearch, Logstash and Kibana) with BizTalk ServerUsing ELK-Stack (Elasticsearch, Logstash and Kibana) with BizTalk Server
Using ELK-Stack (Elasticsearch, Logstash and Kibana) with BizTalk Server
 
Breaking the Monolith - Microservice Extraction at SoundCloud
Breaking the Monolith - Microservice Extraction at SoundCloudBreaking the Monolith - Microservice Extraction at SoundCloud
Breaking the Monolith - Microservice Extraction at SoundCloud
 
Dev309 from asgard to zuul - netflix oss-final
Dev309  from asgard to zuul - netflix oss-finalDev309  from asgard to zuul - netflix oss-final
Dev309 from asgard to zuul - netflix oss-final
 
NGINX Controller: Configuration, Management, and Troubleshooting at Scale – EMEA
NGINX Controller: Configuration, Management, and Troubleshooting at Scale – EMEANGINX Controller: Configuration, Management, and Troubleshooting at Scale – EMEA
NGINX Controller: Configuration, Management, and Troubleshooting at Scale – EMEA
 
Nagios Conference 2014 - Mike Weber - Nagios Rapid Deployment Options
Nagios Conference 2014 - Mike Weber - Nagios Rapid Deployment OptionsNagios Conference 2014 - Mike Weber - Nagios Rapid Deployment Options
Nagios Conference 2014 - Mike Weber - Nagios Rapid Deployment Options
 
Migrating NYSenate.gov
Migrating NYSenate.govMigrating NYSenate.gov
Migrating NYSenate.gov
 

Similaire à Filipe paternot - Case Study: Zabbix Deployment at Globo.com

MongoDB Partner Program Update - November 2013
MongoDB Partner Program Update - November 2013MongoDB Partner Program Update - November 2013
MongoDB Partner Program Update - November 2013
MongoDB
 
Real-Time Dynamic Data Export Using the Kafka Ecosystem
Real-Time Dynamic Data Export Using the Kafka EcosystemReal-Time Dynamic Data Export Using the Kafka Ecosystem
Real-Time Dynamic Data Export Using the Kafka Ecosystem
confluent
 
CouchbasetoHadoop_Matt_Michael_Justin v4
CouchbasetoHadoop_Matt_Michael_Justin v4CouchbasetoHadoop_Matt_Michael_Justin v4
CouchbasetoHadoop_Matt_Michael_Justin v4
Michael Kehoe
 

Similaire à Filipe paternot - Case Study: Zabbix Deployment at Globo.com (20)

DataEngConf SF16 - Scalable and Reliable Logging at Pinterest
DataEngConf SF16 - Scalable and Reliable Logging at PinterestDataEngConf SF16 - Scalable and Reliable Logging at Pinterest
DataEngConf SF16 - Scalable and Reliable Logging at Pinterest
 
Scalable and Reliable Logging at Pinterest
Scalable and Reliable Logging at PinterestScalable and Reliable Logging at Pinterest
Scalable and Reliable Logging at Pinterest
 
Unified Framework for Real Time, Near Real Time and Offline Analysis of Video...
Unified Framework for Real Time, Near Real Time and Offline Analysis of Video...Unified Framework for Real Time, Near Real Time and Offline Analysis of Video...
Unified Framework for Real Time, Near Real Time and Offline Analysis of Video...
 
Modern MySQL Monitoring and Dashboards.
Modern MySQL Monitoring and Dashboards.Modern MySQL Monitoring and Dashboards.
Modern MySQL Monitoring and Dashboards.
 
MongoDB Partner Program Update - November 2013
MongoDB Partner Program Update - November 2013MongoDB Partner Program Update - November 2013
MongoDB Partner Program Update - November 2013
 
Mainframe Application Testing both With and Without Live Data
Mainframe Application Testing both With and Without Live DataMainframe Application Testing both With and Without Live Data
Mainframe Application Testing both With and Without Live Data
 
IBM Impact session CICS V52 overview
IBM Impact session CICS V52 overview IBM Impact session CICS V52 overview
IBM Impact session CICS V52 overview
 
Enabling Telco to Build and Run Modern Applications
Enabling Telco to Build and Run Modern Applications Enabling Telco to Build and Run Modern Applications
Enabling Telco to Build and Run Modern Applications
 
Netflix Recommender System : Big Data Case Study
Netflix Recommender System : Big Data Case StudyNetflix Recommender System : Big Data Case Study
Netflix Recommender System : Big Data Case Study
 
Rakuten’s Journey with Splunk - Evolution of Splunk as a Service
Rakuten’s Journey with Splunk - Evolution of Splunk as a ServiceRakuten’s Journey with Splunk - Evolution of Splunk as a Service
Rakuten’s Journey with Splunk - Evolution of Splunk as a Service
 
USG Summit - September 2014 - Web Management using Drupal
USG Summit - September 2014 - Web Management using DrupalUSG Summit - September 2014 - Web Management using Drupal
USG Summit - September 2014 - Web Management using Drupal
 
Real-Time Dynamic Data Export Using the Kafka Ecosystem
Real-Time Dynamic Data Export Using the Kafka EcosystemReal-Time Dynamic Data Export Using the Kafka Ecosystem
Real-Time Dynamic Data Export Using the Kafka Ecosystem
 
Case study: Life Cycle Management for SAP BusinessObjects platform as well as...
Case study: Life Cycle Management for SAP BusinessObjects platform as well as...Case study: Life Cycle Management for SAP BusinessObjects platform as well as...
Case study: Life Cycle Management for SAP BusinessObjects platform as well as...
 
Share 2014 Pittsburgh CICS Technical Overview
Share 2014 Pittsburgh CICS Technical OverviewShare 2014 Pittsburgh CICS Technical Overview
Share 2014 Pittsburgh CICS Technical Overview
 
AD1545 - Extending the XPages Extension Library
AD1545 - Extending the XPages Extension LibraryAD1545 - Extending the XPages Extension Library
AD1545 - Extending the XPages Extension Library
 
Solr Under the Hood at S&P Global- Sumit Vadhera, S&P Global
Solr Under the Hood at S&P Global- Sumit Vadhera, S&P Global Solr Under the Hood at S&P Global- Sumit Vadhera, S&P Global
Solr Under the Hood at S&P Global- Sumit Vadhera, S&P Global
 
The Need of Cloud-Native Application
The Need of Cloud-Native ApplicationThe Need of Cloud-Native Application
The Need of Cloud-Native Application
 
Building real time data-driven products
Building real time data-driven productsBuilding real time data-driven products
Building real time data-driven products
 
MIGRATION - PAIN OR GAIN?
MIGRATION - PAIN OR GAIN?MIGRATION - PAIN OR GAIN?
MIGRATION - PAIN OR GAIN?
 
CouchbasetoHadoop_Matt_Michael_Justin v4
CouchbasetoHadoop_Matt_Michael_Justin v4CouchbasetoHadoop_Matt_Michael_Justin v4
CouchbasetoHadoop_Matt_Michael_Justin v4
 

Plus de Zabbix

Plus de Zabbix (20)

Zabbix Conference LatAm 2016 - Jessian Ferreira - Wireless with Zabbix
Zabbix Conference LatAm 2016 - Jessian Ferreira - Wireless with ZabbixZabbix Conference LatAm 2016 - Jessian Ferreira - Wireless with Zabbix
Zabbix Conference LatAm 2016 - Jessian Ferreira - Wireless with Zabbix
 
Zabbix Conference LatAm 2016 - Andre Deo - Zabbix Brazil Community
Zabbix Conference LatAm 2016 - Andre Deo - Zabbix Brazil CommunityZabbix Conference LatAm 2016 - Andre Deo - Zabbix Brazil Community
Zabbix Conference LatAm 2016 - Andre Deo - Zabbix Brazil Community
 
Zabbix Conference LatAm 2016 - Jorge Pretel - Low Level Discovery for ODBC an...
Zabbix Conference LatAm 2016 - Jorge Pretel - Low Level Discovery for ODBC an...Zabbix Conference LatAm 2016 - Jorge Pretel - Low Level Discovery for ODBC an...
Zabbix Conference LatAm 2016 - Jorge Pretel - Low Level Discovery for ODBC an...
 
Zabbix Conference LatAm 2016 - Andre Deo - SNMP and Zabbix
Zabbix Conference LatAm 2016 - Andre Deo - SNMP and ZabbixZabbix Conference LatAm 2016 - Andre Deo - SNMP and Zabbix
Zabbix Conference LatAm 2016 - Andre Deo - SNMP and Zabbix
 
Zabbix Conference LatAm 2016 - Rodrigo Mohr - Challenges on Large Env with Or...
Zabbix Conference LatAm 2016 - Rodrigo Mohr - Challenges on Large Env with Or...Zabbix Conference LatAm 2016 - Rodrigo Mohr - Challenges on Large Env with Or...
Zabbix Conference LatAm 2016 - Rodrigo Mohr - Challenges on Large Env with Or...
 
Zabbix Conference LatAm 2016 - Marcio Prop - Monitoring Complex Environments ...
Zabbix Conference LatAm 2016 - Marcio Prop - Monitoring Complex Environments ...Zabbix Conference LatAm 2016 - Marcio Prop - Monitoring Complex Environments ...
Zabbix Conference LatAm 2016 - Marcio Prop - Monitoring Complex Environments ...
 
Zabbix Conference LatAm 2016 - Daniel Nasiloski - Extending Zabbix - Interact...
Zabbix Conference LatAm 2016 - Daniel Nasiloski - Extending Zabbix - Interact...Zabbix Conference LatAm 2016 - Daniel Nasiloski - Extending Zabbix - Interact...
Zabbix Conference LatAm 2016 - Daniel Nasiloski - Extending Zabbix - Interact...
 
Zabbix Conference LatAm 2016 - Filipe Paternot - Zbx@Globo Automation+Integra...
Zabbix Conference LatAm 2016 - Filipe Paternot - Zbx@Globo Automation+Integra...Zabbix Conference LatAm 2016 - Filipe Paternot - Zbx@Globo Automation+Integra...
Zabbix Conference LatAm 2016 - Filipe Paternot - Zbx@Globo Automation+Integra...
 
Zabbix Conference LatAm 2016 - Douglas Esteves - Zabbix at UNICAMP
Zabbix Conference LatAm 2016 - Douglas Esteves - Zabbix at UNICAMPZabbix Conference LatAm 2016 - Douglas Esteves - Zabbix at UNICAMP
Zabbix Conference LatAm 2016 - Douglas Esteves - Zabbix at UNICAMP
 
Ryan Armstrong - Monitoring More Than 6000 Devices in Zabbix | ZabConf2016
Ryan Armstrong - Monitoring More Than 6000 Devices in Zabbix | ZabConf2016Ryan Armstrong - Monitoring More Than 6000 Devices in Zabbix | ZabConf2016
Ryan Armstrong - Monitoring More Than 6000 Devices in Zabbix | ZabConf2016
 
Rafael Martinez Guerrero - Zabbix at the University of Oslo | ZabConf2016
Rafael Martinez Guerrero - Zabbix at the University of Oslo | ZabConf2016Rafael Martinez Guerrero - Zabbix at the University of Oslo | ZabConf2016
Rafael Martinez Guerrero - Zabbix at the University of Oslo | ZabConf2016
 
Wolfgang Alper - Zabbix Meets OPS Control / Rundeck | ZabConf2016
Wolfgang Alper - Zabbix Meets OPS Control / Rundeck | ZabConf2016Wolfgang Alper - Zabbix Meets OPS Control / Rundeck | ZabConf2016
Wolfgang Alper - Zabbix Meets OPS Control / Rundeck | ZabConf2016
 
Wolfgang Alper - Zabbix Meets OPS Control / Rundeck | ZabConf2016
Wolfgang Alper - Zabbix Meets OPS Control / Rundeck | ZabConf2016Wolfgang Alper - Zabbix Meets OPS Control / Rundeck | ZabConf2016
Wolfgang Alper - Zabbix Meets OPS Control / Rundeck | ZabConf2016
 
Sumit Goel - Monitoring Cloud Applications Using Zabbix | ZabConf2016
Sumit Goel - Monitoring Cloud Applications Using Zabbix | ZabConf2016Sumit Goel - Monitoring Cloud Applications Using Zabbix | ZabConf2016
Sumit Goel - Monitoring Cloud Applications Using Zabbix | ZabConf2016
 
Rihards Olups - Zabbix at Nokia - Case Study
Rihards Olups - Zabbix at Nokia - Case StudyRihards Olups - Zabbix at Nokia - Case Study
Rihards Olups - Zabbix at Nokia - Case Study
 
Raymond Kuiper - Zen and The Art of Zabbix Template Design | ZabConf2016
Raymond Kuiper - Zen and The Art of Zabbix Template Design | ZabConf2016Raymond Kuiper - Zen and The Art of Zabbix Template Design | ZabConf2016
Raymond Kuiper - Zen and The Art of Zabbix Template Design | ZabConf2016
 
Dimitri Bellini and Pietro Antonacci - Manage Zabbix Proxies in Remote Networ...
Dimitri Bellini and Pietro Antonacci - Manage Zabbix Proxies in Remote Networ...Dimitri Bellini and Pietro Antonacci - Manage Zabbix Proxies in Remote Networ...
Dimitri Bellini and Pietro Antonacci - Manage Zabbix Proxies in Remote Networ...
 
Erik Skytthe - Monitoring Mesos, Docker, Containers with Zabbix | ZabConf2016
Erik Skytthe - Monitoring Mesos, Docker, Containers with Zabbix | ZabConf2016Erik Skytthe - Monitoring Mesos, Docker, Containers with Zabbix | ZabConf2016
Erik Skytthe - Monitoring Mesos, Docker, Containers with Zabbix | ZabConf2016
 
Mikhail Serkov - Zabbix for HPC Cluster Support | ZabConf2016
Mikhail Serkov - Zabbix for HPC Cluster Support | ZabConf2016Mikhail Serkov - Zabbix for HPC Cluster Support | ZabConf2016
Mikhail Serkov - Zabbix for HPC Cluster Support | ZabConf2016
 
Konstantin Yakovlev - Event Analysis Toolset | ZabConf2016
Konstantin Yakovlev - Event Analysis Toolset | ZabConf2016Konstantin Yakovlev - Event Analysis Toolset | ZabConf2016
Konstantin Yakovlev - Event Analysis Toolset | ZabConf2016
 

Dernier

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 

Dernier (20)

Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 

Filipe paternot - Case Study: Zabbix Deployment at Globo.com

  • 2. A two-year case study globo .com Who are we? • One of major media player in Brazil • responsible for content delivery • Numbers • 45M unique visitors/day • 8 major products • 43% Brazil audience (mobile/computer) • 3k physical devices (+3k VM's and growing fast) • 600k items, 160k triggers, 2k nvps
  • 3. A two-year case study globo .com About this talk • Technical challenges • and how we overcame them • New tools • what we have developed • Time for questions
  • 4. A two-year case study globo .com About::our team • 4(12) people • 80+% time dedicated to Zabbix • 3 Zabbix Certified Specialist's • dev + deploy + infra + Zabbix administration • DBA team • NGX Labs • developing custom tools • 30 people on 24x7 NOC
  • 5. A two-year case study globo .com Know your infrastructure • Know what to monitor • how fast and how long you should keep data • Define reasonable standards • ease Zabbix management • high/med/low/control update times • what trigger priority means
  • 6. A two-year case study globo .com Know your infrastructure Source: Unirede
  • 7. A two-year case study globo .com Know your infrastructure • Plan Zabbix capacity ahead • and then monitor it
  • 8. A two-year case study globo .com Migration from legacy • Can some consultancy help? • Unirede did • What is crucial for your business? • What do you need to plan your future? • "Everything" might not be a good answer! • Talk to your customers! • synergy will help you get through • We used (still use) Cricket and Nagios
  • 9. A two-year case study globo .com Migrating from legacy::Cricket • Cricket (RRD based) • Import of historical data • Choose carefully what you need • we chose only few services • trends data • Challenges to send/sort data to Zabbix • migration took almost one month to complete
  • 10. A two-year case study globo .com Migrating from legacy::Nagios • Talk to each customer • make presentations, be present • (re)design each service template • teach them how to use the API and Frontend • no more ticket to CRUD for monitoring requests • takes time, change is hard
  • 11. A two-year case study globo .com Optimize::database • Study alongside with your DBA • tune everything, everywhere • use read-replicas for large historical data, storage and user visualization needs • use partitions and forget about housekeeper
  • 12. A two-year case study globo .com Optimize::database
  • 13. A two-year case study globo .com Infrastructure • values
  • 14. A two-year case study globo .com Infrastructure
  • 15. A two-year case study globo .com Integrations • Why • third party tools and data sources • completes visualization and data correlation • help while debugging problems • automate everyday tasks • minimize human error • doc infrastructure • consolidates information
  • 16. A two-year case study globo .com Integrations::graphix • Graphix • Alternative LIVE views through network graphs • Faster and easier understanding of data • Triggers, Hosts, Templates, Proxies, etc can be crossed with external sources of data in Realtime and at the same screen. • Integrates data from Zabbix with ServiceNow (CMDB), Tsuru(PaaS) and Network Topologies
  • 17. A two-year case study globo .com Integrations::cloudmon • Cloudmon • manages Cloudstack VM's inside Zabbix • replicate VM's status (started=monitoring, halted=not monitoring, ...) • System VM's and VRouters are also monitored • replicate Cloudstack project as Zabbix host group • ease visualization, data consolidation, sums, etc • Based on Hyclops extension
  • 18. A two-year case study globo .com Integrations::gbix • Gbix • API and GUI/WebApp • faster api response through caching (eg: triggers) • custom methods (fewer steps to interact with api) • users who can't code are still able to do custom operations
  • 19. A two-year case study globo .com Integrations::macrosync • Macrosync • Talks nicely with DBforBix • No need to edit DBforBix manually for each DB monitor that's gonna be created • Updates DBforBix config and reload automatically when changes are detected to Zabbix designated database hostgroup • Integrates with Gbix API
  • 20. A two-year case study globo .com Integrations::discovery • Discovery • Create "automagically" discovery rules • Distribute rules across available proxies • Region aware • Balance huge networks with many small ones • VLANS /24 /27 • Integrate with Globo Network API • github.com/globocom/GloboNetworkAPI
  • 21. A two-year case study globo .com Integrations::git • Gitlab/Github • Many internal project documentations and topology hosted • Wow effect
  • 22. A two-year case study globo .com Templates • Cisco • F5 balancer • Netapp storage • http/https • other vital stuff • hardware health • datacenter sensors
  • 23. A two-year case study globo .com Deploy • Git • Code your infrastructure • Puppet • Manages configuration, rpm, etc • Scripts • Faster interventions, debug • Script everything
  • 24. A two-year case study globo .com Deploy
  • 25. A two-year case study globo .com Questions? fpaternot@corp.globo.com gabriel@corp.globo.com