SlideShare une entreprise Scribd logo
1  sur  17
Monitoring tools for
  ElasticSearch
     SF Meetup
     2013.03.06

                  Sushant Shankar
                  Shyam Kuttikkad
• Why and how we use ElasticSearch
• Monitoring
  – Tools
  – Index Building
  – Query Performance
Who is asdfas
• Social Sharing and Content Discovery platform
   – We help >600,000 publishers with content distribution, user
     engagement, and advertising monetization
   – 450 Fortune 1000 brand marketers leverage our unique social signals
     to deliver impactful advertising
• We develop Machine Learning algorithms operating on Big
  Data to:
   – Provide content sharing insights to Publishers
   – Build customized audience segments for advertising campaigns
   – Extract actionable insights out of social and interest data




www.33Across.com
www.tynt.com
Data firehose of 30B monthly
   events, 1.25B cookies
                     - Interaction with web
                     content
                     - Shares – images,
                     copies
                     - Searches

                           Build, understand,
                           analyze
                           Real-time view
                                    ElasticSearch!
                      Social Audiences
                      Behavior
                      Context
                      Knowledge
Production ElasticSearch cluster

Hardware
6 nodes, 24GB RAM
16GB for ES service
4 cores
3x 1.5TB drive

Index                  Build index
>1TB/index             using MR job
(replicated)           and Bulk API
~300M documents
~5KB / document
~3 hours
System monitoring using Zabbix

               Index Build
ElasticSearch specific monitoring
                     using SPM




Scalable Performance Monitoring (http://sematext.com/spm/index.html)
•   Index stats – Total/Refreshed/Merged documents
•   Shards – Total/Active/Relocating/Initializing
•   Search - Request rate and latency
•   Cache – {Filter, field} cache {count, evictions, size}
•   Machine – CPU, Memory, JVM, GC, Network, Disk
Index Building Optimization using
             Zabbix and SPM
Amount bulk indexed




                      Time taken
                       CPU util.
                       Mem util.
                        Disk I/O
                       Network



                                   # Shards
in practice…
Debugging and Validating using SPM
Index Building: Learnings
• 2 shards / CPU
• 10,000 documents (users) per indexing
  request

• Bulk API for our use case
• No replicas
• Refresh off (index.refresh_interval = -1)
Query Performance: Learnings
•   1-2 Replicas (and for reliability)
•   Turn refresh on again (5s default)
•   Warm up effect (Index Warm up API 0.20+)
•   Optimize API
•   Simulate multiple users
QUERIES?
Sushant Shankar
sushant.shankar@33across.com


     Shyam Kuttikkad
shyam.kuttikkad@33across.com
Why we really need a search engine
         Batch! Good for complicated tasks
         (Machine Learning, Graph Algorithms, etc.)




                          …                           …
Warm Up: load into memory and cache
Other cool features
• Custom Scoring functions
• Scripts – MVEL, Python
• Facets

•   Exploring:
•   Real-time indexing
•   Indexing images, files, etc.
•   Parent-child relationships

Contenu connexe

En vedette

Applying machine learning to product categorization
Applying machine learning to product categorizationApplying machine learning to product categorization
Applying machine learning to product categorization
Sushant Shankar
 

En vedette (11)

The Automation Factory
The Automation FactoryThe Automation Factory
The Automation Factory
 
Applying machine learning to product categorization
Applying machine learning to product categorizationApplying machine learning to product categorization
Applying machine learning to product categorization
 
Elasticsearch in Production (London version)
Elasticsearch in Production (London version)Elasticsearch in Production (London version)
Elasticsearch in Production (London version)
 
E-commerce product classification with deep learning
E-commerce product classification with deep learning E-commerce product classification with deep learning
E-commerce product classification with deep learning
 
LogStash - Yes, logging can be awesome
LogStash - Yes, logging can be awesomeLogStash - Yes, logging can be awesome
LogStash - Yes, logging can be awesome
 
Down and dirty with Elasticsearch
Down and dirty with ElasticsearchDown and dirty with Elasticsearch
Down and dirty with Elasticsearch
 
Machine Learning with Applications in Categorization, Popularity and Sequence...
Machine Learning with Applications in Categorization, Popularity and Sequence...Machine Learning with Applications in Categorization, Popularity and Sequence...
Machine Learning with Applications in Categorization, Popularity and Sequence...
 
Elasticsearch in Netflix
Elasticsearch in NetflixElasticsearch in Netflix
Elasticsearch in Netflix
 
Cassandra+Hadoop
Cassandra+HadoopCassandra+Hadoop
Cassandra+Hadoop
 
Monitoring the ELK stack using Zabbix and Grafana (Dennis Kanbier / 26-11-2015)
Monitoring the ELK stack using Zabbix and Grafana (Dennis Kanbier / 26-11-2015)Monitoring the ELK stack using Zabbix and Grafana (Dennis Kanbier / 26-11-2015)
Monitoring the ELK stack using Zabbix and Grafana (Dennis Kanbier / 26-11-2015)
 
Cassandra for Sysadmins
Cassandra for SysadminsCassandra for Sysadmins
Cassandra for Sysadmins
 

Similaire à SF ElasticSearch Meetup 2013.04.06 - Monitoring

Share point 2013 enterprise search (public)
Share point 2013 enterprise search (public)Share point 2013 enterprise search (public)
Share point 2013 enterprise search (public)
Petter Skodvin-Hvammen
 
Elasticsearch meetup final_2014_04
Elasticsearch meetup final_2014_04Elasticsearch meetup final_2014_04
Elasticsearch meetup final_2014_04
marc_harrison
 
SharePoint 2013 Search Architecture with Russ Houberg
SharePoint 2013  Search Architecture with Russ HoubergSharePoint 2013  Search Architecture with Russ Houberg
SharePoint 2013 Search Architecture with Russ Houberg
knowledgelakemarketing
 
Web search engines and search technology
Web search engines and search technologyWeb search engines and search technology
Web search engines and search technology
Stefanos Anastasiadis
 

Similaire à SF ElasticSearch Meetup 2013.04.06 - Monitoring (20)

SF ElasticSearch Meetup 2012.10.03
SF ElasticSearch Meetup 2012.10.03SF ElasticSearch Meetup 2012.10.03
SF ElasticSearch Meetup 2012.10.03
 
Share point 2013 enterprise search (public)
Share point 2013 enterprise search (public)Share point 2013 enterprise search (public)
Share point 2013 enterprise search (public)
 
Elasticsearch Introduction at BigData meetup
Elasticsearch Introduction at BigData meetupElasticsearch Introduction at BigData meetup
Elasticsearch Introduction at BigData meetup
 
Elasticsearch meetup final_2014_04
Elasticsearch meetup final_2014_04Elasticsearch meetup final_2014_04
Elasticsearch meetup final_2014_04
 
Log analysis using Logstash,ElasticSearch and Kibana - Desert Code Camp 2014
Log analysis using Logstash,ElasticSearch and Kibana - Desert Code Camp 2014Log analysis using Logstash,ElasticSearch and Kibana - Desert Code Camp 2014
Log analysis using Logstash,ElasticSearch and Kibana - Desert Code Camp 2014
 
Log analysis using Logstash,ElasticSearch and Kibana
Log analysis using Logstash,ElasticSearch and KibanaLog analysis using Logstash,ElasticSearch and Kibana
Log analysis using Logstash,ElasticSearch and Kibana
 
SharePoint 2013 Search Architecture with Russ Houberg
SharePoint 2013  Search Architecture with Russ HoubergSharePoint 2013  Search Architecture with Russ Houberg
SharePoint 2013 Search Architecture with Russ Houberg
 
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch Service
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch ServiceBDA402 Deep Dive: Log Analytics with Amazon Elasticsearch Service
BDA402 Deep Dive: Log Analytics with Amazon Elasticsearch Service
 
DeepLearning001&ApacheMXNetWithSparkForInference-ACNA2018
DeepLearning001&ApacheMXNetWithSparkForInference-ACNA2018DeepLearning001&ApacheMXNetWithSparkForInference-ACNA2018
DeepLearning001&ApacheMXNetWithSparkForInference-ACNA2018
 
ElasticSearch as (only) datastore
ElasticSearch as (only) datastoreElasticSearch as (only) datastore
ElasticSearch as (only) datastore
 
Capacity Planning
Capacity PlanningCapacity Planning
Capacity Planning
 
Exploring MongoDB & Elasticsearch: Better Together
Exploring MongoDB & Elasticsearch: Better TogetherExploring MongoDB & Elasticsearch: Better Together
Exploring MongoDB & Elasticsearch: Better Together
 
Log Analytics with Amazon Elasticsearch Service & Kibana
Log Analytics with Amazon Elasticsearch Service & KibanaLog Analytics with Amazon Elasticsearch Service & Kibana
Log Analytics with Amazon Elasticsearch Service & Kibana
 
Web search engines and search technology
Web search engines and search technologyWeb search engines and search technology
Web search engines and search technology
 
Traitement d'événements
Traitement d'événementsTraitement d'événements
Traitement d'événements
 
SharePoint Saturday San Antonio: SharePoint 2010 Performance
SharePoint Saturday San Antonio: SharePoint 2010 PerformanceSharePoint Saturday San Antonio: SharePoint 2010 Performance
SharePoint Saturday San Antonio: SharePoint 2010 Performance
 
Realtime Search Infrastructure at Craigslist (OpenWest 2014)
Realtime Search Infrastructure at Craigslist (OpenWest 2014)Realtime Search Infrastructure at Craigslist (OpenWest 2014)
Realtime Search Infrastructure at Craigslist (OpenWest 2014)
 
AzureSynapse.pptx
AzureSynapse.pptxAzureSynapse.pptx
AzureSynapse.pptx
 
Real-time Data Exploration and Analytics with Amazon Elasticsearch Service
Real-time Data Exploration and Analytics with Amazon Elasticsearch ServiceReal-time Data Exploration and Analytics with Amazon Elasticsearch Service
Real-time Data Exploration and Analytics with Amazon Elasticsearch Service
 
Thing you didn't know you could do in Spark
Thing you didn't know you could do in SparkThing you didn't know you could do in Spark
Thing you didn't know you could do in Spark
 

Dernier

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Dernier (20)

Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 

SF ElasticSearch Meetup 2013.04.06 - Monitoring

  • 1. Monitoring tools for ElasticSearch SF Meetup 2013.03.06 Sushant Shankar Shyam Kuttikkad
  • 2. • Why and how we use ElasticSearch • Monitoring – Tools – Index Building – Query Performance
  • 3. Who is asdfas • Social Sharing and Content Discovery platform – We help >600,000 publishers with content distribution, user engagement, and advertising monetization – 450 Fortune 1000 brand marketers leverage our unique social signals to deliver impactful advertising • We develop Machine Learning algorithms operating on Big Data to: – Provide content sharing insights to Publishers – Build customized audience segments for advertising campaigns – Extract actionable insights out of social and interest data www.33Across.com www.tynt.com
  • 4. Data firehose of 30B monthly events, 1.25B cookies - Interaction with web content - Shares – images, copies - Searches Build, understand, analyze Real-time view ElasticSearch! Social Audiences Behavior Context Knowledge
  • 5. Production ElasticSearch cluster Hardware 6 nodes, 24GB RAM 16GB for ES service 4 cores 3x 1.5TB drive Index Build index >1TB/index using MR job (replicated) and Bulk API ~300M documents ~5KB / document ~3 hours
  • 6. System monitoring using Zabbix Index Build
  • 7. ElasticSearch specific monitoring using SPM Scalable Performance Monitoring (http://sematext.com/spm/index.html) • Index stats – Total/Refreshed/Merged documents • Shards – Total/Active/Relocating/Initializing • Search - Request rate and latency • Cache – {Filter, field} cache {count, evictions, size} • Machine – CPU, Memory, JVM, GC, Network, Disk
  • 8. Index Building Optimization using Zabbix and SPM Amount bulk indexed Time taken CPU util. Mem util. Disk I/O Network # Shards
  • 11. Index Building: Learnings • 2 shards / CPU • 10,000 documents (users) per indexing request • Bulk API for our use case • No replicas • Refresh off (index.refresh_interval = -1)
  • 12. Query Performance: Learnings • 1-2 Replicas (and for reliability) • Turn refresh on again (5s default) • Warm up effect (Index Warm up API 0.20+) • Optimize API • Simulate multiple users
  • 14. Sushant Shankar sushant.shankar@33across.com Shyam Kuttikkad shyam.kuttikkad@33across.com
  • 15. Why we really need a search engine Batch! Good for complicated tasks (Machine Learning, Graph Algorithms, etc.) … …
  • 16. Warm Up: load into memory and cache
  • 17. Other cool features • Custom Scoring functions • Scripts – MVEL, Python • Facets • Exploring: • Real-time indexing • Indexing images, files, etc. • Parent-child relationships

Notes de l'éditeur

  1. http://www.zabbix.com/ - ‘’Enterprise class monitoring solution for everyone’
  2. http://www.zabbix.com/ - ‘’Enterprise class monitoring solution for everyone’
  3. Collect information over 1B users internationally – text copied from over 600K publisher sites, images, searches, pages visitedDifferent slices of data – now!