SlideShare une entreprise Scribd logo
1  sur  18
Play and learn
with
Elasticsearch
Emanuil
@emanuil_tolev
I’ve done things
Used Elasticsearch since v.0.18 (2011)
Been on-call for production systems using Elasticsearch since 2013
Paired it with (mostly) Python, also Ruby and Javascript
Used it as the sole place to hold data
Also used it in a more usual way - paired with a database
Elasticsearch is
a really fast and easily scalable
Open source
Distributed
RESTful
Search and Analytics
Engine
Part of an ecosystem of tools for analytics
(massage, store and graph data)
The features of Elasticsearch
A walk through the woods. Features.
Many features that can be categorised as:
- Indexing
- Querying
- Aggregating (Analysing)
Indexing
Receive raw data
Analyse
Record
You can just throw data at it
Querying
Receive the query
Analyse the query
Search
Fetch (return results)
Control paging and sorting
Many types of query to support many use cases
Aggregating
An aggregation is some analysis over some documents
Types
Buckets are very useful
You can nest aggregations
They’re cleverly cached
You can do quite a lot with
Elasticsearch
Search
through
Natural
Language
~30 minutes to prototype
Ingredients
The text you want to search through
The searches you want to do (queries)
Elasticsearch
Preparation
Put text into Elasticsearch. No schema or
configuration necessary (for basics).
Put queries into Elasticsearch
1. Get results
Let me show you quickly.
Logs
~60 minutes to prototype
Put logs in. Run aggregations.
Get insight into app and traffic.
The Elastic Stack is geared towards
this with multiple products tackling
log formats, ingestion and analysis.
Custom
Dashboards
~180 minutes to prototype
Put data in. Run aggregations.
Get insight.
Plays really well with D3 and other
common visualisation libraries.
Can also use Kibana + Elasticsearch
Further use cases
Search
Faceting
“Did you mean?”
Autocomplete
Sounds-like suggestions
“People who buy this also buy...”
Do you have a nail? Elasticsearch is a
hammerES is not great at:
● Relational
integrity
● Transactions
Problems you should not try to solve with ES:
● Calculate inventory
● Grand totals
● Rollback-able stuff
● User accounts
Let’s play!
I was your host
and would love feedback
Emanuil Tolev
emanuil@cottagelabs.com
@emanuil_tolev on Twitter
Link to slides: http://tinyurl.com/es-intro-slides
Really, really good intro blog post to ES with use cases and further reading,
like securing your Elasticsearch: http://tinyurl.com/es-intro-blog .
US State map came from http://greasethewheels.org/cpi/ , actually a US corruption research paper.

Contenu connexe

Similaire à Elasticsearch workshop presentation

Elastic search apache_solr
Elastic search apache_solrElastic search apache_solr
Elastic search apache_solr
macrochen
 
Optimizing Application Architecture (.NET/Java topics)
Optimizing Application Architecture (.NET/Java topics)Optimizing Application Architecture (.NET/Java topics)
Optimizing Application Architecture (.NET/Java topics)
Ravi Okade
 

Similaire à Elasticsearch workshop presentation (20)

Getting started with Laravel & Elasticsearch
Getting started with Laravel & ElasticsearchGetting started with Laravel & Elasticsearch
Getting started with Laravel & Elasticsearch
 
Filebeat Elastic Search Presentation.pptx
Filebeat Elastic Search Presentation.pptxFilebeat Elastic Search Presentation.pptx
Filebeat Elastic Search Presentation.pptx
 
An Intro to Elasticsearch and Kibana
An Intro to Elasticsearch and KibanaAn Intro to Elasticsearch and Kibana
An Intro to Elasticsearch and Kibana
 
Elasticsearch
ElasticsearchElasticsearch
Elasticsearch
 
Elastic Search Capability Presentation.pptx
Elastic Search Capability Presentation.pptxElastic Search Capability Presentation.pptx
Elastic Search Capability Presentation.pptx
 
Episerver and search engines
Episerver and search enginesEpiserver and search engines
Episerver and search engines
 
ElasticSearch in Production: lessons learned
ElasticSearch in Production: lessons learnedElasticSearch in Production: lessons learned
ElasticSearch in Production: lessons learned
 
Visualizing Austin's data with Elasticsearch and Kibana
Visualizing Austin's data with Elasticsearch and KibanaVisualizing Austin's data with Elasticsearch and Kibana
Visualizing Austin's data with Elasticsearch and Kibana
 
Getting Started with Elasticsearch
Getting Started with ElasticsearchGetting Started with Elasticsearch
Getting Started with Elasticsearch
 
Introduction to elasticsearch
Introduction to elasticsearchIntroduction to elasticsearch
Introduction to elasticsearch
 
Elastic search apache_solr
Elastic search apache_solrElastic search apache_solr
Elastic search apache_solr
 
Elasticsearch Basics
Elasticsearch BasicsElasticsearch Basics
Elasticsearch Basics
 
ElasticSearch Meetup 30 - 10 - 2014
ElasticSearch Meetup 30 - 10 - 2014ElasticSearch Meetup 30 - 10 - 2014
ElasticSearch Meetup 30 - 10 - 2014
 
AWS October Webinar Series - Introducing Amazon Elasticsearch Service
AWS October Webinar Series - Introducing Amazon Elasticsearch ServiceAWS October Webinar Series - Introducing Amazon Elasticsearch Service
AWS October Webinar Series - Introducing Amazon Elasticsearch Service
 
Roaring with elastic search sangam2018
Roaring with elastic search sangam2018Roaring with elastic search sangam2018
Roaring with elastic search sangam2018
 
(BDT209) Launch: Amazon Elasticsearch For Real-Time Data Analytics
(BDT209) Launch: Amazon Elasticsearch For Real-Time Data Analytics(BDT209) Launch: Amazon Elasticsearch For Real-Time Data Analytics
(BDT209) Launch: Amazon Elasticsearch For Real-Time Data Analytics
 
Elastic & Azure & Episever, Case Evira
Elastic & Azure & Episever, Case EviraElastic & Azure & Episever, Case Evira
Elastic & Azure & Episever, Case Evira
 
Configuring elasticsearch for performance and scale
Configuring elasticsearch for performance and scaleConfiguring elasticsearch for performance and scale
Configuring elasticsearch for performance and scale
 
Optimizing Application Architecture (.NET/Java topics)
Optimizing Application Architecture (.NET/Java topics)Optimizing Application Architecture (.NET/Java topics)
Optimizing Application Architecture (.NET/Java topics)
 
Overview on elastic search
Overview on elastic searchOverview on elastic search
Overview on elastic search
 

Plus de Laura Steggles

Plus de Laura Steggles (9)

HR Insights - Tax Reforms & Spring updates 2018
HR Insights - Tax Reforms & Spring updates 2018HR Insights - Tax Reforms & Spring updates 2018
HR Insights - Tax Reforms & Spring updates 2018
 
Tech Talk - Blockchain presentation
Tech Talk - Blockchain presentationTech Talk - Blockchain presentation
Tech Talk - Blockchain presentation
 
HR Insights - Mental Health Awareness in the Workplace
HR Insights - Mental Health Awareness in the WorkplaceHR Insights - Mental Health Awareness in the Workplace
HR Insights - Mental Health Awareness in the Workplace
 
Anna Denton Jones HR Insights September 2017
Anna Denton Jones HR Insights September 2017Anna Denton Jones HR Insights September 2017
Anna Denton Jones HR Insights September 2017
 
How to find and build your audience using social media
How to find and build your audience using social mediaHow to find and build your audience using social media
How to find and build your audience using social media
 
Anna Denton Jones HR Insights June 2017
Anna Denton Jones HR Insights June 2017Anna Denton Jones HR Insights June 2017
Anna Denton Jones HR Insights June 2017
 
Functional programming with Immutable .JS
Functional programming with Immutable .JSFunctional programming with Immutable .JS
Functional programming with Immutable .JS
 
Running local, going global yolk
Running local, going global   yolkRunning local, going global   yolk
Running local, going global yolk
 
Social Media and the common challenges employers have to deal with
Social Media and the common challenges employers have to deal withSocial Media and the common challenges employers have to deal with
Social Media and the common challenges employers have to deal with
 

Dernier

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Dernier (20)

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 

Elasticsearch workshop presentation

  • 2. I’ve done things Used Elasticsearch since v.0.18 (2011) Been on-call for production systems using Elasticsearch since 2013 Paired it with (mostly) Python, also Ruby and Javascript Used it as the sole place to hold data Also used it in a more usual way - paired with a database
  • 3. Elasticsearch is a really fast and easily scalable Open source Distributed RESTful Search and Analytics Engine Part of an ecosystem of tools for analytics (massage, store and graph data)
  • 4. The features of Elasticsearch
  • 5. A walk through the woods. Features. Many features that can be categorised as: - Indexing - Querying - Aggregating (Analysing)
  • 7. Querying Receive the query Analyse the query Search Fetch (return results) Control paging and sorting Many types of query to support many use cases
  • 8. Aggregating An aggregation is some analysis over some documents Types Buckets are very useful You can nest aggregations They’re cleverly cached
  • 9.
  • 10.
  • 11. You can do quite a lot with Elasticsearch
  • 12. Search through Natural Language ~30 minutes to prototype Ingredients The text you want to search through The searches you want to do (queries) Elasticsearch Preparation Put text into Elasticsearch. No schema or configuration necessary (for basics). Put queries into Elasticsearch 1. Get results Let me show you quickly.
  • 13. Logs ~60 minutes to prototype Put logs in. Run aggregations. Get insight into app and traffic. The Elastic Stack is geared towards this with multiple products tackling log formats, ingestion and analysis.
  • 14. Custom Dashboards ~180 minutes to prototype Put data in. Run aggregations. Get insight. Plays really well with D3 and other common visualisation libraries. Can also use Kibana + Elasticsearch
  • 15. Further use cases Search Faceting “Did you mean?” Autocomplete Sounds-like suggestions “People who buy this also buy...”
  • 16. Do you have a nail? Elasticsearch is a hammerES is not great at: ● Relational integrity ● Transactions Problems you should not try to solve with ES: ● Calculate inventory ● Grand totals ● Rollback-able stuff ● User accounts
  • 18. I was your host and would love feedback Emanuil Tolev emanuil@cottagelabs.com @emanuil_tolev on Twitter Link to slides: http://tinyurl.com/es-intro-slides Really, really good intro blog post to ES with use cases and further reading, like securing your Elasticsearch: http://tinyurl.com/es-intro-blog . US State map came from http://greasethewheels.org/cpi/ , actually a US corruption research paper.

Notes de l'éditeur

  1. Am a consultant, specialising in performance and robust technical architecture. The right tools for the right problems, etc. Work in a loose partnership of other consultants and freelancers called Cottage Labs.
  2. About to use it a lot more with RDBMS
  3. Open source - 1-2 of the usual positives. Strong resilient community in this case. Distributed - stuff can go down and the system rebalances itself automatically. Restful - Very easy to use - only need a browser. Very good, simple HTTP API speaking in JSON. Note Search vs. Analytics distinction The Elastic Stack is more than Elasticsearch, but out of scope here.
  4. Indexing (= putting data in) Querying (= find a needle in haystack). Includes things like searching, fuzzy searching, autocompletion and instant searches (train apps). Aggregating (= analysing data and counting things)
  5. Throw data at it: ES will guess data types and enforce them for you. You can’t save a number into a field that ES has learned is a date. Of course, you can also be much more careful and thorough - use Mappings. ES will always analyse by default. Is it possible that we might not always want that? Advanced: asciifolding, tokenisation, find a document by its translation, and more. Index-time analysis and analysers Common pitfall: avoiding analysis for exact string matches
  6. Paging and sorting directly in the URL, or in JSON: ?sort ?size Queries: match, terms, geo, More Like This (takes doc as input to return similar docs)
  7. Types: matrix, metrics, bucket, pipeline Buckets are very useful, especially Terms buckets. Aggregations are cached with some very clever algorithms and great cache management by default, ensuring both low resource use and no stale results. Say we have a field called “us_state” in some data we’ve got. A Terms aggregation over that data will tell us the unique US state codes which are present in our data. If it’s a comprehensive dataset, we’ll essentially just get a list of the US states. Not that useful, right. But, you can nest aggregations so you have sub-aggregations. Which means, we could ask Show a Terms aggregation drilling further and further down into some category. Fashion may be a good metaphore, e.g. All Stock -> Shoes -> Ladies’ -> Red -> Size 6.5 TODO replace with housing example Bucketing: all the buckets criteria are evaluated on every document in the context and when a criterion matches, the document is considered to "fall in" the relevant bucket. By the end of the aggregation process, we’ll end up with a list of buckets - each one with a set of documents that "belong" to it. Metric: Aggregations that keep track and compute metrics over a set of documents. Min, max, avg, sum, ranking, geo bounds and geo centroid. (If asked) Geo bounds gives you the box containing all locations. Geo centroid gives you the center given other points. Matrix: operate on multiple fields and produce a matrix result based on the values. Experimental. Statistics (variance, covariance, correlation). Pipeline: Aggregations that aggregate the output of other aggregations and their associated metrics. More advanced.
  8. Just an example. Example aggregation using geo centroid and the number of, say, museums in the USA - the exact data is not important. But now, let’s see what bucketing the documents by US state gives us.
  9. So this is what “bucketing” is. You’ll find it very useful for building intuitive analytics dashboards and user interfaces that deal with search and discovery. I’ll give you a sneak peek of what the data, the request and the response might look like. The Elastic example is museums in Europe. https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-metrics-geocentroid-aggregation.html
  10. Predefined aggregations available. Logstash capable of understanding many log formats, and you can add custom ones.
  11. Why the ugly dashboard? Dashboards should be useful first, pretty … later. Netflix built an open source application metrics project based on Java and ES. Called Servo
  12. Searching a large number of descriptions for the best match for a specific phrase (e.g. property search, say “no pets”) and returning the best results Faceting: get a breakdown of the types of dwelling that forbid pets :( “Did you mean …?” suggestions Auto-completing a search box based on partially typed words based on previously issued searches while accounting for mis-spellings Searching text for words that sound like another word Product and information suggestions: “People who were interested in / bought this also look at…”
  13. Not great at: Instant availability in search results after indexing High cardinality & high precision analysis Problems you should not try to solve: Very limited resource projects (embedded devices, tiny websites) Elasticsearch is generally fantastic at providing approximate answers from data, such as scoring the results by quality. While Elasticsearch can perform exact matching and statistical calculations, its primary task of search is an inherently approximate task. Finding approximate answers is a property that separates Elasticsearch from more traditional databases. That being said, traditional relational databases excel at precision and data integrity.
  14. The Elastic website has a lot of blogs and videos on user stories, including top senior dogs from Netflix, Rightmove, banks, supercomputer and AI people, fighting Ebola, the BBC and many more! It was a pleasure! I hope you had fun. Please leave a comment on the meetup page or send me an email with feedback.