SlideShare une entreprise Scribd logo
1  sur  45
Télécharger pour lire hors ligne
Finding the right stuff
Michael Reinsch
an intro to Elasticsearch with Ruby/Rails
at Ruby User Group Berlin, Feb 2016
How does it fit into my
app?
Blackbox
with REST API
elasticsearch
Update API: your app pushes updates 

(updates are fast, but asynchronous)
Search API: returns search results
For Ruby / Rails
• https://github.com/elastic/elasticsearch-rails
• gems for Rails:
• elasticsearch-model & elasticsearch-rails
• without Rails / AR:
• elasticsearch-persistence
class Event < ActiveRecord::Base
include Elasticsearch::Model
class Event < ActiveRecord::Base
include Elasticsearch::Model
def as_indexed_json(options={})
{ title: title,
description: description,
starts_at: starts_at.iso8601,
featured: group.featured? }
end
class Event < ActiveRecord::Base
include Elasticsearch::Model
def as_indexed_json(options={})
{ title: title,
description: description,
starts_at: starts_at.iso8601,
featured: group.featured? }
end
settings do
mapping dynamic: 'false' do
indexes :title, type: 'string'
indexes :description, type: 'string'
indexes :starts_at, type: 'date'
indexes :featured, type: 'boolean'
end
end
Event.import
Elasticsearch cluster
Index: events
Type: event
doc 1
Elasticsearch cluster
Index: creations
Type: creation
doc 1
Type: activity
doc 2 doc 1
Index: events
Type: event
doc 1
Elasticsearch cluster
Documents,
not relationships
compose documents with all relevant data
➜ "denormalize" your data
class Event < ActiveRecord::Base
include Elasticsearch::Model
def as_indexed_json(options={})
{ titles: [ title1, title2 ],
locations: locs.map(&:as_indexed_json)
}
end
settings do
mapping dynamic: 'false' do
indexes :titles, type: 'string'
indexes :locations, type: 'nested' do
indexes :name, type: 'string'
indexes :address, type: 'string'
indexes :location, type: 'geo_point'
end
end
end
Event.search 'tokyo rubyist'
response = Event.search 'tokyo rubyist'
response.took
# => 28
response.results.total
# => 2075
response.results.first._score
# => 0.921177
response.results.first._source.title
# => "Drop in Ruby"
response.page(2).results
# => second page of results
response = Event.search 'tokyo rubyist'
response.took
# => 28
response.results.total
# => 2075
response.results.first._score
# => 0.921177
response.results.first._source.title
# => "Drop in Ruby"
response.page(2).results
# => second page of results
supports kaminari /
will_paginate
response = Event.search 'tokyo rubyist'
response.records.to_a
# => [#<Event id: 12409, ...>, ...]
response.page(2).records
# => second page of result records
response.records.each_with_hit do |rec,hit|
puts "* #{rec.title}: #{hit._score}"
end
# * Drop in Ruby: 0.9205564
# * Javascript meets Ruby in Kamakura: 0.8947
# * Meetup at EC Navi: 0.8766844
# * Pair Programming Session #3: 0.8603562
# * Kickoff Party: 0.8265461
Event.search 'tokyo rubyist'
Event.search 'tokyo rubyist'
only upcoming events?
Event.search 'tokyo rubyist'
only upcoming events?
sorted by start date?
Event.search query: {
filtered: {
query: {
simple_query_string: {
query: 'tokyo rubyist',
default_operator: 'and'
}
},
filter: {
and: [
{ range: { starts_at: { gte: 'now' } } },
{ term: { featured: true } }
]
}
}
}, sort: { starts_at: { order: 'asc' } }
Event.search query: {
filtered: {
query: {
simple_query_string: {
query: 'tokyo rubyist',
default_operator: 'and'
}
},
filter: {
and: [
{ range: { starts_at: { gte: 'now' } } },
{ term: { featured: true } }
]
}
}
}, sort: { starts_at: { order: 'asc' } }
our query
Event.search query: {
filtered: {
query: {
simple_query_string: {
query: 'tokyo rubyist',
default_operator: 'and'
}
},
filter: {
and: [
{ range: { starts_at: { gte: 'now' } } },
{ term: { featured: true } }
]
}
}
}, sort: { starts_at: { order: 'asc' } }
filtered by conditions
our query
Event.search query: {
filtered: {
query: {
simple_query_string: {
query: 'tokyo rubyist',
default_operator: 'and'
}
},
filter: {
and: [
{ range: { starts_at: { gte: 'now' } } },
{ term: { featured: true } }
]
}
}
}, sort: { starts_at: { order: 'asc' } }
filtered by conditions
sorted by start time
our query
Query DSL
query: { <query_type>: <arguments> }
filter: { <filter_type>: <arguments> }
valid arguments depend on query / filter type
Query DSL
query: { <query_type>: <arguments> }
filter: { <filter_type>: <arguments> }
valid arguments depend on query / filter type
scores matching documents
Query DSL
query: { <query_type>: <arguments> }
filter: { <filter_type>: <arguments> }
valid arguments depend on query / filter type
scores matching documents
filters documents
Event.search query: {
filtered: {
query: {
simple_query_string: {
query: 'tokyo rubyist',
default_operator: 'and'
}
},
filter: {
and: [
{ range: { starts_at: { gte: 'now' } } },
{ term: { featured: true } }
]
}
}
}, sort: { starts_at: { order: "asc" } }
Match Query
Multi Match Query
Bool Query
Boosting Query
Common Terms Query
Constant Score Query
Dis Max Query
Filtered Query
Fuzzy Like This Query
Fuzzy Like This Field Query
Function Score Query
Fuzzy Query
GeoShape Query
Has Child Query
Has Parent Query
Ids Query
Indices Query
Match All Query
More Like This Query
Nested Query
Prefix Query
Query String Query
Simple Query String Query
Range Query
Regexp Query
Span First Query
Span Multi Term Query
Span Near Query
Span Not Query
Span Or Query
Span Term Query
Term Query
Terms Query
Top Children Query
Wildcard Query
Minimum Should Match
Multi Term Query Rewrite
Template Query
And Filter
Bool Filter
Exists Filter
Geo Bounding Box Filter
Geo Distance Filter
Geo Distance Range Filter
Geo Polygon Filter
GeoShape Filter
Geohash Cell Filter
Has Child Filter
Has Parent Filter
Ids Filter
Indices Filter
Limit Filter
Match All Filter
Missing Filter
Nested Filter
Not Filter
Or Filter
Prefix Filter
Query Filter
Range Filter
Regexp Filter
Script Filter
Term Filter
Terms Filter
Type Filter
Event.search query: {
bool: {
should: [
{
simple_query_string: {
query: "tokyo rubyist",
default_operator: "and"
}
}, {
function_score: {
filter: {
and: [
{ range: { starts_at: { lte: 'now' } } },
{ term: { featured: true } }
]
},
gauss: {
starts_at: {
origin: 'now',
scale: '10d',
decay: 0.5
},
},
boost_mode: "sum"
}
}
],
minimum_should_match: 2
}
}
Create service objects
class EventSearch
def initialize
@filters = []
end
def starting_after(time)
tap { @filters << { range: { starts_at: { gte: time } } } }
end
def featured
tap { @filters << { term: { featured: true } } }
end
def in_group(group_id)
tap { @filters << { term: { group_id: group_id } } }
end
Event.search '東京rubyist'
Dealing with different
languages
built in analysers for arabic, armenian, basque,
brazilian, bulgarian, catalan, cjk, czech, danish,
dutch, english, finnish, french, galician, german,
greek, hindi, hungarian, indonesian, irish, italian,
latvian, lithuanian, norwegian, persian,
portuguese, romanian, russian, sorani, spanish,
swedish, turkish, thai.
class Event < ActiveRecord::Base
include Elasticsearch::Model
def as_indexed_json(options={})
{ title: { en: title_en, de: title_de, ja: title_ja },
description: { en: desc_en, de: desc_de, ja: desc_ja },
starts_at: starts_at.iso8601,
featured: group.featured? }
end
settings do
mapping dynamic: 'false' do
indexes :title do
indexes :en, type: 'string', analyzer: 'english'
indexes :de, type: 'string', analyzer: 'german'
indexes :ja, type: 'string', analyzer: 'cjk'
end
indexes :description do
indexes :en, type: 'string', analyzer: 'english'
indexes :de, type: 'string', analyzer: 'german'
indexes :ja, type: 'string', analyzer: 'cjk'
end
indexes :starts_at, type: 'date'
indexes :featured, type: 'boolean'
end
end
Changes to mappings?
⚠ can't change field types / analysers ⚠
but: we can add new field mappings
class AddCreatedAtToES < ActiveRecord::Migration
def up
client = Elasticsearch::Client.new
client.indices.put_mapping(
index: Event.index_name,
type: Event.document_type,
body: {
properties: {
created_at: { type: 'date' }
}
}
)
Event.__elasticsearch__.import
end
def down
end
end
Automated tests
class Event < ActiveRecord::Base
include Elasticsearch::Model
index_name "drkpr_#{Rails.env}_events"
Index names with
environment
Test helpers
• everything is asynchronous!
• Helpers:

wait_for_elasticsearch

wait_for_elasticsearch_removal

clear_elasticsearch!



➜ https://gist.github.com/mreinsch/094dc9cf63362314cef4
• specs: Tag tests which require elasticsearch
Production ready?
• use elastic.co/found or AWS ES
• use two clustered instances for redundancy
• Elasticsearch could go away
• keep impact at a minimum!
• update Elasticsearch from background worker
Questions?
Resources:
Elastic Docs

https://www.elastic.co/guide/index.html
Ruby Gem Docs

https://github.com/elastic/elasticsearch-rails
Elasticsearch rspec helpers

https://gist.github.com/mreinsch/094dc9cf63362314cef4



Elasticsearch indexer job example

https://gist.github.com/mreinsch/acb2f6c58891e5cd4f13
or ask me later:
michael@movingfast.io
@mreinsch

Contenu connexe

Tendances

Hidden-Web Induced by Client-Side Scripting: An Empirical Study
Hidden-Web Induced by Client-Side Scripting: An Empirical StudyHidden-Web Induced by Client-Side Scripting: An Empirical Study
Hidden-Web Induced by Client-Side Scripting: An Empirical Study
SALT Lab @ UBC
 

Tendances (20)

Building a Cloud API Server using Play(SCALA) & Riak
Building a Cloud API Server using  Play(SCALA) & Riak Building a Cloud API Server using  Play(SCALA) & Riak
Building a Cloud API Server using Play(SCALA) & Riak
 
Azure Durable Functions (2019-03-30)
Azure Durable Functions (2019-03-30) Azure Durable Functions (2019-03-30)
Azure Durable Functions (2019-03-30)
 
#Pharo Days 2016 Data Formats and Protocols
#Pharo Days 2016 Data Formats and Protocols#Pharo Days 2016 Data Formats and Protocols
#Pharo Days 2016 Data Formats and Protocols
 
Sane Sharding with Akka Cluster
Sane Sharding with Akka ClusterSane Sharding with Akka Cluster
Sane Sharding with Akka Cluster
 
Analyzing Log Data With Apache Spark
Analyzing Log Data With Apache SparkAnalyzing Log Data With Apache Spark
Analyzing Log Data With Apache Spark
 
Riak 2.0 : For Beginners, and Everyone Else
Riak 2.0 : For Beginners, and Everyone ElseRiak 2.0 : For Beginners, and Everyone Else
Riak 2.0 : For Beginners, and Everyone Else
 
JavaOne 2013: Java 8 - The Good Parts
JavaOne 2013: Java 8 - The Good PartsJavaOne 2013: Java 8 - The Good Parts
JavaOne 2013: Java 8 - The Good Parts
 
Beyond Parallelize and Collect by Holden Karau
Beyond Parallelize and Collect by Holden KarauBeyond Parallelize and Collect by Holden Karau
Beyond Parallelize and Collect by Holden Karau
 
Content OCRing and Scoring
Content OCRing and ScoringContent OCRing and Scoring
Content OCRing and Scoring
 
iOS Keychain by 흰, 민디
iOS Keychain by 흰, 민디iOS Keychain by 흰, 민디
iOS Keychain by 흰, 민디
 
PK chunking presentation from Tahoe Dreamin' 2016
PK chunking presentation from Tahoe Dreamin' 2016PK chunking presentation from Tahoe Dreamin' 2016
PK chunking presentation from Tahoe Dreamin' 2016
 
Doctrine ORM with eZ Platform REST API and GraphQL
Doctrine ORM with eZ Platform REST API and GraphQLDoctrine ORM with eZ Platform REST API and GraphQL
Doctrine ORM with eZ Platform REST API and GraphQL
 
Android dev 3
Android dev 3Android dev 3
Android dev 3
 
Side by Side with Elasticsearch & Solr, Part 2
Side by Side with Elasticsearch & Solr, Part 2Side by Side with Elasticsearch & Solr, Part 2
Side by Side with Elasticsearch & Solr, Part 2
 
Debugging PySpark: Spark Summit East talk by Holden Karau
Debugging PySpark: Spark Summit East talk by Holden KarauDebugging PySpark: Spark Summit East talk by Holden Karau
Debugging PySpark: Spark Summit East talk by Holden Karau
 
Databricks spark-knowledge-base-1
Databricks spark-knowledge-base-1Databricks spark-knowledge-base-1
Databricks spark-knowledge-base-1
 
Thinking in Graphs - GraphQL problems and more - Maciej Rybaniec (23.06.2017)
Thinking in Graphs - GraphQL problems and more - Maciej Rybaniec (23.06.2017)Thinking in Graphs - GraphQL problems and more - Maciej Rybaniec (23.06.2017)
Thinking in Graphs - GraphQL problems and more - Maciej Rybaniec (23.06.2017)
 
Client server part 12
Client server part 12Client server part 12
Client server part 12
 
20170624 GraphQL Presentation
20170624 GraphQL Presentation20170624 GraphQL Presentation
20170624 GraphQL Presentation
 
Hidden-Web Induced by Client-Side Scripting: An Empirical Study
Hidden-Web Induced by Client-Side Scripting: An Empirical StudyHidden-Web Induced by Client-Side Scripting: An Empirical Study
Hidden-Web Induced by Client-Side Scripting: An Empirical Study
 

En vedette

United Kingdom
United KingdomUnited Kingdom
United Kingdom
tsereteli
 
Introduction
IntroductionIntroduction
Introduction
Kira Park
 
ActionIntel Overview - Emerging Strategies
ActionIntel Overview - Emerging StrategiesActionIntel Overview - Emerging Strategies
ActionIntel Overview - Emerging Strategies
leadcrafting
 
Robert Lynch - PMI-ACP Agile Training
Robert Lynch - PMI-ACP Agile TrainingRobert Lynch - PMI-ACP Agile Training
Robert Lynch - PMI-ACP Agile Training
Bob Lynch
 
Elasticsearch@ALM
Elasticsearch@ALMElasticsearch@ALM
Elasticsearch@ALM
Eyal Dahari
 

En vedette (20)

United Kingdom
United KingdomUnited Kingdom
United Kingdom
 
Elasticsearch
ElasticsearchElasticsearch
Elasticsearch
 
Whoa
WhoaWhoa
Whoa
 
Wearables Europe 2015 - Designing for Wearables
Wearables Europe 2015 - Designing for WearablesWearables Europe 2015 - Designing for Wearables
Wearables Europe 2015 - Designing for Wearables
 
Introduction
IntroductionIntroduction
Introduction
 
ActionIntel Overview - Emerging Strategies
ActionIntel Overview - Emerging StrategiesActionIntel Overview - Emerging Strategies
ActionIntel Overview - Emerging Strategies
 
Queer idols
Queer idolsQueer idols
Queer idols
 
Programas de intervención familiar desde la educación infantil
Programas de intervención familiar desde la educación infantilProgramas de intervención familiar desde la educación infantil
Programas de intervención familiar desde la educación infantil
 
Resume
ResumeResume
Resume
 
Robert Lynch - PMI-ACP Agile Training
Robert Lynch - PMI-ACP Agile TrainingRobert Lynch - PMI-ACP Agile Training
Robert Lynch - PMI-ACP Agile Training
 
Presentation H1 2016 Investors
Presentation H1 2016 InvestorsPresentation H1 2016 Investors
Presentation H1 2016 Investors
 
35 câu hỏi chu nghia mac
35 câu hỏi chu nghia mac35 câu hỏi chu nghia mac
35 câu hỏi chu nghia mac
 
Defensa memoria UAI Sentiment Analysis
Defensa memoria UAI Sentiment AnalysisDefensa memoria UAI Sentiment Analysis
Defensa memoria UAI Sentiment Analysis
 
Understanding cross-border religion in the Irish web
Understanding cross-border religion in the Irish webUnderstanding cross-border religion in the Irish web
Understanding cross-border religion in the Irish web
 
Review
Review Review
Review
 
Registration
RegistrationRegistration
Registration
 
Elasticsearch@ALM
Elasticsearch@ALMElasticsearch@ALM
Elasticsearch@ALM
 
Positive Thinking
Positive ThinkingPositive Thinking
Positive Thinking
 
Emerging Trends in Social Media 2013
Emerging Trends in Social Media 2013Emerging Trends in Social Media 2013
Emerging Trends in Social Media 2013
 
Mystery Shopping Secrets
Mystery Shopping SecretsMystery Shopping Secrets
Mystery Shopping Secrets
 

Similaire à Finding the right stuff, an intro to Elasticsearch (at Rug::B)

Elasticsearch And Ruby [RuPy2012]
Elasticsearch And Ruby [RuPy2012]Elasticsearch And Ruby [RuPy2012]
Elasticsearch And Ruby [RuPy2012]
Karel Minarik
 
Google apps script database abstraction exposed version
Google apps script database abstraction   exposed versionGoogle apps script database abstraction   exposed version
Google apps script database abstraction exposed version
Bruce McPherson
 
Real-time search in Drupal with Elasticsearch @Moldcamp
Real-time search in Drupal with Elasticsearch @MoldcampReal-time search in Drupal with Elasticsearch @Moldcamp
Real-time search in Drupal with Elasticsearch @Moldcamp
Alexei Gorobets
 

Similaire à Finding the right stuff, an intro to Elasticsearch (at Rug::B) (20)

Elasticsearch And Ruby [RuPy2012]
Elasticsearch And Ruby [RuPy2012]Elasticsearch And Ruby [RuPy2012]
Elasticsearch And Ruby [RuPy2012]
 
ElasticSearch for .NET Developers
ElasticSearch for .NET DevelopersElasticSearch for .NET Developers
ElasticSearch for .NET Developers
 
Google apps script database abstraction exposed version
Google apps script database abstraction   exposed versionGoogle apps script database abstraction   exposed version
Google apps script database abstraction exposed version
 
Elasticsearch & "PeopleSearch"
Elasticsearch & "PeopleSearch"Elasticsearch & "PeopleSearch"
Elasticsearch & "PeopleSearch"
 
Real-time search in Drupal with Elasticsearch @Moldcamp
Real-time search in Drupal with Elasticsearch @MoldcampReal-time search in Drupal with Elasticsearch @Moldcamp
Real-time search in Drupal with Elasticsearch @Moldcamp
 
Academy PRO: Querying Elasticsearch
Academy PRO: Querying ElasticsearchAcademy PRO: Querying Elasticsearch
Academy PRO: Querying Elasticsearch
 
Elastic tire demo
Elastic tire demoElastic tire demo
Elastic tire demo
 
Elasticsearch in 15 Minutes
Elasticsearch in 15 MinutesElasticsearch in 15 Minutes
Elasticsearch in 15 Minutes
 
Rspec API Documentation
Rspec API DocumentationRspec API Documentation
Rspec API Documentation
 
Query DSL In Elasticsearch
Query DSL In ElasticsearchQuery DSL In Elasticsearch
Query DSL In Elasticsearch
 
Room with testing and rxjava
Room with testing and rxjavaRoom with testing and rxjava
Room with testing and rxjava
 
JSON and the APInauts
JSON and the APInautsJSON and the APInauts
JSON and the APInauts
 
GraphQL & Relay - 串起前後端世界的橋樑
GraphQL & Relay - 串起前後端世界的橋樑GraphQL & Relay - 串起前後端世界的橋樑
GraphQL & Relay - 串起前後端世界的橋樑
 
Polyglot Adventures for the Modern Java Developer #javaone2017
Polyglot Adventures for the Modern Java Developer #javaone2017Polyglot Adventures for the Modern Java Developer #javaone2017
Polyglot Adventures for the Modern Java Developer #javaone2017
 
Polyglot Adventures for the Modern Java Developer
Polyglot Adventures for the Modern Java DeveloperPolyglot Adventures for the Modern Java Developer
Polyglot Adventures for the Modern Java Developer
 
Stop the noise! - Introduction to the JSON:API specification in Drupal
Stop the noise! - Introduction to the JSON:API specification in DrupalStop the noise! - Introduction to the JSON:API specification in Drupal
Stop the noise! - Introduction to the JSON:API specification in Drupal
 
Introduction to GraphQL using Nautobot and Arista cEOS
Introduction to GraphQL using Nautobot and Arista cEOSIntroduction to GraphQL using Nautobot and Arista cEOS
Introduction to GraphQL using Nautobot and Arista cEOS
 
[@IndeedEng] Building Indeed Resume Search
[@IndeedEng] Building Indeed Resume Search[@IndeedEng] Building Indeed Resume Search
[@IndeedEng] Building Indeed Resume Search
 
APIdays Paris 2018 - Building scalable, type-safe GraphQL servers from scratc...
APIdays Paris 2018 - Building scalable, type-safe GraphQL servers from scratc...APIdays Paris 2018 - Building scalable, type-safe GraphQL servers from scratc...
APIdays Paris 2018 - Building scalable, type-safe GraphQL servers from scratc...
 
Getting started with Elasticsearch in .net
Getting started with Elasticsearch in .netGetting started with Elasticsearch in .net
Getting started with Elasticsearch in .net
 

Dernier

Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Dernier (20)

Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 

Finding the right stuff, an intro to Elasticsearch (at Rug::B)

  • 1. Finding the right stuff Michael Reinsch an intro to Elasticsearch with Ruby/Rails at Ruby User Group Berlin, Feb 2016
  • 2.
  • 3.
  • 4.
  • 5. How does it fit into my app?
  • 6. Blackbox with REST API elasticsearch Update API: your app pushes updates 
 (updates are fast, but asynchronous) Search API: returns search results
  • 7. For Ruby / Rails • https://github.com/elastic/elasticsearch-rails • gems for Rails: • elasticsearch-model & elasticsearch-rails • without Rails / AR: • elasticsearch-persistence
  • 8. class Event < ActiveRecord::Base include Elasticsearch::Model
  • 9. class Event < ActiveRecord::Base include Elasticsearch::Model def as_indexed_json(options={}) { title: title, description: description, starts_at: starts_at.iso8601, featured: group.featured? } end
  • 10. class Event < ActiveRecord::Base include Elasticsearch::Model def as_indexed_json(options={}) { title: title, description: description, starts_at: starts_at.iso8601, featured: group.featured? } end settings do mapping dynamic: 'false' do indexes :title, type: 'string' indexes :description, type: 'string' indexes :starts_at, type: 'date' indexes :featured, type: 'boolean' end end
  • 13. Index: events Type: event doc 1 Elasticsearch cluster
  • 14. Index: creations Type: creation doc 1 Type: activity doc 2 doc 1 Index: events Type: event doc 1 Elasticsearch cluster
  • 15. Documents, not relationships compose documents with all relevant data ➜ "denormalize" your data
  • 16. class Event < ActiveRecord::Base include Elasticsearch::Model def as_indexed_json(options={}) { titles: [ title1, title2 ], locations: locs.map(&:as_indexed_json) } end settings do mapping dynamic: 'false' do indexes :titles, type: 'string' indexes :locations, type: 'nested' do indexes :name, type: 'string' indexes :address, type: 'string' indexes :location, type: 'geo_point' end end end
  • 18. response = Event.search 'tokyo rubyist' response.took # => 28 response.results.total # => 2075 response.results.first._score # => 0.921177 response.results.first._source.title # => "Drop in Ruby" response.page(2).results # => second page of results
  • 19. response = Event.search 'tokyo rubyist' response.took # => 28 response.results.total # => 2075 response.results.first._score # => 0.921177 response.results.first._source.title # => "Drop in Ruby" response.page(2).results # => second page of results supports kaminari / will_paginate
  • 20. response = Event.search 'tokyo rubyist' response.records.to_a # => [#<Event id: 12409, ...>, ...] response.page(2).records # => second page of result records response.records.each_with_hit do |rec,hit| puts "* #{rec.title}: #{hit._score}" end # * Drop in Ruby: 0.9205564 # * Javascript meets Ruby in Kamakura: 0.8947 # * Meetup at EC Navi: 0.8766844 # * Pair Programming Session #3: 0.8603562 # * Kickoff Party: 0.8265461
  • 23. Event.search 'tokyo rubyist' only upcoming events? sorted by start date?
  • 24. Event.search query: { filtered: { query: { simple_query_string: { query: 'tokyo rubyist', default_operator: 'and' } }, filter: { and: [ { range: { starts_at: { gte: 'now' } } }, { term: { featured: true } } ] } } }, sort: { starts_at: { order: 'asc' } }
  • 25. Event.search query: { filtered: { query: { simple_query_string: { query: 'tokyo rubyist', default_operator: 'and' } }, filter: { and: [ { range: { starts_at: { gte: 'now' } } }, { term: { featured: true } } ] } } }, sort: { starts_at: { order: 'asc' } } our query
  • 26. Event.search query: { filtered: { query: { simple_query_string: { query: 'tokyo rubyist', default_operator: 'and' } }, filter: { and: [ { range: { starts_at: { gte: 'now' } } }, { term: { featured: true } } ] } } }, sort: { starts_at: { order: 'asc' } } filtered by conditions our query
  • 27. Event.search query: { filtered: { query: { simple_query_string: { query: 'tokyo rubyist', default_operator: 'and' } }, filter: { and: [ { range: { starts_at: { gte: 'now' } } }, { term: { featured: true } } ] } } }, sort: { starts_at: { order: 'asc' } } filtered by conditions sorted by start time our query
  • 28. Query DSL query: { <query_type>: <arguments> } filter: { <filter_type>: <arguments> } valid arguments depend on query / filter type
  • 29. Query DSL query: { <query_type>: <arguments> } filter: { <filter_type>: <arguments> } valid arguments depend on query / filter type scores matching documents
  • 30. Query DSL query: { <query_type>: <arguments> } filter: { <filter_type>: <arguments> } valid arguments depend on query / filter type scores matching documents filters documents
  • 31. Event.search query: { filtered: { query: { simple_query_string: { query: 'tokyo rubyist', default_operator: 'and' } }, filter: { and: [ { range: { starts_at: { gte: 'now' } } }, { term: { featured: true } } ] } } }, sort: { starts_at: { order: "asc" } }
  • 32. Match Query Multi Match Query Bool Query Boosting Query Common Terms Query Constant Score Query Dis Max Query Filtered Query Fuzzy Like This Query Fuzzy Like This Field Query Function Score Query Fuzzy Query GeoShape Query Has Child Query Has Parent Query Ids Query Indices Query Match All Query More Like This Query Nested Query Prefix Query Query String Query Simple Query String Query Range Query Regexp Query Span First Query Span Multi Term Query Span Near Query Span Not Query Span Or Query Span Term Query Term Query Terms Query Top Children Query Wildcard Query Minimum Should Match Multi Term Query Rewrite Template Query
  • 33. And Filter Bool Filter Exists Filter Geo Bounding Box Filter Geo Distance Filter Geo Distance Range Filter Geo Polygon Filter GeoShape Filter Geohash Cell Filter Has Child Filter Has Parent Filter Ids Filter Indices Filter Limit Filter Match All Filter Missing Filter Nested Filter Not Filter Or Filter Prefix Filter Query Filter Range Filter Regexp Filter Script Filter Term Filter Terms Filter Type Filter
  • 34. Event.search query: { bool: { should: [ { simple_query_string: { query: "tokyo rubyist", default_operator: "and" } }, { function_score: { filter: { and: [ { range: { starts_at: { lte: 'now' } } }, { term: { featured: true } } ] }, gauss: { starts_at: { origin: 'now', scale: '10d', decay: 0.5 }, }, boost_mode: "sum" } } ], minimum_should_match: 2 } }
  • 35. Create service objects class EventSearch def initialize @filters = [] end def starting_after(time) tap { @filters << { range: { starts_at: { gte: time } } } } end def featured tap { @filters << { term: { featured: true } } } end def in_group(group_id) tap { @filters << { term: { group_id: group_id } } } end
  • 37. Dealing with different languages built in analysers for arabic, armenian, basque, brazilian, bulgarian, catalan, cjk, czech, danish, dutch, english, finnish, french, galician, german, greek, hindi, hungarian, indonesian, irish, italian, latvian, lithuanian, norwegian, persian, portuguese, romanian, russian, sorani, spanish, swedish, turkish, thai.
  • 38. class Event < ActiveRecord::Base include Elasticsearch::Model def as_indexed_json(options={}) { title: { en: title_en, de: title_de, ja: title_ja }, description: { en: desc_en, de: desc_de, ja: desc_ja }, starts_at: starts_at.iso8601, featured: group.featured? } end settings do mapping dynamic: 'false' do indexes :title do indexes :en, type: 'string', analyzer: 'english' indexes :de, type: 'string', analyzer: 'german' indexes :ja, type: 'string', analyzer: 'cjk' end indexes :description do indexes :en, type: 'string', analyzer: 'english' indexes :de, type: 'string', analyzer: 'german' indexes :ja, type: 'string', analyzer: 'cjk' end indexes :starts_at, type: 'date' indexes :featured, type: 'boolean' end end
  • 39. Changes to mappings? ⚠ can't change field types / analysers ⚠ but: we can add new field mappings
  • 40. class AddCreatedAtToES < ActiveRecord::Migration def up client = Elasticsearch::Client.new client.indices.put_mapping( index: Event.index_name, type: Event.document_type, body: { properties: { created_at: { type: 'date' } } } ) Event.__elasticsearch__.import end def down end end
  • 42. class Event < ActiveRecord::Base include Elasticsearch::Model index_name "drkpr_#{Rails.env}_events" Index names with environment
  • 43. Test helpers • everything is asynchronous! • Helpers:
 wait_for_elasticsearch
 wait_for_elasticsearch_removal
 clear_elasticsearch!
 
 ➜ https://gist.github.com/mreinsch/094dc9cf63362314cef4 • specs: Tag tests which require elasticsearch
  • 44. Production ready? • use elastic.co/found or AWS ES • use two clustered instances for redundancy • Elasticsearch could go away • keep impact at a minimum! • update Elasticsearch from background worker
  • 45. Questions? Resources: Elastic Docs
 https://www.elastic.co/guide/index.html Ruby Gem Docs
 https://github.com/elastic/elasticsearch-rails Elasticsearch rspec helpers
 https://gist.github.com/mreinsch/094dc9cf63362314cef4
 
 Elasticsearch indexer job example
 https://gist.github.com/mreinsch/acb2f6c58891e5cd4f13 or ask me later: michael@movingfast.io @mreinsch