SlideShare une entreprise Scribd logo
1  sur  35
Télécharger pour lire hors ligne
FASTER AND BETTER SEARCHFASTER AND BETTER SEARCH
RESULTS WITH ELASTICSEARCHRESULTS WITH ELASTICSEARCH
TAKE YOUR SITE-WIDE SEARCHES TO THE NEXT LEVELTAKE YOUR SITE-WIDE SEARCHES TO THE NEXT LEVEL
1
WEB SITE SEARCHWEB SITE SEARCH
Search across different fields (title, content,...);
show relevant results first;
2
WEB SITE SEARCHWEB SITE SEARCH
Search across different fields (title, content,...);
show relevant results first;
categorize results;
filter by various attributes;
2
WEB SITE SEARCHWEB SITE SEARCH
Search across different fields (title, content,...);
show relevant results first;
categorize results;
filter by various attributes;
withstand user typos;
treat synonyms as the same word;
2
WEB SITE SEARCHWEB SITE SEARCH
Search across different fields (title, content,...);
show relevant results first;
categorize results;
filter by various attributes;
withstand user typos;
treat synonyms as the same word;
be scalable;
be fault tolerant;
easy to deploy.
2
PLONE SITE SEARCHPLONE SITE SEARCH
ZCatalog:
fully integrated in Plone;
no advanced features (like synonyms support);
not very scalable.
3
PLONE SITE SEARCHPLONE SITE SEARCH
ZCatalog:
fully integrated in Plone;
no advanced features (like synonyms support);
not very scalable.
Apache Solr:
based on the Java search library Apache Lucene;
better results ranking;
advanced features;
more configurable;
some clustering support (using Zookeper)
3
PLONE SITE SEARCHPLONE SITE SEARCH
ZCatalog:
fully integrated in Plone;
no advanced features (like synonyms support);
not very scalable.
Apache Solr:
based on the Java search library Apache Lucene;
better results ranking;
advanced features;
more configurable;
some clustering support (using Zookeper)
Elasticsearch:
based (again) on Lucene;
similar search features of Solr
great scalability;
less XML, more JSON. 3
PLONE SITE SEARCHPLONE SITE SEARCH
ZCatalog:
fully integrated in Plone;
no advanced features (like synonyms support);
not very scalable.
Apache Solr: collective.solr, alm.solrindex
based on the Java search library Apache Lucene;
better results ranking;
advanced features;
more configurable;
some clustering support (using Zookeper)
Elasticsearch: collective.elasticsearch
based (again) on Lucene;
similar search features of Solr
great scalability;
less XML, more JSON. 3
ELASTIC STACKELASTIC STACK
Also know as ELK:
Elasticsearch,
Logstash,
Kibana,
Beats.
Two main class of use cases:
Almost static data: search engines,
Time series data: logs and metrics.
4
ELASTICSEARCHELASTICSEARCH
5
INDEX A DOCUMENTINDEX A DOCUMENT
POST plone/_doc
{
"title": "Getting started with plone and Elasticsearch",
"author": "Enrico Polesel",
"content": "We want to index the entire content of our Plone website into elasticsearch...",
"tags": ["plone", "search", "elasticsearch", "cluster", "performance", "high availability"],
"date": "2019-10-25T11:50:00+0200"
}
6
INDEX A DOCUMENTINDEX A DOCUMENT
POST plone/_doc
{
"title": "Getting started with plone and Elasticsearch",
"author": "Enrico Polesel",
"content": "We want to index the entire content of our Plone website into elasticsearch...",
"tags": ["plone", "search", "elasticsearch", "cluster", "performance", "high availability"],
"date": "2019-10-25T11:50:00+0200"
}
{
"_index" : "plone",
"_type" : "_doc",
"_id" : "Y0MZ7W0B3-sU3YTrncfM",
"_version" : 1,
"result" : "created",
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 0,
"_primary_term" : 1
}
6
DATA TYPESDATA TYPES
short, long,
float, double
IP
geopoint
interval, date_interval
keyword (not analyzed strings),
text (analyzed strings),
object, array, nested object,
...
7
DATA TYPESDATA TYPES
short, long,
float, double
IP
geopoint
interval, date_interval
keyword (not analyzed strings),
text (analyzed strings),
object, array, nested object,
...
7
DATA TYPESDATA TYPES
short, long,
float, double
IP
geopoint
interval, date_interval
keyword (not analyzed strings),
text (analyzed strings),
object, array, nested object,
...
7
DATA TYPESDATA TYPES
short, long,
float, double
IP
geopoint
interval, date_interval
keyword (not analyzed strings),
text (analyzed strings),
object, array, nested object,
...
7
DATA TYPESDATA TYPES
short, long,
float, double
IP
geopoint
interval, date_interval
keyword (not analyzed strings),
text (analyzed strings),
object, array, nested object,
...
7
DATA TYPESDATA TYPES
short, long,
float, double
IP
geopoint
interval, date_interval
keyword (not analyzed strings),
text (analyzed strings),
object, array, nested object,
...
7
DATA TYPESDATA TYPES
short, long,
float, double
IP
geopoint
interval, date_interval
keyword (not analyzed strings),
text (analyzed strings),
object, array, nested object,
...
7
TEXT ANALYSISTEXT ANALYSIS
8
ANALYZERSANALYZERS
1. Char filters
convert HTML escape codes
normalize unicode symbols
replace patterns
2. Tokenizer
separate on whitespaces
separate on punctuation
may be grammar based
may generate partial words
special tokenizer for special strings (like paths)
3. Token filters
normalize tokens
stemming
remove stopwords
translate synonyms 9
QUERY - MATCHQUERY - MATCH
GET plone/_search
{
"query": {
"match": {
"content": "elasticsearch"
}
}
}
{
...
"hits" : {
"total" : { ... },
"max_score" : 0.2876821,
"hits" : [
{
"_index" : "plone",
"_id" : "Y0MZ7W0B3-sU3YTrncfM",
"_score" : 0.2876821,
"_source" : {
"title" : "Getting started with plone and Elasticsearch",
...
}
}
]
}
}
QUERY - FUZZY MATCHQUERY - FUZZY MATCH
With distance 1 we have:
Changing a character (box → fox)
Removing a character (black → lack)
Inserting a character (sic → sick)
Transposing two adjacent characters (act → cat)
GET plone/_search
{
"query": {
"match": {
"content": {
"query": "ploMe",
"fuzziness": 1
}
}
}
}
11
QUERY - MULTI MATCHQUERY - MULTI MATCH
Matches in the title field will be boosted!
GET plone/_search
{
"query": {
"multi_match": {
"query": "plome",
"fields": [ "tilte^2", "content" ],
"fuzziness": 1
}
}
}
12
QUERYQUERY
And much more!
Suggestions,
search as you type,
geo query,
external ranking,
more like this,
...
13
AGGREGATIONSAGGREGATIONS
GET plone/_search
{
"query": {
"match": {
"content": "elasticsearch"
}
},
"aggs": {
"Authors": {
"term": {
"field": "author",
"size": 10
}
}
}
}
14
AGGREGATIONSAGGREGATIONS
GET plone/_search
{
"query": {
"match": {
"content": "elasticsearch"
}
},
"aggs": {
"Authors": {
"term": {
"field": "author",
"size": 10
},
"aggs": {
"Tags": {
"term": {
"field": "tags",
"size": 100
}
}
}
}
}
}
15
AGGREGATIONSAGGREGATIONS
GET plone/_search
{
"query": { ... }, },
"aggs": {
"Authors": {
"term": {
"field": "author",
"size": 10
},
"aggs": {
"Avg-length": {
"avg": {
"field": "length"
}
},
"Last-published": {
"max": {
"field": "date"
}
}
}
}
}
}
16
AGGREGATIONSAGGREGATIONS
And much more!
Advanced stats,
geo centroid,
cardinality,
significant terms,
...
17
RUNNING ELASTICSEARCHRUNNING ELASTICSEARCH
config/elasticsearch.yml
config/jvm.options
Docker, yum/apt, Windows and MacOS also supported!
See
$ wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.4.0-linux-x86_64.tar.gz
$ tar -xf elasticsearch-7.4.0-linux-x86_64.tar.gz
$ cd elasticsearch-7.4.0-linux-x86_64
$ bin/elasticsearch
https://www.elastic.co/downloads/
18
RUNNING ELASTICSEARCHRUNNING ELASTICSEARCH
config/elasticsearch.yml
config/jvm.options
Docker, yum/apt, Windows and MacOS also supported!
See
$ wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.4.0-linux-x86_64.tar.gz
$ tar -xf elasticsearch-7.4.0-linux-x86_64.tar.gz
$ cd elasticsearch-7.4.0-linux-x86_64
$ bin/elasticsearch
$ wget https://artifacts.elastic.co/downloads/kibana/kibana-7.4.0-linux-x86_64.tar.gz
$ tar -xf kibana-7.4.0-linux-x86_64.tar.gz
$ cd kibana-7.4.0-linux-x86_64
$ bin/kibana
https://www.elastic.co/downloads/
18
CLUSTERINGCLUSTERING
Need high availability? Install two data nodes! (replica is enabled by default)
Need more space? Increase the number of nodes! (and of indeces/shards)
Need more search performance? Increase the number of replicas!
Have disks of different type (fast/slow)? Use hot-cold architecture!
19
WHAT'S NEXT? ELASTIC APP SEARCHWHAT'S NEXT? ELASTIC APP SEARCH
20
HOMEWORKHOMEWORK
Download Elasticsearch from
Untar, cd, and run Elasticsearch (bin/elasticsearch)
Test it: curl http://localhost:9200/
Add collective.elasticsearch to your project eggs & re-run buildout
Restart Plone
Goto Control Panel
Add "Elastic Search" in Add-on Products
Click "Elastic Search" in "Add-on Configuration"
Enable
Click "Convert Catalog"
Click "Rebuild Catalog"
https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-6.8.4.tar.gz
21

Contenu connexe

Tendances

Google Search Techniques
Google Search TechniquesGoogle Search Techniques
Google Search TechniquesDuc Chau
 
RESTful Web API and MongoDB go for a pic nic
RESTful Web API and MongoDB go for a pic nicRESTful Web API and MongoDB go for a pic nic
RESTful Web API and MongoDB go for a pic nicNicola Iarocci
 
Parsing strange v1.1
Parsing strange v1.1Parsing strange v1.1
Parsing strange v1.1Hal Stern
 
The Google Hacking Database: A Key Resource to Exposing Vulnerabilities
The Google Hacking Database: A Key Resource to Exposing VulnerabilitiesThe Google Hacking Database: A Key Resource to Exposing Vulnerabilities
The Google Hacking Database: A Key Resource to Exposing VulnerabilitiesTechWell
 
Linked Data in Use: Schema.org, JSON-LD and hypermedia APIs - Front in Bahia...
Linked Data in Use: Schema.org, JSON-LD and hypermedia APIs  - Front in Bahia...Linked Data in Use: Schema.org, JSON-LD and hypermedia APIs  - Front in Bahia...
Linked Data in Use: Schema.org, JSON-LD and hypermedia APIs - Front in Bahia...Ícaro Medeiros
 
Advanced MongoDB Aggregation Pipelines
Advanced MongoDB Aggregation PipelinesAdvanced MongoDB Aggregation Pipelines
Advanced MongoDB Aggregation PipelinesTom Schreiber
 
JSON-LD: Linked Data for Web Apps
JSON-LD: Linked Data for Web AppsJSON-LD: Linked Data for Web Apps
JSON-LD: Linked Data for Web AppsGregg Kellogg
 
Yahoo! Search BOSS
Yahoo! Search BOSSYahoo! Search BOSS
Yahoo! Search BOSSPraveen P N
 
Enhance Your Google Search
Enhance Your Google SearchEnhance Your Google Search
Enhance Your Google SearchValentini Mellas
 
Search engine-optimization-tips
Search engine-optimization-tipsSearch engine-optimization-tips
Search engine-optimization-tipsTrí Tuệ Việt
 

Tendances (14)

3 google hacking
3 google hacking3 google hacking
3 google hacking
 
Google Search Techniques
Google Search TechniquesGoogle Search Techniques
Google Search Techniques
 
Awesome Tools 2017
Awesome Tools 2017Awesome Tools 2017
Awesome Tools 2017
 
RESTful Web API and MongoDB go for a pic nic
RESTful Web API and MongoDB go for a pic nicRESTful Web API and MongoDB go for a pic nic
RESTful Web API and MongoDB go for a pic nic
 
Technical Utilities for your Site
Technical Utilities for your SiteTechnical Utilities for your Site
Technical Utilities for your Site
 
Parsing strange v1.1
Parsing strange v1.1Parsing strange v1.1
Parsing strange v1.1
 
The Google Hacking Database: A Key Resource to Exposing Vulnerabilities
The Google Hacking Database: A Key Resource to Exposing VulnerabilitiesThe Google Hacking Database: A Key Resource to Exposing Vulnerabilities
The Google Hacking Database: A Key Resource to Exposing Vulnerabilities
 
Linked Data in Use: Schema.org, JSON-LD and hypermedia APIs - Front in Bahia...
Linked Data in Use: Schema.org, JSON-LD and hypermedia APIs  - Front in Bahia...Linked Data in Use: Schema.org, JSON-LD and hypermedia APIs  - Front in Bahia...
Linked Data in Use: Schema.org, JSON-LD and hypermedia APIs - Front in Bahia...
 
Advanced MongoDB Aggregation Pipelines
Advanced MongoDB Aggregation PipelinesAdvanced MongoDB Aggregation Pipelines
Advanced MongoDB Aggregation Pipelines
 
JSON-LD: Linked Data for Web Apps
JSON-LD: Linked Data for Web AppsJSON-LD: Linked Data for Web Apps
JSON-LD: Linked Data for Web Apps
 
Google Dorks
Google DorksGoogle Dorks
Google Dorks
 
Yahoo! Search BOSS
Yahoo! Search BOSSYahoo! Search BOSS
Yahoo! Search BOSS
 
Enhance Your Google Search
Enhance Your Google SearchEnhance Your Google Search
Enhance Your Google Search
 
Search engine-optimization-tips
Search engine-optimization-tipsSearch engine-optimization-tips
Search engine-optimization-tips
 

Similaire à Faster and better search results with Elasticsearch

In search of: A meetup about Liferay and Search 2016-04-20
In search of: A meetup about Liferay and Search   2016-04-20In search of: A meetup about Liferay and Search   2016-04-20
In search of: A meetup about Liferay and Search 2016-04-20Tibor Lipusz
 
Elasticsearch in 15 Minutes
Elasticsearch in 15 MinutesElasticsearch in 15 Minutes
Elasticsearch in 15 MinutesKarel Minarik
 
Elasticsearch - SEARCH & ANALYZE DATA IN REAL TIME
Elasticsearch - SEARCH & ANALYZE DATA IN REAL TIMEElasticsearch - SEARCH & ANALYZE DATA IN REAL TIME
Elasticsearch - SEARCH & ANALYZE DATA IN REAL TIMEPiotr Pelczar
 
ElasticSearch in action
ElasticSearch in actionElasticSearch in action
ElasticSearch in actionCodemotion
 
Introduction to Elasticsearch for Business Intelligence and Application Insights
Introduction to Elasticsearch for Business Intelligence and Application InsightsIntroduction to Elasticsearch for Business Intelligence and Application Insights
Introduction to Elasticsearch for Business Intelligence and Application InsightsData Works MD
 
Real-time search in Drupal with Elasticsearch @Moldcamp
Real-time search in Drupal with Elasticsearch @MoldcampReal-time search in Drupal with Elasticsearch @Moldcamp
Real-time search in Drupal with Elasticsearch @MoldcampAlexei Gorobets
 
Real-time search in Drupal. Meet Elasticsearch
Real-time search in Drupal. Meet ElasticsearchReal-time search in Drupal. Meet Elasticsearch
Real-time search in Drupal. Meet ElasticsearchAlexei Gorobets
 
elasticsearch - advanced features in practice
elasticsearch - advanced features in practiceelasticsearch - advanced features in practice
elasticsearch - advanced features in practiceJano Suchal
 
Introduction to Elasticsearch
Introduction to ElasticsearchIntroduction to Elasticsearch
Introduction to ElasticsearchRuslan Zavacky
 
Elasticsearch: You know, for search! and more!
Elasticsearch: You know, for search! and more!Elasticsearch: You know, for search! and more!
Elasticsearch: You know, for search! and more!Philips Kokoh Prasetyo
 
Elasticsearch & "PeopleSearch"
Elasticsearch & "PeopleSearch"Elasticsearch & "PeopleSearch"
Elasticsearch & "PeopleSearch"George Stathis
 
Elastic search and Symfony3 - A practical approach
Elastic search and Symfony3 - A practical approachElastic search and Symfony3 - A practical approach
Elastic search and Symfony3 - A practical approachSymfonyMu
 
Harnessing The Power of Search - Liferay DEVCON 2015, Darmstadt, Germany
Harnessing The Power of Search - Liferay DEVCON 2015, Darmstadt, GermanyHarnessing The Power of Search - Liferay DEVCON 2015, Darmstadt, Germany
Harnessing The Power of Search - Liferay DEVCON 2015, Darmstadt, GermanyAndré Ricardo Barreto de Oliveira
 
Introduction to Elasticsearch with basics of Lucene
Introduction to Elasticsearch with basics of LuceneIntroduction to Elasticsearch with basics of Lucene
Introduction to Elasticsearch with basics of LuceneRahul Jain
 
MongoDB .local Munich 2019: Still Haven't Found What You Are Looking For? Use...
MongoDB .local Munich 2019: Still Haven't Found What You Are Looking For? Use...MongoDB .local Munich 2019: Still Haven't Found What You Are Looking For? Use...
MongoDB .local Munich 2019: Still Haven't Found What You Are Looking For? Use...MongoDB
 
Elasticsearch for SQL Users
Elasticsearch for SQL UsersElasticsearch for SQL Users
Elasticsearch for SQL UsersAll Things Open
 
Elasticsearch in 15 minutes
Elasticsearch in 15 minutesElasticsearch in 15 minutes
Elasticsearch in 15 minutesDavid Pilato
 

Similaire à Faster and better search results with Elasticsearch (20)

In search of: A meetup about Liferay and Search 2016-04-20
In search of: A meetup about Liferay and Search   2016-04-20In search of: A meetup about Liferay and Search   2016-04-20
In search of: A meetup about Liferay and Search 2016-04-20
 
Elasticsearch in 15 Minutes
Elasticsearch in 15 MinutesElasticsearch in 15 Minutes
Elasticsearch in 15 Minutes
 
Elasticsearch - SEARCH & ANALYZE DATA IN REAL TIME
Elasticsearch - SEARCH & ANALYZE DATA IN REAL TIMEElasticsearch - SEARCH & ANALYZE DATA IN REAL TIME
Elasticsearch - SEARCH & ANALYZE DATA IN REAL TIME
 
ElasticSearch in action
ElasticSearch in actionElasticSearch in action
ElasticSearch in action
 
Introduction to Elasticsearch for Business Intelligence and Application Insights
Introduction to Elasticsearch for Business Intelligence and Application InsightsIntroduction to Elasticsearch for Business Intelligence and Application Insights
Introduction to Elasticsearch for Business Intelligence and Application Insights
 
Real-time search in Drupal with Elasticsearch @Moldcamp
Real-time search in Drupal with Elasticsearch @MoldcampReal-time search in Drupal with Elasticsearch @Moldcamp
Real-time search in Drupal with Elasticsearch @Moldcamp
 
Real-time search in Drupal. Meet Elasticsearch
Real-time search in Drupal. Meet ElasticsearchReal-time search in Drupal. Meet Elasticsearch
Real-time search in Drupal. Meet Elasticsearch
 
Elasticsearch
ElasticsearchElasticsearch
Elasticsearch
 
elasticsearch - advanced features in practice
elasticsearch - advanced features in practiceelasticsearch - advanced features in practice
elasticsearch - advanced features in practice
 
Apache solr
Apache solrApache solr
Apache solr
 
Introduction to Elasticsearch
Introduction to ElasticsearchIntroduction to Elasticsearch
Introduction to Elasticsearch
 
Elasticsearch: You know, for search! and more!
Elasticsearch: You know, for search! and more!Elasticsearch: You know, for search! and more!
Elasticsearch: You know, for search! and more!
 
Elasticsearch & "PeopleSearch"
Elasticsearch & "PeopleSearch"Elasticsearch & "PeopleSearch"
Elasticsearch & "PeopleSearch"
 
Elastic search and Symfony3 - A practical approach
Elastic search and Symfony3 - A practical approachElastic search and Symfony3 - A practical approach
Elastic search and Symfony3 - A practical approach
 
Harnessing The Power of Search - Liferay DEVCON 2015, Darmstadt, Germany
Harnessing The Power of Search - Liferay DEVCON 2015, Darmstadt, GermanyHarnessing The Power of Search - Liferay DEVCON 2015, Darmstadt, Germany
Harnessing The Power of Search - Liferay DEVCON 2015, Darmstadt, Germany
 
Introduction to Elasticsearch with basics of Lucene
Introduction to Elasticsearch with basics of LuceneIntroduction to Elasticsearch with basics of Lucene
Introduction to Elasticsearch with basics of Lucene
 
MongoDB .local Munich 2019: Still Haven't Found What You Are Looking For? Use...
MongoDB .local Munich 2019: Still Haven't Found What You Are Looking For? Use...MongoDB .local Munich 2019: Still Haven't Found What You Are Looking For? Use...
MongoDB .local Munich 2019: Still Haven't Found What You Are Looking For? Use...
 
Elasticsearch for SQL Users
Elasticsearch for SQL UsersElasticsearch for SQL Users
Elasticsearch for SQL Users
 
Splunk bsides
Splunk bsidesSplunk bsides
Splunk bsides
 
Elasticsearch in 15 minutes
Elasticsearch in 15 minutesElasticsearch in 15 minutes
Elasticsearch in 15 minutes
 

Dernier

Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 

Dernier (20)

Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 

Faster and better search results with Elasticsearch

  • 1. FASTER AND BETTER SEARCHFASTER AND BETTER SEARCH RESULTS WITH ELASTICSEARCHRESULTS WITH ELASTICSEARCH TAKE YOUR SITE-WIDE SEARCHES TO THE NEXT LEVELTAKE YOUR SITE-WIDE SEARCHES TO THE NEXT LEVEL 1
  • 2. WEB SITE SEARCHWEB SITE SEARCH Search across different fields (title, content,...); show relevant results first; 2
  • 3. WEB SITE SEARCHWEB SITE SEARCH Search across different fields (title, content,...); show relevant results first; categorize results; filter by various attributes; 2
  • 4. WEB SITE SEARCHWEB SITE SEARCH Search across different fields (title, content,...); show relevant results first; categorize results; filter by various attributes; withstand user typos; treat synonyms as the same word; 2
  • 5. WEB SITE SEARCHWEB SITE SEARCH Search across different fields (title, content,...); show relevant results first; categorize results; filter by various attributes; withstand user typos; treat synonyms as the same word; be scalable; be fault tolerant; easy to deploy. 2
  • 6. PLONE SITE SEARCHPLONE SITE SEARCH ZCatalog: fully integrated in Plone; no advanced features (like synonyms support); not very scalable. 3
  • 7. PLONE SITE SEARCHPLONE SITE SEARCH ZCatalog: fully integrated in Plone; no advanced features (like synonyms support); not very scalable. Apache Solr: based on the Java search library Apache Lucene; better results ranking; advanced features; more configurable; some clustering support (using Zookeper) 3
  • 8. PLONE SITE SEARCHPLONE SITE SEARCH ZCatalog: fully integrated in Plone; no advanced features (like synonyms support); not very scalable. Apache Solr: based on the Java search library Apache Lucene; better results ranking; advanced features; more configurable; some clustering support (using Zookeper) Elasticsearch: based (again) on Lucene; similar search features of Solr great scalability; less XML, more JSON. 3
  • 9. PLONE SITE SEARCHPLONE SITE SEARCH ZCatalog: fully integrated in Plone; no advanced features (like synonyms support); not very scalable. Apache Solr: collective.solr, alm.solrindex based on the Java search library Apache Lucene; better results ranking; advanced features; more configurable; some clustering support (using Zookeper) Elasticsearch: collective.elasticsearch based (again) on Lucene; similar search features of Solr great scalability; less XML, more JSON. 3
  • 10. ELASTIC STACKELASTIC STACK Also know as ELK: Elasticsearch, Logstash, Kibana, Beats. Two main class of use cases: Almost static data: search engines, Time series data: logs and metrics. 4
  • 12. INDEX A DOCUMENTINDEX A DOCUMENT POST plone/_doc { "title": "Getting started with plone and Elasticsearch", "author": "Enrico Polesel", "content": "We want to index the entire content of our Plone website into elasticsearch...", "tags": ["plone", "search", "elasticsearch", "cluster", "performance", "high availability"], "date": "2019-10-25T11:50:00+0200" } 6
  • 13. INDEX A DOCUMENTINDEX A DOCUMENT POST plone/_doc { "title": "Getting started with plone and Elasticsearch", "author": "Enrico Polesel", "content": "We want to index the entire content of our Plone website into elasticsearch...", "tags": ["plone", "search", "elasticsearch", "cluster", "performance", "high availability"], "date": "2019-10-25T11:50:00+0200" } { "_index" : "plone", "_type" : "_doc", "_id" : "Y0MZ7W0B3-sU3YTrncfM", "_version" : 1, "result" : "created", "_shards" : { "total" : 2, "successful" : 1, "failed" : 0 }, "_seq_no" : 0, "_primary_term" : 1 } 6
  • 14. DATA TYPESDATA TYPES short, long, float, double IP geopoint interval, date_interval keyword (not analyzed strings), text (analyzed strings), object, array, nested object, ... 7
  • 15. DATA TYPESDATA TYPES short, long, float, double IP geopoint interval, date_interval keyword (not analyzed strings), text (analyzed strings), object, array, nested object, ... 7
  • 16. DATA TYPESDATA TYPES short, long, float, double IP geopoint interval, date_interval keyword (not analyzed strings), text (analyzed strings), object, array, nested object, ... 7
  • 17. DATA TYPESDATA TYPES short, long, float, double IP geopoint interval, date_interval keyword (not analyzed strings), text (analyzed strings), object, array, nested object, ... 7
  • 18. DATA TYPESDATA TYPES short, long, float, double IP geopoint interval, date_interval keyword (not analyzed strings), text (analyzed strings), object, array, nested object, ... 7
  • 19. DATA TYPESDATA TYPES short, long, float, double IP geopoint interval, date_interval keyword (not analyzed strings), text (analyzed strings), object, array, nested object, ... 7
  • 20. DATA TYPESDATA TYPES short, long, float, double IP geopoint interval, date_interval keyword (not analyzed strings), text (analyzed strings), object, array, nested object, ... 7
  • 22. ANALYZERSANALYZERS 1. Char filters convert HTML escape codes normalize unicode symbols replace patterns 2. Tokenizer separate on whitespaces separate on punctuation may be grammar based may generate partial words special tokenizer for special strings (like paths) 3. Token filters normalize tokens stemming remove stopwords translate synonyms 9
  • 23. QUERY - MATCHQUERY - MATCH GET plone/_search { "query": { "match": { "content": "elasticsearch" } } } { ... "hits" : { "total" : { ... }, "max_score" : 0.2876821, "hits" : [ { "_index" : "plone", "_id" : "Y0MZ7W0B3-sU3YTrncfM", "_score" : 0.2876821, "_source" : { "title" : "Getting started with plone and Elasticsearch", ... } } ] } }
  • 24. QUERY - FUZZY MATCHQUERY - FUZZY MATCH With distance 1 we have: Changing a character (box → fox) Removing a character (black → lack) Inserting a character (sic → sick) Transposing two adjacent characters (act → cat) GET plone/_search { "query": { "match": { "content": { "query": "ploMe", "fuzziness": 1 } } } } 11
  • 25. QUERY - MULTI MATCHQUERY - MULTI MATCH Matches in the title field will be boosted! GET plone/_search { "query": { "multi_match": { "query": "plome", "fields": [ "tilte^2", "content" ], "fuzziness": 1 } } } 12
  • 26. QUERYQUERY And much more! Suggestions, search as you type, geo query, external ranking, more like this, ... 13
  • 27. AGGREGATIONSAGGREGATIONS GET plone/_search { "query": { "match": { "content": "elasticsearch" } }, "aggs": { "Authors": { "term": { "field": "author", "size": 10 } } } } 14
  • 28. AGGREGATIONSAGGREGATIONS GET plone/_search { "query": { "match": { "content": "elasticsearch" } }, "aggs": { "Authors": { "term": { "field": "author", "size": 10 }, "aggs": { "Tags": { "term": { "field": "tags", "size": 100 } } } } } } 15
  • 29. AGGREGATIONSAGGREGATIONS GET plone/_search { "query": { ... }, }, "aggs": { "Authors": { "term": { "field": "author", "size": 10 }, "aggs": { "Avg-length": { "avg": { "field": "length" } }, "Last-published": { "max": { "field": "date" } } } } } } 16
  • 30. AGGREGATIONSAGGREGATIONS And much more! Advanced stats, geo centroid, cardinality, significant terms, ... 17
  • 31. RUNNING ELASTICSEARCHRUNNING ELASTICSEARCH config/elasticsearch.yml config/jvm.options Docker, yum/apt, Windows and MacOS also supported! See $ wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.4.0-linux-x86_64.tar.gz $ tar -xf elasticsearch-7.4.0-linux-x86_64.tar.gz $ cd elasticsearch-7.4.0-linux-x86_64 $ bin/elasticsearch https://www.elastic.co/downloads/ 18
  • 32. RUNNING ELASTICSEARCHRUNNING ELASTICSEARCH config/elasticsearch.yml config/jvm.options Docker, yum/apt, Windows and MacOS also supported! See $ wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.4.0-linux-x86_64.tar.gz $ tar -xf elasticsearch-7.4.0-linux-x86_64.tar.gz $ cd elasticsearch-7.4.0-linux-x86_64 $ bin/elasticsearch $ wget https://artifacts.elastic.co/downloads/kibana/kibana-7.4.0-linux-x86_64.tar.gz $ tar -xf kibana-7.4.0-linux-x86_64.tar.gz $ cd kibana-7.4.0-linux-x86_64 $ bin/kibana https://www.elastic.co/downloads/ 18
  • 33. CLUSTERINGCLUSTERING Need high availability? Install two data nodes! (replica is enabled by default) Need more space? Increase the number of nodes! (and of indeces/shards) Need more search performance? Increase the number of replicas! Have disks of different type (fast/slow)? Use hot-cold architecture! 19
  • 34. WHAT'S NEXT? ELASTIC APP SEARCHWHAT'S NEXT? ELASTIC APP SEARCH 20
  • 35. HOMEWORKHOMEWORK Download Elasticsearch from Untar, cd, and run Elasticsearch (bin/elasticsearch) Test it: curl http://localhost:9200/ Add collective.elasticsearch to your project eggs & re-run buildout Restart Plone Goto Control Panel Add "Elastic Search" in Add-on Products Click "Elastic Search" in "Add-on Configuration" Enable Click "Convert Catalog" Click "Rebuild Catalog" https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-6.8.4.tar.gz 21