Faster and better search results with Elasticsearch

FASTER AND BETTER SEARCHFASTER AND BETTER SEARCH
RESULTS WITH ELASTICSEARCHRESULTS WITH ELASTICSEARCH
TAKE YOUR SITE-WIDE SEARCHES TO THE NEXT LEVELTAKE YOUR SITE-WIDE SEARCHES TO THE NEXT LEVEL
1

WEB SITE SEARCHWEB SITE SEARCH
Search across diﬀerent fields (title, content,...);
show relevant results first;
2

categorize results;
filter by various attributes;
2

categorize results;
withstand user typos;
treat synonyms as the same word;
2

categorize results;
withstand user typos;
treat synonyms as the same word;
be scalable;
be fault tolerant;
easy to deploy.
2

PLONE SITE SEARCHPLONE SITE SEARCH
ZCatalog:
fully integrated in Plone;
no advanced features (like synonyms support);
not very scalable.
3

ZCatalog:
not very scalable.
Apache Solr:
based on the Java search library Apache Lucene;
better results ranking;
advanced features;
more configurable;
some clustering support (using Zookeper)
3

ZCatalog:
not very scalable.
Apache Solr:
advanced features;
more configurable;
Elasticsearch:
based (again) on Lucene;
similar search features of Solr
great scalability;
less XML, more JSON. 3

ZCatalog:
not very scalable.
Apache Solr: collective.solr, alm.solrindex
advanced features;
more configurable;
Elasticsearch: collective.elasticsearch
based (again) on Lucene;
similar search features of Solr
great scalability;
less XML, more JSON. 3

ELASTIC STACKELASTIC STACK
Also know as ELK:
Elasticsearch,
Logstash,
Kibana,
Beats.
Two main class of use cases:
Almost static data: search engines,
Time series data: logs and metrics.
4

INDEX A DOCUMENTINDEX A DOCUMENT
POST plone/_doc
{
"title": "Getting started with plone and Elasticsearch",
"author": "Enrico Polesel",
"content": "We want to index the entire content of our Plone website into elasticsearch...",
"tags": ["plone", "search", "elasticsearch", "cluster", "performance", "high availability"],
"date": "2019-10-25T11:50:00+0200"
}
6

INDEX A DOCUMENTINDEX A DOCUMENT
POST plone/_doc
{
"title": "Getting started with plone and Elasticsearch",
"author": "Enrico Polesel",
"content": "We want to index the entire content of our Plone website into elasticsearch...",
"tags": ["plone", "search", "elasticsearch", "cluster", "performance", "high availability"],
"date": "2019-10-25T11:50:00+0200"
}
{
"_index" : "plone",
"_type" : "_doc",
"_id" : "Y0MZ7W0B3-sU3YTrncfM",
"_version" : 1,
"result" : "created",
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 0,
"_primary_term" : 1
}
6

DATA TYPESDATA TYPES
short, long,
float, double
IP
geopoint
interval, date_interval
keyword (not analyzed strings),
text (analyzed strings),
object, array, nested object,
...
7

ANALYZERSANALYZERS
1. Char filters
convert HTML escape codes
normalize unicode symbols
replace patterns
2. Tokenizer
separate on whitespaces
separate on punctuation
may be grammar based
may generate partial words
special tokenizer for special strings (like paths)
3. Token filters
normalize tokens
stemming
remove stopwords
translate synonyms 9

QUERY - MATCHQUERY - MATCH
GET plone/_search
{
"query": {
"match": {
"content": "elasticsearch"
}
}
}
{
...
"hits" : {
"total" : { ... },
"max_score" : 0.2876821,
"hits" : [
{
"_index" : "plone",
"_id" : "Y0MZ7W0B3-sU3YTrncfM",
"_score" : 0.2876821,
"_source" : {
"title" : "Getting started with plone and Elasticsearch",
...
}
}
]
}
}

QUERY - FUZZY MATCHQUERY - FUZZY MATCH
With distance 1 we have:
Changing a character (box → fox)
Removing a character (black → lack)
Inserting a character (sic → sick)
Transposing two adjacent characters (act → cat)
GET plone/_search
{
"query": {
"match": {
"content": {
"query": "ploMe",
"fuzziness": 1
}
}
}
}
11

QUERY - MULTI MATCHQUERY - MULTI MATCH
Matches in the title field will be boosted!
GET plone/_search
{
"query": {
"multi_match": {
"query": "plome",
"fields": [ "tilte^2", "content" ],
"fuzziness": 1
}
}
}
12

QUERYQUERY
And much more!
Suggestions,
search as you type,
geo query,
external ranking,
more like this,
...
13

AGGREGATIONSAGGREGATIONS
GET plone/_search
{
"query": {
"match": {
}
},
"aggs": {
"Authors": {
"term": {
"field": "author",
"size": 10
}
}
}
}
14

GET plone/_search
{
"query": {
"match": {
}
},
"aggs": {
"Authors": {
"term": {
"field": "author",
"size": 10
},
"aggs": {
"Tags": {
"term": {
"field": "tags",
"size": 100
}
}
}
}
}
}
15

GET plone/_search
{
"query": { ... }, },
"aggs": {
"Authors": {
"term": {
"field": "author",
"size": 10
},
"aggs": {
"Avg-length": {
"avg": {
"field": "length"
}
},
"Last-published": {
"max": {
"field": "date"
}
}
}
}
}
}
16

And much more!
Advanced stats,
geo centroid,
cardinality,
significant terms,
...
17

RUNNING ELASTICSEARCHRUNNING ELASTICSEARCH
config/elasticsearch.yml
config/jvm.options
Docker, yum/apt, Windows and MacOS also supported!
See
$ wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.4.0-linux-x86_64.tar.gz
$ tar -xf elasticsearch-7.4.0-linux-x86_64.tar.gz
$ cd elasticsearch-7.4.0-linux-x86_64
$ bin/elasticsearch
https://www.elastic.co/downloads/
18

RUNNING ELASTICSEARCHRUNNING ELASTICSEARCH
config/elasticsearch.yml
config/jvm.options
Docker, yum/apt, Windows and MacOS also supported!
See
$ wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.4.0-linux-x86_64.tar.gz
$ tar -xf elasticsearch-7.4.0-linux-x86_64.tar.gz
$ cd elasticsearch-7.4.0-linux-x86_64
$ bin/elasticsearch
$ wget https://artifacts.elastic.co/downloads/kibana/kibana-7.4.0-linux-x86_64.tar.gz
$ tar -xf kibana-7.4.0-linux-x86_64.tar.gz
$ cd kibana-7.4.0-linux-x86_64
$ bin/kibana
https://www.elastic.co/downloads/
18

CLUSTERINGCLUSTERING
Need high availability? Install two data nodes! (replica is enabled by default)
Need more space? Increase the number of nodes! (and of indeces/shards)
Need more search performance? Increase the number of replicas!
Have disks of diﬀerent type (fast/slow)? Use hot-cold architecture!
19

WHAT'S NEXT? ELASTIC APP SEARCHWHAT'S NEXT? ELASTIC APP SEARCH
20

HOMEWORKHOMEWORK
Download Elasticsearch from
Untar, cd, and run Elasticsearch (bin/elasticsearch)
Test it: curl http://localhost:9200/
Add collective.elasticsearch to your project eggs & re-run buildout
Restart Plone
Goto Control Panel
Add "Elastic Search" in Add-on Products
Click "Elastic Search" in "Add-on Configuration"
Enable
Click "Convert Catalog"
Click "Rebuild Catalog"
https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-6.8.4.tar.gz
21

Faster and better search results with Elasticsearch

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (14)

Similaire à Faster and better search results with Elasticsearch

Similaire à Faster and better search results with Elasticsearch (20)

Dernier

Dernier (20)

Faster and better search results with Elasticsearch