SlideShare une entreprise Scribd logo
1  sur  47
Télécharger pour lire hors ligne
elasticsearch basics
mathieu Elie at giroll
mardi 17 décembre 13
speaker : @mathieuel
• freelance & founder @oneplaylist
• full stack skills
• see what i’ve done on

mardi 17 décembre 13
• go from first steps
• and get over first frustation
• give the you the power needed to learn by

mardi 17 décembre 13
• be sure you have java runtime
• apt-get install openjdk-6-jre-headless -y
• consider oracle jvm

mardi 17 décembre 13
unzip and run !
## Get the latest stable archive
## Extract the archive
cd elasticsearch-0.90.7
## run !
# This will run elasticsearch on foreground.
./bin/elasticsearch -f

mardi 17 décembre 13
its alive !
[2013-12-13 15:45:25,187][INFO ][node
] [Bridge, George Washington]
version[0.90.7], pid[37998], build[36897d0/2013-11-13T12:06:54Z]
[2013-12-13 15:45:25,189][INFO ][node
] [Bridge, George Washington]
initializing ...
[2013-12-13 15:45:25,202][INFO ][plugins
] [Bridge, George Washington]
loaded [], sites []
[2013-12-13 15:45:28,342][INFO ][node
] [Bridge, George Washington]
[2013-12-13 15:45:28,342][INFO ][node
] [Bridge, George Washington]
starting ...
[2013-12-13 15:45:28,491][INFO ][transport
] [Bridge, George Washington]
bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/]}
[2013-12-13 15:45:31,545][INFO ][cluster.service
] [Bridge, George Washington]
new_master [Bridge, George Washington][pKCdh1b_TP2TlurO1gm4_g][inet[/]],
reason: zen-disco-join (elected_as_master)
[2013-12-13 15:45:31,577][INFO ][discovery
] [Bridge, George Washington]
[2013-12-13 15:45:31,595][INFO ][http
] [Bridge, George Washington]
bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/]}
[2013-12-13 15:45:31,596][INFO ][node
] [Bridge, George Washington]
[2013-12-13 15:45:31,629][INFO ][gateway
] [Bridge, George Washington]
recovered [0] indices into cluster_state
mardi 17 décembre 13
ping es on port 9200
"ok" : true,
"status" : 200,
"name" : "Gideon, Gregory",
"version" : {
"number" : "0.90.6",
"build_hash" : "e2a24efdde0cb7cc1b2071ffbbd1fd874a6d8d6b",
"build_timestamp" : "2013-11-04T13:44:16Z",
"build_snapshot" : false,
"lucene_version" : "4.5.1"
"tagline" : "You Know, for Search"

mardi 17 décembre 13
Store a Document
curl -XPUT http://localhost:9200/workshop/site/1 -d '
"url": "",
"title": "Open Source Distributed Real Time Search & Analytics",
"description": "Elasticsearch is a powerful open source search and
analytics engine that makes data easy to explore.",
"tags": ["Open Source", "elasticsearch", "Distributed"]

mardi 17 décembre 13
retreive the document
curl -XGET http://localhost:9200/workshop/site/1
"_source" :
"url": "",
"title": "Open Source Distributed Real Time Search & Analytics",
"description": "Elasticsearch is a powerful open source search and
analytics engine that makes data easy to explore.",
"tags": ["Open Source", "elasticsearch", "Distributed"]

mardi 17 décembre 13
add more documents
curl -XPUT http://localhost:9200/workshop/site/2 -d '
"url": "",
"title": "Mathieu ELIE Freelance - Full Stack Data Engineer, Data
"description": "Freelance Consultant in Bordeaux, System & Software
Architect. Love dataviz, redis, elasticsearch, architecture scalability
recipes and playing with data.",
tags: ["elasticsearch", "Data Visualization"]
curl -XPUT http://localhost:9200/workshop/site/3 -d '
"url": "",
"title": "Collectif Giroll - Gironde Logiciels Libres",
"description": "Giroll, collectif basÎ È Bordeaux, rÎunis
autour des Logiciels et des Cultures libres. Ateliers tous les mardis de
18h30 È 20h30 et organisation d''Install Party Linux tous les six",
tags: ["Open Source", "Collectif"]
mardi 17 décembre 13
now search !

mardi 17 décembre 13
curl 'http://localhost:9200/workshop/_search?pretty=true'
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
"hits" : {
"total" : 3,
"max_score" : 1.0,
"hits" : [ {
"_index" : "workshop",
"_type" : "site",
"_id" : "1",
"_score" : 1.0, "_source" :
"url": "",
"title": "Open Source Distributed Real Time Search & Analytics",
"description": "Elasticsearch is a powerful open source search and analytics engine
that makes data easy to explore.",
"tags": ["Open Source", "elasticsearch", "Distributed"]
}, {
"_index" : "workshop",
"_type" : "site",
"_id" : "3",
"_score" : 1.0, "_source" :
"url": "",
"title": "Collectif Giroll - Gironde Logiciels Libres",
"description": "Giroll, collectif basÎ È Bordeaux, rÎunis autour des Logiciels
et des Cultures libres. Ateliers tous les mardis de 18h30 È 20h30 et organisation
mardi 17 décembre 13
dInstall Party Linux tous les six",
ok great, but now i
want to search for
text !
mardi 17 décembre 13
step 1 : pass query as a
request body
curl -XPOST 'http://localhost:9200/workshop/site/_search?pretty=true' -d
"query" : {
"match_all" : { }

mardi 17 décembre 13
It returns all documents
because we use the match all query

mardi 17 décembre 13
match_all query is part of the queries dsl

mardi 17 décembre 13
so lets use the
query_string query dsl
curl -XPOST 'http://localhost:9200/workshop/site/_search?pretty=true' -d '{
"query" : {
"query_string" : {
"query" : "elasticsearch"

mardi 17 décembre 13
result is a a quiet
verbose lets get only
title and tags fields
curl -XPOST 'http://localhost:9200/workshop/site/_search?pretty=true' -d '{
"fields" : ["title", "tags"],
"query" : {
"query_string" : {
"query" : "elasticsearch"

mardi 17 décembre 13
"took" : 6,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
"hits" : {
"total" : 2,
"max_score" : 0.081366636,
"hits" : [ {
"_index" : "workshop",
"_type" : "site",
"_id" : "1",
"_score" : 0.081366636,
"fields" : {
"tags" : [ "Open Source", "elasticsearch", "Distributed" ],
"title" : "Open Source Distributed Real Time Search & Analytics"
}, {
"_index" : "workshop",
"_type" : "site",
"_id" : "2",
"_score" : 0.06780553,
"fields" : {
"tags" : [ "elasticsearch", "Data Visualization" ],
"title" : "Mathieu ELIE Freelance - Full Stack Data Engineer, Data
mardi 17 décembre 13
lets go for facets on tags !!

do you see the wall ??? ;)

mardi 17 décembre 13
Facets dsl
curl -XPOST 'http://localhost:9200/workshop/site/_search?pretty=true' -d '{
"fields" : ["title", "tags"],
"query" : {
"query_string" : {
"query" : "elasticsearch"
"facets" : {
"tags" : { "terms" : {"field" : "tags"} }

mardi 17 décembre 13
"facets" : {
"tags" : {
"_type" : "terms",
"missing" : 0,
"total" : 7,
"other" : 0,
"terms" : [ {
"term" : "elasticsearch",
"count" : 2
}, {
"term" : "visualization",
"count" : 1
}, {
"term" : "source",
"count" : 1
}, {
"term" : "open",
"count" : 1
}, {
"term" : "distributed",
"count" : 1
}, {
"term" : "data",
"count" : 1
} ]
mardi 17 décembre 13

ho no!!
• hey ! see "Open Source" !

it is lower cased
and exploded in multiple tokens !

• this is done by the defautl mapping and

mardi 17 décembre 13
curl 'http://localhost:9200/workshop/site/_mapping?pretty=true'
"site" : {
"properties" : {
"description" : {
"type" : "string"
"tags" : {
"type" : "string"
"title" : {
"type" : "string"
"url" : {
"type" : "string"

mardi 17 décembre 13
• tags is a type of string and we have a default



• An analyzer of type standard is built using
the Standard Tokenizer with the Standard
Token Filter, Lower Case Token Filter, and
Stop Token Filter.

mardi 17 décembre 13
test the default analyzer
curl -XGET 'localhost:9200/workshop/_analyze?pretty=true' -d 'Open Source'
"tokens" : [ {
"token" : "open",
"start_offset" : 0,
"end_offset" : 4,
"type" : "<ALPHANUM>",
"position" : 1
}, {
"token" : "source",
"start_offset" : 5,
"end_offset" : 11,
"type" : "<ALPHANUM>",
"position" : 2
} ]

mardi 17 décembre 13
• what about keyword analyzer ?


mardi 17 décembre 13
curl -XGET 'localhost:9200/workshop/_analyze?
analyzer=keyword&pretty=true' -d 'Open Source'
"tokens" : [ {
"token" : "Open Source",
"start_offset" : 0,
"end_offset" : 11,
"type" : "word",
"position" : 1
} ]

got it ! now how to apply this to our tags field ?

mardi 17 décembre 13

'http://localhost:9200/workshop/site/_mapping?pretty=true' -d '
"site" : {
"properties" : {
"url" : {"type" : "string"},
"title" : {"type" : "string"},
"description" : {"type" : "string"},
"tags" : {"type" : "string", "analyzer": "keyword" }

"error" : "MergeMappingException[Merge failed with failures {[mapper
[tags] has different index_analyzer]}]",
"status" : 400

oops ! we need to drop something..
mardi 17 décembre 13
curl -XDELETE 'http://localhost:9200/workshop/'
# index should exists if we want to put mapping..
curl -XPUT 'http://localhost:9200/workshop/'

'http://localhost:9200/workshop/site/_mapping?pretty=true' -d '
"site" : {
"properties" : {
"url" : {"type" : "string"},
"title" : {"type" : "string"},
"description" : {"type" : "string"},
"tags" : {"type" : "string", "analyzer": "keyword" }


mardi 17 décembre 13
# test on the field analysis
curl -XGET 'localhost:9200/workshop/_analyze?
pretty=true&field=site.tags' -d 'Open Source'
"tokens" : [ {
"token" : "Open Source",
"start_offset" : 0,
"end_offset" : 11,
"type" : "word",
"position" : 1
} ]
# congrats !

mardi 17 décembre 13
# lets push data again
curl -XPUT http://localhost:9200/workshop/site/1 -d '
"url": "",
"title": "Open Source Distributed Real Time Search & Analytics",
"description": "Elasticsearch is a powerful open source search and
analytics engine that makes data easy to explore.",
"tags": ["Open Source", "elasticsearch", "Distributed"]

curl -XPUT http://localhost:9200/workshop/site/2 -d '
"url": "",
"title": "Mathieu ELIE Freelance - Full Stack Data Engineer, Data
"description": "Freelance Consultant in Bordeaux, System &amp; Software
Architect. Love dataviz, redis, elasticsearch, architecture scalability
recipes and playing with data.",
tags: ["elasticsearch", "Data Visualization"]

curl -XPUT http://localhost:9200/workshop/site/3 -d '
"url": "",
"title": "Collectif Giroll - Gironde Logiciels Libres",
"description": "Giroll, collectif basÎ È Bordeaux, rÎunis autour
des Logiciels et des Cultures libres. Ateliers tous les mardis de 18h30 √

mardi 17 décembre 13
# faceting ok ???
curl -XPOST 'http://localhost:9200/workshop/site/_search?
pretty=true' -d '{
"fields" : ["title", "tags"],
"query" : {
"query_string" : {
"query" : "elasticsearch"
"facets" : {
"tags" : { "terms" : {"field" : "tags"} }

mardi 17 décembre 13
"facets" : {
"tags" : {
"_type" : "terms",
"missing" : 0,
"total" : 5,
"other" : 0,
"terms" : [ {
"term" : "elasticsearch",
"count" : 2
}, {
"term" : "Open Source",
"count" : 1
}, {
"term" : "Distributed",
"count" : 1
}, {
"term" : "Data Visualization",
"count" : 1
} ]

cool ! our facets contains whole tags ! great jobs !!
mardi 17 décembre 13
if want only docs with "Open Source" tag
we use filters
and term filter

mardi 17 décembre 13
curl -XGET 'http://localhost:9200/workshop/site/_search?
pretty=true' -d '{
"query" : {
"match_all" : { }
"filter" : {
"term" : { "tags" : "Open Source"}

• more efficient than full text search
• cached / indexed
• you can filter using facet items
mardi 17 décembre 13
• elasticsearch doc is great
• but it is exhaustive
• so at the beguining its a bit frustrating

mardi 17 décembre 13
Think about json
curl -XPOST 'http://localhost:9200/workshop/site/_search?
pretty=true' -d '{
"fields" : ["title", "tags"],
"query" : {
"query_string" : {
"query" : "elasticsearch"
"facets" : {
"tags" : { "terms" : {"field" : "tags"} }

mardi 17 décembre 13
your hitting the search api
curl -XPOST 'http://localhost:9200/workshop/site/_search?
pretty=true' -d '{
"fields" : ["title", "tags"],
"query" : {
"query_string" : {
"query" : "elasticsearch"
"facets" : {
"tags" : { "terms" : {"field" : "tags"} }

mardi 17 décembre 13
your using the query dsl
curl -XPOST 'http://localhost:9200/workshop/site/_search?
pretty=true' -d '{
"fields" : ["title", "tags"],
"query" : {
"query_string" : {
"query" : "elasticsearch"
"facets" : {
"tags" : { "terms" : {"field" : "tags"} }

mardi 17 décembre 13
your using different types of queries
curl -XPOST 'http://localhost:9200/workshop/site/_search?
pretty=true' -d '{
"fields" : ["title", "tags"],
"query" : {
"query_string" : {
"query" : "elasticsearch"
"facets" : {
"tags" : { "terms" : {"field" : "tags"} }

mardi 17 décembre 13
this query is a query_string type
with a query parameter set to elasticsearch
curl -XPOST 'http://localhost:9200/workshop/site/_search?
pretty=true' -d '{
"fields" : ["title", "tags"],
"query" : {
"query_string" : {
"query" : "elasticsearch"
"facets" : {
"tags" : { "terms" : {"field" : "tags"} }

mardi 17 décembre 13
we also use faceting
curl -XPOST 'http://localhost:9200/workshop/site/_search?
pretty=true' -d '{
"fields" : ["title", "tags"],
"query" : {
"query_string" : {
"query" : "elasticsearch"
"facets" : {
"tags" : { "terms" : {"field" : "tags"} }

mardi 17 décembre 13
we use a terms facet
curl -XPOST 'http://localhost:9200/workshop/site/_search?
pretty=true' -d '{
"fields" : ["title", "tags"],
"query" : {
"query_string" : {
"query" : "elasticsearch"
"facets" : {
"tags" : { "terms" : {"field" : "tags"} }

mardi 17 décembre 13
• common mistake: the code example are
not showing always whole query

• so you should replace the code in the doc
in the whole dsl hierarchy

• think about hierarchy and everything
should be more clear

mardi 17 décembre 13
the end for me...

the begguining for you...
mardi 17 décembre 13
questions and more
• twitter @mathieuel
• contact on my freelance website
• thanks to giroll for hosting this workshop !
mardi 17 décembre 13

Contenu connexe


CouchDB Open Source Bridge
CouchDB Open Source BridgeCouchDB Open Source Bridge
CouchDB Open Source BridgeChris Anderson
Elasticsearch in 15 minutes
Elasticsearch in 15 minutesElasticsearch in 15 minutes
Elasticsearch in 15 minutesDavid Pilato
Drupal 6 to 7 migration
Drupal 6 to 7 migrationDrupal 6 to 7 migration
Drupal 6 to 7 migrationAdelle Frank
When big data meet python @ COSCUP 2012
When big data meet python @ COSCUP 2012When big data meet python @ COSCUP 2012
When big data meet python @ COSCUP 2012Jimmy Lai
Code decoupling from Symfony (and others frameworks) - PHP Conference Brasil ...
Code decoupling from Symfony (and others frameworks) - PHP Conference Brasil ...Code decoupling from Symfony (and others frameworks) - PHP Conference Brasil ...
Code decoupling from Symfony (and others frameworks) - PHP Conference Brasil ...Miguel Gallardo
The Bixo Web Mining Toolkit
The Bixo Web Mining ToolkitThe Bixo Web Mining Toolkit
The Bixo Web Mining ToolkitTom Croucher
Spiders, Chatbots, and the Future of Metadata: A look inside the BNC BiblioSh...
Spiders, Chatbots, and the Future of Metadata: A look inside the BNC BiblioSh...Spiders, Chatbots, and the Future of Metadata: A look inside the BNC BiblioSh...
Spiders, Chatbots, and the Future of Metadata: A look inside the BNC BiblioSh...BookNet Canada
Web History 101, or How the Future is Unwritten
Web History 101, or How the Future is UnwrittenWeb History 101, or How the Future is Unwritten
Web History 101, or How the Future is UnwrittenBookNet Canada
Introduction to ELK
Introduction to ELKIntroduction to ELK
Introduction to ELKYuHsuan Chen
Introduction to Elasticsearch
Introduction to ElasticsearchIntroduction to Elasticsearch
Introduction to ElasticsearchJason Austin
Big data at scrapinghub
Big data at scrapinghubBig data at scrapinghub
Big data at scrapinghubDana Brophy
Distributed percolator in elasticsearch
Distributed percolator in elasticsearchDistributed percolator in elasticsearch
Distributed percolator in elasticsearchmartijnvg
ElasticSearch for data mining
ElasticSearch for data mining ElasticSearch for data mining
ElasticSearch for data mining William Simms
Amazing Speed: Elasticsearch for the .NET Developer- Adrian Carr, Codestock 2015
Amazing Speed: Elasticsearch for the .NET Developer- Adrian Carr, Codestock 2015Amazing Speed: Elasticsearch for the .NET Developer- Adrian Carr, Codestock 2015
Amazing Speed: Elasticsearch for the .NET Developer- Adrian Carr, Codestock 2015Adrian Carr
Getting started with Scrapy in Python
Getting started with Scrapy in PythonGetting started with Scrapy in Python
Getting started with Scrapy in PythonViren Rajput
ArangoDB – A different approach to NoSQL
ArangoDB – A different approach to NoSQLArangoDB – A different approach to NoSQL
ArangoDB – A different approach to NoSQLArangoDB Database
Apache CouchDB Presentation @ Sept. 2104 GTALUG Meeting
Apache CouchDB Presentation @ Sept. 2104 GTALUG MeetingApache CouchDB Presentation @ Sept. 2104 GTALUG Meeting
Apache CouchDB Presentation @ Sept. 2104 GTALUG MeetingMyles Braithwaite
An Introduction to Elastic Search.
An Introduction to Elastic Search.An Introduction to Elastic Search.
An Introduction to Elastic Search.Jurriaan Persyn

Tendances (19)

CouchDB Open Source Bridge
CouchDB Open Source BridgeCouchDB Open Source Bridge
CouchDB Open Source Bridge
Elasticsearch in 15 minutes
Elasticsearch in 15 minutesElasticsearch in 15 minutes
Elasticsearch in 15 minutes
Drupal 6 to 7 migration
Drupal 6 to 7 migrationDrupal 6 to 7 migration
Drupal 6 to 7 migration
When big data meet python @ COSCUP 2012
When big data meet python @ COSCUP 2012When big data meet python @ COSCUP 2012
When big data meet python @ COSCUP 2012
Code decoupling from Symfony (and others frameworks) - PHP Conference Brasil ...
Code decoupling from Symfony (and others frameworks) - PHP Conference Brasil ...Code decoupling from Symfony (and others frameworks) - PHP Conference Brasil ...
Code decoupling from Symfony (and others frameworks) - PHP Conference Brasil ...
The Bixo Web Mining Toolkit
The Bixo Web Mining ToolkitThe Bixo Web Mining Toolkit
The Bixo Web Mining Toolkit
Spiders, Chatbots, and the Future of Metadata: A look inside the BNC BiblioSh...
Spiders, Chatbots, and the Future of Metadata: A look inside the BNC BiblioSh...Spiders, Chatbots, and the Future of Metadata: A look inside the BNC BiblioSh...
Spiders, Chatbots, and the Future of Metadata: A look inside the BNC BiblioSh...
Web History 101, or How the Future is Unwritten
Web History 101, or How the Future is UnwrittenWeb History 101, or How the Future is Unwritten
Web History 101, or How the Future is Unwritten
Introduction to ELK
Introduction to ELKIntroduction to ELK
Introduction to ELK
Introduction to Elasticsearch
Introduction to ElasticsearchIntroduction to Elasticsearch
Introduction to Elasticsearch
Big data at scrapinghub
Big data at scrapinghubBig data at scrapinghub
Big data at scrapinghub
Distributed percolator in elasticsearch
Distributed percolator in elasticsearchDistributed percolator in elasticsearch
Distributed percolator in elasticsearch
ElasticSearch for data mining
ElasticSearch for data mining ElasticSearch for data mining
ElasticSearch for data mining
Elasticsearch Introduction at BigData meetup
Elasticsearch Introduction at BigData meetupElasticsearch Introduction at BigData meetup
Elasticsearch Introduction at BigData meetup
Amazing Speed: Elasticsearch for the .NET Developer- Adrian Carr, Codestock 2015
Amazing Speed: Elasticsearch for the .NET Developer- Adrian Carr, Codestock 2015Amazing Speed: Elasticsearch for the .NET Developer- Adrian Carr, Codestock 2015
Amazing Speed: Elasticsearch for the .NET Developer- Adrian Carr, Codestock 2015
Getting started with Scrapy in Python
Getting started with Scrapy in PythonGetting started with Scrapy in Python
Getting started with Scrapy in Python
ArangoDB – A different approach to NoSQL
ArangoDB – A different approach to NoSQLArangoDB – A different approach to NoSQL
ArangoDB – A different approach to NoSQL
Apache CouchDB Presentation @ Sept. 2104 GTALUG Meeting
Apache CouchDB Presentation @ Sept. 2104 GTALUG MeetingApache CouchDB Presentation @ Sept. 2104 GTALUG Meeting
Apache CouchDB Presentation @ Sept. 2104 GTALUG Meeting
An Introduction to Elastic Search.
An Introduction to Elastic Search.An Introduction to Elastic Search.
An Introduction to Elastic Search.

En vedette

Ruby eventmachine pres at rubybdx
Ruby eventmachine pres at rubybdxRuby eventmachine pres at rubybdx
Ruby eventmachine pres at rubybdxMathieu Elie
ruby + websocket + haproxy
ruby + websocket + haproxyruby + websocket + haproxy
ruby + websocket + haproxyMathieu Elie
Trabajo de fisioquímica
Trabajo de fisioquímicaTrabajo de fisioquímica
Trabajo de fisioquímicaDaniel Armiijos
ESTADO DE DIREITO - 42 EDIÇÃOEstadodedireito
Miyuki iiyama-charcoal-tree-based-bioenergy-icraf-may2015
Miyuki iiyama-charcoal-tree-based-bioenergy-icraf-may2015Miyuki iiyama-charcoal-tree-based-bioenergy-icraf-may2015
Miyuki iiyama-charcoal-tree-based-bioenergy-icraf-may2015World Agroforestry (ICRAF)
Water pollution caused by toxic substances
Water pollution caused by toxic substancesWater pollution caused by toxic substances
Water pollution caused by toxic substancesshenaemhe14
Unidad de innovación seminario innovación
Unidad de innovación seminario innovaciónUnidad de innovación seminario innovación
Unidad de innovación seminario innovaciónAndoni Carrion
El marketing no está muerto
El marketing no está muertoEl marketing no está muerto
El marketing no está muertoCristina Palacios
Elasticsearch 5.0 les nouveautés
Elasticsearch 5.0 les nouveautésElasticsearch 5.0 les nouveautés
Elasticsearch 5.0 les nouveautésMathieu Elie
Hadoop: Components and Key Ideas, -part1
Hadoop: Components and Key Ideas, -part1Hadoop: Components and Key Ideas, -part1
Hadoop: Components and Key Ideas, -part1Sandeep Kunkunuru
PriceMinister Rakuten Campus 2013 : Présentation par SoColissimo, partenaire ...
PriceMinister Rakuten Campus 2013 : Présentation par SoColissimo, partenaire ...PriceMinister Rakuten Campus 2013 : Présentation par SoColissimo, partenaire ...
PriceMinister Rakuten Campus 2013 : Présentation par SoColissimo, partenaire ...PriceMinister
Workshop: Learning Elasticsearch
Workshop: Learning ElasticsearchWorkshop: Learning Elasticsearch
Workshop: Learning ElasticsearchAnurag Patel
Hadoop workshop
Hadoop workshopHadoop workshop
Hadoop workshopFang Mac

En vedette (20)

Ruby eventmachine pres at rubybdx
Ruby eventmachine pres at rubybdxRuby eventmachine pres at rubybdx
Ruby eventmachine pres at rubybdx
ruby + websocket + haproxy
ruby + websocket + haproxyruby + websocket + haproxy
ruby + websocket + haproxy
Elasticsearch Workshop
Elasticsearch WorkshopElasticsearch Workshop
Elasticsearch Workshop
Trabajo de fisioquímica
Trabajo de fisioquímicaTrabajo de fisioquímica
Trabajo de fisioquímica
Sourabh Resume.2
Sourabh Resume.2Sourabh Resume.2
Sourabh Resume.2
Discovery ct750 hd book
Discovery ct750 hd bookDiscovery ct750 hd book
Discovery ct750 hd book
Miyuki iiyama-charcoal-tree-based-bioenergy-icraf-may2015
Miyuki iiyama-charcoal-tree-based-bioenergy-icraf-may2015Miyuki iiyama-charcoal-tree-based-bioenergy-icraf-may2015
Miyuki iiyama-charcoal-tree-based-bioenergy-icraf-may2015
Water pollution caused by toxic substances
Water pollution caused by toxic substancesWater pollution caused by toxic substances
Water pollution caused by toxic substances
Unidad de innovación seminario innovación
Unidad de innovación seminario innovaciónUnidad de innovación seminario innovación
Unidad de innovación seminario innovación
El marketing no está muerto
El marketing no está muertoEl marketing no está muerto
El marketing no está muerto
Hadoop 2.0 handout 5.0
Hadoop 2.0 handout 5.0Hadoop 2.0 handout 5.0
Hadoop 2.0 handout 5.0
Elasticsearch 5.0 les nouveautés
Elasticsearch 5.0 les nouveautésElasticsearch 5.0 les nouveautés
Elasticsearch 5.0 les nouveautés
Hadoop: Components and Key Ideas, -part1
Hadoop: Components and Key Ideas, -part1Hadoop: Components and Key Ideas, -part1
Hadoop: Components and Key Ideas, -part1
PriceMinister Rakuten Campus 2013 : Présentation par SoColissimo, partenaire ...
PriceMinister Rakuten Campus 2013 : Présentation par SoColissimo, partenaire ...PriceMinister Rakuten Campus 2013 : Présentation par SoColissimo, partenaire ...
PriceMinister Rakuten Campus 2013 : Présentation par SoColissimo, partenaire ...
Digital signature
Digital signatureDigital signature
Digital signature
Big data hbase
Big data hbase Big data hbase
Big data hbase
Workshop: Learning Elasticsearch
Workshop: Learning ElasticsearchWorkshop: Learning Elasticsearch
Workshop: Learning Elasticsearch
Hadoop workshop
Hadoop workshopHadoop workshop
Hadoop workshop

Similaire à elasticsearch basics workshop

Elasticsearch & "PeopleSearch"
Elasticsearch & "PeopleSearch"Elasticsearch & "PeopleSearch"
Elasticsearch & "PeopleSearch"George Stathis
Rapid Prototyping with Solr
Rapid Prototyping with SolrRapid Prototyping with Solr
Rapid Prototyping with SolrErik Hatcher
Elasticsearch – mye mer enn søk! [JavaZone 2013]
Elasticsearch – mye mer enn søk! [JavaZone 2013]Elasticsearch – mye mer enn søk! [JavaZone 2013]
Elasticsearch – mye mer enn søk! [JavaZone 2013]foundsearch
Rapid Prototyping with Solr
Rapid Prototyping with SolrRapid Prototyping with Solr
Rapid Prototyping with SolrErik Hatcher
Full-Stack Data Science: How to be a One-person Data Team
Full-Stack Data Science: How to be a One-person Data TeamFull-Stack Data Science: How to be a One-person Data Team
Full-Stack Data Science: How to be a One-person Data TeamGreg Goltsov
Approach to find critical vulnerabilities
Approach to find critical vulnerabilitiesApproach to find critical vulnerabilities
Approach to find critical vulnerabilitiesAshish Kunwar
How ElasticSearch lives in my DevOps life
How ElasticSearch lives in my DevOps lifeHow ElasticSearch lives in my DevOps life
How ElasticSearch lives in my DevOps life琛琳 饶
Mastering ElasticSearch with Ruby and Tire
Mastering ElasticSearch with Ruby and TireMastering ElasticSearch with Ruby and Tire
Mastering ElasticSearch with Ruby and TireLuca Bonmassar
ElasticSearch 5.x - New Tricks - 2017-02-08 - Elasticsearch Meetup
ElasticSearch 5.x -  New Tricks - 2017-02-08 - Elasticsearch Meetup ElasticSearch 5.x -  New Tricks - 2017-02-08 - Elasticsearch Meetup
ElasticSearch 5.x - New Tricks - 2017-02-08 - Elasticsearch Meetup Alberto Paro
Null Bachaav - May 07 Attack Monitoring workshop.
Null Bachaav - May 07 Attack Monitoring workshop.Null Bachaav - May 07 Attack Monitoring workshop.
Null Bachaav - May 07 Attack Monitoring workshop.Prajal Kulkarni
Elasticsearch intro output
Elasticsearch intro outputElasticsearch intro output
Elasticsearch intro outputTom Chen
Terrastore - A document database for developers
Terrastore - A document database for developersTerrastore - A document database for developers
Terrastore - A document database for developersSergio Bossa
What Ops Can Learn From Design
What Ops Can Learn From DesignWhat Ops Can Learn From Design
What Ops Can Learn From DesignRobert Treat
Why and How Powershell will rule the Command Line - Barcamp LA 4
Why and How Powershell will rule the Command Line - Barcamp LA 4Why and How Powershell will rule the Command Line - Barcamp LA 4
Why and How Powershell will rule the Command Line - Barcamp LA 4Ilya Haykinson
Managing Your Content with Elasticsearch
Managing Your Content with ElasticsearchManaging Your Content with Elasticsearch
Managing Your Content with ElasticsearchSamantha Quiñones

Similaire à elasticsearch basics workshop (20)

Elasticsearch & "PeopleSearch"
Elasticsearch & "PeopleSearch"Elasticsearch & "PeopleSearch"
Elasticsearch & "PeopleSearch"
Rapid Prototyping with Solr
Rapid Prototyping with SolrRapid Prototyping with Solr
Rapid Prototyping with Solr
Elasticsearch – mye mer enn søk! [JavaZone 2013]
Elasticsearch – mye mer enn søk! [JavaZone 2013]Elasticsearch – mye mer enn søk! [JavaZone 2013]
Elasticsearch – mye mer enn søk! [JavaZone 2013]
Rapid Prototyping with Solr
Rapid Prototyping with SolrRapid Prototyping with Solr
Rapid Prototyping with Solr
Full-Stack Data Science: How to be a One-person Data Team
Full-Stack Data Science: How to be a One-person Data TeamFull-Stack Data Science: How to be a One-person Data Team
Full-Stack Data Science: How to be a One-person Data Team
Approach to find critical vulnerabilities
Approach to find critical vulnerabilitiesApproach to find critical vulnerabilities
Approach to find critical vulnerabilities
How ElasticSearch lives in my DevOps life
How ElasticSearch lives in my DevOps lifeHow ElasticSearch lives in my DevOps life
How ElasticSearch lives in my DevOps life
Mastering ElasticSearch with Ruby and Tire
Mastering ElasticSearch with Ruby and TireMastering ElasticSearch with Ruby and Tire
Mastering ElasticSearch with Ruby and Tire
ElasticSearch 5.x - New Tricks - 2017-02-08 - Elasticsearch Meetup
ElasticSearch 5.x -  New Tricks - 2017-02-08 - Elasticsearch Meetup ElasticSearch 5.x -  New Tricks - 2017-02-08 - Elasticsearch Meetup
ElasticSearch 5.x - New Tricks - 2017-02-08 - Elasticsearch Meetup
Elastic Search
Elastic SearchElastic Search
Elastic Search
Null Bachaav - May 07 Attack Monitoring workshop.
Null Bachaav - May 07 Attack Monitoring workshop.Null Bachaav - May 07 Attack Monitoring workshop.
Null Bachaav - May 07 Attack Monitoring workshop.
Elasticsearch intro output
Elasticsearch intro outputElasticsearch intro output
Elasticsearch intro output
Terrastore - A document database for developers
Terrastore - A document database for developersTerrastore - A document database for developers
Terrastore - A document database for developers
What Ops Can Learn From Design
What Ops Can Learn From DesignWhat Ops Can Learn From Design
What Ops Can Learn From Design
Why and How Powershell will rule the Command Line - Barcamp LA 4
Why and How Powershell will rule the Command Line - Barcamp LA 4Why and How Powershell will rule the Command Line - Barcamp LA 4
Why and How Powershell will rule the Command Line - Barcamp LA 4
Managing Your Content with Elasticsearch
Managing Your Content with ElasticsearchManaging Your Content with Elasticsearch
Managing Your Content with Elasticsearch
ACM BPM and elasticsearch AMIS25
ACM BPM and elasticsearch AMIS25ACM BPM and elasticsearch AMIS25
ACM BPM and elasticsearch AMIS25


From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar

Dernier (20)

From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club

elasticsearch basics workshop

  • 1. elasticsearch basics workshop mathieu Elie at giroll mardi 17 décembre 13
  • 2. speaker : @mathieuel • freelance & founder @oneplaylist • full stack skills • see what i’ve done on mardi 17 décembre 13
  • 3. goal • go from first steps • and get over first frustation • give the you the power needed to learn by yourself mardi 17 décembre 13
  • 4. install • be sure you have java runtime • apt-get install openjdk-6-jre-headless -y • consider oracle jvm mardi 17 décembre 13
  • 5. unzip and run ! ## Get the latest stable archive wget elasticsearch/ ## Extract the archive unzip cd elasticsearch-0.90.7 ## run ! # This will run elasticsearch on foreground. ./bin/elasticsearch -f mardi 17 décembre 13
  • 6. its alive ! [2013-12-13 15:45:25,187][INFO ][node ] [Bridge, George Washington] version[0.90.7], pid[37998], build[36897d0/2013-11-13T12:06:54Z] [2013-12-13 15:45:25,189][INFO ][node ] [Bridge, George Washington] initializing ... [2013-12-13 15:45:25,202][INFO ][plugins ] [Bridge, George Washington] loaded [], sites [] [2013-12-13 15:45:28,342][INFO ][node ] [Bridge, George Washington] initialized [2013-12-13 15:45:28,342][INFO ][node ] [Bridge, George Washington] starting ... [2013-12-13 15:45:28,491][INFO ][transport ] [Bridge, George Washington] bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/]} [2013-12-13 15:45:31,545][INFO ][cluster.service ] [Bridge, George Washington] new_master [Bridge, George Washington][pKCdh1b_TP2TlurO1gm4_g][inet[/]], reason: zen-disco-join (elected_as_master) [2013-12-13 15:45:31,577][INFO ][discovery ] [Bridge, George Washington] elasticsearch/pKCdh1b_TP2TlurO1gm4_g [2013-12-13 15:45:31,595][INFO ][http ] [Bridge, George Washington] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/]} [2013-12-13 15:45:31,596][INFO ][node ] [Bridge, George Washington] started [2013-12-13 15:45:31,629][INFO ][gateway ] [Bridge, George Washington] recovered [0] indices into cluster_state mardi 17 décembre 13
  • 7. ping es on port 9200 curl { "ok" : true, "status" : 200, "name" : "Gideon, Gregory", "version" : { "number" : "0.90.6", "build_hash" : "e2a24efdde0cb7cc1b2071ffbbd1fd874a6d8d6b", "build_timestamp" : "2013-11-04T13:44:16Z", "build_snapshot" : false, "lucene_version" : "4.5.1" }, "tagline" : "You Know, for Search" }% mardi 17 décembre 13
  • 8. Store a Document curl -XPUT http://localhost:9200/workshop/site/1 -d ' { "url": "", "title": "Open Source Distributed Real Time Search & Analytics", "description": "Elasticsearch is a powerful open source search and analytics engine that makes data easy to explore.", "tags": ["Open Source", "elasticsearch", "Distributed"] }' {"ok":true,"_index":"workshop","_type":"sites","_id":"1","_version":1}% mardi 17 décembre 13
  • 9. retreive the document curl -XGET http://localhost:9200/workshop/site/1 {"_index":"workshop","_type":"site","_id":"1","_version":2,"exists":true, "_source" : { "url": "", "title": "Open Source Distributed Real Time Search & Analytics", "description": "Elasticsearch is a powerful open source search and analytics engine that makes data easy to explore.", "tags": ["Open Source", "elasticsearch", "Distributed"] }}% mardi 17 décembre 13
  • 10. add more documents curl -XPUT http://localhost:9200/workshop/site/2 -d ' { "url": "", "title": "Mathieu ELIE Freelance - Full Stack Data Engineer, Data Visualization", "description": "Freelance Consultant in Bordeaux, System &amp; Software Architect. Love dataviz, redis, elasticsearch, architecture scalability recipes and playing with data.", tags: ["elasticsearch", "Data Visualization"] }' curl -XPUT http://localhost:9200/workshop/site/3 -d ' { "url": "", "title": "Collectif Giroll - Gironde Logiciels Libres", "description": "Giroll, collectif bas√é √à Bordeaux, r√éunis autour des Logiciels et des Cultures libres. Ateliers tous les mardis de 18h30 √à 20h30 et organisation d''Install Party Linux tous les six", tags: ["Open Source", "Collectif"] }' mardi 17 décembre 13
  • 11. now search ! mardi 17 décembre 13
  • 12. curl 'http://localhost:9200/workshop/_search?pretty=true' { "took" : 1, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "failed" : 0 }, "hits" : { "total" : 3, "max_score" : 1.0, "hits" : [ { "_index" : "workshop", "_type" : "site", "_id" : "1", "_score" : 1.0, "_source" : { "url": "", "title": "Open Source Distributed Real Time Search & Analytics", "description": "Elasticsearch is a powerful open source search and analytics engine that makes data easy to explore.", "tags": ["Open Source", "elasticsearch", "Distributed"] } }, { "_index" : "workshop", "_type" : "site", "_id" : "3", "_score" : 1.0, "_source" : { "url": "", "title": "Collectif Giroll - Gironde Logiciels Libres", "description": "Giroll, collectif bas√é √à Bordeaux, r√éunis autour des Logiciels et des Cultures libres. Ateliers tous les mardis de 18h30 √à 20h30 et organisation mardi 17 décembre 13 dInstall Party Linux tous les six",
  • 13. ok great, but now i want to search for text ! mardi 17 décembre 13
  • 14. step 1 : pass query as a request body curl -XPOST 'http://localhost:9200/workshop/site/_search?pretty=true' -d '{ "query" : { "match_all" : { } } }' mardi 17 décembre 13
  • 15. It returns all documents because we use the match all query reference/current/query-dsl-match-all-query.html mardi 17 décembre 13
  • 16. match_all query is part of the queries dsl reference/current/query-dsl-queries.html mardi 17 décembre 13
  • 17. so lets use the query_string query dsl curl -XPOST 'http://localhost:9200/workshop/site/_search?pretty=true' -d '{ "query" : { "query_string" : { "query" : "elasticsearch" } } }' mardi 17 décembre 13
  • 18. result is a a quiet verbose lets get only title and tags fields curl -XPOST 'http://localhost:9200/workshop/site/_search?pretty=true' -d '{ "fields" : ["title", "tags"], "query" : { "query_string" : { "query" : "elasticsearch" } } }' mardi 17 décembre 13
  • 19. { "took" : 6, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "failed" : 0 }, "hits" : { "total" : 2, "max_score" : 0.081366636, "hits" : [ { "_index" : "workshop", "_type" : "site", "_id" : "1", "_score" : 0.081366636, "fields" : { "tags" : [ "Open Source", "elasticsearch", "Distributed" ], "title" : "Open Source Distributed Real Time Search & Analytics" } }, { "_index" : "workshop", "_type" : "site", "_id" : "2", "_score" : 0.06780553, "fields" : { "tags" : [ "elasticsearch", "Data Visualization" ], "title" : "Mathieu ELIE Freelance - Full Stack Data Engineer, Data Visualization" } mardi 17 décembre 13
  • 20. lets go for facets on tags !! reference/current/search-facets.html do you see the wall ??? ;) mardi 17 décembre 13
  • 21. Facets dsl curl -XPOST 'http://localhost:9200/workshop/site/_search?pretty=true' -d '{ "fields" : ["title", "tags"], "query" : { "query_string" : { "query" : "elasticsearch" } }, "facets" : { "tags" : { "terms" : {"field" : "tags"} } } }' mardi 17 décembre 13
  • 22. "facets" : { "tags" : { "_type" : "terms", "missing" : 0, "total" : 7, "other" : 0, "terms" : [ { "term" : "elasticsearch", "count" : 2 }, { "term" : "visualization", "count" : 1 }, { "term" : "source", "count" : 1 }, { "term" : "open", "count" : 1 }, { "term" : "distributed", "count" : 1 }, { "term" : "data", "count" : 1 } ] } } mardi 17 décembre 13 ho no!!
  • 23. • hey ! see "Open Source" ! it is lower cased and exploded in multiple tokens ! • this is done by the defautl mapping and analyzer mardi 17 décembre 13
  • 24. curl 'http://localhost:9200/workshop/site/_mapping?pretty=true' { "site" : { "properties" : { "description" : { "type" : "string" }, "tags" : { "type" : "string" }, "title" : { "type" : "string" }, "url" : { "type" : "string" } } } } mardi 17 décembre 13
  • 25. • tags is a type of string and we have a default analyzer • elasticsearch/reference/current/analysisstandard-analyzer.html • An analyzer of type standard is built using the Standard Tokenizer with the Standard Token Filter, Lower Case Token Filter, and Stop Token Filter. mardi 17 décembre 13
  • 26. test the default analyzer curl -XGET 'localhost:9200/workshop/_analyze?pretty=true' -d 'Open Source' { "tokens" : [ { "token" : "open", "start_offset" : 0, "end_offset" : 4, "type" : "<ALPHANUM>", "position" : 1 }, { "token" : "source", "start_offset" : 5, "end_offset" : 11, "type" : "<ALPHANUM>", "position" : 2 } ] } mardi 17 décembre 13
  • 27. • what about keyword analyzer ? • elasticsearch/reference/current/analysiskeyword-analyzer.html mardi 17 décembre 13
  • 28. curl -XGET 'localhost:9200/workshop/_analyze? analyzer=keyword&pretty=true' -d 'Open Source' { "tokens" : [ { "token" : "Open Source", "start_offset" : 0, "end_offset" : 11, "type" : "word", "position" : 1 } ] } got it ! now how to apply this to our tags field ? mardi 17 décembre 13
  • 29. curl { 'http://localhost:9200/workshop/site/_mapping?pretty=true' -d ' "site" : { "properties" : { "url" : {"type" : "string"}, "title" : {"type" : "string"}, "description" : {"type" : "string"}, "tags" : {"type" : "string", "analyzer": "keyword" } } } } ' { "error" : "MergeMappingException[Merge failed with failures {[mapper [tags] has different index_analyzer]}]", "status" : 400 } oops ! we need to drop something.. mardi 17 décembre 13
  • 30. curl -XDELETE 'http://localhost:9200/workshop/' {"ok":true,"acknowledged":true}% # index should exists if we want to put mapping.. curl -XPUT 'http://localhost:9200/workshop/' {"ok":true,"acknowledged":true}% curl { 'http://localhost:9200/workshop/site/_mapping?pretty=true' -d ' "site" : { "properties" : { "url" : {"type" : "string"}, "title" : {"type" : "string"}, "description" : {"type" : "string"}, "tags" : {"type" : "string", "analyzer": "keyword" } } } } ' {"ok":true,"acknowledged":true}% mardi 17 décembre 13
  • 31. # test on the field analysis curl -XGET 'localhost:9200/workshop/_analyze? pretty=true&field=site.tags' -d 'Open Source' { "tokens" : [ { "token" : "Open Source", "start_offset" : 0, "end_offset" : 11, "type" : "word", "position" : 1 } ] } # congrats ! mardi 17 décembre 13
  • 32. # lets push data again curl -XPUT http://localhost:9200/workshop/site/1 -d ' { "url": "", "title": "Open Source Distributed Real Time Search & Analytics", "description": "Elasticsearch is a powerful open source search and analytics engine that makes data easy to explore.", "tags": ["Open Source", "elasticsearch", "Distributed"] }' curl -XPUT http://localhost:9200/workshop/site/2 -d ' { "url": "", "title": "Mathieu ELIE Freelance - Full Stack Data Engineer, Data Visualization", "description": "Freelance Consultant in Bordeaux, System &amp; Software Architect. Love dataviz, redis, elasticsearch, architecture scalability recipes and playing with data.", tags: ["elasticsearch", "Data Visualization"] }' curl -XPUT http://localhost:9200/workshop/site/3 -d ' { "url": "", "title": "Collectif Giroll - Gironde Logiciels Libres", "description": "Giroll, collectif bas√é √à Bordeaux, r√éunis autour des Logiciels et des Cultures libres. Ateliers tous les mardis de 18h30 √ mardi 17 décembre 13
  • 33. # faceting ok ??? curl -XPOST 'http://localhost:9200/workshop/site/_search? pretty=true' -d '{ "fields" : ["title", "tags"], "query" : { "query_string" : { "query" : "elasticsearch" } }, "facets" : { "tags" : { "terms" : {"field" : "tags"} } } }' mardi 17 décembre 13
  • 34. "facets" : { "tags" : { "_type" : "terms", "missing" : 0, "total" : 5, "other" : 0, "terms" : [ { "term" : "elasticsearch", "count" : 2 }, { "term" : "Open Source", "count" : 1 }, { "term" : "Distributed", "count" : 1 }, { "term" : "Data Visualization", "count" : 1 } ] } } cool ! our facets contains whole tags ! great jobs !! mardi 17 décembre 13
  • 35. if want only docs with "Open Source" tag we use filters reference/current/query-dsl-filters.html and term filter mardi 17 décembre 13
  • 36. curl -XGET 'http://localhost:9200/workshop/site/_search? pretty=true' -d '{ "query" : { "match_all" : { } }, "filter" : { "term" : { "tags" : "Open Source"} } }' • more efficient than full text search • cached / indexed • you can filter using facet items mardi 17 décembre 13
  • 37. RTFM WAY • elasticsearch doc is great • but it is exhaustive • so at the beguining its a bit frustrating mardi 17 décembre 13
  • 38. Think about json hierachy curl -XPOST 'http://localhost:9200/workshop/site/_search? pretty=true' -d '{ "fields" : ["title", "tags"], "query" : { "query_string" : { "query" : "elasticsearch" } }, "facets" : { "tags" : { "terms" : {"field" : "tags"} } } }' mardi 17 décembre 13
  • 39. your hitting the search api reference/current/search-search.html curl -XPOST 'http://localhost:9200/workshop/site/_search? pretty=true' -d '{ "fields" : ["title", "tags"], "query" : { "query_string" : { "query" : "elasticsearch" } }, "facets" : { "tags" : { "terms" : {"field" : "tags"} } } }' mardi 17 décembre 13
  • 40. your using the query dsl reference/current/query-dsl.html curl -XPOST 'http://localhost:9200/workshop/site/_search? pretty=true' -d '{ "fields" : ["title", "tags"], "query" : { "query_string" : { "query" : "elasticsearch" } }, "facets" : { "tags" : { "terms" : {"field" : "tags"} } } }' mardi 17 décembre 13
  • 41. your using different types of queries reference/current/query-dsl-queries.html curl -XPOST 'http://localhost:9200/workshop/site/_search? pretty=true' -d '{ "fields" : ["title", "tags"], "query" : { "query_string" : { "query" : "elasticsearch" } }, "facets" : { "tags" : { "terms" : {"field" : "tags"} } } }' mardi 17 décembre 13
  • 42. this query is a query_string type with a query parameter set to elasticsearch reference/current/query-dsl-query-string-query.html curl -XPOST 'http://localhost:9200/workshop/site/_search? pretty=true' -d '{ "fields" : ["title", "tags"], "query" : { "query_string" : { "query" : "elasticsearch" } }, "facets" : { "tags" : { "terms" : {"field" : "tags"} } } }' mardi 17 décembre 13
  • 43. we also use faceting reference/current/search-facets.html curl -XPOST 'http://localhost:9200/workshop/site/_search? pretty=true' -d '{ "fields" : ["title", "tags"], "query" : { "query_string" : { "query" : "elasticsearch" } }, "facets" : { "tags" : { "terms" : {"field" : "tags"} } } }' mardi 17 décembre 13
  • 44. we use a terms facet reference/current/search-facets-terms-facet.html curl -XPOST 'http://localhost:9200/workshop/site/_search? pretty=true' -d '{ "fields" : ["title", "tags"], "query" : { "query_string" : { "query" : "elasticsearch" } }, "facets" : { "tags" : { "terms" : {"field" : "tags"} } } }' mardi 17 décembre 13
  • 45. RTFM WAY • common mistake: the code example are not showing always whole query • so you should replace the code in the doc in the whole dsl hierarchy • think about hierarchy and everything should be more clear mardi 17 décembre 13
  • 46. the end for me... the begguining for you... mardi 17 décembre 13
  • 47. questions and more • twitter @mathieuel • contact on my freelance website • • thanks to giroll for hosting this workshop ! mardi 17 décembre 13