SlideShare une entreprise Scribd logo
1  sur  68
Télécharger pour lire hors ligne
99 Problems, But
The Search Ain’t One
Andrei Zmievski • PHP UK •!Feb 25, 2011
who am I?
 curl http://localhost:9200/speaker/info/andrei


{“name”:       “Andrei Zmievski”,
 “projects”:   [“PHP”, “PHP-GTK”, “Smarty”, “Unicode/i18n”],
 “likes”:      [“coding”, “beer”, “brewing”, “photography”],
 “twitter”:    “@a”,
 “email”:      “andrei@zmievski.org”}
what is elasticsearch?

a search engine for the NoSQL generation

  domain-driven

  distributed

  RESTful

  Hitchhiker’s Guide to the Galaxy (no, really)
document model


document-oriented

JSON-based

schema-free
engine


based on Lucene

multi-tenancy

distributed, out of the box
nomenclature

index

type

document

  _id

node
3 easy steps
1. index
           !"#$%&'()*+%,--./00$1!2$,13-/45660!17803.92:9#0;%&<=
           >
request




           %%%%?72@9?/%?A7<#9B%C@B9D3:B?E
           %%%%?-2$:?/%?44%(#1F$9@3E%F"-%-,9%*92#!,%AB7==-%)79?E
           %%%%?$B:93?/%G?!1<B7H?E%?F99#?E%?.,1-1H#2.,I?JE
           %%%%?-KB--9#?/%?2?E
           %%%%?,9BH,-?/%;LM
           N=


           >
response




           %%%%?1:?/-#"9
           %%%%?OB7<9P?/?!178?
           %%%%?O-I.9?/?3.92:9#?
           %%%%?OB<?/?;?
           N
2. search
request



           !"#$%,--./00$1!2$,13-/45660!17803.92:9#0O392#!,QRSF99#


           >%?-11:?%/%TE
           %%?O3,2#<3?%/%>
           %%%%?-1-2$?%/%;E
           %%%%?3"!!9338"$?%/%;E
           %%%%?82B$9<?%/%6
           %%NE
           %%?,B-3?%/%>
           %%%%?-1-2$?%/%;E
response




           %%%%?@2PO3!1#9?%/%6UV46LM64E
           %%%%?,B-3?%/%G%>
           %%%%%%?OB7<9P?%/%?!178?E
           %%%%%%?O-I.9?%/%?3.92:9#?E
           %%%%%%?OB<?%/%?5?E
           %%%%%%?O3!1#9?%/%6UV46LM64E
           %%%%%%?O31"#!9?%/%
           >
           %%%%?72@9?/%?A7<#9B%C@B9D3:B?E
           %%%%?-2$:?/%?44%(#1F$9@3E%F"-%-,9%*92#!,%AB7=-%)79?E
           %%%%?$B:93?/%G?!1<B7H?E%?F99#?E%?.,1-1H#2.,I?JE
           %%%%?-KB--9#?/%?2?E
           %%%%?,9BH,-?/%;LM
           N%N%J%N%N
2. search
request



           !"#$%,--./00$1!2$,13-/45660!17803.92:9#0O392#!,QRSF99#


           >%?-11:?%/%TE
           %%?O3,2#<3?%/%>
           %%%%?-1-2$?%/%;E
           %%%%?3"!!9338"$?%/%;E
           %%%%?82B$9<?%/%6
           %%NE                                total number of hits
           %%?,B-3?%/%>
           !!!!"#$#%&"!'!()
response




           %%%%?@2PO3!1#9?%/%6UV46LM64E
           %%%%?,B-3?%/%G%>
           %%%%%%?OB7<9P?%/%?!178?E
           %%%%%%?O-I.9?%/%?3.92:9#?E
           %%%%%%?OB<?%/%?5?E
           %%%%%%?O3!1#9?%/%6UV46LM64E
           %%%%%%?O31"#!9?%/%
           >
           %%%%?72@9?/%?A7<#9B%C@B9D3:B?E
           %%%%?-2$:?/%?44%(#1F$9@3E%F"-%-,9%*92#!,%AB7=-%)79?E
           %%%%?$B:93?/%G?!1<B7H?E%?F99#?E%?.,1-1H#2.,I?JE
           %%%%?-KB--9#?/%?2?E
           %%%%?,9BH,-?/%;LM
           N%N%J%N%N
2. search
request



           !"#$%,--./00$1!2$,13-/45660!17803.92:9#0O392#!,QRSF99#


           >%?-11:?%/%TE
           %%?O3,2#<3?%/%>
           %%%%?-1-2$?%/%;E
           %%%%?3"!!9338"$?%/%;E
           %%%%?82B$9<?%/%6
           %%NE
           %%?,B-3?%/%>
           %%%%?-1-2$?%/%;E
                                                       the index of the doc
response




           %%%%?@2PO3!1#9?%/%6UV46LM64E
           %%%%?,B-3?%/%G%>
           !!!!!!"*+,-./"!'!"0$,1")
           %%%%%%?O-I.9?%/%?3.92:9#?E
           %%%%%%?OB<?%/%?5?E
           %%%%%%?O3!1#9?%/%6UV46LM64E
           %%%%%%?O31"#!9?%/%
           >
           %%%%?72@9?/%?A7<#9B%C@B9D3:B?E
           %%%%?-2$:?/%?44%(#1F$9@3E%F"-%-,9%*92#!,%AB7=-%)79?E
           %%%%?$B:93?/%G?!1<B7H?E%?F99#?E%?.,1-1H#2.,I?JE
           %%%%?-KB--9#?/%?2?E
           %%%%?,9BH,-?/%;LM
           N%N%J%N%N
2. search
request



           !"#$%,--./00$1!2$,13-/45660!17803.92:9#0O392#!,QRSF99#


           >%?-11:?%/%TE
           %%?O3,2#<3?%/%>
           %%%%?-1-2$?%/%;E
           %%%%?3"!!9338"$?%/%;E
           %%%%?82B$9<?%/%6
           %%NE
           %%?,B-3?%/%>
           %%%%?-1-2$?%/%;E
response




           %%%%?@2PO3!1#9?%/%6UV46LM64E
           %%%%?,B-3?%/%G%>                              the type of the doc
           %%%%%%?OB7<9P?%/%?!178?E
           !!!!!!"*#23."!'!"43.%5.6")
           %%%%%%?OB<?%/%?5?E
           %%%%%%?O3!1#9?%/%6UV46LM64E
           %%%%%%?O31"#!9?%/%
           >
           %%%%?72@9?/%?A7<#9B%C@B9D3:B?E
           %%%%?-2$:?/%?44%(#1F$9@3E%F"-%-,9%*92#!,%AB7=-%)79?E
           %%%%?$B:93?/%G?!1<B7H?E%?F99#?E%?.,1-1H#2.,I?JE
           %%%%?-KB--9#?/%?2?E
           %%%%?,9BH,-?/%;LM
           N%N%J%N%N
2. search
request



           !"#$%,--./00$1!2$,13-/45660!17803.92:9#0O392#!,QRSF99#


           >%?-11:?%/%TE
           %%?O3,2#<3?%/%>
           %%%%?-1-2$?%/%;E
           %%%%?3"!!9338"$?%/%;E
           %%%%?82B$9<?%/%6
           %%NE
           %%?,B-3?%/%>
           %%%%?-1-2$?%/%;E
response




           %%%%?@2PO3!1#9?%/%6UV46LM64E
           %%%%?,B-3?%/%G%>
           %%%%%%?OB7<9P?%/%?!178?E
           %%%%%%?O-I.9?%/%?3.92:9#?E
           !!!!!!"*+-"!'!"7")                             the id of the doc
           %%%%%%?O3!1#9?%/%6UV46LM64E
           %%%%%%?O31"#!9?%/%
           >
           %%%%?72@9?/%?A7<#9B%C@B9D3:B?E
           %%%%?-2$:?/%?44%(#1F$9@3E%F"-%-,9%*92#!,%AB7=-%)79?E
           %%%%?$B:93?/%G?!1<B7H?E%?F99#?E%?.,1-1H#2.,I?JE
           %%%%?-KB--9#?/%?2?E
           %%%%?,9BH,-?/%;LM
           N%N%J%N%N
2. search
request



           !"#$%,--./00$1!2$,13-/45660!17803.92:9#0O392#!,QRSF99#


           >%?-11:?%/%TE
           %%?O3,2#<3?%/%>
           %%%%?-1-2$?%/%;E
           %%%%?3"!!9338"$?%/%;E
           %%%%?82B$9<?%/%6
           %%NE
           %%?,B-3?%/%>
           %%%%?-1-2$?%/%;E
response




           %%%%?@2PO3!1#9?%/%6UV46LM64E
           %%%%?,B-3?%/%G%>
           %%%%%%?OB7<9P?%/%?!178?E
           %%%%%%?O-I.9?%/%?3.92:9#?E
           !!!!!!"*+-"!'!"7")                             the id of the doc
           %%%%%%?O3!1#9?%/%6UV46LM64E
           %%%%%%?O31"#!9?%/%
           >
           %%%%?72@9?/%?A7<#9B%C@B9D3:B?E
           %%%%?-2$:?/%?44%(#1F$9@3E%F"-%-,9%*92#!,%AB7=-%)79?E
           %%%%?$B:93?/%G?!1<B7H?E%?F99#?E%?.,1-1H#2.,I?JE
           %%%%?-KB--9#?/%?2?E
           %%%%?,9BH,-?/%;LM
           N%N%J%N%N
2. search
request



           !"#$%,--./00$1!2$,13-/45660!17803.92:9#0O392#!,QRSF99#


           >%?-11:?%/%TE
           %%?O3,2#<3?%/%>
           %%%%?-1-2$?%/%;E
           %%%%?3"!!9338"$?%/%;E
           %%%%?82B$9<?%/%6
           %%NE
           %%?,B-3?%/%>
           %%%%?-1-2$?%/%;E
response




           %%%%?@2PO3!1#9?%/%6UV46LM64E
           %%%%?,B-3?%/%G%>
           %%%%%%?OB7<9P?%/%?!178?E
           %%%%%%?O-I.9?%/%?3.92:9#?E
           !!!!!!"*+-"!'!"7")
           %%%%%%?O3!1#9?%/%6UV46LM64E                            the hit score
           %%%%%%?O31"#!9?%/%
           >
           %%%%?72@9?/%?A7<#9B%C@B9D3:B?E
           %%%%?-2$:?/%?44%(#1F$9@3E%F"-%-,9%*92#!,%AB7=-%)79?E
           %%%%?$B:93?/%G?!1<B7H?E%?F99#?E%?.,1-1H#2.,I?JE
           %%%%?-KB--9#?/%?2?E
           %%%%?,9BH,-?/%;LM
           N%N%J%N%N
2. search
request



           !"#$%,--./00$1!2$,13-/45660!17803.92:9#0O392#!,QRSF99#


           >%?-11:?%/%TE
           %%?O3,2#<3?%/%>
           %%%%?-1-2$?%/%;E
           %%%%?3"!!9338"$?%/%;E
           %%%%?82B$9<?%/%6
           %%NE
           %%?,B-3?%/%>
           %%%%?-1-2$?%/%;E
response




           %%%%?@2PO3!1#9?%/%6UV46LM64E
           %%%%?,B-3?%/%G%>
           %%%%%%?OB7<9P?%/%?!178?E
           %%%%%%?O-I.9?%/%?3.92:9#?E
           !!!!!!"*+-"!'!"7")
           %%%%%%?O3!1#9?%/%6UV46LM64E
           %%%%%%?O31"#!9?%/%
                                                                  the original source
           8
           !!!!",%9."'!":,-6.+!;9+.<45+")
           !!!!"#%&5"'!"==!>6$?&.94)!?@#!#A.!B.%60A!:+,C#!D,.")
           !!!!"&+5.4"'!E"0$-+,F")!"?..6")!"3A$#$F6%3A2"G)
           !!!!"#H+##.6"'!"%")
           !!!!"A.+FA#"'!(IJ
           K%N%J%N%N
2. search
request



           !"#$%,--./00$1!2$,13-/45660!17803.92:9#0O392#!,QRSF99#


           >%"#$$5"!'!L)
           %%?O3,2#<3?%/%>
           %%%%?-1-2$?%/%;E                             the execution time
           %%%%?3"!!9338"$?%/%;E
           %%%%?82B$9<?%/%6
           %%NE
           %%?,B-3?%/%>
           %%%%?-1-2$?%/%;E
response




           %%%%?@2PO3!1#9?%/%6UV46LM64E
           %%%%?,B-3?%/%G%>
           %%%%%%?OB7<9P?%/%?!178?E
           %%%%%%?O-I.9?%/%?3.92:9#?E
           %%%%%%?OB<?%/%?5?E
           %%%%%%?O3!1#9?%/%6UV46LM64E
           %%%%%%?O31"#!9?%/%
           >
           %%%%?72@9?/%?A7<#9B%C@B9D3:B?E
           %%%%?-2$:?/%?44%(#1F$9@3E%F"-%-,9%*92#!,%AB7=-%)79?E
           %%%%?$B:93?/%G?!1<B7H?E%?F99#?E%?.,1-1H#2.,I?JE
           %%%%?-KB--9#?/%?2?E
           %%%%?,9BH,-?/%;LM
           N%N%J%N%N
3. profit


that’s up to you
demo
distributed model


provides:

  performance

  resiliency (high-availability)
shards
a portion of the document space

each one is a separate Lucene index

  thus, many per-index settings are available

document is sharded by its _id value

  but can be assigned (routed) to a shard
  deterministically
zero-conf discovery


zen (multicast and unicast)

cloud (EC2 via API)
auto-routing

master node:

  maintains cluster state

  reassigns shards if nodes leave/join cluster

any node can serve as the request router

the query is handled via scatter-gather mechanism
replicas

each shard can have 1 or more replicas

# of replicas can be updated dynamically after
index creation

replicas can be used for querying in parallel
shard allocation
               node 1




       start with a single node
shard allocation
                node 1
                 person1
                 person2




      PUT /person {
         “index”: {
            “number_of_shards”: 2,
            “number_of_replicas”: 1
      }}
shard allocation
       node 1          node 2
       person1         person1
       person2         person2




        start the second node
shard allocation
node 1    node 2         node 3   node 4
person1   person1
person2   person2




            start 2 more nodes
shard allocation
node 1    node 2         node 3    node 4
person1                  person1
          person2                  person2




            start 2 more nodes
document sharding
node 1    node 2         node 3     node 4
person1                   person1
          person2                   person2




            PUT /person/info/1
            {…}
document sharding
     node 1         node 2         node 3     node 4
     person1                        person1
                    person2                   person2




                      PUT /person/info/1
hashed to shard 1     {…}
document sharding
node 1    node 2         node 3      node 4
person1                   person1
          person2                    person2




                        replicated

            PUT /person/info/1
            {…}
document sharding
node 1    node 2         node 3     node 4
person1                   person1
          person2                   person2




            PUT /person/info/2
            {…}
document sharding
node 1         node 2            node 3     node 4
person1                           person1
               person2                      person2




hashed to shard 2
                    PUT /person/info/2
                    {…}
document sharding
node 1    node 2         node 3      node 4
person1                   person1
          person2                    person2




                                    replicated

            PUT /person/info/2
            {…}
scatter-gather
node 1          node 2        node 3          node 4
person1                        person1
                person2                       person2




          GET /person/_search?q=name:thomas
shard allocation
node 1          node 2        node 3          node 4
person1                        person1
                person2                       person2




          GET /person/_search?q=name:thomas
shard allocation
node 1          node 2        node 3          node 4
person1                        person1
                person2                       person2




          GET /person/_search?q=name:thomas
shard allocation
node 1          node 2        node 3          node 4
person1                        person1
                person2                       person2




          GET /person/_search?q=name:thomas
transactional model

per-document consistency

no need to commit/flush

uses write-behind transaction log

write consistency (W) can be controlled

  one, quorum, or all
(near) real-time search


1 second refresh rate by default

_refresh API also
index storage

node data considered transient

can be stored in local file system, JVM heap,
native OS memory, or FS & memory combination

persistent storage requires a gateway
gateways
persistent store for cluster state and indices

asynchronous, translog-based write strategy

allows full recovery if a cluster restart is needed

supported gateways:
  local
  shared FS
  Hadoop via HDFS
  S3
mapping
describes document structure to the search
engine

automatically created with sensible defaults

explicit mapping can be provided (generally, a
good idea)

can run into merge conflicts
mapping

important meta fields:

  _source

  _all

  _boost
mapping types

simple:

  string, integer/long, float/double, boolean, and
  null)

complex:

  array, object
sample mapping
document



           >?"39#?/%%%%%%?<9#B!:?E
           %?-B-$9?/%%%%%?W17X-%(27B!?E
           %?-2H3?/%%%%%%G?.#18B$B7H?E%?<9F"HHB7H?E%?.,.?JE
           %?.13-W2-9?/%%?56;6&;5&55+;M/;Y/;5?E
           %?.#B1#B-I?/%%5N



           >?.13-?/%>
           %%?.#1.9#-B93?%/%>
mapping




           %%%%?"39#?/%>?-I.9?/%?3-#B7H?E%?B7<9P?/%?71-O272$IZ9<?NE
           %%%%?@9332H9?/%>?-I.9?/%?3-#B7H?E%[F113-/%;UVNE
           %%%%?-2H3?/%>?-I.9?/%?3-#B7H?E%?B7!$"<9OB7O2$$?/%?71?NE
           %%%%?.13-W2-9?%/%>?-I.9?%/%?<2-9?E%[3-1#9/%[71NE
           %%%%?.#B1#B-I?%/%>?-I.9?%/%?B7-9H9#?N
           NNN
analyzers
break down (tokenize) and normalize fields during
indexing and query strings at search time

analyzer = tokenizer + token filters (0 or more)
*-27<2#<%A72$IZ9#%S
%%%*-27<2#<%+1:97BZ9#%]
%%%%%%%*-27<2#<%+1:97%^B$-9#%]
%%%%%%%_1K9#!239%+1:97%^B$-9#%]
%%%%%%%*-1.%+1:97%^B$-9#
analyzers
                            analyzers, tokenizers, and filters can be
                            customized
mapping elasticsearch.yml




                            B7<9P/
                            %%272$I3B3/
                            %%%%272$IZ9#/
                            %%%%%%.@&%,F/
                            %%%%%%%%-I.9/%!"3-1@
                            %%%%%%%%-1:97BZ9#/%3-27<2#<
                            %%%%%%%%8B$-9#/%G3-27<2#<E%$1K9#!239E%3-1.E
                            %%%%%%%%%%%%%%%%%23!BB81$<B7HE%.1#-9#*-9@J


                            `
                            ?-B-$9?/%>?-I.9?/%?3-#B7H?E%?272$IZ9#?/%?9"$27H?NE
                            `
API
API conventions


append ?pretty=true to get readable JSON

boolean values: false/0/off = false, rest is true

JSONP support via callback parameter
API structure

http://host:port/[index]/[type]/[_action/id]

 GET http://es:9200/_status

 GET http://es:9200/twitter/_status

 POST http://es:9200/twitter/tweet/1

 GET http://es:9200/twitter/tweet/1
API structure
http://host:port/[index]/[type]/[_action/id]

 GET http://es:9200/twitter/tweet/_search

 GET http://es:9200/twitter/user/_search

 GET http://es:9200/twitter/tweet,user/_search

 GET http://es:9200/twitter,facebook/_search

 GET http://es:9200/_search
_cluster API structure

GET /_cluster/health

GET /_cluster/health/index1,index2

GET /_cluster/nodes/stats

GET /_cluster/nodes/nodeId1,nodeId2/stats
API {core}
index             search

bulk               query

delete             from/size paging

delete by query    sort

get                highlighting

count              selective fields
API {indices}
create           optimize

delete           snapshot

open/close       update settings

get/put/delete   analyze
mapping
                 status
refresh
                 flush
API {cluster}

health

state

nodes info

nodes stats

nodes shutdown
Query DSL
term / terms   query_string

range            default_operator

prefix            analyzer

bool             phrase_slop

fuzzy            etc

wildcard
filters


share some similar features with queries (term,
range, etc)

why use a filter?
filters
faster than queries

cached (depends on the filter)

  the cache is used for different queries against
  the same filter

no scoring

more useful ones: term, terms, range, prefix, and,
or, not, exists, missing, query
facets

provide aggregated data based on the search
request

terms, histogram, date histogram, range,
statistical, and more
geo search

implemented as filters (and a facet)

  geo_distance

  geo_bounding_box

  geo_polygon
interfaces
REST

  including memcached

Java /!Groovy

Language clients (REST/Thrift):

  pyes, PHP (standalone and symfony), Ruby, Perl

Flume sink implementation
elastica

similar to the other PHP ElasticSearch client

API naming is consistent with Zend Framework

can be extended for new filters, facets, etc

still under development
elastica
          $es = new Elastica_Client('vm', 9200);
          $index = new Elastica_Index($es, 'test');
          $index->create(array(), true);
          $type = new Elastica_Type($index, 'person');
          $doc = new Elastica_Document(1, array('name' => 'Andrei Zmievski',
example




                                                 'email' => 'andrei@test.com',
                                                 'username' => 'andrei',
                                                 'bills' => array(2, 3, 5)));
          $type->addDocument($doc);

          $qs = new Elastica_Query_QueryString('andrei');
          $query = new Elastica_Query($qs);
          $resultSet = $type->search($query);
          print $resultSet->count();
data import

ES is not the primary data store (usually)

to import/synchronize data:

  write an agent (Gearman, message queues, etc)

  use rivers (CouchDB, RabbitMQ, Twitter)
10 more features
versioning          load balancing nodes

index aliases       plugins

parent/child docs   more_like_this

scripting           multi_field mapping

dynamic mapping     percolation
templates
References

http://github.com/elasticsearch/elasticsearch

http://www.elasticsearch.org/community/forum

IRC: #elasticsearch on irc.freenode.net

twitter: @elasticsearch


             HTTP://ZMIEVSKI.ORG/TALKS

Contenu connexe

En vedette

The Pregel Programming Model with Spark GraphX
The Pregel Programming Model with Spark GraphXThe Pregel Programming Model with Spark GraphX
The Pregel Programming Model with Spark GraphXAndrea Iacono
 
How to build_a_search_engine
How to build_a_search_engineHow to build_a_search_engine
How to build_a_search_engineAndrea Iacono
 
03. ElasticSearch : Data In, Data Out
03. ElasticSearch : Data In, Data Out03. ElasticSearch : Data In, Data Out
03. ElasticSearch : Data In, Data OutOpenThink Labs
 
Elasticsearch 101 - Cluster setup and tuning
Elasticsearch 101 - Cluster setup and tuningElasticsearch 101 - Cluster setup and tuning
Elasticsearch 101 - Cluster setup and tuningPetar Djekic
 
The Good, the Bad, and the Ugly: What Happened to Unicode and PHP 6
The Good, the Bad, and the Ugly: What Happened to Unicode and PHP 6The Good, the Bad, and the Ugly: What Happened to Unicode and PHP 6
The Good, the Bad, and the Ugly: What Happened to Unicode and PHP 6Andrei Zmievski
 
Scaling massive elastic search clusters - Rafał Kuć - Sematext
Scaling massive elastic search clusters - Rafał Kuć - SematextScaling massive elastic search clusters - Rafał Kuć - Sematext
Scaling massive elastic search clusters - Rafał Kuć - SematextRafał Kuć
 
Building a distributed search system with Hadoop and Lucene
Building a distributed search system with Hadoop and LuceneBuilding a distributed search system with Hadoop and Lucene
Building a distributed search system with Hadoop and LuceneMirko Calvaresi
 
Solr vs. Elasticsearch - Case by Case
Solr vs. Elasticsearch - Case by CaseSolr vs. Elasticsearch - Case by Case
Solr vs. Elasticsearch - Case by CaseAlexandre Rafalovitch
 
How to Become a Thought Leader in Your Niche
How to Become a Thought Leader in Your NicheHow to Become a Thought Leader in Your Niche
How to Become a Thought Leader in Your NicheLeslie Samuel
 

En vedette (10)

The Pregel Programming Model with Spark GraphX
The Pregel Programming Model with Spark GraphXThe Pregel Programming Model with Spark GraphX
The Pregel Programming Model with Spark GraphX
 
How to build_a_search_engine
How to build_a_search_engineHow to build_a_search_engine
How to build_a_search_engine
 
03. ElasticSearch : Data In, Data Out
03. ElasticSearch : Data In, Data Out03. ElasticSearch : Data In, Data Out
03. ElasticSearch : Data In, Data Out
 
Andrei's Regex Clinic
Andrei's Regex ClinicAndrei's Regex Clinic
Andrei's Regex Clinic
 
Elasticsearch 101 - Cluster setup and tuning
Elasticsearch 101 - Cluster setup and tuningElasticsearch 101 - Cluster setup and tuning
Elasticsearch 101 - Cluster setup and tuning
 
The Good, the Bad, and the Ugly: What Happened to Unicode and PHP 6
The Good, the Bad, and the Ugly: What Happened to Unicode and PHP 6The Good, the Bad, and the Ugly: What Happened to Unicode and PHP 6
The Good, the Bad, and the Ugly: What Happened to Unicode and PHP 6
 
Scaling massive elastic search clusters - Rafał Kuć - Sematext
Scaling massive elastic search clusters - Rafał Kuć - SematextScaling massive elastic search clusters - Rafał Kuć - Sematext
Scaling massive elastic search clusters - Rafał Kuć - Sematext
 
Building a distributed search system with Hadoop and Lucene
Building a distributed search system with Hadoop and LuceneBuilding a distributed search system with Hadoop and Lucene
Building a distributed search system with Hadoop and Lucene
 
Solr vs. Elasticsearch - Case by Case
Solr vs. Elasticsearch - Case by CaseSolr vs. Elasticsearch - Case by Case
Solr vs. Elasticsearch - Case by Case
 
How to Become a Thought Leader in Your Niche
How to Become a Thought Leader in Your NicheHow to Become a Thought Leader in Your Niche
How to Become a Thought Leader in Your Niche
 

Dernier

#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 

Dernier (20)

#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 

99 Problems, But The Search Ain't One

  • 1. 99 Problems, But The Search Ain’t One Andrei Zmievski • PHP UK •!Feb 25, 2011
  • 2. who am I? curl http://localhost:9200/speaker/info/andrei {“name”: “Andrei Zmievski”, “projects”: [“PHP”, “PHP-GTK”, “Smarty”, “Unicode/i18n”], “likes”: [“coding”, “beer”, “brewing”, “photography”], “twitter”: “@a”, “email”: “andrei@zmievski.org”}
  • 3. what is elasticsearch? a search engine for the NoSQL generation domain-driven distributed RESTful Hitchhiker’s Guide to the Galaxy (no, really)
  • 8. 1. index !"#$%&'()*+%,--./00$1!2$,13-/45660!17803.92:9#0;%&<= > request %%%%?72@9?/%?A7<#9B%C@B9D3:B?E %%%%?-2$:?/%?44%(#1F$9@3E%F"-%-,9%*92#!,%AB7==-%)79?E %%%%?$B:93?/%G?!1<B7H?E%?F99#?E%?.,1-1H#2.,I?JE %%%%?-KB--9#?/%?2?E %%%%?,9BH,-?/%;LM N= > response %%%%?1:?/-#"9 %%%%?OB7<9P?/?!178? %%%%?O-I.9?/?3.92:9#? %%%%?OB<?/?;? N
  • 9. 2. search request !"#$%,--./00$1!2$,13-/45660!17803.92:9#0O392#!,QRSF99# >%?-11:?%/%TE %%?O3,2#<3?%/%> %%%%?-1-2$?%/%;E %%%%?3"!!9338"$?%/%;E %%%%?82B$9<?%/%6 %%NE %%?,B-3?%/%> %%%%?-1-2$?%/%;E response %%%%?@2PO3!1#9?%/%6UV46LM64E %%%%?,B-3?%/%G%> %%%%%%?OB7<9P?%/%?!178?E %%%%%%?O-I.9?%/%?3.92:9#?E %%%%%%?OB<?%/%?5?E %%%%%%?O3!1#9?%/%6UV46LM64E %%%%%%?O31"#!9?%/% > %%%%?72@9?/%?A7<#9B%C@B9D3:B?E %%%%?-2$:?/%?44%(#1F$9@3E%F"-%-,9%*92#!,%AB7=-%)79?E %%%%?$B:93?/%G?!1<B7H?E%?F99#?E%?.,1-1H#2.,I?JE %%%%?-KB--9#?/%?2?E %%%%?,9BH,-?/%;LM N%N%J%N%N
  • 10. 2. search request !"#$%,--./00$1!2$,13-/45660!17803.92:9#0O392#!,QRSF99# >%?-11:?%/%TE %%?O3,2#<3?%/%> %%%%?-1-2$?%/%;E %%%%?3"!!9338"$?%/%;E %%%%?82B$9<?%/%6 %%NE total number of hits %%?,B-3?%/%> !!!!"#$#%&"!'!() response %%%%?@2PO3!1#9?%/%6UV46LM64E %%%%?,B-3?%/%G%> %%%%%%?OB7<9P?%/%?!178?E %%%%%%?O-I.9?%/%?3.92:9#?E %%%%%%?OB<?%/%?5?E %%%%%%?O3!1#9?%/%6UV46LM64E %%%%%%?O31"#!9?%/% > %%%%?72@9?/%?A7<#9B%C@B9D3:B?E %%%%?-2$:?/%?44%(#1F$9@3E%F"-%-,9%*92#!,%AB7=-%)79?E %%%%?$B:93?/%G?!1<B7H?E%?F99#?E%?.,1-1H#2.,I?JE %%%%?-KB--9#?/%?2?E %%%%?,9BH,-?/%;LM N%N%J%N%N
  • 11. 2. search request !"#$%,--./00$1!2$,13-/45660!17803.92:9#0O392#!,QRSF99# >%?-11:?%/%TE %%?O3,2#<3?%/%> %%%%?-1-2$?%/%;E %%%%?3"!!9338"$?%/%;E %%%%?82B$9<?%/%6 %%NE %%?,B-3?%/%> %%%%?-1-2$?%/%;E the index of the doc response %%%%?@2PO3!1#9?%/%6UV46LM64E %%%%?,B-3?%/%G%> !!!!!!"*+,-./"!'!"0$,1") %%%%%%?O-I.9?%/%?3.92:9#?E %%%%%%?OB<?%/%?5?E %%%%%%?O3!1#9?%/%6UV46LM64E %%%%%%?O31"#!9?%/% > %%%%?72@9?/%?A7<#9B%C@B9D3:B?E %%%%?-2$:?/%?44%(#1F$9@3E%F"-%-,9%*92#!,%AB7=-%)79?E %%%%?$B:93?/%G?!1<B7H?E%?F99#?E%?.,1-1H#2.,I?JE %%%%?-KB--9#?/%?2?E %%%%?,9BH,-?/%;LM N%N%J%N%N
  • 12. 2. search request !"#$%,--./00$1!2$,13-/45660!17803.92:9#0O392#!,QRSF99# >%?-11:?%/%TE %%?O3,2#<3?%/%> %%%%?-1-2$?%/%;E %%%%?3"!!9338"$?%/%;E %%%%?82B$9<?%/%6 %%NE %%?,B-3?%/%> %%%%?-1-2$?%/%;E response %%%%?@2PO3!1#9?%/%6UV46LM64E %%%%?,B-3?%/%G%> the type of the doc %%%%%%?OB7<9P?%/%?!178?E !!!!!!"*#23."!'!"43.%5.6") %%%%%%?OB<?%/%?5?E %%%%%%?O3!1#9?%/%6UV46LM64E %%%%%%?O31"#!9?%/% > %%%%?72@9?/%?A7<#9B%C@B9D3:B?E %%%%?-2$:?/%?44%(#1F$9@3E%F"-%-,9%*92#!,%AB7=-%)79?E %%%%?$B:93?/%G?!1<B7H?E%?F99#?E%?.,1-1H#2.,I?JE %%%%?-KB--9#?/%?2?E %%%%?,9BH,-?/%;LM N%N%J%N%N
  • 13. 2. search request !"#$%,--./00$1!2$,13-/45660!17803.92:9#0O392#!,QRSF99# >%?-11:?%/%TE %%?O3,2#<3?%/%> %%%%?-1-2$?%/%;E %%%%?3"!!9338"$?%/%;E %%%%?82B$9<?%/%6 %%NE %%?,B-3?%/%> %%%%?-1-2$?%/%;E response %%%%?@2PO3!1#9?%/%6UV46LM64E %%%%?,B-3?%/%G%> %%%%%%?OB7<9P?%/%?!178?E %%%%%%?O-I.9?%/%?3.92:9#?E !!!!!!"*+-"!'!"7") the id of the doc %%%%%%?O3!1#9?%/%6UV46LM64E %%%%%%?O31"#!9?%/% > %%%%?72@9?/%?A7<#9B%C@B9D3:B?E %%%%?-2$:?/%?44%(#1F$9@3E%F"-%-,9%*92#!,%AB7=-%)79?E %%%%?$B:93?/%G?!1<B7H?E%?F99#?E%?.,1-1H#2.,I?JE %%%%?-KB--9#?/%?2?E %%%%?,9BH,-?/%;LM N%N%J%N%N
  • 14. 2. search request !"#$%,--./00$1!2$,13-/45660!17803.92:9#0O392#!,QRSF99# >%?-11:?%/%TE %%?O3,2#<3?%/%> %%%%?-1-2$?%/%;E %%%%?3"!!9338"$?%/%;E %%%%?82B$9<?%/%6 %%NE %%?,B-3?%/%> %%%%?-1-2$?%/%;E response %%%%?@2PO3!1#9?%/%6UV46LM64E %%%%?,B-3?%/%G%> %%%%%%?OB7<9P?%/%?!178?E %%%%%%?O-I.9?%/%?3.92:9#?E !!!!!!"*+-"!'!"7") the id of the doc %%%%%%?O3!1#9?%/%6UV46LM64E %%%%%%?O31"#!9?%/% > %%%%?72@9?/%?A7<#9B%C@B9D3:B?E %%%%?-2$:?/%?44%(#1F$9@3E%F"-%-,9%*92#!,%AB7=-%)79?E %%%%?$B:93?/%G?!1<B7H?E%?F99#?E%?.,1-1H#2.,I?JE %%%%?-KB--9#?/%?2?E %%%%?,9BH,-?/%;LM N%N%J%N%N
  • 15. 2. search request !"#$%,--./00$1!2$,13-/45660!17803.92:9#0O392#!,QRSF99# >%?-11:?%/%TE %%?O3,2#<3?%/%> %%%%?-1-2$?%/%;E %%%%?3"!!9338"$?%/%;E %%%%?82B$9<?%/%6 %%NE %%?,B-3?%/%> %%%%?-1-2$?%/%;E response %%%%?@2PO3!1#9?%/%6UV46LM64E %%%%?,B-3?%/%G%> %%%%%%?OB7<9P?%/%?!178?E %%%%%%?O-I.9?%/%?3.92:9#?E !!!!!!"*+-"!'!"7") %%%%%%?O3!1#9?%/%6UV46LM64E the hit score %%%%%%?O31"#!9?%/% > %%%%?72@9?/%?A7<#9B%C@B9D3:B?E %%%%?-2$:?/%?44%(#1F$9@3E%F"-%-,9%*92#!,%AB7=-%)79?E %%%%?$B:93?/%G?!1<B7H?E%?F99#?E%?.,1-1H#2.,I?JE %%%%?-KB--9#?/%?2?E %%%%?,9BH,-?/%;LM N%N%J%N%N
  • 16. 2. search request !"#$%,--./00$1!2$,13-/45660!17803.92:9#0O392#!,QRSF99# >%?-11:?%/%TE %%?O3,2#<3?%/%> %%%%?-1-2$?%/%;E %%%%?3"!!9338"$?%/%;E %%%%?82B$9<?%/%6 %%NE %%?,B-3?%/%> %%%%?-1-2$?%/%;E response %%%%?@2PO3!1#9?%/%6UV46LM64E %%%%?,B-3?%/%G%> %%%%%%?OB7<9P?%/%?!178?E %%%%%%?O-I.9?%/%?3.92:9#?E !!!!!!"*+-"!'!"7") %%%%%%?O3!1#9?%/%6UV46LM64E %%%%%%?O31"#!9?%/% the original source 8 !!!!",%9."'!":,-6.+!;9+.<45+") !!!!"#%&5"'!"==!>6$?&.94)!?@#!#A.!B.%60A!:+,C#!D,.") !!!!"&+5.4"'!E"0$-+,F")!"?..6")!"3A$#$F6%3A2"G) !!!!"#H+##.6"'!"%") !!!!"A.+FA#"'!(IJ K%N%J%N%N
  • 17. 2. search request !"#$%,--./00$1!2$,13-/45660!17803.92:9#0O392#!,QRSF99# >%"#$$5"!'!L) %%?O3,2#<3?%/%> %%%%?-1-2$?%/%;E the execution time %%%%?3"!!9338"$?%/%;E %%%%?82B$9<?%/%6 %%NE %%?,B-3?%/%> %%%%?-1-2$?%/%;E response %%%%?@2PO3!1#9?%/%6UV46LM64E %%%%?,B-3?%/%G%> %%%%%%?OB7<9P?%/%?!178?E %%%%%%?O-I.9?%/%?3.92:9#?E %%%%%%?OB<?%/%?5?E %%%%%%?O3!1#9?%/%6UV46LM64E %%%%%%?O31"#!9?%/% > %%%%?72@9?/%?A7<#9B%C@B9D3:B?E %%%%?-2$:?/%?44%(#1F$9@3E%F"-%-,9%*92#!,%AB7=-%)79?E %%%%?$B:93?/%G?!1<B7H?E%?F99#?E%?.,1-1H#2.,I?JE %%%%?-KB--9#?/%?2?E %%%%?,9BH,-?/%;LM N%N%J%N%N
  • 19. demo
  • 20. distributed model provides: performance resiliency (high-availability)
  • 21. shards a portion of the document space each one is a separate Lucene index thus, many per-index settings are available document is sharded by its _id value but can be assigned (routed) to a shard deterministically
  • 22. zero-conf discovery zen (multicast and unicast) cloud (EC2 via API)
  • 23. auto-routing master node: maintains cluster state reassigns shards if nodes leave/join cluster any node can serve as the request router the query is handled via scatter-gather mechanism
  • 24. replicas each shard can have 1 or more replicas # of replicas can be updated dynamically after index creation replicas can be used for querying in parallel
  • 25. shard allocation node 1 start with a single node
  • 26. shard allocation node 1 person1 person2 PUT /person { “index”: { “number_of_shards”: 2, “number_of_replicas”: 1 }}
  • 27. shard allocation node 1 node 2 person1 person1 person2 person2 start the second node
  • 28. shard allocation node 1 node 2 node 3 node 4 person1 person1 person2 person2 start 2 more nodes
  • 29. shard allocation node 1 node 2 node 3 node 4 person1 person1 person2 person2 start 2 more nodes
  • 30. document sharding node 1 node 2 node 3 node 4 person1 person1 person2 person2 PUT /person/info/1 {…}
  • 31. document sharding node 1 node 2 node 3 node 4 person1 person1 person2 person2 PUT /person/info/1 hashed to shard 1 {…}
  • 32. document sharding node 1 node 2 node 3 node 4 person1 person1 person2 person2 replicated PUT /person/info/1 {…}
  • 33. document sharding node 1 node 2 node 3 node 4 person1 person1 person2 person2 PUT /person/info/2 {…}
  • 34. document sharding node 1 node 2 node 3 node 4 person1 person1 person2 person2 hashed to shard 2 PUT /person/info/2 {…}
  • 35. document sharding node 1 node 2 node 3 node 4 person1 person1 person2 person2 replicated PUT /person/info/2 {…}
  • 36. scatter-gather node 1 node 2 node 3 node 4 person1 person1 person2 person2 GET /person/_search?q=name:thomas
  • 37. shard allocation node 1 node 2 node 3 node 4 person1 person1 person2 person2 GET /person/_search?q=name:thomas
  • 38. shard allocation node 1 node 2 node 3 node 4 person1 person1 person2 person2 GET /person/_search?q=name:thomas
  • 39. shard allocation node 1 node 2 node 3 node 4 person1 person1 person2 person2 GET /person/_search?q=name:thomas
  • 40. transactional model per-document consistency no need to commit/flush uses write-behind transaction log write consistency (W) can be controlled one, quorum, or all
  • 41. (near) real-time search 1 second refresh rate by default _refresh API also
  • 42. index storage node data considered transient can be stored in local file system, JVM heap, native OS memory, or FS & memory combination persistent storage requires a gateway
  • 43. gateways persistent store for cluster state and indices asynchronous, translog-based write strategy allows full recovery if a cluster restart is needed supported gateways: local shared FS Hadoop via HDFS S3
  • 44. mapping describes document structure to the search engine automatically created with sensible defaults explicit mapping can be provided (generally, a good idea) can run into merge conflicts
  • 45. mapping important meta fields: _source _all _boost
  • 46. mapping types simple: string, integer/long, float/double, boolean, and null) complex: array, object
  • 47. sample mapping document >?"39#?/%%%%%%?<9#B!:?E %?-B-$9?/%%%%%?W17X-%(27B!?E %?-2H3?/%%%%%%G?.#18B$B7H?E%?<9F"HHB7H?E%?.,.?JE %?.13-W2-9?/%%?56;6&;5&55+;M/;Y/;5?E %?.#B1#B-I?/%%5N >?.13-?/%> %%?.#1.9#-B93?%/%> mapping %%%%?"39#?/%>?-I.9?/%?3-#B7H?E%?B7<9P?/%?71-O272$IZ9<?NE %%%%?@9332H9?/%>?-I.9?/%?3-#B7H?E%[F113-/%;UVNE %%%%?-2H3?/%>?-I.9?/%?3-#B7H?E%?B7!$"<9OB7O2$$?/%?71?NE %%%%?.13-W2-9?%/%>?-I.9?%/%?<2-9?E%[3-1#9/%[71NE %%%%?.#B1#B-I?%/%>?-I.9?%/%?B7-9H9#?N NNN
  • 48. analyzers break down (tokenize) and normalize fields during indexing and query strings at search time analyzer = tokenizer + token filters (0 or more) *-27<2#<%A72$IZ9#%S %%%*-27<2#<%+1:97BZ9#%] %%%%%%%*-27<2#<%+1:97%^B$-9#%] %%%%%%%_1K9#!239%+1:97%^B$-9#%] %%%%%%%*-1.%+1:97%^B$-9#
  • 49. analyzers analyzers, tokenizers, and filters can be customized mapping elasticsearch.yml B7<9P/ %%272$I3B3/ %%%%272$IZ9#/ %%%%%%.@&%,F/ %%%%%%%%-I.9/%!"3-1@ %%%%%%%%-1:97BZ9#/%3-27<2#< %%%%%%%%8B$-9#/%G3-27<2#<E%$1K9#!239E%3-1.E %%%%%%%%%%%%%%%%%23!BB81$<B7HE%.1#-9#*-9@J ` ?-B-$9?/%>?-I.9?/%?3-#B7H?E%?272$IZ9#?/%?9"$27H?NE `
  • 50. API
  • 51. API conventions append ?pretty=true to get readable JSON boolean values: false/0/off = false, rest is true JSONP support via callback parameter
  • 52. API structure http://host:port/[index]/[type]/[_action/id] GET http://es:9200/_status GET http://es:9200/twitter/_status POST http://es:9200/twitter/tweet/1 GET http://es:9200/twitter/tweet/1
  • 53. API structure http://host:port/[index]/[type]/[_action/id] GET http://es:9200/twitter/tweet/_search GET http://es:9200/twitter/user/_search GET http://es:9200/twitter/tweet,user/_search GET http://es:9200/twitter,facebook/_search GET http://es:9200/_search
  • 54. _cluster API structure GET /_cluster/health GET /_cluster/health/index1,index2 GET /_cluster/nodes/stats GET /_cluster/nodes/nodeId1,nodeId2/stats
  • 55. API {core} index search bulk query delete from/size paging delete by query sort get highlighting count selective fields
  • 56. API {indices} create optimize delete snapshot open/close update settings get/put/delete analyze mapping status refresh flush
  • 58. Query DSL term / terms query_string range default_operator prefix analyzer bool phrase_slop fuzzy etc wildcard
  • 59. filters share some similar features with queries (term, range, etc) why use a filter?
  • 60. filters faster than queries cached (depends on the filter) the cache is used for different queries against the same filter no scoring more useful ones: term, terms, range, prefix, and, or, not, exists, missing, query
  • 61. facets provide aggregated data based on the search request terms, histogram, date histogram, range, statistical, and more
  • 62. geo search implemented as filters (and a facet) geo_distance geo_bounding_box geo_polygon
  • 63. interfaces REST including memcached Java /!Groovy Language clients (REST/Thrift): pyes, PHP (standalone and symfony), Ruby, Perl Flume sink implementation
  • 64. elastica similar to the other PHP ElasticSearch client API naming is consistent with Zend Framework can be extended for new filters, facets, etc still under development
  • 65. elastica $es = new Elastica_Client('vm', 9200); $index = new Elastica_Index($es, 'test'); $index->create(array(), true); $type = new Elastica_Type($index, 'person'); $doc = new Elastica_Document(1, array('name' => 'Andrei Zmievski', example 'email' => 'andrei@test.com', 'username' => 'andrei', 'bills' => array(2, 3, 5))); $type->addDocument($doc); $qs = new Elastica_Query_QueryString('andrei'); $query = new Elastica_Query($qs); $resultSet = $type->search($query); print $resultSet->count();
  • 66. data import ES is not the primary data store (usually) to import/synchronize data: write an agent (Gearman, message queues, etc) use rivers (CouchDB, RabbitMQ, Twitter)
  • 67. 10 more features versioning load balancing nodes index aliases plugins parent/child docs more_like_this scripting multi_field mapping dynamic mapping percolation templates