SlideShare une entreprise Scribd logo
1  sur  32
Télécharger pour lire hors ligne
Solr Performance &
   Key Innovations

     Yonik Seeley, Lucid Imagination
yonik@lucidimagination.com, May 26 2011
Solr 3.1 Highlights
§  Numeric range facets (similar to date faceting).
§  New spatial search, including spatial filtering,
    boosting and sorting capabilities.
§  Example Velocity driven search UI at
    http://localhost:8983/solr/browse
§  A new faster termvector-based highlighter.
§  Extended dismax (edismax) query parser with
    support for fielded queries, enhanced relevancy, and
    full lucene syntax support.
§  Distributed search support for the Spell check and
    Terms components.
                                                      3
Solr 3.1 Highlights (continued)
§  Suggester, a fast trie-based autocomplete
    component.
§  Sort results by any function query.
§  JSON document indexing.
§  CSV response format
§  Apache UIMA integration for metadata
    extraction.
§  Tons of optimizations, bugfixes, and new
    analysis capabilities via Apache Lucene 3.1.


                                                   4
What’s not in 3.1?
§  Result Grouping (AKA Field Collapsing)
§  Pivot Faceting
§  SolrCloud
§  Pseudo-fields
§  Pseudo-join
§  Relevancy function queries
§  Per-segment faceting
§  *Tons* of new Lucene performance/efficiency
    goodness
                                              5
Recent Lucene Performance
§  TieredMergePolicy – the new default
  •    Much better for incremental indexing / NRT
  •    Ignores segment order when selecting best merge
  •    Takes deletes into account
  •    Does not over-merge (no cascading merges)
§  Finite State Transducer (FST) based terms index




                                                         6
DocumentWriterPerThread (DWPT)
                           Indexing
                           thread
§  Flushing new
    segment is now                             Index Writer
    concurrent w/
    indexing
§  Use multiple               DWPT               DWPT        DWPT
                                  in-memory
    indexing threads/
    connections
§  When max mem is                   Flush segment
                                      to disk
    hit, biggest DWPT is      _1_0.tiv           _2_0.tiv     _3_0.tiv
    concurrently flushed      _1_0.prx           _2_0.prx     _3_0.prx
                              _1_0.frq           _2_0.frq     _3_0.frq
                                 …                  …            …
                                                                         7
Solr Cloud
http://.../solr/collection1?distrib=true


                                                Load-balanced
      shard1                                    sub-request              shard2
      (replica1)                                                         (replica1)
              replica2                                                           replica2
                replica3                                                               replica3

                                                      ZK
                                                     node        /livenodes
                                                                   server1:8983/solr
                    ZK      /collections                           server2:8983/solr
                   node       /collection1 configName=myconf                                  ZK
                                                                   server2:8983/solr         node
                                /shards
                                  /shard1                      /configs
                                    server1:8983/solr            /myconf
                                    server2:8983/solr             solrconfig.xml
                                 /shard2                          schema.xml
                                    server3:8983/solr
                          ZK        server4:8983/solr                          ZK
                          node                                                node
                                             ZooKeeper quorum
                                                                                                    8
Solr Cloud: Getting Started
   http://wiki.apache.org/solr/SolrCloud
   java	
  -­‐Dbootstrap_confdir=./solr/conf	
  	
  
        	
  -­‐Dcollection.configName=myconf	
  	
  
        	
  -­‐DzkRun	
  	
                              Upload /solr/conf
        	
  -­‐jar	
  start.jar	
                        to ZK and call it
                                                             “myconf”

                                       Run an internal
                                         ZK server




http://localhost:8983/solr/collection1/admin/zookeeper.jsp
Distributed Requests
l  Explicitly   specify node addresses to load-balance across
   shards=localhost:8983/solr|localhost:8900/solr,	
  
   	
  	
  	
  	
  	
  	
  	
  localhost:7574/solr|localhost:7500/solr	
  
   l    A list of equivalent nodes are separated by “|”
   l    Different phases of the same distributed request use the same node
l  Specify    logical shard ids to search across
   shards=NY_shard,NJ_shard	
  
l  Query    across all shards in the collection
   http://localhost:8983/solr/collection1/select?distrib=true	
  
   	
  
l  public	
  CloudSolrServer(String	
  zkHost)	
  
   l    SolrJ Java client that load-balances across all nodes in cluster
Extended Dismax Parser
l  Supersetof dismax
l  Designed to directly handle user queries w/o exceptions
    &defType=edismax&q=foo&qf=body	
  
l  Fixes         edge cases where dismax could still throw exceptions
    OR	
  	
  	
  AND	
  	
  	
  NOT	
  	
  	
  -­‐	
  	
  	
   	
  
l  Full     lucene syntax support
    l    Tries lucene syntax first
    l    Smart escaping is done if syntax errors
l  Optionally                  supports treating and / or as AND/OR in lucene
    syntax
l  Fielded queries (e.g. myfield:foo) even in degraded mode
    l    uf parameter controls what field names may be directly specified in q
Extended Dismax Parser (continued)
l  boost parameter for multiplicative boost-by-function
l  Pure negative query clauses
   Example: solr	
  OR	
  (-­‐solr)	
  
l  Enhanced        term proximity boosting
   l    pf2=myfield – results in term bigrams in sloppy phrase queries
     	
  myfield: aa	
  bb	
  cc -­‐>	
  	
  myfield: aa	
  bb 	
  	
  myfield: bb	
  cc 	
  
l  Enhanced        stopword handling
   l    stopwords omitted in main query, but added in optional proximity
         boosting part
         Example: q=solr	
  is	
  awesome	
  &	
  qf=myfield	
  &	
  pf2=myfield	
  	
  	
  -­‐>	
  	
  	
  	
  
     	
  +myfield:(solr	
  awesome)	
  	
  (myfield: solr	
  is 	
  myfield: is	
  
         awesome )	
  
   l  Currently controlled by the absence of StopWordFilter in index analyzer,
         and presence in query analyzer
Faceting Performance Improvements

l  For   facet.method=enum, speed up initial population of the
    filterCache (i.e. first time facet): from 30% to 32x
    improvement
l  Optimized facet.method=fc for multi-valued fields and large
    facet.limit – up to 3x faster
l  Optimized deep facet paging – up to 10x faster with really
    large facet.offsets
l  Less memory consumed by field cache entries

l  Per-segment faceting with facet.method=fcs
    l    Only faster when re-opening index frequently (many times a second)
    l    Only works for single-valued fields
Pivot Faceting
l  Other    names that could have made sense:
   l    Grid Faceting, Cross-Product Faceting, Matrix Faceting
l  Syntax:    facet.pivot=field1,field2,field3,…

 facet.pivot=cat,inStock
                             #docs #docs w/             #docs w/
                                   inStock:true         instock:false
 cat:electronics             14     10                  4
 cat:memory                  3      3                   0
 cat:connector               2      0                   2
 cat:graphics card           2      0                   2
 cat:hard drive              2      2                   0
Pivot Faceting
   http://...&facet=true&facet.pivot=cat,popularity
            "facet_counts":{                    (continued)
               "facet_pivot":{
                 "cat,popularity":[{           {
                   "field":"cat",                "field":"popularity",
14 docs w/         "value":"electronics",        "value":"1",
cat==electronics   "count":14,                   "count":2}]},
                   "pivot":[{               {
5 docs w/             "field":"popularity", "field":"cat",
cat==electronics      "value":"6",            "value":"memory",
&& popularity==6      "count":5},             "count":3,
                    {                         "pivot":[]},
                      "field":"popularity",
                      "value":"7",              […]
                      "count":4},
Range Faceting
                              "facet_counts":{
§  Like Date faceting, but     "facet_ranges":{
    more generic                  "price":{
                                   "counts":{
                                     "0.0":5,
http://...&facet=true                "50.0":2,
&facet.range=price                   "100.0":0,
                                     "150.0":2,
&facet.range.start=0                 "200.0":0,
&facet.range.end=500                 "250.0":1,
                                     "300.0":2,
&facet.range.gap=50                  "350.0":2,
                                     "400.0":0,
                                     "450.0":1},
                                   "gap":50.0,
                                   "start":0.0,
                                   "end":500.0}}}}
Spatial Search
Step1: Index some locations!
<field name= name >The Alpine Shop</field>
<field name= store >44.013617,-73.168264</field>

Step2: Decide where you are
&pt=44.0153371,-73.16734
&d=1
&sfield=store

Step3: Profit!

Spatial Filter: &fq={!geofilt}

Bounding Box: &fq={!bbox}

Distance Function: &sort=geodist() asc

Returning the distance: &fl=geodist()

       Pseudo-fields!       Note: You can now sort
                               by any arbitrary
                                function query!
Pseudo-Fields
Returns other info along with document stored fields
§  Function queries
   fl=name,location,geodist(),add(myfield,10)	
  
§  Fieldname globs
   fl=id,attr_*	
  
§  Multiple “fl” (field list) values
   &fl=id,attr_*&fl=geodist()&fl=termfreq(text,’solr’)
                                                     	
  
§  Aliasing
   fl=id,location:loc,_dist_:geodist()	
  
§  Future: inlined highlighting, “explain”, sort-values,
    group-value
   	
                                                       18
Result Grouping / Field
                 Collapsing
l  Goal
   l Limit the number of results per category
   l  category normally defined by unique values in a field

l  Uses
   l  Web Search – collapse by web site
   l  Email threads – collapse by thread id

   l  Ecommerce/retail

        l  Show the top 5 items for each store category (music, movies,
            etc)
Field Collapsing by Site
Result Grouping by Category
Field Collapse on Product Type
Group by Field
http://...&fl=id,name&q=ipod&group=true&group.field=manu_exact
  "grouped":{
    "manu_exact":{
     "matches":3,
     "groups":[{
        "groupValue":"Belkin",
        "doclist":{"numFound":2,"start":0,"docs":[
            {
              "id":"IW-02",
              "name":"iPod & iPod Mini USB 2.0 Cable"}]
        }},
      {
        "groupValue":"Apple Computer Inc.",
        "doclist":{"numFound":1,"start":0,"docs":[
            {
Group by Query
http://...&group=true&group.query=price:[0 TO 99.99]
  &group.query=price:[100 TO *]&group.limit=5
  "grouped":{
    "price:[0 TO 99.99]":{
     "matches":3,
     "doclist":{"numFound":2,"start":0,"docs":[
         {
           "id":"IW-02",
           "name":"iPod & iPod Mini USB 2.0 Cable"},
         {
           "id":"F8V7067-APL-KIT",
           "name":"Belkin Mobile Power Cord for iPod"}]
     }},
    "price:[100 TO *]":{
     "matches":3,
     "doclist":{"numFound":1,"start":0,"docs":[
Grouping Params
parameter                meaning                                      default

group.field=<field>      Like facet.field – group by unique field
                         values
group.query=<query>      Like facet.query – top docs that also
                         match
group.function=<function Group by unique values produced by the
query>                   function query
group.limit=<n>          How many docs per group                      1
group.sort=<sort spec>   How to sort documents within a group         Same as sort


rows=<n>                 How many groups to return                    10
sort=<sort spec>         How to sort the groups relative to each
                         other (based on top doc)
group.format=<format>    grouped/simple – if simple, a single flat    grouped
                         list is used and rows units are “docs”
group.main=true/false    If true, the first field grouping command is false
                         used as main result set
Pseudo-Join
    id: blog1                                id: post1
                                             blog_id: blog1
    name: Solr ‘n Stuff
                                             author: Yonik Seeley
    owner: Yonik Seeley                      title: Solr relevancy function queries
    Started: 2007-10-26                      body: Lucene’s default ranking […]

    id: blog2                                id: post2
    name: lifehacker                         blog_id: blog1
                                             author: Yonik Seeley
    owner: Gawker Media
                                             title: Solr result grouping
    started: 2005-1-31                       body: Result Grouping, also called […]

                                             id: post3
                                             blog_id: blog2
Restrict to blogs mentioning netflix         author: Whitson Gordon
                                             title: How to Install Netflix on Almost
                                                        Any Android Device
fq={!join from=blog_id to=id}body:netflix

-  Finds all documents matching “netflix”
-  Maps to different docs by following blog_id to id

                                                                                       25
Pseudo-Join Examples
§  Only show posts from blogs started after 2010
  q=foo&fq={!join from=id to=blog_id}started:[2010 TO *]


§  If any post in a blog mentions “obama”, then search
    all posts in that blog for “bomb” (self-join)
  q=bomb&fq={!join from=blog_id to=blog_id}obama


§  If any blog post mentions “obama”, then search all
    websites with the same blog owner for “bomb”
  q=bomb&fq={!join from=owner to=website_owner}{!join
  from=blog_id to=id}obama

                                                           26
Cross-Core Join
   id: doc1
   security: managers
                                             id: mary
   title: doc for managers only              security_groups: managers, employees
   body: …

   id: doc1                                  id: john
                                             security_groups: employees
   security: managers, employees
   title: doc for everyone
   body: …

             collection1                                   sec1

                                  Single Solr Server


http://localhost:8983/solr/collection1/select?q=foo&fq={!join
fromIndex=sec1 from=security_groups to=security}user:john

                                                                                    27
Pseudo-Join vs Grouping
Pseudo-Join                                   Result Grouping / Field Collapsing

O(n_terms_in_join_fields)                     O(n_docs_in_result)

Single or multi-valued fields                 Single-valued fields only

Filters only (no info currently passed from   Can order docs within a group and groups
the “from” docs to the “to” docs).            by top doc within that group using normal
                                              sort criteria.
Chainable (one join can be the input to       Not currently chainable – can only group
another)                                      one field deep
Affects which documents match a request,      Grouping does not currently affect the set
so naturally affects facet numbers (e.g.      of documents matching the query, so
you can search posts and get numbers of       faceting is unaffected.
blogs)




                                                                                      28
Auto-Suggest
l  Many    people previously used terms component
   l    Can be slow for a large corpus
l  New    auto-suggest builds off SpellCheck component
   l    TST implementation: compact memory based trie
   l    FST implementation: slower to build, but smaller & faster lookup
   l    Based on a field in the main index, or on a dictionary file
http://localhost:8983/solr/suggest?wt=json&indent=true&q=ult

                  "spellcheck":{
                    "suggestions":[
                     "ult",{
                       "numFound":1,
                       "startOffset":0,
                       "endOffset":3,
                       "suggestion":["ultrasharp"]},
                     "collation","ultrasharp"]}}
                                                                            29
Index with JSON
$	
  URL=http://localhost:8983/solr/update/json	
  
$	
  curl	
  $URL	
  -­‐H	
  'Content-­‐type:application/json'	
  -­‐d	
  ’	
  
[	
  
	
  	
  {	
  
	
  	
  	
  	
  "id"	
  :	
  "978-­‐0641723445",	
  
	
  	
  	
  	
  "cat"	
  :	
  ["book","hardcover"],	
  
	
  	
  	
  	
  "title"	
  :	
  "The	
  Lightning	
  Thief",	
  
	
  	
  	
  	
  "author"	
  :	
  "Rick	
  Riordan",	
  
	
  	
  	
  	
  "series_t"	
  :	
  "Percy	
  Jackson	
  and	
  the	
  Olympians",	
  
	
  	
  	
  	
  "sequence_i"	
  :	
  1,	
  
	
  	
  	
  	
  "genre_s"	
  :	
  "fantasy",	
  
	
  	
  	
  	
  "inStock"	
  :	
  true,	
  
	
  	
  	
  	
  "price"	
  :	
  12.50,	
  
	
  	
  	
  	
  "pages_i"	
  :	
  384	
  
	
  	
  }	
  
]'	
  
Query Results in CSV
http://localhost:8983/solr/select?q=ipod&fl=name,price,cat,popularity&wt=csv

name,price,cat,popularity
iPod & iPod Mini USB 2.0 Cable,11.5,"electronics,connector",1
Belkin Mobile Power Cord for iPod w/ Dock,19.95,"electronics,connector",1
Apple 60 GB iPod with Video Playback Black,399.0,"electronics,music",10

l  Can handle multi-valued fields (see cat field in example)
l  Completely compatible with the CSV update handler (can round-trip)

l  Results are streamed – good for dumping entire parts of the index
http://localhost:8983/solr/browse
Q&A

Contenu connexe

Tendances

Do we need Unsafe in Java?
Do we need Unsafe in Java?Do we need Unsafe in Java?
Do we need Unsafe in Java?Andrei Pangin
 
Down to Stack Traces, up from Heap Dumps
Down to Stack Traces, up from Heap DumpsDown to Stack Traces, up from Heap Dumps
Down to Stack Traces, up from Heap DumpsAndrei Pangin
 
No more (unsecure) secrets, Marty
No more (unsecure) secrets, MartyNo more (unsecure) secrets, Marty
No more (unsecure) secrets, MartyMathias Herberts
 
Process Address Space: The way to create virtual address (page table) of user...
Process Address Space: The way to create virtual address (page table) of user...Process Address Space: The way to create virtual address (page table) of user...
Process Address Space: The way to create virtual address (page table) of user...Adrian Huang
 
Everything you wanted to know about Stack Traces and Heap Dumps
Everything you wanted to know about Stack Traces and Heap DumpsEverything you wanted to know about Stack Traces and Heap Dumps
Everything you wanted to know about Stack Traces and Heap DumpsAndrei Pangin
 
Plongée profonde dans les technos de haute disponibilité d’Exchange 2010 par...
Plongée profonde  dans les technos de haute disponibilité d’Exchange 2010 par...Plongée profonde  dans les technos de haute disponibilité d’Exchange 2010 par...
Plongée profonde dans les technos de haute disponibilité d’Exchange 2010 par...Microsoft Technet France
 
Tutorial to set up a case for chtMultiRegionFoam in OpenFOAM 2.0.0
Tutorial to set up a case for chtMultiRegionFoam in OpenFOAM 2.0.0Tutorial to set up a case for chtMultiRegionFoam in OpenFOAM 2.0.0
Tutorial to set up a case for chtMultiRegionFoam in OpenFOAM 2.0.0ARPIT SINGHAL
 
OOUG: Oracle transaction locking
OOUG: Oracle transaction lockingOOUG: Oracle transaction locking
OOUG: Oracle transaction lockingKyle Hailey
 
ch3-pv1-memory-management
ch3-pv1-memory-managementch3-pv1-memory-management
ch3-pv1-memory-managementyushiang fu
 
The Art of JVM Profiling
The Art of JVM ProfilingThe Art of JVM Profiling
The Art of JVM ProfilingAndrei Pangin
 
東急Ruby会議向け「rubyの細かい話」
東急Ruby会議向け「rubyの細かい話」東急Ruby会議向け「rubyの細かい話」
東急Ruby会議向け「rubyの細かい話」Masaya TARUI
 
UKOUG, Oracle Transaction Locks
UKOUG, Oracle Transaction LocksUKOUG, Oracle Transaction Locks
UKOUG, Oracle Transaction LocksKyle Hailey
 
Oracle 10g Performance: chapter 09 enqueues
Oracle 10g Performance: chapter 09 enqueuesOracle 10g Performance: chapter 09 enqueues
Oracle 10g Performance: chapter 09 enqueuesKyle Hailey
 
BlockChain implementation by python
BlockChain implementation by pythonBlockChain implementation by python
BlockChain implementation by pythonwonyong hwang
 
Cs757 ns2-tutorial-exercise
Cs757 ns2-tutorial-exerciseCs757 ns2-tutorial-exercise
Cs757 ns2-tutorial-exercisePratik Joshi
 
Thinking outside the box, learning a little about a lot
Thinking outside the box, learning a little about a lotThinking outside the box, learning a little about a lot
Thinking outside the box, learning a little about a lotMark Broadbent
 
Fatkulin presentation
Fatkulin presentationFatkulin presentation
Fatkulin presentationEnkitec
 
[Pgday.Seoul 2017] 3. PostgreSQL WAL Buffers, Clog Buffers Deep Dive - 이근오
[Pgday.Seoul 2017] 3. PostgreSQL WAL Buffers, Clog Buffers Deep Dive - 이근오[Pgday.Seoul 2017] 3. PostgreSQL WAL Buffers, Clog Buffers Deep Dive - 이근오
[Pgday.Seoul 2017] 3. PostgreSQL WAL Buffers, Clog Buffers Deep Dive - 이근오PgDay.Seoul
 
Range reader/writer locking for the Linux kernel
Range reader/writer locking for the Linux kernelRange reader/writer locking for the Linux kernel
Range reader/writer locking for the Linux kernelDavidlohr Bueso
 

Tendances (20)

Do we need Unsafe in Java?
Do we need Unsafe in Java?Do we need Unsafe in Java?
Do we need Unsafe in Java?
 
Down to Stack Traces, up from Heap Dumps
Down to Stack Traces, up from Heap DumpsDown to Stack Traces, up from Heap Dumps
Down to Stack Traces, up from Heap Dumps
 
No more (unsecure) secrets, Marty
No more (unsecure) secrets, MartyNo more (unsecure) secrets, Marty
No more (unsecure) secrets, Marty
 
Process Address Space: The way to create virtual address (page table) of user...
Process Address Space: The way to create virtual address (page table) of user...Process Address Space: The way to create virtual address (page table) of user...
Process Address Space: The way to create virtual address (page table) of user...
 
Everything you wanted to know about Stack Traces and Heap Dumps
Everything you wanted to know about Stack Traces and Heap DumpsEverything you wanted to know about Stack Traces and Heap Dumps
Everything you wanted to know about Stack Traces and Heap Dumps
 
Plongée profonde dans les technos de haute disponibilité d’Exchange 2010 par...
Plongée profonde  dans les technos de haute disponibilité d’Exchange 2010 par...Plongée profonde  dans les technos de haute disponibilité d’Exchange 2010 par...
Plongée profonde dans les technos de haute disponibilité d’Exchange 2010 par...
 
Tutorial to set up a case for chtMultiRegionFoam in OpenFOAM 2.0.0
Tutorial to set up a case for chtMultiRegionFoam in OpenFOAM 2.0.0Tutorial to set up a case for chtMultiRegionFoam in OpenFOAM 2.0.0
Tutorial to set up a case for chtMultiRegionFoam in OpenFOAM 2.0.0
 
OOUG: Oracle transaction locking
OOUG: Oracle transaction lockingOOUG: Oracle transaction locking
OOUG: Oracle transaction locking
 
ch3-pv1-memory-management
ch3-pv1-memory-managementch3-pv1-memory-management
ch3-pv1-memory-management
 
The Art of JVM Profiling
The Art of JVM ProfilingThe Art of JVM Profiling
The Art of JVM Profiling
 
東急Ruby会議向け「rubyの細かい話」
東急Ruby会議向け「rubyの細かい話」東急Ruby会議向け「rubyの細かい話」
東急Ruby会議向け「rubyの細かい話」
 
UKOUG, Oracle Transaction Locks
UKOUG, Oracle Transaction LocksUKOUG, Oracle Transaction Locks
UKOUG, Oracle Transaction Locks
 
Oracle 10g Performance: chapter 09 enqueues
Oracle 10g Performance: chapter 09 enqueuesOracle 10g Performance: chapter 09 enqueues
Oracle 10g Performance: chapter 09 enqueues
 
BlockChain implementation by python
BlockChain implementation by pythonBlockChain implementation by python
BlockChain implementation by python
 
Cs757 ns2-tutorial-exercise
Cs757 ns2-tutorial-exerciseCs757 ns2-tutorial-exercise
Cs757 ns2-tutorial-exercise
 
Thinking outside the box, learning a little about a lot
Thinking outside the box, learning a little about a lotThinking outside the box, learning a little about a lot
Thinking outside the box, learning a little about a lot
 
Fatkulin presentation
Fatkulin presentationFatkulin presentation
Fatkulin presentation
 
[Pgday.Seoul 2017] 3. PostgreSQL WAL Buffers, Clog Buffers Deep Dive - 이근오
[Pgday.Seoul 2017] 3. PostgreSQL WAL Buffers, Clog Buffers Deep Dive - 이근오[Pgday.Seoul 2017] 3. PostgreSQL WAL Buffers, Clog Buffers Deep Dive - 이근오
[Pgday.Seoul 2017] 3. PostgreSQL WAL Buffers, Clog Buffers Deep Dive - 이근오
 
Range reader/writer locking for the Linux kernel
Range reader/writer locking for the Linux kernelRange reader/writer locking for the Linux kernel
Range reader/writer locking for the Linux kernel
 
Computer vision
Computer vision Computer vision
Computer vision
 

En vedette

C:\Fakepath\I Love You Mommy
C:\Fakepath\I Love You MommyC:\Fakepath\I Love You Mommy
C:\Fakepath\I Love You MommyNyiah
 
Speed Up Web 2012
Speed Up Web 2012Speed Up Web 2012
Speed Up Web 2012彰 村地
 
Amazing grace[1]
Amazing grace[1]Amazing grace[1]
Amazing grace[1]tanica
 
20101023 ie9 cache
20101023 ie9 cache20101023 ie9 cache
20101023 ie9 cache彰 村地
 
Maroon5
Maroon5Maroon5
Maroon5tanica
 
Cancer
CancerCancer
Cancertanica
 
Oslb office365
Oslb office365Oslb office365
Oslb office365彰 村地
 
Minneapolis Solr Meetup - May 28, 2014: Target.com Search
Minneapolis Solr Meetup - May 28, 2014: Target.com SearchMinneapolis Solr Meetup - May 28, 2014: Target.com Search
Minneapolis Solr Meetup - May 28, 2014: Target.com SearchLucidworks (Archived)
 
SFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for Business
SFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for BusinessSFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for Business
SFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for BusinessLucidworks (Archived)
 
Solr Cluster installation tool "Anuenue"
Solr Cluster installation tool "Anuenue"Solr Cluster installation tool "Anuenue"
Solr Cluster installation tool "Anuenue"Lucidworks (Archived)
 
最新ブラウザー UI 比較
最新ブラウザー UI 比較最新ブラウザー UI 比較
最新ブラウザー UI 比較彰 村地
 
Ingles haiti
Ingles haitiIngles haiti
Ingles haititanica
 
ブラウザー勉強会始めました
ブラウザー勉強会始めましたブラウザー勉強会始めました
ブラウザー勉強会始めました彰 村地
 
Integrating Advanced Text Analytics into Solr
Integrating Advanced Text Analytics into SolrIntegrating Advanced Text Analytics into Solr
Integrating Advanced Text Analytics into SolrLucidworks (Archived)
 
"A Study of I/O and Virtualization Performance with a Search Engine based on ...
"A Study of I/O and Virtualization Performance with a Search Engine based on ..."A Study of I/O and Virtualization Performance with a Search Engine based on ...
"A Study of I/O and Virtualization Performance with a Search Engine based on ...Lucidworks (Archived)
 
Chicago Solr Meetup - June 10th: This Ain't Your Parents' Search Engine
Chicago Solr Meetup - June 10th: This Ain't Your Parents' Search EngineChicago Solr Meetup - June 10th: This Ain't Your Parents' Search Engine
Chicago Solr Meetup - June 10th: This Ain't Your Parents' Search EngineLucidworks (Archived)
 
Guidelines for Managers: What Lucene and Solr Open Source Search can do for E...
Guidelines for Managers: What Lucene and Solr Open Source Search can do for E...Guidelines for Managers: What Lucene and Solr Open Source Search can do for E...
Guidelines for Managers: What Lucene and Solr Open Source Search can do for E...Lucidworks (Archived)
 
Already, just, still, yet
Already, just, still, yetAlready, just, still, yet
Already, just, still, yettanica
 
Kelly Clarkson
Kelly ClarksonKelly Clarkson
Kelly Clarksontanica
 

En vedette (20)

C:\Fakepath\I Love You Mommy
C:\Fakepath\I Love You MommyC:\Fakepath\I Love You Mommy
C:\Fakepath\I Love You Mommy
 
Speed Up Web 2012
Speed Up Web 2012Speed Up Web 2012
Speed Up Web 2012
 
Amazing grace[1]
Amazing grace[1]Amazing grace[1]
Amazing grace[1]
 
20101023 ie9 cache
20101023 ie9 cache20101023 ie9 cache
20101023 ie9 cache
 
Maroon5
Maroon5Maroon5
Maroon5
 
Cancer
CancerCancer
Cancer
 
Oslb office365
Oslb office365Oslb office365
Oslb office365
 
Minneapolis Solr Meetup - May 28, 2014: Target.com Search
Minneapolis Solr Meetup - May 28, 2014: Target.com SearchMinneapolis Solr Meetup - May 28, 2014: Target.com Search
Minneapolis Solr Meetup - May 28, 2014: Target.com Search
 
SFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for Business
SFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for BusinessSFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for Business
SFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for Business
 
Solr Cluster installation tool "Anuenue"
Solr Cluster installation tool "Anuenue"Solr Cluster installation tool "Anuenue"
Solr Cluster installation tool "Anuenue"
 
最新ブラウザー UI 比較
最新ブラウザー UI 比較最新ブラウザー UI 比較
最新ブラウザー UI 比較
 
Ingles haiti
Ingles haitiIngles haiti
Ingles haiti
 
ブラウザー勉強会始めました
ブラウザー勉強会始めましたブラウザー勉強会始めました
ブラウザー勉強会始めました
 
Juan gris
Juan grisJuan gris
Juan gris
 
Integrating Advanced Text Analytics into Solr
Integrating Advanced Text Analytics into SolrIntegrating Advanced Text Analytics into Solr
Integrating Advanced Text Analytics into Solr
 
"A Study of I/O and Virtualization Performance with a Search Engine based on ...
"A Study of I/O and Virtualization Performance with a Search Engine based on ..."A Study of I/O and Virtualization Performance with a Search Engine based on ...
"A Study of I/O and Virtualization Performance with a Search Engine based on ...
 
Chicago Solr Meetup - June 10th: This Ain't Your Parents' Search Engine
Chicago Solr Meetup - June 10th: This Ain't Your Parents' Search EngineChicago Solr Meetup - June 10th: This Ain't Your Parents' Search Engine
Chicago Solr Meetup - June 10th: This Ain't Your Parents' Search Engine
 
Guidelines for Managers: What Lucene and Solr Open Source Search can do for E...
Guidelines for Managers: What Lucene and Solr Open Source Search can do for E...Guidelines for Managers: What Lucene and Solr Open Source Search can do for E...
Guidelines for Managers: What Lucene and Solr Open Source Search can do for E...
 
Already, just, still, yet
Already, just, still, yetAlready, just, still, yet
Already, just, still, yet
 
Kelly Clarkson
Kelly ClarksonKelly Clarkson
Kelly Clarkson
 

Similaire à Seeley yonik solr performance key innovations

First oslo solr community meetup lightning talk janhoy
First oslo solr community meetup lightning talk janhoyFirst oslo solr community meetup lightning talk janhoy
First oslo solr community meetup lightning talk janhoyCominvent AS
 
Solr As A SparkSQL DataSource
Solr As A SparkSQL DataSourceSolr As A SparkSQL DataSource
Solr As A SparkSQL DataSourceSpark Summit
 
Terraform at Scale - All Day DevOps 2017
Terraform at Scale - All Day DevOps 2017Terraform at Scale - All Day DevOps 2017
Terraform at Scale - All Day DevOps 2017Jonathon Brouse
 
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DCIntro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DCLucidworks (Archived)
 
Solr Exchange: Introduction to SolrCloud
Solr Exchange: Introduction to SolrCloudSolr Exchange: Introduction to SolrCloud
Solr Exchange: Introduction to SolrCloudthelabdude
 
Anchoring Trust: Rewriting DNS for the Semantic Network with Ruby and Rails
Anchoring Trust: Rewriting DNS for the Semantic Network with Ruby and RailsAnchoring Trust: Rewriting DNS for the Semantic Network with Ruby and Rails
Anchoring Trust: Rewriting DNS for the Semantic Network with Ruby and RailsEleanor McHugh
 
SFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
SFBay Area Solr Meetup - June 18th: Benchmarking Solr PerformanceSFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
SFBay Area Solr Meetup - June 18th: Benchmarking Solr PerformanceLucidworks (Archived)
 
Zookeeper Introduce
Zookeeper IntroduceZookeeper Introduce
Zookeeper Introducejhao niu
 
Solr and Spark for Real-Time Big Data Analytics: Presented by Tim Potter, Luc...
Solr and Spark for Real-Time Big Data Analytics: Presented by Tim Potter, Luc...Solr and Spark for Real-Time Big Data Analytics: Presented by Tim Potter, Luc...
Solr and Spark for Real-Time Big Data Analytics: Presented by Tim Potter, Luc...Lucidworks
 
Akka Cluster in Production
Akka Cluster in ProductionAkka Cluster in Production
Akka Cluster in Productionbilyushonak
 
Solr Troubleshooting - TreeMap approach
Solr Troubleshooting - TreeMap approachSolr Troubleshooting - TreeMap approach
Solr Troubleshooting - TreeMap approachAlexandre Rafalovitch
 
Solr Troubleshooting - Treemap Approach: Presented by Alexandre Rafolovitch, ...
Solr Troubleshooting - Treemap Approach: Presented by Alexandre Rafolovitch, ...Solr Troubleshooting - Treemap Approach: Presented by Alexandre Rafolovitch, ...
Solr Troubleshooting - Treemap Approach: Presented by Alexandre Rafolovitch, ...Lucidworks
 
Webinar: Solr & Spark for Real Time Big Data Analytics
Webinar: Solr & Spark for Real Time Big Data AnalyticsWebinar: Solr & Spark for Real Time Big Data Analytics
Webinar: Solr & Spark for Real Time Big Data AnalyticsLucidworks
 
Troubleshooting common oslo.messaging and RabbitMQ issues
Troubleshooting common oslo.messaging and RabbitMQ issuesTroubleshooting common oslo.messaging and RabbitMQ issues
Troubleshooting common oslo.messaging and RabbitMQ issuesMichael Klishin
 
Small wins in a small time with Apache Solr
Small wins in a small time with Apache SolrSmall wins in a small time with Apache Solr
Small wins in a small time with Apache SolrSourcesense
 
Solr Compute Cloud – An Elastic Solr Infrastructure: Presented by Nitin Sharm...
Solr Compute Cloud – An Elastic Solr Infrastructure: Presented by Nitin Sharm...Solr Compute Cloud – An Elastic Solr Infrastructure: Presented by Nitin Sharm...
Solr Compute Cloud – An Elastic Solr Infrastructure: Presented by Nitin Sharm...Lucidworks
 
Solr 4 highlights - Mark Miller
Solr 4 highlights - Mark MillerSolr 4 highlights - Mark Miller
Solr 4 highlights - Mark Millerlucenerevolution
 

Similaire à Seeley yonik solr performance key innovations (20)

Scaling search with SolrCloud
Scaling search with SolrCloudScaling search with SolrCloud
Scaling search with SolrCloud
 
Solr4 nosql search_server_2013
Solr4 nosql search_server_2013Solr4 nosql search_server_2013
Solr4 nosql search_server_2013
 
Solr 3.1 and beyond
Solr 3.1 and beyondSolr 3.1 and beyond
Solr 3.1 and beyond
 
First oslo solr community meetup lightning talk janhoy
First oslo solr community meetup lightning talk janhoyFirst oslo solr community meetup lightning talk janhoy
First oslo solr community meetup lightning talk janhoy
 
Solr As A SparkSQL DataSource
Solr As A SparkSQL DataSourceSolr As A SparkSQL DataSource
Solr As A SparkSQL DataSource
 
Terraform at Scale - All Day DevOps 2017
Terraform at Scale - All Day DevOps 2017Terraform at Scale - All Day DevOps 2017
Terraform at Scale - All Day DevOps 2017
 
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DCIntro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
 
Solr Exchange: Introduction to SolrCloud
Solr Exchange: Introduction to SolrCloudSolr Exchange: Introduction to SolrCloud
Solr Exchange: Introduction to SolrCloud
 
Anchoring Trust: Rewriting DNS for the Semantic Network with Ruby and Rails
Anchoring Trust: Rewriting DNS for the Semantic Network with Ruby and RailsAnchoring Trust: Rewriting DNS for the Semantic Network with Ruby and Rails
Anchoring Trust: Rewriting DNS for the Semantic Network with Ruby and Rails
 
SFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
SFBay Area Solr Meetup - June 18th: Benchmarking Solr PerformanceSFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
SFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
 
Zookeeper Introduce
Zookeeper IntroduceZookeeper Introduce
Zookeeper Introduce
 
Solr and Spark for Real-Time Big Data Analytics: Presented by Tim Potter, Luc...
Solr and Spark for Real-Time Big Data Analytics: Presented by Tim Potter, Luc...Solr and Spark for Real-Time Big Data Analytics: Presented by Tim Potter, Luc...
Solr and Spark for Real-Time Big Data Analytics: Presented by Tim Potter, Luc...
 
Akka Cluster in Production
Akka Cluster in ProductionAkka Cluster in Production
Akka Cluster in Production
 
Solr Troubleshooting - TreeMap approach
Solr Troubleshooting - TreeMap approachSolr Troubleshooting - TreeMap approach
Solr Troubleshooting - TreeMap approach
 
Solr Troubleshooting - Treemap Approach: Presented by Alexandre Rafolovitch, ...
Solr Troubleshooting - Treemap Approach: Presented by Alexandre Rafolovitch, ...Solr Troubleshooting - Treemap Approach: Presented by Alexandre Rafolovitch, ...
Solr Troubleshooting - Treemap Approach: Presented by Alexandre Rafolovitch, ...
 
Webinar: Solr & Spark for Real Time Big Data Analytics
Webinar: Solr & Spark for Real Time Big Data AnalyticsWebinar: Solr & Spark for Real Time Big Data Analytics
Webinar: Solr & Spark for Real Time Big Data Analytics
 
Troubleshooting common oslo.messaging and RabbitMQ issues
Troubleshooting common oslo.messaging and RabbitMQ issuesTroubleshooting common oslo.messaging and RabbitMQ issues
Troubleshooting common oslo.messaging and RabbitMQ issues
 
Small wins in a small time with Apache Solr
Small wins in a small time with Apache SolrSmall wins in a small time with Apache Solr
Small wins in a small time with Apache Solr
 
Solr Compute Cloud – An Elastic Solr Infrastructure: Presented by Nitin Sharm...
Solr Compute Cloud – An Elastic Solr Infrastructure: Presented by Nitin Sharm...Solr Compute Cloud – An Elastic Solr Infrastructure: Presented by Nitin Sharm...
Solr Compute Cloud – An Elastic Solr Infrastructure: Presented by Nitin Sharm...
 
Solr 4 highlights - Mark Miller
Solr 4 highlights - Mark MillerSolr 4 highlights - Mark Miller
Solr 4 highlights - Mark Miller
 

Plus de Lucidworks (Archived)

Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...
Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...
Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...Lucidworks (Archived)
 
SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr
 SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr
SFBay Area Solr Meetup - July 15th: Integrating Hadoop and SolrLucidworks (Archived)
 
Chicago Solr Meetup - June 10th: Exploring Hadoop with Search
Chicago Solr Meetup - June 10th: Exploring Hadoop with SearchChicago Solr Meetup - June 10th: Exploring Hadoop with Search
Chicago Solr Meetup - June 10th: Exploring Hadoop with SearchLucidworks (Archived)
 
Minneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache Solr
Minneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache SolrMinneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache Solr
Minneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache SolrLucidworks (Archived)
 
Exploration of multidimensional biomedical data in pub chem, Presented by Lia...
Exploration of multidimensional biomedical data in pub chem, Presented by Lia...Exploration of multidimensional biomedical data in pub chem, Presented by Lia...
Exploration of multidimensional biomedical data in pub chem, Presented by Lia...Lucidworks (Archived)
 
Unstructured Or: How I Learned to Stop Worrying and Love the xml, Presented...
Unstructured   Or: How I Learned to Stop Worrying and Love the xml, Presented...Unstructured   Or: How I Learned to Stop Worrying and Love the xml, Presented...
Unstructured Or: How I Learned to Stop Worrying and Love the xml, Presented...Lucidworks (Archived)
 
Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...
Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...
Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...Lucidworks (Archived)
 
Big Data Challenges, Presented by Wes Caldwell at SolrExchage DC
Big Data Challenges, Presented by Wes Caldwell at SolrExchage DCBig Data Challenges, Presented by Wes Caldwell at SolrExchage DC
Big Data Challenges, Presented by Wes Caldwell at SolrExchage DCLucidworks (Archived)
 
What's New in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DC
What's New  in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DCWhat's New  in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DC
What's New in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DCLucidworks (Archived)
 
Solr At AOL, Presented by Sean Timm at SolrExchage DC
Solr At AOL, Presented by Sean Timm at SolrExchage DCSolr At AOL, Presented by Sean Timm at SolrExchage DC
Solr At AOL, Presented by Sean Timm at SolrExchage DCLucidworks (Archived)
 
Test Driven Relevancy, Presented by Doug Turnbull at SolrExchage DC
Test Driven Relevancy, Presented by Doug Turnbull at SolrExchage DCTest Driven Relevancy, Presented by Doug Turnbull at SolrExchage DC
Test Driven Relevancy, Presented by Doug Turnbull at SolrExchage DCLucidworks (Archived)
 
Building a data driven search application with LucidWorks SiLK
Building a data driven search application with LucidWorks SiLKBuilding a data driven search application with LucidWorks SiLK
Building a data driven search application with LucidWorks SiLKLucidworks (Archived)
 
Introducing LucidWorks App for Splunk Enterprise webinar
Introducing LucidWorks App for Splunk Enterprise webinarIntroducing LucidWorks App for Splunk Enterprise webinar
Introducing LucidWorks App for Splunk Enterprise webinarLucidworks (Archived)
 
Lucene/Solr Revolution 2013: Paul Doscher Opening Remarks
Lucene/Solr Revolution 2013: Paul Doscher Opening Remarks Lucene/Solr Revolution 2013: Paul Doscher Opening Remarks
Lucene/Solr Revolution 2013: Paul Doscher Opening Remarks Lucidworks (Archived)
 
Implementing Click-through Relevance Ranking in Solr and LucidWorks Enterprise
Implementing Click-through Relevance Ranking in Solr and LucidWorks EnterpriseImplementing Click-through Relevance Ranking in Solr and LucidWorks Enterprise
Implementing Click-through Relevance Ranking in Solr and LucidWorks EnterpriseLucidworks (Archived)
 
Building specialized industry applications using Solr, and migration from FAS...
Building specialized industry applications using Solr, and migration from FAS...Building specialized industry applications using Solr, and migration from FAS...
Building specialized industry applications using Solr, and migration from FAS...Lucidworks (Archived)
 
Using Solr in Online Travel Shopping to Improve User Experience
Using Solr in Online Travel Shopping to Improve User ExperienceUsing Solr in Online Travel Shopping to Improve User Experience
Using Solr in Online Travel Shopping to Improve User ExperienceLucidworks (Archived)
 

Plus de Lucidworks (Archived) (20)

Integrating Hadoop & Solr
Integrating Hadoop & SolrIntegrating Hadoop & Solr
Integrating Hadoop & Solr
 
The Data-Driven Paradigm
The Data-Driven ParadigmThe Data-Driven Paradigm
The Data-Driven Paradigm
 
Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...
Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...
Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...
 
SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr
 SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr
SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr
 
Chicago Solr Meetup - June 10th: Exploring Hadoop with Search
Chicago Solr Meetup - June 10th: Exploring Hadoop with SearchChicago Solr Meetup - June 10th: Exploring Hadoop with Search
Chicago Solr Meetup - June 10th: Exploring Hadoop with Search
 
What's new in solr june 2014
What's new in solr june 2014What's new in solr june 2014
What's new in solr june 2014
 
Minneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache Solr
Minneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache SolrMinneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache Solr
Minneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache Solr
 
Exploration of multidimensional biomedical data in pub chem, Presented by Lia...
Exploration of multidimensional biomedical data in pub chem, Presented by Lia...Exploration of multidimensional biomedical data in pub chem, Presented by Lia...
Exploration of multidimensional biomedical data in pub chem, Presented by Lia...
 
Unstructured Or: How I Learned to Stop Worrying and Love the xml, Presented...
Unstructured   Or: How I Learned to Stop Worrying and Love the xml, Presented...Unstructured   Or: How I Learned to Stop Worrying and Love the xml, Presented...
Unstructured Or: How I Learned to Stop Worrying and Love the xml, Presented...
 
Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...
Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...
Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...
 
Big Data Challenges, Presented by Wes Caldwell at SolrExchage DC
Big Data Challenges, Presented by Wes Caldwell at SolrExchage DCBig Data Challenges, Presented by Wes Caldwell at SolrExchage DC
Big Data Challenges, Presented by Wes Caldwell at SolrExchage DC
 
What's New in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DC
What's New  in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DCWhat's New  in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DC
What's New in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DC
 
Solr At AOL, Presented by Sean Timm at SolrExchage DC
Solr At AOL, Presented by Sean Timm at SolrExchage DCSolr At AOL, Presented by Sean Timm at SolrExchage DC
Solr At AOL, Presented by Sean Timm at SolrExchage DC
 
Test Driven Relevancy, Presented by Doug Turnbull at SolrExchage DC
Test Driven Relevancy, Presented by Doug Turnbull at SolrExchage DCTest Driven Relevancy, Presented by Doug Turnbull at SolrExchage DC
Test Driven Relevancy, Presented by Doug Turnbull at SolrExchage DC
 
Building a data driven search application with LucidWorks SiLK
Building a data driven search application with LucidWorks SiLKBuilding a data driven search application with LucidWorks SiLK
Building a data driven search application with LucidWorks SiLK
 
Introducing LucidWorks App for Splunk Enterprise webinar
Introducing LucidWorks App for Splunk Enterprise webinarIntroducing LucidWorks App for Splunk Enterprise webinar
Introducing LucidWorks App for Splunk Enterprise webinar
 
Lucene/Solr Revolution 2013: Paul Doscher Opening Remarks
Lucene/Solr Revolution 2013: Paul Doscher Opening Remarks Lucene/Solr Revolution 2013: Paul Doscher Opening Remarks
Lucene/Solr Revolution 2013: Paul Doscher Opening Remarks
 
Implementing Click-through Relevance Ranking in Solr and LucidWorks Enterprise
Implementing Click-through Relevance Ranking in Solr and LucidWorks EnterpriseImplementing Click-through Relevance Ranking in Solr and LucidWorks Enterprise
Implementing Click-through Relevance Ranking in Solr and LucidWorks Enterprise
 
Building specialized industry applications using Solr, and migration from FAS...
Building specialized industry applications using Solr, and migration from FAS...Building specialized industry applications using Solr, and migration from FAS...
Building specialized industry applications using Solr, and migration from FAS...
 
Using Solr in Online Travel Shopping to Improve User Experience
Using Solr in Online Travel Shopping to Improve User ExperienceUsing Solr in Online Travel Shopping to Improve User Experience
Using Solr in Online Travel Shopping to Improve User Experience
 

Dernier

Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 

Dernier (20)

Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 

Seeley yonik solr performance key innovations

  • 1. Solr Performance & Key Innovations Yonik Seeley, Lucid Imagination yonik@lucidimagination.com, May 26 2011
  • 2. Solr 3.1 Highlights §  Numeric range facets (similar to date faceting). §  New spatial search, including spatial filtering, boosting and sorting capabilities. §  Example Velocity driven search UI at http://localhost:8983/solr/browse §  A new faster termvector-based highlighter. §  Extended dismax (edismax) query parser with support for fielded queries, enhanced relevancy, and full lucene syntax support. §  Distributed search support for the Spell check and Terms components. 3
  • 3. Solr 3.1 Highlights (continued) §  Suggester, a fast trie-based autocomplete component. §  Sort results by any function query. §  JSON document indexing. §  CSV response format §  Apache UIMA integration for metadata extraction. §  Tons of optimizations, bugfixes, and new analysis capabilities via Apache Lucene 3.1. 4
  • 4. What’s not in 3.1? §  Result Grouping (AKA Field Collapsing) §  Pivot Faceting §  SolrCloud §  Pseudo-fields §  Pseudo-join §  Relevancy function queries §  Per-segment faceting §  *Tons* of new Lucene performance/efficiency goodness 5
  • 5. Recent Lucene Performance §  TieredMergePolicy – the new default •  Much better for incremental indexing / NRT •  Ignores segment order when selecting best merge •  Takes deletes into account •  Does not over-merge (no cascading merges) §  Finite State Transducer (FST) based terms index 6
  • 6. DocumentWriterPerThread (DWPT) Indexing thread §  Flushing new segment is now Index Writer concurrent w/ indexing §  Use multiple DWPT DWPT DWPT in-memory indexing threads/ connections §  When max mem is Flush segment to disk hit, biggest DWPT is _1_0.tiv _2_0.tiv _3_0.tiv concurrently flushed _1_0.prx _2_0.prx _3_0.prx _1_0.frq _2_0.frq _3_0.frq … … … 7
  • 7. Solr Cloud http://.../solr/collection1?distrib=true Load-balanced shard1 sub-request shard2 (replica1) (replica1) replica2 replica2 replica3 replica3 ZK node /livenodes server1:8983/solr ZK /collections server2:8983/solr node /collection1 configName=myconf ZK server2:8983/solr node /shards /shard1 /configs server1:8983/solr /myconf server2:8983/solr solrconfig.xml /shard2 schema.xml server3:8983/solr ZK server4:8983/solr ZK node node ZooKeeper quorum 8
  • 8. Solr Cloud: Getting Started http://wiki.apache.org/solr/SolrCloud java  -­‐Dbootstrap_confdir=./solr/conf      -­‐Dcollection.configName=myconf      -­‐DzkRun     Upload /solr/conf  -­‐jar  start.jar   to ZK and call it “myconf” Run an internal ZK server http://localhost:8983/solr/collection1/admin/zookeeper.jsp
  • 9. Distributed Requests l  Explicitly specify node addresses to load-balance across shards=localhost:8983/solr|localhost:8900/solr,                localhost:7574/solr|localhost:7500/solr   l  A list of equivalent nodes are separated by “|” l  Different phases of the same distributed request use the same node l  Specify logical shard ids to search across shards=NY_shard,NJ_shard   l  Query across all shards in the collection http://localhost:8983/solr/collection1/select?distrib=true     l  public  CloudSolrServer(String  zkHost)   l  SolrJ Java client that load-balances across all nodes in cluster
  • 10. Extended Dismax Parser l  Supersetof dismax l  Designed to directly handle user queries w/o exceptions &defType=edismax&q=foo&qf=body   l  Fixes edge cases where dismax could still throw exceptions OR      AND      NOT      -­‐         l  Full lucene syntax support l  Tries lucene syntax first l  Smart escaping is done if syntax errors l  Optionally supports treating and / or as AND/OR in lucene syntax l  Fielded queries (e.g. myfield:foo) even in degraded mode l  uf parameter controls what field names may be directly specified in q
  • 11. Extended Dismax Parser (continued) l  boost parameter for multiplicative boost-by-function l  Pure negative query clauses Example: solr  OR  (-­‐solr)   l  Enhanced term proximity boosting l  pf2=myfield – results in term bigrams in sloppy phrase queries  myfield: aa  bb  cc -­‐>    myfield: aa  bb    myfield: bb  cc   l  Enhanced stopword handling l  stopwords omitted in main query, but added in optional proximity boosting part Example: q=solr  is  awesome  &  qf=myfield  &  pf2=myfield      -­‐>          +myfield:(solr  awesome)    (myfield: solr  is  myfield: is   awesome )   l  Currently controlled by the absence of StopWordFilter in index analyzer, and presence in query analyzer
  • 12. Faceting Performance Improvements l  For facet.method=enum, speed up initial population of the filterCache (i.e. first time facet): from 30% to 32x improvement l  Optimized facet.method=fc for multi-valued fields and large facet.limit – up to 3x faster l  Optimized deep facet paging – up to 10x faster with really large facet.offsets l  Less memory consumed by field cache entries l  Per-segment faceting with facet.method=fcs l  Only faster when re-opening index frequently (many times a second) l  Only works for single-valued fields
  • 13. Pivot Faceting l  Other names that could have made sense: l  Grid Faceting, Cross-Product Faceting, Matrix Faceting l  Syntax: facet.pivot=field1,field2,field3,… facet.pivot=cat,inStock #docs #docs w/ #docs w/ inStock:true instock:false cat:electronics 14 10 4 cat:memory 3 3 0 cat:connector 2 0 2 cat:graphics card 2 0 2 cat:hard drive 2 2 0
  • 14. Pivot Faceting http://...&facet=true&facet.pivot=cat,popularity "facet_counts":{ (continued) "facet_pivot":{ "cat,popularity":[{ { "field":"cat", "field":"popularity", 14 docs w/ "value":"electronics", "value":"1", cat==electronics "count":14, "count":2}]}, "pivot":[{ { 5 docs w/ "field":"popularity", "field":"cat", cat==electronics "value":"6", "value":"memory", && popularity==6 "count":5}, "count":3, { "pivot":[]}, "field":"popularity", "value":"7", […] "count":4},
  • 15. Range Faceting "facet_counts":{ §  Like Date faceting, but "facet_ranges":{ more generic "price":{ "counts":{ "0.0":5, http://...&facet=true "50.0":2, &facet.range=price "100.0":0, "150.0":2, &facet.range.start=0 "200.0":0, &facet.range.end=500 "250.0":1, "300.0":2, &facet.range.gap=50 "350.0":2, "400.0":0, "450.0":1}, "gap":50.0, "start":0.0, "end":500.0}}}}
  • 16. Spatial Search Step1: Index some locations! <field name= name >The Alpine Shop</field> <field name= store >44.013617,-73.168264</field> Step2: Decide where you are &pt=44.0153371,-73.16734 &d=1 &sfield=store Step3: Profit! Spatial Filter: &fq={!geofilt} Bounding Box: &fq={!bbox} Distance Function: &sort=geodist() asc Returning the distance: &fl=geodist() Pseudo-fields! Note: You can now sort by any arbitrary function query!
  • 17. Pseudo-Fields Returns other info along with document stored fields §  Function queries fl=name,location,geodist(),add(myfield,10)   §  Fieldname globs fl=id,attr_*   §  Multiple “fl” (field list) values &fl=id,attr_*&fl=geodist()&fl=termfreq(text,’solr’)   §  Aliasing fl=id,location:loc,_dist_:geodist()   §  Future: inlined highlighting, “explain”, sort-values, group-value   18
  • 18. Result Grouping / Field Collapsing l  Goal l Limit the number of results per category l  category normally defined by unique values in a field l  Uses l  Web Search – collapse by web site l  Email threads – collapse by thread id l  Ecommerce/retail l  Show the top 5 items for each store category (music, movies, etc)
  • 20. Result Grouping by Category Field Collapse on Product Type
  • 21. Group by Field http://...&fl=id,name&q=ipod&group=true&group.field=manu_exact "grouped":{ "manu_exact":{ "matches":3, "groups":[{ "groupValue":"Belkin", "doclist":{"numFound":2,"start":0,"docs":[ { "id":"IW-02", "name":"iPod & iPod Mini USB 2.0 Cable"}] }}, { "groupValue":"Apple Computer Inc.", "doclist":{"numFound":1,"start":0,"docs":[ {
  • 22. Group by Query http://...&group=true&group.query=price:[0 TO 99.99] &group.query=price:[100 TO *]&group.limit=5 "grouped":{ "price:[0 TO 99.99]":{ "matches":3, "doclist":{"numFound":2,"start":0,"docs":[ { "id":"IW-02", "name":"iPod & iPod Mini USB 2.0 Cable"}, { "id":"F8V7067-APL-KIT", "name":"Belkin Mobile Power Cord for iPod"}] }}, "price:[100 TO *]":{ "matches":3, "doclist":{"numFound":1,"start":0,"docs":[
  • 23. Grouping Params parameter meaning default group.field=<field> Like facet.field – group by unique field values group.query=<query> Like facet.query – top docs that also match group.function=<function Group by unique values produced by the query> function query group.limit=<n> How many docs per group 1 group.sort=<sort spec> How to sort documents within a group Same as sort rows=<n> How many groups to return 10 sort=<sort spec> How to sort the groups relative to each other (based on top doc) group.format=<format> grouped/simple – if simple, a single flat grouped list is used and rows units are “docs” group.main=true/false If true, the first field grouping command is false used as main result set
  • 24. Pseudo-Join id: blog1 id: post1 blog_id: blog1 name: Solr ‘n Stuff author: Yonik Seeley owner: Yonik Seeley title: Solr relevancy function queries Started: 2007-10-26 body: Lucene’s default ranking […] id: blog2 id: post2 name: lifehacker blog_id: blog1 author: Yonik Seeley owner: Gawker Media title: Solr result grouping started: 2005-1-31 body: Result Grouping, also called […] id: post3 blog_id: blog2 Restrict to blogs mentioning netflix author: Whitson Gordon title: How to Install Netflix on Almost Any Android Device fq={!join from=blog_id to=id}body:netflix -  Finds all documents matching “netflix” -  Maps to different docs by following blog_id to id 25
  • 25. Pseudo-Join Examples §  Only show posts from blogs started after 2010 q=foo&fq={!join from=id to=blog_id}started:[2010 TO *] §  If any post in a blog mentions “obama”, then search all posts in that blog for “bomb” (self-join) q=bomb&fq={!join from=blog_id to=blog_id}obama §  If any blog post mentions “obama”, then search all websites with the same blog owner for “bomb” q=bomb&fq={!join from=owner to=website_owner}{!join from=blog_id to=id}obama 26
  • 26. Cross-Core Join id: doc1 security: managers id: mary title: doc for managers only security_groups: managers, employees body: … id: doc1 id: john security_groups: employees security: managers, employees title: doc for everyone body: … collection1 sec1 Single Solr Server http://localhost:8983/solr/collection1/select?q=foo&fq={!join fromIndex=sec1 from=security_groups to=security}user:john 27
  • 27. Pseudo-Join vs Grouping Pseudo-Join Result Grouping / Field Collapsing O(n_terms_in_join_fields) O(n_docs_in_result) Single or multi-valued fields Single-valued fields only Filters only (no info currently passed from Can order docs within a group and groups the “from” docs to the “to” docs). by top doc within that group using normal sort criteria. Chainable (one join can be the input to Not currently chainable – can only group another) one field deep Affects which documents match a request, Grouping does not currently affect the set so naturally affects facet numbers (e.g. of documents matching the query, so you can search posts and get numbers of faceting is unaffected. blogs) 28
  • 28. Auto-Suggest l  Many people previously used terms component l  Can be slow for a large corpus l  New auto-suggest builds off SpellCheck component l  TST implementation: compact memory based trie l  FST implementation: slower to build, but smaller & faster lookup l  Based on a field in the main index, or on a dictionary file http://localhost:8983/solr/suggest?wt=json&indent=true&q=ult "spellcheck":{ "suggestions":[ "ult",{ "numFound":1, "startOffset":0, "endOffset":3, "suggestion":["ultrasharp"]}, "collation","ultrasharp"]}} 29
  • 29. Index with JSON $  URL=http://localhost:8983/solr/update/json   $  curl  $URL  -­‐H  'Content-­‐type:application/json'  -­‐d  ’   [      {          "id"  :  "978-­‐0641723445",          "cat"  :  ["book","hardcover"],          "title"  :  "The  Lightning  Thief",          "author"  :  "Rick  Riordan",          "series_t"  :  "Percy  Jackson  and  the  Olympians",          "sequence_i"  :  1,          "genre_s"  :  "fantasy",          "inStock"  :  true,          "price"  :  12.50,          "pages_i"  :  384      }   ]'  
  • 30. Query Results in CSV http://localhost:8983/solr/select?q=ipod&fl=name,price,cat,popularity&wt=csv name,price,cat,popularity iPod & iPod Mini USB 2.0 Cable,11.5,"electronics,connector",1 Belkin Mobile Power Cord for iPod w/ Dock,19.95,"electronics,connector",1 Apple 60 GB iPod with Video Playback Black,399.0,"electronics,music",10 l  Can handle multi-valued fields (see cat field in example) l  Completely compatible with the CSV update handler (can round-trip) l  Results are streamed – good for dumping entire parts of the index
  • 32. Q&A