SlideShare une entreprise Scribd logo
1  sur  32
Télécharger pour lire hors ligne
Digging into solr
Rails Usergroup Hamburg 13. April 2011
Overview
●   What is solr
●   Solr integration into Rails
●   Challenges for the search
●   Experiences
What is solr
●   Matthew 7:7b / Lukas 11:9b
●   (sermon on the Mount)
●   seek and you will find;
What is solr
What is solr
                           HTTP Request Servlet                     Update Servlet

Admin
                                                                        XML
                          Different Request Handler
                                                                       Update


                 schema
                                                      caching
        config                       Solr Core
                                                                concurrency



                                    Lucene

                                                                      Replication
What is solr
●   Unstructured rows
●   Denormalization of data
●   Dynamic fields
●   Schema → Tokenizer, Filters, etc.
●   Tons of XML
What is solr

          Indexing                                      Query


                                               Filter   Tokenizer Query
Tokenizer Token   Filter   Strings


                                     Index

                                             Results
What is solr
●   Get Requests
hl.fragsize=0
&spellcheck=true
&spellcheck.extendedResults=true
&qf=everything_phonetic_wa^1+display_name_phonetic_wa^2+comment_en_wa^4+revi
ew_en_wa^8+everything_en_wa^16+everything_wa^32+display_name_en_wa^64+displ
ay_name_wa^128
&spellcheck.collate=true
&wt=ruby
&hl=true
&rows=100
&f =pk_i,score
  l
&start=0
&q=chipotle+bbq
&spellcheck.dictionary=spell_en
&bf=linear(en_rating_points_i,100,0)
&spellcheck.count=1
&qt=dismax&
fq=closed_b:false+AND+domain_id_s:uki*+AND+(type_s:Place)
What is solr
●   Response type
    ●   XML
    ●   Ruby
    ●   JSON
    ●   XML + XSLT
    ●   etc.
Solr integration into Rails
●   Sunspot
●   acts_as_solr
●   Qype → acts_as_solr
●   Optimized Queries for solr
    ●   Monkey patching
    ●   Defined queries without dynamic fields
    ●   Names of search fields differ from AR names
Solr integration into Rails
●   Data consistency
    ●   Synchronous
        –   AR stores in mysql and solr
        –   Longer response times
        –   Not really synchron in case of replication
    ●   Asynchronous
        –   AR stores in mysql
        –   Data import via mysql requests by solr master
        –   Out of sync for some minutes
        –   Deletion by flag, later physically
        –   Javascript preprocessing of data possible
Challenges - Spellchecking
●   Pool of words for spellchecking
    Words from real data

                                           ?
●


●   Beeeeeeer
●   9 Languages                            CC BY-ND 2.0 - JM3


●   New → Spellchecker for different kind of data
●   Suggestion → Locator → Facet → best match ?
●   Similar word → fuzzy search vs. spellchecking
Challenges - Spellchecking

                                                           Chipotle BBQ
CC BY-ND 2.0
 raybdbomb          CC BY-ND 2.0 - Meindert Arnold Jacob




Chinese Baby
                                                                CC BY-ND 2.0 - joshDubya




        !      CC BY-ND 2.0 - michael clarke stuff
                                                           shingles
Challenges – Stemming
●   Stemming vs. Lemmatizing
●   9 Languages
●   Hafen – Hafer (Harbor – Oat)
●   Performance
●   Stemming → solr SnowBallPorterFactory
●   Polish → Lemmatizng → OpenOffice
Challenges – Synonyms
●   9 Languages
●   OpenOffice rules !
●   Not all languages available → NL is missing
Challenges – NGrams
●   Hugh Index
●   Tee matches Steeb
●   EdgeNGrams
●   Bar → Sofabar → Barmbek
    ●   Not matched string shall be a word → performance
Challenges – Phrases
●   Boost matching of phrases → whole entry
    ●   'Europa Passage'
●   Boost matching of phrases → left sided
    ●   'Galeria Kaufhof in Hamburg'
    ●   'Boutique in Galeria Kaufhof'
    ●   Javascript pre processing
●   Boost matching of phrase somewhere in entry
●   How to handle matches of some words in given
    phrase?
Challenges – Whitespace in index
●   Index: 'Ping Pong'
●   Search word: 'Pingpong'
●   Javascript pre processing


                                     CC BY-ND 2.0 - zimpenfish




             CC BY-ND 2.0 - Ewan-M
Experiences – sever setup
               Live                Staging      Dev
            Loadbalancer            Slave        iMac

 Solr queries
                                    Master
   Slave        Slave      Slave

Replication                                   Solr & MySql
                                   DB Slave
               Master

           Import
              DB Slave
Experiences – size of indices
●   Staging System → Sunday evening
●   Places in simple format: 712 MB
●   Previews simple format: 5,519 GByte
●   Places Previews Comments extended: 3,5 GB
●   Big Spellchecker: 16 GByte
●   New combined index: 15 GByte
    ●   Index: 14 Gbyte
    ●   Spellchecker: 1 GByte
Experiences – server setup
●   Live Servers
●   2 x 8 Cores, 2 x 16 Cores
●   32 Gbyte RAM
●   Max. CPU usage: up to 500%
●   Solr loves RAM → 32 Gbyte full with cache
Experiences – Solr loves RAM
●   Dev → 1 Gig
●   Staging → 4.5 Gig (no load)
●   Import → 11 Gig and more
●   Production → 14 Gig
Experiences – Solr loves RAM prod.
              slave
Experiences – accesses
●   More than ~60 requests per seconds are not
    recommended
●   Max of 40 requests per seconds is OK
Experiences – accesses
Experiences – CPU load
●   Last Import → up to 250 %
●   Production (slave):
Experiences – Response times
Experiences – Response times
●   Spellchecking 'pizzt' big index (staging):
●   1502 / 48 / 47 / 48 / 31 ms
●   Spellchecking 'pizzt' small index (staging):
●   603 / 12 / 8 / 9 / 9 ms
Experiences – Response times
●   Facet for spellchecking:
●   facet=true&facet.mincount=0&facet.limit=1&wt=ruby&rows=0&fl=pk_i,score&
    facet.query=comment_de_wa:"pizza"+OR+review_de_wa:"pizza"+OR+everything_de_wa:"pizza"+OR+everything_wa:"pizza"+
    OR+display_name_de_wa:"pizza"+OR+display_name_wa:"pizza"+OR+display_name_ngram:"pizza"&
    facet.query=comment_de_wa:"pizze"+OR+review_de_wa:"pizze"+OR+everything_de_wa:"pizze"+OR+everything_wa:"pizze"+
    OR+display_name_de_wa:"pizze"+OR+display_name_wa:"pizze"+OR+display_name_ngram:"pizze"&
    facet.query=comment_de_wa:"pizz"+OR+review_de_wa:"pizz"+OR+everything_de_wa:"pizz"+OR+everything_wa:"pizz"+OR+di
    splay_name_de_wa:"pizz"+OR+display_name_wa:"pizz"+OR+display_name_ngram:"pizz"&
    facet.query=comment_de_wa:"pizzi"+OR+review_de_wa:"pizzi"+OR+everything_de_wa:"pizzi"+OR+everything_wa:"pizzi"+OR+
    display_name_de_wa:"pizzi"+OR+display_name_wa:"pizzi"+OR+display_name_ngram:"pizzi"&
    facet.query=comment_de_wa:"pizzs"+OR+review_de_wa:"pizzs"+OR+everything_de_wa:"pizzs"+OR+everything_wa:"pizzs"+O
    R+display_name_de_wa:"pizzs"+OR+display_name_wa:"pizzs"+OR+display_name_ngram:"pizzs"&f
    facet.query=comment_de_wa:"pizzo"+OR+review_de_wa:"pizzo"+OR+everything_de_wa:"pizzo"+OR+everything_wa:"pizzo"+
    OR+display_name_de_wa:"pizzo"+OR+display_name_wa:"pizzo"+OR+display_name_ngram:"pizzo"&
    facet.query=comment_de_wa:"pizzy"+OR+review_de_wa:"pizzy"+OR+everything_de_wa:"pizzy"+OR+everything_wa:"pizzy"+O
    R+display_name_de_wa:"pizzy"+OR+display_name_wa:"pizzy"+OR+display_name_ngram:"pizzy"&
    facet.query=comment_de_wa:"pizzn"+OR+review_de_wa:"pizzn"+OR+everything_de_wa:"pizzn"+OR+everything_wa:"pizzn"+
    OR+display_name_de_wa:"pizzn"+OR+display_name_wa:"pizzn"+OR+display_name_ngram:"pizzn"&
    facet.query=comment_de_wa:"pezzt"+OR+review_de_wa:"pezzt"+OR+everything_de_wa:"pezzt"+OR+everything_wa:"pezzt"+
    OR+display_name_de_wa:"pezzt"+OR+display_name_wa:"pezzt"+OR+display_name_ngram:"pezzt"&
    facet.query=comment_de_wa:"pizz√§"+OR+review_de_wa:"pizz√§"+OR+everything_de_wa:"pizz√§"+OR+everything_wa:"pizz√
    §"+OR+display_name_de_wa:"pizz√§"+OR+display_name_wa:"pizz√§"+OR+display_name_ngram:"pizz√§"&
    q=*:*&qt=standard&fq=closed_b:false+AND+domain_id_s:de600-hamburg*+AND+(type_s:Place)


●   10 facets: 231 / 5 /4 / 22 / 3(->xml) ms
Experiences – Response times

●   Warming up → Staging vs. Production
●   Staging: slow
●   Production: fast
Experiences – Response times

●   Staging / index schama on prod
●   Standard Query 'pizza': 106 / 0 / 0 (9122)
●   Fuzzy (pizza~0.3): 4440 / 663 / 0 (40149)
●   Fuzzy (pizza~0.5): 822 / 0 / 0    (12129)
●   Fuzzy (pizza~0.8): 34 / 1 / 0     (9122)
●   Wildcard: (rest*): 39 / 0 / 0      (41031)
Experiences - Monitoring
●   Munin
●   New Relic

Contenu connexe

Similaire à Solr rug

Using Solr in Online Travel Shopping to Improve User Experience
Using Solr in Online Travel Shopping to Improve User ExperienceUsing Solr in Online Travel Shopping to Improve User Experience
Using Solr in Online Travel Shopping to Improve User Experience
Lucidworks (Archived)
 
MongoDB Basic Concepts
MongoDB Basic ConceptsMongoDB Basic Concepts
MongoDB Basic Concepts
MongoDB
 

Similaire à Solr rug (20)

Cacheconcurrencyconsistency cassandra svcc
Cacheconcurrencyconsistency cassandra svccCacheconcurrencyconsistency cassandra svcc
Cacheconcurrencyconsistency cassandra svcc
 
The Year of JRuby - RubyC 2018
The Year of JRuby - RubyC 2018The Year of JRuby - RubyC 2018
The Year of JRuby - RubyC 2018
 
Solr on Windows: Does it Work? Does it Scale? - Teun Duynstee
Solr on Windows: Does it Work? Does it Scale? - Teun DuynsteeSolr on Windows: Does it Work? Does it Scale? - Teun Duynstee
Solr on Windows: Does it Work? Does it Scale? - Teun Duynstee
 
Scylla Summit 2018: How We Made Large Partition Scans Over Two Times Faster
Scylla Summit 2018: How We Made Large Partition Scans Over Two Times FasterScylla Summit 2018: How We Made Large Partition Scans Over Two Times Faster
Scylla Summit 2018: How We Made Large Partition Scans Over Two Times Faster
 
Explore the Cosmos (DB) with .NET Core 2.0
Explore the Cosmos (DB) with .NET Core 2.0Explore the Cosmos (DB) with .NET Core 2.0
Explore the Cosmos (DB) with .NET Core 2.0
 
MyRocks Deep Dive
MyRocks Deep DiveMyRocks Deep Dive
MyRocks Deep Dive
 
Using Spring with NoSQL databases (SpringOne China 2012)
Using Spring with NoSQL databases (SpringOne China 2012)Using Spring with NoSQL databases (SpringOne China 2012)
Using Spring with NoSQL databases (SpringOne China 2012)
 
Use all the buzzwords
Use all the buzzwordsUse all the buzzwords
Use all the buzzwords
 
Scala & Spark(1.6) in Performance Aspect for Scala Taiwan
Scala & Spark(1.6) in Performance Aspect for Scala TaiwanScala & Spark(1.6) in Performance Aspect for Scala Taiwan
Scala & Spark(1.6) in Performance Aspect for Scala Taiwan
 
Solr @ eBay Kleinanzeigen
Solr @ eBay KleinanzeigenSolr @ eBay Kleinanzeigen
Solr @ eBay Kleinanzeigen
 
mtl_rubykaigi
mtl_rubykaigimtl_rubykaigi
mtl_rubykaigi
 
Using Solr in Online Travel Shopping to Improve User Experience
Using Solr in Online Travel Shopping to Improve User ExperienceUsing Solr in Online Travel Shopping to Improve User Experience
Using Solr in Online Travel Shopping to Improve User Experience
 
An Introduction to Basics of Search and Relevancy with Apache Solr
An Introduction to Basics of Search and Relevancy with Apache SolrAn Introduction to Basics of Search and Relevancy with Apache Solr
An Introduction to Basics of Search and Relevancy with Apache Solr
 
MongoDB: Optimising for Performance, Scale & Analytics
MongoDB: Optimising for Performance, Scale & AnalyticsMongoDB: Optimising for Performance, Scale & Analytics
MongoDB: Optimising for Performance, Scale & Analytics
 
Polyglot and Functional Programming (OSCON 2012)
Polyglot and Functional Programming (OSCON 2012)Polyglot and Functional Programming (OSCON 2012)
Polyglot and Functional Programming (OSCON 2012)
 
Erlang White Label
Erlang White LabelErlang White Label
Erlang White Label
 
MapReduce with Hadoop and Ruby
MapReduce with Hadoop and RubyMapReduce with Hadoop and Ruby
MapReduce with Hadoop and Ruby
 
SELF - Becoming a Rails Developer - The Rest of the Story
SELF - Becoming a Rails Developer - The Rest of the StorySELF - Becoming a Rails Developer - The Rest of the Story
SELF - Becoming a Rails Developer - The Rest of the Story
 
MongoDB Basic Concepts
MongoDB Basic ConceptsMongoDB Basic Concepts
MongoDB Basic Concepts
 
10 EZ Steps to SOLR Domination - Berlin Buzzwords 2012
10 EZ Steps to SOLR Domination - Berlin Buzzwords 201210 EZ Steps to SOLR Domination - Berlin Buzzwords 2012
10 EZ Steps to SOLR Domination - Berlin Buzzwords 2012
 

Dernier

Dernier (20)

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 

Solr rug

  • 1. Digging into solr Rails Usergroup Hamburg 13. April 2011
  • 2. Overview ● What is solr ● Solr integration into Rails ● Challenges for the search ● Experiences
  • 3. What is solr ● Matthew 7:7b / Lukas 11:9b ● (sermon on the Mount) ● seek and you will find;
  • 5. What is solr HTTP Request Servlet Update Servlet Admin XML Different Request Handler Update schema caching config Solr Core concurrency Lucene Replication
  • 6. What is solr ● Unstructured rows ● Denormalization of data ● Dynamic fields ● Schema → Tokenizer, Filters, etc. ● Tons of XML
  • 7. What is solr Indexing Query Filter Tokenizer Query Tokenizer Token Filter Strings Index Results
  • 8. What is solr ● Get Requests hl.fragsize=0 &spellcheck=true &spellcheck.extendedResults=true &qf=everything_phonetic_wa^1+display_name_phonetic_wa^2+comment_en_wa^4+revi ew_en_wa^8+everything_en_wa^16+everything_wa^32+display_name_en_wa^64+displ ay_name_wa^128 &spellcheck.collate=true &wt=ruby &hl=true &rows=100 &f =pk_i,score l &start=0 &q=chipotle+bbq &spellcheck.dictionary=spell_en &bf=linear(en_rating_points_i,100,0) &spellcheck.count=1 &qt=dismax& fq=closed_b:false+AND+domain_id_s:uki*+AND+(type_s:Place)
  • 9. What is solr ● Response type ● XML ● Ruby ● JSON ● XML + XSLT ● etc.
  • 10. Solr integration into Rails ● Sunspot ● acts_as_solr ● Qype → acts_as_solr ● Optimized Queries for solr ● Monkey patching ● Defined queries without dynamic fields ● Names of search fields differ from AR names
  • 11. Solr integration into Rails ● Data consistency ● Synchronous – AR stores in mysql and solr – Longer response times – Not really synchron in case of replication ● Asynchronous – AR stores in mysql – Data import via mysql requests by solr master – Out of sync for some minutes – Deletion by flag, later physically – Javascript preprocessing of data possible
  • 12. Challenges - Spellchecking ● Pool of words for spellchecking Words from real data ? ● ● Beeeeeeer ● 9 Languages CC BY-ND 2.0 - JM3 ● New → Spellchecker for different kind of data ● Suggestion → Locator → Facet → best match ? ● Similar word → fuzzy search vs. spellchecking
  • 13. Challenges - Spellchecking Chipotle BBQ CC BY-ND 2.0 raybdbomb CC BY-ND 2.0 - Meindert Arnold Jacob Chinese Baby CC BY-ND 2.0 - joshDubya ! CC BY-ND 2.0 - michael clarke stuff shingles
  • 14. Challenges – Stemming ● Stemming vs. Lemmatizing ● 9 Languages ● Hafen – Hafer (Harbor – Oat) ● Performance ● Stemming → solr SnowBallPorterFactory ● Polish → Lemmatizng → OpenOffice
  • 15. Challenges – Synonyms ● 9 Languages ● OpenOffice rules ! ● Not all languages available → NL is missing
  • 16. Challenges – NGrams ● Hugh Index ● Tee matches Steeb ● EdgeNGrams ● Bar → Sofabar → Barmbek ● Not matched string shall be a word → performance
  • 17. Challenges – Phrases ● Boost matching of phrases → whole entry ● 'Europa Passage' ● Boost matching of phrases → left sided ● 'Galeria Kaufhof in Hamburg' ● 'Boutique in Galeria Kaufhof' ● Javascript pre processing ● Boost matching of phrase somewhere in entry ● How to handle matches of some words in given phrase?
  • 18. Challenges – Whitespace in index ● Index: 'Ping Pong' ● Search word: 'Pingpong' ● Javascript pre processing CC BY-ND 2.0 - zimpenfish CC BY-ND 2.0 - Ewan-M
  • 19. Experiences – sever setup Live Staging Dev Loadbalancer Slave iMac Solr queries Master Slave Slave Slave Replication Solr & MySql DB Slave Master Import DB Slave
  • 20. Experiences – size of indices ● Staging System → Sunday evening ● Places in simple format: 712 MB ● Previews simple format: 5,519 GByte ● Places Previews Comments extended: 3,5 GB ● Big Spellchecker: 16 GByte ● New combined index: 15 GByte ● Index: 14 Gbyte ● Spellchecker: 1 GByte
  • 21. Experiences – server setup ● Live Servers ● 2 x 8 Cores, 2 x 16 Cores ● 32 Gbyte RAM ● Max. CPU usage: up to 500% ● Solr loves RAM → 32 Gbyte full with cache
  • 22. Experiences – Solr loves RAM ● Dev → 1 Gig ● Staging → 4.5 Gig (no load) ● Import → 11 Gig and more ● Production → 14 Gig
  • 23. Experiences – Solr loves RAM prod. slave
  • 24. Experiences – accesses ● More than ~60 requests per seconds are not recommended ● Max of 40 requests per seconds is OK
  • 26. Experiences – CPU load ● Last Import → up to 250 % ● Production (slave):
  • 28. Experiences – Response times ● Spellchecking 'pizzt' big index (staging): ● 1502 / 48 / 47 / 48 / 31 ms ● Spellchecking 'pizzt' small index (staging): ● 603 / 12 / 8 / 9 / 9 ms
  • 29. Experiences – Response times ● Facet for spellchecking: ● facet=true&facet.mincount=0&facet.limit=1&wt=ruby&rows=0&fl=pk_i,score& facet.query=comment_de_wa:"pizza"+OR+review_de_wa:"pizza"+OR+everything_de_wa:"pizza"+OR+everything_wa:"pizza"+ OR+display_name_de_wa:"pizza"+OR+display_name_wa:"pizza"+OR+display_name_ngram:"pizza"& facet.query=comment_de_wa:"pizze"+OR+review_de_wa:"pizze"+OR+everything_de_wa:"pizze"+OR+everything_wa:"pizze"+ OR+display_name_de_wa:"pizze"+OR+display_name_wa:"pizze"+OR+display_name_ngram:"pizze"& facet.query=comment_de_wa:"pizz"+OR+review_de_wa:"pizz"+OR+everything_de_wa:"pizz"+OR+everything_wa:"pizz"+OR+di splay_name_de_wa:"pizz"+OR+display_name_wa:"pizz"+OR+display_name_ngram:"pizz"& facet.query=comment_de_wa:"pizzi"+OR+review_de_wa:"pizzi"+OR+everything_de_wa:"pizzi"+OR+everything_wa:"pizzi"+OR+ display_name_de_wa:"pizzi"+OR+display_name_wa:"pizzi"+OR+display_name_ngram:"pizzi"& facet.query=comment_de_wa:"pizzs"+OR+review_de_wa:"pizzs"+OR+everything_de_wa:"pizzs"+OR+everything_wa:"pizzs"+O R+display_name_de_wa:"pizzs"+OR+display_name_wa:"pizzs"+OR+display_name_ngram:"pizzs"&f facet.query=comment_de_wa:"pizzo"+OR+review_de_wa:"pizzo"+OR+everything_de_wa:"pizzo"+OR+everything_wa:"pizzo"+ OR+display_name_de_wa:"pizzo"+OR+display_name_wa:"pizzo"+OR+display_name_ngram:"pizzo"& facet.query=comment_de_wa:"pizzy"+OR+review_de_wa:"pizzy"+OR+everything_de_wa:"pizzy"+OR+everything_wa:"pizzy"+O R+display_name_de_wa:"pizzy"+OR+display_name_wa:"pizzy"+OR+display_name_ngram:"pizzy"& facet.query=comment_de_wa:"pizzn"+OR+review_de_wa:"pizzn"+OR+everything_de_wa:"pizzn"+OR+everything_wa:"pizzn"+ OR+display_name_de_wa:"pizzn"+OR+display_name_wa:"pizzn"+OR+display_name_ngram:"pizzn"& facet.query=comment_de_wa:"pezzt"+OR+review_de_wa:"pezzt"+OR+everything_de_wa:"pezzt"+OR+everything_wa:"pezzt"+ OR+display_name_de_wa:"pezzt"+OR+display_name_wa:"pezzt"+OR+display_name_ngram:"pezzt"& facet.query=comment_de_wa:"pizz√§"+OR+review_de_wa:"pizz√§"+OR+everything_de_wa:"pizz√§"+OR+everything_wa:"pizz√ §"+OR+display_name_de_wa:"pizz√§"+OR+display_name_wa:"pizz√§"+OR+display_name_ngram:"pizz√§"& q=*:*&qt=standard&fq=closed_b:false+AND+domain_id_s:de600-hamburg*+AND+(type_s:Place) ● 10 facets: 231 / 5 /4 / 22 / 3(->xml) ms
  • 30. Experiences – Response times ● Warming up → Staging vs. Production ● Staging: slow ● Production: fast
  • 31. Experiences – Response times ● Staging / index schama on prod ● Standard Query 'pizza': 106 / 0 / 0 (9122) ● Fuzzy (pizza~0.3): 4440 / 663 / 0 (40149) ● Fuzzy (pizza~0.5): 822 / 0 / 0 (12129) ● Fuzzy (pizza~0.8): 34 / 1 / 0 (9122) ● Wildcard: (rest*): 39 / 0 / 0 (41031)
  • 32. Experiences - Monitoring ● Munin ● New Relic