SlideShare une entreprise Scribd logo
1  sur  50
Télécharger pour lire hors ligne
Introductions…!
  Who	
  the	
  hell	
  am	
  I?	
  
    Jay	
  Hill,	
  Lucid	
  Imagina-on	
  
    7	
  years	
  Lucene	
  experience	
  
    4	
  years	
  Solr	
  experience	
  
    Author	
  of	
  Lucid	
  Training	
  
    SME	
  for	
  Lucid	
  Cer-fica-on	
  
  Who	
  the	
  hell	
  are	
  you?	
  
    New	
  to	
  search?	
  
    New	
  to	
  Lucene/Solr?	
  
    BaKle-­‐tested	
  veterans?	
  


©	
  Lucid	
  Imagina-on,	
  Inc.	
  
We'll Leave Time For Q&A!
  Who's	
  doing	
  what?	
  
    Solr	
  3.1?	
  
    Solr	
  1.4.1?	
  
    Nightly	
  build?	
  
    Solr	
  1.3	
  or	
  older?	
  

  Are	
  there	
  any	
  specific	
  problems	
  you're	
  having?	
  
  Meanwhile,	
  interrupt,	
  ask	
  ques8ons	
  as	
  we	
  go,	
  etc.	
  	
  




©	
  Lucid	
  Imagina-on,	
  Inc.	
  
A Brief Word About Lucid Imagination!
  Lucid	
  Imagina8on:	
  
     The	
  commercial	
  company	
  suppor-ng	
  	
  
      Lucene/Solr	
  open	
  source	
  search.	
  
     Founded	
  by	
  	
  
         Yonik	
  Seeley	
  –	
  Creator	
  of	
  Solr	
  
         Erik	
  Hatcher	
  –	
  Co-­‐author,	
  Lucene	
  In	
  Ac-on	
  
         Grant	
  Ingersoll	
  –	
  Apache	
  PMC	
  Chair	
  
         Marc	
  Krellenstein	
  –	
  Lucid	
  CTO	
  
     Staff	
  includes	
  9	
  Lucene/Solr	
  commiKers	
  
     Training,	
  cer-fica-on,	
  support,	
  LucidWorks	
  Enterprise	
  



©	
  Lucid	
  Imagina-on,	
  Inc.	
  
Lucid Customers (That I've Worked With)!




©	
  Lucid	
  Imagina-on,	
  Inc.	
  
…On To The Sinning!!




©	
  Lucid	
  Imagina-on,	
  Inc.	
  
Sins As Anti-Patterns?!
  "Sorta	
  kinda"	
  
     Specify	
  Nothing	
  (Sloth)	
  
     Creeping	
  Featurei-s	
  (Greed)	
  
     Blowhard	
  Jamboree	
  (Pride)	
  
     Boat	
  Anchor	
  (Lust)	
  
     Not	
  Invented	
  Here	
  (Envy)	
  
     Phatware	
  (GluKony)	
  
     Emperor's	
  New	
  Clothes	
  (Wrath)	
  




©	
  Lucid	
  Imagina-on,	
  Inc.	
  
Sins Can Contradict One Another!!
  You'll	
  no-ce	
  that	
  many	
  of	
  the	
  "sins"	
  	
  
   we	
  see	
  will	
  be	
  the	
  exact	
  opposite	
  of	
  	
  
   others	
  
  Just	
  as	
  some	
  of	
  us	
  tend	
  towards	
  	
  
   laziness,	
  others	
  towards	
  excess	
  

  Some-mes	
  you	
  -­‐	
  
     "Look	
  before	
  you	
  leap."	
  
  Other	
  -mes,	
  	
  
     "He	
  who	
  hesitates	
  is	
  lost."	
  
  In	
  Solr	
  (or	
  any	
  search	
  app),	
  one	
  size	
  never	
  fits	
  all	
  


©	
  Lucid	
  Imagina-on,	
  Inc.	
  
"I	
  don't	
  know	
  
                                        and	
  I	
  don't	
  care."	
  




©	
  Lucid	
  Imagina-on,	
  Inc.	
  
Sloth!
  "We	
  aren't	
  really	
  into	
  open	
  source."	
  
     Lack	
  of	
  commitment	
  to	
  Solr	
  and/or	
  the	
  search	
  
      applica-on	
  itself	
  
  Not	
  developing	
  in-­‐house	
  Solr	
  exper-se	
  
  Not	
  paying	
  enough	
  aKen-on	
  to	
  JVM	
  sebngs,	
  	
  
   garbage	
  collec-on,	
  and	
  RAM	
  alloca-on.	
  




©	
  Lucid	
  Imagina-on,	
  Inc.	
  
Sloth!
  Neglec-ng	
  to	
  get	
  familiar	
  with	
  the	
  source	
  code	
  
     It	
  is	
  open	
  source	
  ader	
  all!	
  
  Not	
  taking	
  the	
  -me	
  to	
  understand	
  the	
  main	
  
   parts	
  of	
  Solr:	
  
     Request	
  Handlers	
  
     Search	
  components	
  
     Query	
  parsers	
  
            Extend	
  QParserPlugin	
  class	
  
     ValueSource	
  &	
  ValueSourceParser	
  –	
  custom	
  func-ons	
  
            New	
  pseudo-­‐fields	
  in	
  4.x	
  
     Response	
  writers	
  

©	
  Lucid	
  Imagina-on,	
  Inc.	
  
Sloth!
  Not	
  keeping	
  up	
  with	
  new	
  features	
  and	
  
   developments	
  in	
  Lucene	
  and	
  Solr	
  




    CHANGES.txt	
  –	
  use	
  "diff"	
  to	
  keep	
  up	
  on	
  changes	
  



©	
  Lucid	
  Imagina-on,	
  Inc.	
  
Sloth!
  New	
  features	
  in	
  Solr	
  3.1:	
  
     Solr	
  spa8al	
  
     Edismax	
  query	
  parser	
  
          NOT	
  experimental!	
  
     Dynamic	
  metadata	
  extrac-on	
  via	
  UIMA	
  
     Numeric	
  range	
  face8ng	
  (like	
  date	
  face-ng)	
  
     Lucene	
  RAMDirectoryFactory	
  available	
  
     Face-ng	
  performance	
  improvements	
  
     Spellcheck	
  and	
  Terms	
  components	
  now	
  
      work	
  for	
  distributed	
  search	
  
     Suggester	
  component	
  –	
  beKer	
  autosuggest!	
  
          Can	
  add	
  custom	
  dict.,	
  phrases,	
  etc.	
  
©	
  Lucid	
  Imagina-on,	
  Inc.	
  
Sloth!
  New	
  features	
  coming	
  in	
  Solr	
  4.x:	
  
     Lucene	
  DocumentWritersPerThread	
  (DWPT)	
  
          Moving	
  towards	
  "real	
  -me"	
  
     UpdateHandler	
  upgrade	
  to	
  work	
  with	
  real-­‐-me	
  	
  
     Field	
  collapsing/grouping	
  
     Pivot	
  facets	
  
     SolrCloud	
  (Zookeeper)	
  
     Fuzzy	
  queries	
  100	
  -mes	
  faster	
  
     Pseudo	
  fields	
  via	
  func-ons	
  
     Relevancy	
  func-on	
  queries:	
  n,	
  idf,	
  docFreq,	
  norm,	
  …	
  


©	
  Lucid	
  Imagina-on,	
  Inc.	
  
Sloth: The Path To Salvation!
  Commit	
  to	
  the	
  project	
  and	
  to	
  learning	
  Solr	
  
  Stay	
  up	
  to	
  date	
  on	
  Solr	
  changes	
  
  Stay	
  current	
  with	
  ongoing	
  releases	
  
  Get	
  familiar	
  with	
  the	
  source	
  code	
  
  Spend	
  some	
  -me	
  to	
  understand	
  the	
  main	
  
   configura-on	
  files:	
  
     solrconfig.xml	
  
     schema.xml	
  
  Read	
  through	
  the	
  en-re	
  Solr	
  Wiki	
  once	
  every	
  so	
  oden	
  
  Develop	
  in-­‐house	
  Solr	
  exper-se	
  



©	
  Lucid	
  Imagina-on,	
  Inc.	
  
Save	
  a	
  penny,	
  
                                        lose	
  a	
  customer.	
  




©	
  Lucid	
  Imagina-on,	
  Inc.	
  
Greed!
  Skimping	
  on	
  resources	
  such	
  as:	
  
     RAM	
  	
  
        "Here's	
  a	
  quarter	
  buddy,	
  go	
  buy	
  some	
  RAM!"	
  
     Storage	
  space	
  

  You	
  will	
  get	
  what	
  you	
  pay	
  for!	
  
     …on	
  the	
  other	
  hand,	
  not	
  every	
  company	
  has	
  "deep	
  pockets"	
  




©	
  Lucid	
  Imagina-on,	
  Inc.	
  
Greed!
  Trying	
  to	
  "squeeze	
  by",	
  indexing	
  to,	
  and	
  searching	
  
   on,	
  the	
  same	
  server	
  
                                                     Indexing	
  
               Indexing	
  


                                                                         Shards	
  (Indexers)	
  




                                                                                   Slave/Searchers	
  




                                                  Load	
  Balancer	
  
              Searches	
  
                                                  Searches	
  
©	
  Lucid	
  Imagina-on,	
  Inc.	
  
Greed!
  Not	
  making	
  the	
  effort	
  to	
  find	
  the	
  right	
  balance	
  
   between	
  precision	
  and	
  recall	
  

        Recall:	
  What	
  frac-on	
  of	
       Precision:	
  What	
  frac-on	
  
        the	
  relevant	
  documents	
  in	
     of	
  the	
  returned	
  results	
  
        the	
  collec-on	
  were	
  re-­‐	
      are	
  relevant	
  to	
  the	
  
        turned	
  by	
  the	
  system?	
  	
     informa-on	
  need?	
  




©	
  Lucid	
  Imagina-on,	
  Inc.	
  
Greed!
  A	
  few	
  thoughts	
  about	
  relevance:	
  
     Get	
  feedback	
  from	
  domain	
  experts	
  
     Is	
  it	
  beKer	
  to	
  have	
  lots	
  of	
  results	
  with	
  less	
  	
  
          precision,	
  or	
  fewer,	
  more	
  targeted	
  results?	
  
     Different	
  sites	
  will	
  have	
  very	
  different	
  	
  
          requirements	
  




©	
  Lucid	
  Imagina-on,	
  Inc.	
  
Greed: The Path To Salvation!
      Pry	
  open	
  your	
  wallet	
  –	
  don't	
  be	
  cheap	
  
      You	
  don't	
  have	
  to	
  push	
  the	
  envelope	
  
      Find	
  the	
  right	
  balance	
  between	
  recall	
  and	
  precision	
  
      Don't	
  push	
  for	
  more	
  results	
  over	
  precision	
  –	
  unless	
  
       that	
  is	
  a	
  clear	
  requirement	
  (some-mes	
  it	
  is)	
  




©	
  Lucid	
  Imagina-on,	
  Inc.	
  
"What	
  could	
  possibly	
  
                                                    go	
  wrong?	
  




©	
  Lucid	
  Imagina-on,	
  Inc.	
  
Pride!
  Reinven-ng	
  the	
  wheel	
  
     "Why	
  don't	
  we	
  just	
  write	
  our	
  own	
  search	
  
      libraries?"	
  
     Nobody	
  has	
  a	
  use	
  case	
  like	
  us	
  –	
  right?	
  
     "We	
  need	
  to	
  change	
  the	
  scoring	
  algorithms."	
  




©	
  Lucid	
  Imagina-on,	
  Inc.	
  
Pride!
  Thinking	
  you	
  can	
  "do	
  it	
  all"	
  in	
  Solr	
  
     Solr	
  is	
  rarely	
  a	
  good	
  choice	
  as	
  a	
  SOR	
  
  Consider	
  other	
  tools	
  to	
  work	
  with	
  Solr:	
  
     Nutch	
  
     Mahout	
  
     OpenNLP	
  
     Google	
  Connector	
  Framework	
  
     Your	
  own	
  code	
  




©	
  Lucid	
  Imagina-on,	
  Inc.	
  
Pride!
  Stubbornly	
  refusing	
  to	
  use	
  resources	
  such	
  as	
  the	
  	
  
   mailing	
  lists:	
  
     Solr	
  user	
  list:	
  
         solr-­‐user@lucene.apache.org	
  
     Solr	
  developer	
  list:	
  
         dev@lucene.apache.org	
  
     Lucene	
  user	
  list:	
  
         java-­‐user@lucene.apache.org	
  	
  

  LucidFind:	
  hKp://www.lucidimagina-on.com/search/	
  	
  



©	
  Lucid	
  Imagina-on,	
  Inc.	
  
Pride!
  "I	
  will	
  not	
  yield!"	
  
     Trying	
  to	
  "win	
  baKles"	
  on	
  the	
  mailing	
  lists	
  
     Good	
  Karma	
  –	
  be	
  a	
  good	
  ci-zen	
  in	
  the	
  community	
  




©	
  Lucid	
  Imagina-on,	
  Inc.	
  
Pride: The Path To Salvation!
  Ask	
  for	
  help	
  when	
  needed	
  
  Let	
  the	
  business	
  needs	
  define	
  the	
  project	
  –	
  don't	
  
   let	
  the	
  tail	
  wag	
  the	
  dog	
  
  Get	
  a	
  feel	
  for	
  the	
  Solr	
  community	
  and	
  respect	
  the	
  
   experience	
  of	
  others	
  
  You're	
  situa-on,	
  while	
  possibly	
  unique,	
  is	
  probably	
  
   not	
  completely	
  dissimilar	
  to	
  others.	
  Learn	
  from	
  the	
  	
  
   pioneers	
  and	
  Solr	
  veterans	
  




©	
  Lucid	
  Imagina-on,	
  Inc.	
  
"Someone	
  stop	
  me!"	
  




©	
  Lucid	
  Imagina-on,	
  Inc.	
  
Lust!
  Obsessing	
  over	
  unimportant	
  details	
  too	
  early	
  
   in	
  the	
  project	
  
     Agile	
  approach	
  is	
  well	
  suited	
  to	
  Solr	
  
          development	
  –	
  iterate!	
  
  Trying	
  to	
  "push	
  the	
  envelope"	
  
     Necessary	
  some-mes,	
  but	
  it's	
  not	
  called	
  
          the	
  "bleeding	
  edge"	
  without	
  reason	
  
     "Ease	
  in"	
  to	
  major	
  changes	
  
  Too	
  much	
  aKen-on	
  to	
  JVM	
  sebngs	
  
            Solr	
  experts	
  are	
  not	
  usually	
  JVM/GC	
  experts	
  



©	
  Lucid	
  Imagina-on,	
  Inc.	
  
Lust!
  "An--­‐greed"	
  –	
  CommiEng	
  too	
  many	
  resources	
  	
  
   to	
  Solr	
  
     Make	
  sure	
  the	
  OS	
  has	
  plenty	
  of	
  RAM	
  
           to	
  cache	
  files,	
  etc	
  
  "If	
  one	
  is	
  good,	
  a	
  dozen	
  must	
  be	
  beKer!"	
  
     As	
  much	
  as	
  possible,	
  try	
  to	
  get	
  a	
  sense	
  of	
  what	
  
           your	
  query	
  volume	
  will	
  be,	
  and	
  don't	
  just	
  throw	
  
           money	
  at	
  building	
  a	
  monstrous	
  farm	
  of	
  searchers	
  
     Solr	
  has	
  proven	
  to	
  be	
  much	
  more	
  efficient	
  than	
  some	
  	
  
           large,	
  commercial	
  search	
  solu-ons	
  




©	
  Lucid	
  Imagina-on,	
  Inc.	
  
Lust!
  Blood	
  from	
  a	
  turnip:	
  
     Trying	
  some	
  absurd	
  new	
  technique,	
  	
  
      "just	
  because"	
  

  RAMDirectoryFactory	
  –	
  not	
  a	
  secret	
  way	
  to	
  faster	
  
   indexing/searching	
  
     No	
  disk-­‐backed	
  persistence	
  
     Usually	
  not	
  worth	
  it	
  
     …but	
  you	
  never	
  know…	
  

  Research	
  first	
  before	
  going	
  "extreme"	
  

©	
  Lucid	
  Imagina-on,	
  Inc.	
  
Lust!
  No	
  need	
  to	
  index	
  millions	
  of	
  docs	
  for	
  development	
  
  BeKer	
  to	
  work	
  with	
  small	
  sets	
  of	
  data	
  while	
  
   gebng	
  started.	
  
  Don't	
  worry	
  too	
  much	
  about	
  field	
  types	
  as	
  you	
  get	
  
   started.	
  Get	
  data	
  in	
  the	
  index,	
  then	
  analyze	
  and	
  
   refine.	
  




©	
  Lucid	
  Imagina-on,	
  Inc.	
  
Lust: The Path To Salvation!
  Use	
  an	
  agile	
  approach	
  –	
  start	
  simply,	
  build	
  your	
  
   applica-on	
  slowly,	
  iterate	
  
  Deal	
  with	
  the	
  low-­‐hanging	
  fruit	
  first	
  
  Measure	
  twice,	
  cut	
  once	
  
  Don't	
  miss	
  the	
  forest	
  for	
  the	
  trees	
  –	
  no	
  need	
  to	
  
   obsess	
  over	
  details	
  in	
  the	
  early	
  stages	
  
  Do	
  some	
  due	
  diligence	
  before	
  trying	
  unorthodox	
  
   approaches	
  
  Get	
  a	
  small	
  sample	
  of	
  data	
  indexed	
  w/o	
  worrying	
  about	
  type,	
  
   then	
  itera-ons	
  of	
  refinement	
  



©	
  Lucid	
  Imagina-on,	
  Inc.	
  
"If	
  we	
  had	
  some	
  bacon	
  	
  
                                                 we	
  could	
  have	
  some	
  
                                        	
  bacon	
  and	
  eggs	
  –	
  if	
  we	
  	
  
                                                       had	
  some	
  eggs."	
  




©	
  Lucid	
  Imagina-on,	
  Inc.	
  
Envy!
  Adding	
  "cool"	
  features	
  you	
  see	
  on	
  other	
  
   sites,	
  but	
  don't	
  really	
  need	
  
     Keep	
  it	
  "lean	
  and	
  mean",	
  especially	
  
       to	
  start	
  
     Resist	
  the	
  urge	
  to	
  include	
  the	
  	
  
       "kitchen	
  sink"	
  




©	
  Lucid	
  Imagina-on,	
  Inc.	
  
Envy!
  You	
  too	
  can	
  master	
  dismax!	
  
     Don't	
  be	
  afraid	
  of	
  dismax/edismax	
  
     Lots	
  of	
  controls	
  to	
  learn,	
  but	
  also	
  
      lots	
  of	
  power	
  
     Flexibility	
  to	
  search	
  mul-ple	
  fields	
  
     Boost	
  different	
  fields	
  
     Boost	
  phrase	
  fields	
  (pf)	
  higher	
  than	
  query	
  fields	
  (qf)	
  
     Use	
  boost	
  queries	
  (bq)	
  and	
  func-on	
  queries	
  (bf)	
  
     Most	
  in-mida-ng	
  params:	
  
            -e	
  
            mm	
  

©	
  Lucid	
  Imagina-on,	
  Inc.	
  
Envy!
  Spa-al	
  search	
  –	
  seems	
  complicated,	
  but	
  
   major	
  sites	
  make	
  it	
  look	
  easy	
  
  Now,	
  in	
  Solr	
  3.1	
  –	
  it	
  is	
  easy!	
  
  You	
  can:	
  
     Store	
  spa-al	
  data	
  in	
  your	
  index	
  
     Filter	
  by	
  distance	
  
     Sort	
  by	
  distance	
  
     Boost/bias	
  by	
  distance	
  
     Facet	
  by	
  distance	
  
  Also	
  consider:	
  Search-­‐based	
  naviga-on	
  such	
  as	
  
   "Show	
  me	
  in-­‐stock	
  items	
  only"	
  

©	
  Lucid	
  Imagina-on,	
  Inc.	
  
Envy: The Path To Salvation!
  Focus	
  on	
  your	
  requirements,	
  don't	
  try	
  
   to	
  add	
  "bells	
  and	
  whistles"	
  you	
  don't	
  
   need	
  
  Don't	
  be	
  hesitant	
  to	
  dive	
  into	
  the	
  power	
  
   of	
  dismax/edismax	
  
  Take	
  advantage	
  of	
  new	
  features	
  such	
  as	
  
   Solr	
  spa-al,	
  if	
  those	
  features	
  will	
  add	
  
   value	
  to	
  the	
  end	
  user	
  experience	
  




©	
  Lucid	
  Imagina-on,	
  Inc.	
  
"A	
  fat	
  stomach	
  never	
  	
  
                                        breeds	
  fine	
  thoughts."	
  




©	
  Lucid	
  Imagina-on,	
  Inc.	
  
Gluttony!
  “Staying	
  fit	
  and	
  trim”	
  is	
  usually	
  good	
  prac-ce	
  	
  
   when	
  designing	
  and	
  running	
  Solr	
  applica-ons	
  
     Once	
  again	
  –	
  keep	
  it	
  "lean	
  and	
  mean"	
  	
  
  A	
  lot	
  of	
  these	
  issues	
  cross	
  over	
  into	
  the	
  “Sloth”	
  	
  
   category	
  
     The	
  effort	
  needed	
  to	
  keep	
  your	
  configura-on	
  	
  
          and	
  data	
  efficiently	
  managed	
  is	
  not	
  considered	
  	
  
          important	
  
  Don't	
  lose	
  control	
  of	
  your	
  configura-on	
  files	
  
     Remove	
  unnecessary	
  elements	
  
     Version	
  control	
  all	
  configura-on	
  files	
  


©	
  Lucid	
  Imagina-on,	
  Inc.	
  
Gluttony!
  Slim	
  down	
  those	
  "bloated"	
  queries:	
  
           q="red	
  shoes"&	
  accountId=(12343	
  OR	
  338899	
  
            OR	
  554443	
  OR	
  243445	
  OR	
  55442OR	
  3330899	
  	
  
            OR	
  59927	
  OR	
  3888999	
  OR	
  549	
  OR	
  440293579	
  
            34201	
  OR	
  339917	
  OR	
  300191	
  OR	
  339338	
  OR	
  	
  
            109823	
  OR	
  679176	
  OR	
  31407815	
  OR	
  3001756	
  	
  
            OR	
  134322	
  OR	
  311123	
  OR	
  987888	
  OR	
  997181	
  OR	
  771819	
  OR	
  
            100292	
  OR	
  3389474	
  OR	
  5505759	
  OR	
  2459577	
  OR	
  4499957	
  OR	
  
            1996571	
  OR	
  559590	
  OR	
  220299	
  OR	
  4404872	
  OR	
  151510	
  OR	
  
            66017	
  OR	
  666	
  OR	
  113459	
  OR	
  890575	
  OR	
  505725	
  OR	
  330393	
  OR	
  
            349940	
  OR	
  4094994	
  OR	
  1245995	
  OR	
  2459959	
  OR	
  4255909	
  OR	
  
                 899955	
  OR	
  7878899	
  OR	
  100999	
  …	
  ∞	
  )	
  

©	
  Lucid	
  Imagina-on,	
  Inc.	
  
Gluttony!
  Stay	
  in	
  shape	
  –	
  Flex	
  Your	
  Solr	
  Muscles!	
  
     Keep	
  up	
  on	
  new	
  features	
  
     Training,	
  when	
  appropriate	
  
     Cer-fica-on	
  
     Contribute!	
  
     Follow	
  the	
  user	
  lists	
  
     Refactor	
  when	
  new	
  features	
  can	
  help	
  
     Keep	
  up	
  to	
  date	
  on	
  new	
  releases	
  




©	
  Lucid	
  Imagina-on,	
  Inc.	
  
Gluttony: The Path To Salvation!
  Keep	
  configura-on	
  files	
  clean	
  and	
  trim.	
  Remove	
  
   unused	
  elements	
  
  Periodically	
  review	
  queries	
  to	
  make	
  sure	
  they	
  
   are	
  efficient	
  
  Refactor	
  when	
  necessary	
  –	
  keep	
  your	
  
   applica-on	
  fit	
  and	
  trim	
  




©	
  Lucid	
  Imagina-on,	
  Inc.	
  
"Hope	
  is	
  the	
  denial	
  of	
  reality."	
  




©	
  Lucid	
  Imagina-on,	
  Inc.	
  
Wrath!
  Wrath	
  -­‐	
  usually	
  synonymous	
  with	
  anger,	
  but…	
  
  Let’s	
  use	
  an	
  older	
  defini-on	
  here:	
  	
  
     “A	
  vehement	
  denial	
  of	
  the	
  truth,	
  	
  
       both	
  to	
  others	
  and	
  in	
  the	
  form	
  of	
  	
  
       self-­‐denial	
  and	
  impaMence.”	
  
  Step	
  back	
  every	
  now	
  and	
  then	
  and	
  look	
  
   objec-vely	
  at	
  your	
  applica-on	
  




©	
  Lucid	
  Imagina-on,	
  Inc.	
  
Wrath!
  Resist	
  the	
  push	
  to	
  rush	
  to	
  produc-on…	
  




©	
  Lucid	
  Imagina-on,	
  Inc.	
  
Wrath!
  Ignoring	
  new	
  Solr	
  releases	
  
     OK	
  to	
  wait	
  un-l	
  a	
  release	
  is	
  proven	
  
     But	
  gebng	
  too	
  far	
  behind	
  makes	
  upgrading	
  
      more	
  painful	
  with	
  each	
  release	
  

  We	
  don't	
  have	
  -me	
  to	
  do	
  it	
  right,	
  but	
  we	
  always	
  	
  
   have	
  -me	
  to	
  fix	
  it	
  




©	
  Lucid	
  Imagina-on,	
  Inc.	
  
Wrath!
  Ignoring	
  complaints	
  about	
  results	
  relevance	
  
  Disregarding	
  feedback	
  from	
  stakeholders	
  
  Remember	
  –	
  the	
  point	
  of	
  your	
  search	
  applica-on	
  
   is	
  to	
  support	
  the	
  business,	
  not	
  to	
  "build	
  cool	
  stuff"	
  
  Not	
  taking	
  advantage	
  of	
  log	
  files	
  
     Consider	
  mining	
  log	
  files,	
  storing	
  data	
  in	
  
           rela-onal	
  DB	
  for	
  genera-ng	
  reports	
  
     Capturing	
  user	
  queries	
  and	
  query	
  counts	
  can	
  be	
  
           extremely	
  useful	
  
                Can	
  also	
  be	
  used	
  for	
  query-­‐based	
  autosuggest.	
  
                 (not	
  just	
  indexed	
  terms)	
  


©	
  Lucid	
  Imagina-on,	
  Inc.	
  
Wrath: The Path To Salvation!
  Keep	
  your	
  version	
  of	
  Solr	
  up	
  to	
  date	
  
     OK	
  to	
  wait	
  "awhile",	
  but	
  don't	
  skip	
  versions	
  
  Seek	
  and	
  embrace	
  feedback	
  from	
  business	
  and	
  	
  
   domain	
  experts	
  
  Constantly	
  gauge	
  and	
  improve	
  relevance	
  as	
  an	
  	
  
   ongoing	
  task	
  
  Avoid	
  the	
  push	
  to	
  release	
  too	
  soon	
  (as	
  best	
  you	
  can)	
  
  Take	
  advantage	
  of	
  log	
  files	
  to	
  understand	
  what	
  	
  
   users	
  are	
  doing,	
  and	
  what	
  is	
  not	
  working	
  well	
  




©	
  Lucid	
  Imagina-on,	
  Inc.	
  
¡Búsqueda,	
  y	
  usted	
  encontrará!	
  

Contenu connexe

En vedette

Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...
Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...
Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...Lucidworks (Archived)
 
C:\Fakepath\6620millardmodule3b
C:\Fakepath\6620millardmodule3bC:\Fakepath\6620millardmodule3b
C:\Fakepath\6620millardmodule3bDonna Millard
 
SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr
 SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr
SFBay Area Solr Meetup - July 15th: Integrating Hadoop and SolrLucidworks (Archived)
 
Practical Search with Solr: Beyond just Looking it Up
Practical Search with Solr: Beyond just Looking it UpPractical Search with Solr: Beyond just Looking it Up
Practical Search with Solr: Beyond just Looking it UpLucidworks (Archived)
 
Network Forensics Puzzle Contest に挑戦 #1
Network Forensics Puzzle Contest に挑戦 #1Network Forensics Puzzle Contest に挑戦 #1
Network Forensics Puzzle Contest に挑戦 #1彰 村地
 
Shining new light on lucene solr performance and monitoring
Shining new light on lucene solr performance and monitoringShining new light on lucene solr performance and monitoring
Shining new light on lucene solr performance and monitoringLucidworks (Archived)
 
Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...
Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...
Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...Lucidworks (Archived)
 
Technology opportunities in hampton roads (kaszubowski ), nasa technology day...
Technology opportunities in hampton roads (kaszubowski ), nasa technology day...Technology opportunities in hampton roads (kaszubowski ), nasa technology day...
Technology opportunities in hampton roads (kaszubowski ), nasa technology day...Marty Kaszubowski
 
Creating Custom Finishes
Creating Custom FinishesCreating Custom Finishes
Creating Custom Finishesguest0a3c64a
 
Building SaaS Solutions for Online Media Using Apache Solr
Building SaaS Solutions for Online Media Using Apache SolrBuilding SaaS Solutions for Online Media Using Apache Solr
Building SaaS Solutions for Online Media Using Apache SolrLucidworks (Archived)
 
Updated: Marketing your Technology
Updated: Marketing your TechnologyUpdated: Marketing your Technology
Updated: Marketing your TechnologyMarty Kaszubowski
 
Descritores de linguagem
Descritores de linguagemDescritores de linguagem
Descritores de linguagemgindri
 
How The Guardian Embraced the Internet using Content, Search, and Open Source
How The Guardian Embraced the Internet using Content, Search, and Open SourceHow The Guardian Embraced the Internet using Content, Search, and Open Source
How The Guardian Embraced the Internet using Content, Search, and Open SourceLucidworks (Archived)
 
Jonh Lennon
Jonh LennonJonh Lennon
Jonh Lennontanica
 
HTML5 と次世代のネットワーク プロトコル
HTML5 と次世代のネットワーク プロトコルHTML5 と次世代のネットワーク プロトコル
HTML5 と次世代のネットワーク プロトコル彰 村地
 
Ecma 262 5th Edition を読む #5 第9条
Ecma 262 5th Edition を読む #5 第9条Ecma 262 5th Edition を読む #5 第9条
Ecma 262 5th Edition を読む #5 第9条彰 村地
 
Azure と世間様
Azure と世間様Azure と世間様
Azure と世間様彰 村地
 
Tv ролики
Tv роликиTv ролики
Tv роликиtarodnova
 
Using Solr to find the Right Person for the Right Job
Using Solr to find the Right Person for the Right JobUsing Solr to find the Right Person for the Right Job
Using Solr to find the Right Person for the Right JobLucidworks (Archived)
 
Tennis
TennisTennis
Tennisaritz
 

En vedette (20)

Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...
Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...
Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...
 
C:\Fakepath\6620millardmodule3b
C:\Fakepath\6620millardmodule3bC:\Fakepath\6620millardmodule3b
C:\Fakepath\6620millardmodule3b
 
SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr
 SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr
SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr
 
Practical Search with Solr: Beyond just Looking it Up
Practical Search with Solr: Beyond just Looking it UpPractical Search with Solr: Beyond just Looking it Up
Practical Search with Solr: Beyond just Looking it Up
 
Network Forensics Puzzle Contest に挑戦 #1
Network Forensics Puzzle Contest に挑戦 #1Network Forensics Puzzle Contest に挑戦 #1
Network Forensics Puzzle Contest に挑戦 #1
 
Shining new light on lucene solr performance and monitoring
Shining new light on lucene solr performance and monitoringShining new light on lucene solr performance and monitoring
Shining new light on lucene solr performance and monitoring
 
Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...
Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...
Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...
 
Technology opportunities in hampton roads (kaszubowski ), nasa technology day...
Technology opportunities in hampton roads (kaszubowski ), nasa technology day...Technology opportunities in hampton roads (kaszubowski ), nasa technology day...
Technology opportunities in hampton roads (kaszubowski ), nasa technology day...
 
Creating Custom Finishes
Creating Custom FinishesCreating Custom Finishes
Creating Custom Finishes
 
Building SaaS Solutions for Online Media Using Apache Solr
Building SaaS Solutions for Online Media Using Apache SolrBuilding SaaS Solutions for Online Media Using Apache Solr
Building SaaS Solutions for Online Media Using Apache Solr
 
Updated: Marketing your Technology
Updated: Marketing your TechnologyUpdated: Marketing your Technology
Updated: Marketing your Technology
 
Descritores de linguagem
Descritores de linguagemDescritores de linguagem
Descritores de linguagem
 
How The Guardian Embraced the Internet using Content, Search, and Open Source
How The Guardian Embraced the Internet using Content, Search, and Open SourceHow The Guardian Embraced the Internet using Content, Search, and Open Source
How The Guardian Embraced the Internet using Content, Search, and Open Source
 
Jonh Lennon
Jonh LennonJonh Lennon
Jonh Lennon
 
HTML5 と次世代のネットワーク プロトコル
HTML5 と次世代のネットワーク プロトコルHTML5 と次世代のネットワーク プロトコル
HTML5 と次世代のネットワーク プロトコル
 
Ecma 262 5th Edition を読む #5 第9条
Ecma 262 5th Edition を読む #5 第9条Ecma 262 5th Edition を読む #5 第9条
Ecma 262 5th Edition を読む #5 第9条
 
Azure と世間様
Azure と世間様Azure と世間様
Azure と世間様
 
Tv ролики
Tv роликиTv ролики
Tv ролики
 
Using Solr to find the Right Person for the Right Job
Using Solr to find the Right Person for the Right JobUsing Solr to find the Right Person for the Right Job
Using Solr to find the Right Person for the Right Job
 
Tennis
TennisTennis
Tennis
 

Similaire à The Seven Deadly Sins of Solr

Would you buy an open source company?
Would you buy an open source company?Would you buy an open source company?
Would you buy an open source company?Bertrand Delacretaz
 
Solr: Search at the Speed of Light
Solr: Search at the Speed of LightSolr: Search at the Speed of Light
Solr: Search at the Speed of LightErik Hatcher
 
Rapid prototyping with solr - By Erik Hatcher
Rapid prototyping with solr -  By Erik Hatcher Rapid prototyping with solr -  By Erik Hatcher
Rapid prototyping with solr - By Erik Hatcher lucenerevolution
 
Cognitum Ontorion: Knowledge Representation and Reasoning System
Cognitum Ontorion: Knowledge Representation and Reasoning SystemCognitum Ontorion: Knowledge Representation and Reasoning System
Cognitum Ontorion: Knowledge Representation and Reasoning SystemCognitum
 
Technologies for startup
Technologies for startupTechnologies for startup
Technologies for startupDzung Nguyen
 
Boston Spark User Group - Spark's Role at MediaCrossing - July 15, 2014
Boston Spark User Group - Spark's Role at MediaCrossing - July 15, 2014Boston Spark User Group - Spark's Role at MediaCrossing - July 15, 2014
Boston Spark User Group - Spark's Role at MediaCrossing - July 15, 2014gmalouf678
 
7 New Tools Java Developers Should Know
7 New Tools Java Developers Should Know7 New Tools Java Developers Should Know
7 New Tools Java Developers Should KnowTakipi
 
[CB19] Spyware, Ransomware and Worms. How to prevent the next SAP tragedy by ...
[CB19] Spyware, Ransomware and Worms. How to prevent the next SAP tragedy by ...[CB19] Spyware, Ransomware and Worms. How to prevent the next SAP tragedy by ...
[CB19] Spyware, Ransomware and Worms. How to prevent the next SAP tragedy by ...CODE BLUE
 
Node.js Deeper Dive
Node.js Deeper DiveNode.js Deeper Dive
Node.js Deeper DiveJustin Reock
 
Data Engineer's Lunch 90: Migrating SQL Data with Arcion
Data Engineer's Lunch 90: Migrating SQL Data with ArcionData Engineer's Lunch 90: Migrating SQL Data with Arcion
Data Engineer's Lunch 90: Migrating SQL Data with ArcionAnant Corporation
 
Moved to https://slidr.io/azzazzel/web-application-performance-tuning-beyond-xmx
Moved to https://slidr.io/azzazzel/web-application-performance-tuning-beyond-xmxMoved to https://slidr.io/azzazzel/web-application-performance-tuning-beyond-xmx
Moved to https://slidr.io/azzazzel/web-application-performance-tuning-beyond-xmxMilen Dyankov
 
Planning JavaScript and Ajax for larger teams
Planning JavaScript and Ajax for larger teamsPlanning JavaScript and Ajax for larger teams
Planning JavaScript and Ajax for larger teamsChristian Heilmann
 
Getting started faster with LucidWorks for Solr
Getting started faster with LucidWorks for SolrGetting started faster with LucidWorks for Solr
Getting started faster with LucidWorks for SolrLucidworks (Archived)
 
2019 StartIT - Boosting your performance with Blackfire
2019 StartIT - Boosting your performance with Blackfire2019 StartIT - Boosting your performance with Blackfire
2019 StartIT - Boosting your performance with BlackfireMarko Mitranić
 
API Description Languages
API Description LanguagesAPI Description Languages
API Description LanguagesAkana
 
API Description Languages
API Description LanguagesAPI Description Languages
API Description LanguagesAkana
 
Becoming an IBM Connections Developer
Becoming an IBM Connections DeveloperBecoming an IBM Connections Developer
Becoming an IBM Connections DeveloperRob Novak
 

Similaire à The Seven Deadly Sins of Solr (20)

Would you buy an open source company?
Would you buy an open source company?Would you buy an open source company?
Would you buy an open source company?
 
Into the domain
Into the domainInto the domain
Into the domain
 
Solr: Search at the Speed of Light
Solr: Search at the Speed of LightSolr: Search at the Speed of Light
Solr: Search at the Speed of Light
 
Rapid Prototyping with Solr
Rapid Prototyping with SolrRapid Prototyping with Solr
Rapid Prototyping with Solr
 
Rapid prototyping with solr - By Erik Hatcher
Rapid prototyping with solr -  By Erik Hatcher Rapid prototyping with solr -  By Erik Hatcher
Rapid prototyping with solr - By Erik Hatcher
 
Cognitum Ontorion: Knowledge Representation and Reasoning System
Cognitum Ontorion: Knowledge Representation and Reasoning SystemCognitum Ontorion: Knowledge Representation and Reasoning System
Cognitum Ontorion: Knowledge Representation and Reasoning System
 
Technologies for startup
Technologies for startupTechnologies for startup
Technologies for startup
 
Boston Spark User Group - Spark's Role at MediaCrossing - July 15, 2014
Boston Spark User Group - Spark's Role at MediaCrossing - July 15, 2014Boston Spark User Group - Spark's Role at MediaCrossing - July 15, 2014
Boston Spark User Group - Spark's Role at MediaCrossing - July 15, 2014
 
7 New Tools Java Developers Should Know
7 New Tools Java Developers Should Know7 New Tools Java Developers Should Know
7 New Tools Java Developers Should Know
 
[CB19] Spyware, Ransomware and Worms. How to prevent the next SAP tragedy by ...
[CB19] Spyware, Ransomware and Worms. How to prevent the next SAP tragedy by ...[CB19] Spyware, Ransomware and Worms. How to prevent the next SAP tragedy by ...
[CB19] Spyware, Ransomware and Worms. How to prevent the next SAP tragedy by ...
 
Node.js Deeper Dive
Node.js Deeper DiveNode.js Deeper Dive
Node.js Deeper Dive
 
Data Engineer's Lunch 90: Migrating SQL Data with Arcion
Data Engineer's Lunch 90: Migrating SQL Data with ArcionData Engineer's Lunch 90: Migrating SQL Data with Arcion
Data Engineer's Lunch 90: Migrating SQL Data with Arcion
 
Moved to https://slidr.io/azzazzel/web-application-performance-tuning-beyond-xmx
Moved to https://slidr.io/azzazzel/web-application-performance-tuning-beyond-xmxMoved to https://slidr.io/azzazzel/web-application-performance-tuning-beyond-xmx
Moved to https://slidr.io/azzazzel/web-application-performance-tuning-beyond-xmx
 
Planning JavaScript and Ajax for larger teams
Planning JavaScript and Ajax for larger teamsPlanning JavaScript and Ajax for larger teams
Planning JavaScript and Ajax for larger teams
 
Getting started faster with LucidWorks for Solr
Getting started faster with LucidWorks for SolrGetting started faster with LucidWorks for Solr
Getting started faster with LucidWorks for Solr
 
Solr @ eBay Kleinanzeigen
Solr @ eBay KleinanzeigenSolr @ eBay Kleinanzeigen
Solr @ eBay Kleinanzeigen
 
2019 StartIT - Boosting your performance with Blackfire
2019 StartIT - Boosting your performance with Blackfire2019 StartIT - Boosting your performance with Blackfire
2019 StartIT - Boosting your performance with Blackfire
 
API Description Languages
API Description LanguagesAPI Description Languages
API Description Languages
 
API Description Languages
API Description LanguagesAPI Description Languages
API Description Languages
 
Becoming an IBM Connections Developer
Becoming an IBM Connections DeveloperBecoming an IBM Connections Developer
Becoming an IBM Connections Developer
 

Plus de Lucidworks (Archived)

SFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for Business
SFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for BusinessSFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for Business
SFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for BusinessLucidworks (Archived)
 
SFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
SFBay Area Solr Meetup - June 18th: Benchmarking Solr PerformanceSFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
SFBay Area Solr Meetup - June 18th: Benchmarking Solr PerformanceLucidworks (Archived)
 
Chicago Solr Meetup - June 10th: This Ain't Your Parents' Search Engine
Chicago Solr Meetup - June 10th: This Ain't Your Parents' Search EngineChicago Solr Meetup - June 10th: This Ain't Your Parents' Search Engine
Chicago Solr Meetup - June 10th: This Ain't Your Parents' Search EngineLucidworks (Archived)
 
Chicago Solr Meetup - June 10th: Exploring Hadoop with Search
Chicago Solr Meetup - June 10th: Exploring Hadoop with SearchChicago Solr Meetup - June 10th: Exploring Hadoop with Search
Chicago Solr Meetup - June 10th: Exploring Hadoop with SearchLucidworks (Archived)
 
Minneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache Solr
Minneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache SolrMinneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache Solr
Minneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache SolrLucidworks (Archived)
 
Minneapolis Solr Meetup - May 28, 2014: Target.com Search
Minneapolis Solr Meetup - May 28, 2014: Target.com SearchMinneapolis Solr Meetup - May 28, 2014: Target.com Search
Minneapolis Solr Meetup - May 28, 2014: Target.com SearchLucidworks (Archived)
 
Exploration of multidimensional biomedical data in pub chem, Presented by Lia...
Exploration of multidimensional biomedical data in pub chem, Presented by Lia...Exploration of multidimensional biomedical data in pub chem, Presented by Lia...
Exploration of multidimensional biomedical data in pub chem, Presented by Lia...Lucidworks (Archived)
 
Unstructured Or: How I Learned to Stop Worrying and Love the xml, Presented...
Unstructured   Or: How I Learned to Stop Worrying and Love the xml, Presented...Unstructured   Or: How I Learned to Stop Worrying and Love the xml, Presented...
Unstructured Or: How I Learned to Stop Worrying and Love the xml, Presented...Lucidworks (Archived)
 
Big Data Challenges, Presented by Wes Caldwell at SolrExchage DC
Big Data Challenges, Presented by Wes Caldwell at SolrExchage DCBig Data Challenges, Presented by Wes Caldwell at SolrExchage DC
Big Data Challenges, Presented by Wes Caldwell at SolrExchage DCLucidworks (Archived)
 
What's New in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DC
What's New  in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DCWhat's New  in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DC
What's New in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DCLucidworks (Archived)
 
Solr At AOL, Presented by Sean Timm at SolrExchage DC
Solr At AOL, Presented by Sean Timm at SolrExchage DCSolr At AOL, Presented by Sean Timm at SolrExchage DC
Solr At AOL, Presented by Sean Timm at SolrExchage DCLucidworks (Archived)
 
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DCIntro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DCLucidworks (Archived)
 
Test Driven Relevancy, Presented by Doug Turnbull at SolrExchage DC
Test Driven Relevancy, Presented by Doug Turnbull at SolrExchage DCTest Driven Relevancy, Presented by Doug Turnbull at SolrExchage DC
Test Driven Relevancy, Presented by Doug Turnbull at SolrExchage DCLucidworks (Archived)
 
Building a data driven search application with LucidWorks SiLK
Building a data driven search application with LucidWorks SiLKBuilding a data driven search application with LucidWorks SiLK
Building a data driven search application with LucidWorks SiLKLucidworks (Archived)
 
Introducing LucidWorks App for Splunk Enterprise webinar
Introducing LucidWorks App for Splunk Enterprise webinarIntroducing LucidWorks App for Splunk Enterprise webinar
Introducing LucidWorks App for Splunk Enterprise webinarLucidworks (Archived)
 
Lucene/Solr Revolution 2013: Paul Doscher Opening Remarks
Lucene/Solr Revolution 2013: Paul Doscher Opening Remarks Lucene/Solr Revolution 2013: Paul Doscher Opening Remarks
Lucene/Solr Revolution 2013: Paul Doscher Opening Remarks Lucidworks (Archived)
 

Plus de Lucidworks (Archived) (20)

Integrating Hadoop & Solr
Integrating Hadoop & SolrIntegrating Hadoop & Solr
Integrating Hadoop & Solr
 
The Data-Driven Paradigm
The Data-Driven ParadigmThe Data-Driven Paradigm
The Data-Driven Paradigm
 
SFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for Business
SFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for BusinessSFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for Business
SFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for Business
 
SFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
SFBay Area Solr Meetup - June 18th: Benchmarking Solr PerformanceSFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
SFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
 
Chicago Solr Meetup - June 10th: This Ain't Your Parents' Search Engine
Chicago Solr Meetup - June 10th: This Ain't Your Parents' Search EngineChicago Solr Meetup - June 10th: This Ain't Your Parents' Search Engine
Chicago Solr Meetup - June 10th: This Ain't Your Parents' Search Engine
 
Chicago Solr Meetup - June 10th: Exploring Hadoop with Search
Chicago Solr Meetup - June 10th: Exploring Hadoop with SearchChicago Solr Meetup - June 10th: Exploring Hadoop with Search
Chicago Solr Meetup - June 10th: Exploring Hadoop with Search
 
What's new in solr june 2014
What's new in solr june 2014What's new in solr june 2014
What's new in solr june 2014
 
Minneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache Solr
Minneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache SolrMinneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache Solr
Minneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache Solr
 
Minneapolis Solr Meetup - May 28, 2014: Target.com Search
Minneapolis Solr Meetup - May 28, 2014: Target.com SearchMinneapolis Solr Meetup - May 28, 2014: Target.com Search
Minneapolis Solr Meetup - May 28, 2014: Target.com Search
 
Exploration of multidimensional biomedical data in pub chem, Presented by Lia...
Exploration of multidimensional biomedical data in pub chem, Presented by Lia...Exploration of multidimensional biomedical data in pub chem, Presented by Lia...
Exploration of multidimensional biomedical data in pub chem, Presented by Lia...
 
Unstructured Or: How I Learned to Stop Worrying and Love the xml, Presented...
Unstructured   Or: How I Learned to Stop Worrying and Love the xml, Presented...Unstructured   Or: How I Learned to Stop Worrying and Love the xml, Presented...
Unstructured Or: How I Learned to Stop Worrying and Love the xml, Presented...
 
Big Data Challenges, Presented by Wes Caldwell at SolrExchage DC
Big Data Challenges, Presented by Wes Caldwell at SolrExchage DCBig Data Challenges, Presented by Wes Caldwell at SolrExchage DC
Big Data Challenges, Presented by Wes Caldwell at SolrExchage DC
 
What's New in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DC
What's New  in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DCWhat's New  in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DC
What's New in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DC
 
Solr At AOL, Presented by Sean Timm at SolrExchage DC
Solr At AOL, Presented by Sean Timm at SolrExchage DCSolr At AOL, Presented by Sean Timm at SolrExchage DC
Solr At AOL, Presented by Sean Timm at SolrExchage DC
 
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DCIntro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
 
Test Driven Relevancy, Presented by Doug Turnbull at SolrExchage DC
Test Driven Relevancy, Presented by Doug Turnbull at SolrExchage DCTest Driven Relevancy, Presented by Doug Turnbull at SolrExchage DC
Test Driven Relevancy, Presented by Doug Turnbull at SolrExchage DC
 
Building a data driven search application with LucidWorks SiLK
Building a data driven search application with LucidWorks SiLKBuilding a data driven search application with LucidWorks SiLK
Building a data driven search application with LucidWorks SiLK
 
Introducing LucidWorks App for Splunk Enterprise webinar
Introducing LucidWorks App for Splunk Enterprise webinarIntroducing LucidWorks App for Splunk Enterprise webinar
Introducing LucidWorks App for Splunk Enterprise webinar
 
Solr4 nosql search_server_2013
Solr4 nosql search_server_2013Solr4 nosql search_server_2013
Solr4 nosql search_server_2013
 
Lucene/Solr Revolution 2013: Paul Doscher Opening Remarks
Lucene/Solr Revolution 2013: Paul Doscher Opening Remarks Lucene/Solr Revolution 2013: Paul Doscher Opening Remarks
Lucene/Solr Revolution 2013: Paul Doscher Opening Remarks
 

Dernier

The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 

Dernier (20)

The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 

The Seven Deadly Sins of Solr

  • 1.
  • 2. Introductions…!   Who  the  hell  am  I?    Jay  Hill,  Lucid  Imagina-on    7  years  Lucene  experience    4  years  Solr  experience    Author  of  Lucid  Training    SME  for  Lucid  Cer-fica-on     Who  the  hell  are  you?    New  to  search?    New  to  Lucene/Solr?    BaKle-­‐tested  veterans?   ©  Lucid  Imagina-on,  Inc.  
  • 3. We'll Leave Time For Q&A!   Who's  doing  what?    Solr  3.1?    Solr  1.4.1?    Nightly  build?    Solr  1.3  or  older?     Are  there  any  specific  problems  you're  having?     Meanwhile,  interrupt,  ask  ques8ons  as  we  go,  etc.     ©  Lucid  Imagina-on,  Inc.  
  • 4. A Brief Word About Lucid Imagination!   Lucid  Imagina8on:    The  commercial  company  suppor-ng     Lucene/Solr  open  source  search.    Founded  by      Yonik  Seeley  –  Creator  of  Solr    Erik  Hatcher  –  Co-­‐author,  Lucene  In  Ac-on    Grant  Ingersoll  –  Apache  PMC  Chair    Marc  Krellenstein  –  Lucid  CTO    Staff  includes  9  Lucene/Solr  commiKers    Training,  cer-fica-on,  support,  LucidWorks  Enterprise   ©  Lucid  Imagina-on,  Inc.  
  • 5. Lucid Customers (That I've Worked With)! ©  Lucid  Imagina-on,  Inc.  
  • 6. …On To The Sinning!! ©  Lucid  Imagina-on,  Inc.  
  • 7. Sins As Anti-Patterns?!   "Sorta  kinda"    Specify  Nothing  (Sloth)    Creeping  Featurei-s  (Greed)    Blowhard  Jamboree  (Pride)    Boat  Anchor  (Lust)    Not  Invented  Here  (Envy)    Phatware  (GluKony)    Emperor's  New  Clothes  (Wrath)   ©  Lucid  Imagina-on,  Inc.  
  • 8. Sins Can Contradict One Another!!   You'll  no-ce  that  many  of  the  "sins"     we  see  will  be  the  exact  opposite  of     others     Just  as  some  of  us  tend  towards     laziness,  others  towards  excess     Some-mes  you  -­‐    "Look  before  you  leap."     Other  -mes,      "He  who  hesitates  is  lost."     In  Solr  (or  any  search  app),  one  size  never  fits  all   ©  Lucid  Imagina-on,  Inc.  
  • 9. "I  don't  know   and  I  don't  care."   ©  Lucid  Imagina-on,  Inc.  
  • 10. Sloth!   "We  aren't  really  into  open  source."    Lack  of  commitment  to  Solr  and/or  the  search   applica-on  itself     Not  developing  in-­‐house  Solr  exper-se     Not  paying  enough  aKen-on  to  JVM  sebngs,     garbage  collec-on,  and  RAM  alloca-on.   ©  Lucid  Imagina-on,  Inc.  
  • 11. Sloth!   Neglec-ng  to  get  familiar  with  the  source  code    It  is  open  source  ader  all!     Not  taking  the  -me  to  understand  the  main   parts  of  Solr:    Request  Handlers    Search  components    Query  parsers    Extend  QParserPlugin  class    ValueSource  &  ValueSourceParser  –  custom  func-ons    New  pseudo-­‐fields  in  4.x    Response  writers   ©  Lucid  Imagina-on,  Inc.  
  • 12. Sloth!   Not  keeping  up  with  new  features  and   developments  in  Lucene  and  Solr   CHANGES.txt  –  use  "diff"  to  keep  up  on  changes   ©  Lucid  Imagina-on,  Inc.  
  • 13. Sloth!   New  features  in  Solr  3.1:    Solr  spa8al    Edismax  query  parser    NOT  experimental!    Dynamic  metadata  extrac-on  via  UIMA    Numeric  range  face8ng  (like  date  face-ng)    Lucene  RAMDirectoryFactory  available    Face-ng  performance  improvements    Spellcheck  and  Terms  components  now   work  for  distributed  search    Suggester  component  –  beKer  autosuggest!    Can  add  custom  dict.,  phrases,  etc.   ©  Lucid  Imagina-on,  Inc.  
  • 14. Sloth!   New  features  coming  in  Solr  4.x:    Lucene  DocumentWritersPerThread  (DWPT)    Moving  towards  "real  -me"    UpdateHandler  upgrade  to  work  with  real-­‐-me      Field  collapsing/grouping    Pivot  facets    SolrCloud  (Zookeeper)    Fuzzy  queries  100  -mes  faster    Pseudo  fields  via  func-ons    Relevancy  func-on  queries:  n,  idf,  docFreq,  norm,  …   ©  Lucid  Imagina-on,  Inc.  
  • 15. Sloth: The Path To Salvation!   Commit  to  the  project  and  to  learning  Solr     Stay  up  to  date  on  Solr  changes     Stay  current  with  ongoing  releases     Get  familiar  with  the  source  code     Spend  some  -me  to  understand  the  main   configura-on  files:    solrconfig.xml    schema.xml     Read  through  the  en-re  Solr  Wiki  once  every  so  oden     Develop  in-­‐house  Solr  exper-se   ©  Lucid  Imagina-on,  Inc.  
  • 16. Save  a  penny,   lose  a  customer.   ©  Lucid  Imagina-on,  Inc.  
  • 17. Greed!   Skimping  on  resources  such  as:    RAM      "Here's  a  quarter  buddy,  go  buy  some  RAM!"    Storage  space     You  will  get  what  you  pay  for!    …on  the  other  hand,  not  every  company  has  "deep  pockets"   ©  Lucid  Imagina-on,  Inc.  
  • 18. Greed!   Trying  to  "squeeze  by",  indexing  to,  and  searching   on,  the  same  server   Indexing   Indexing   Shards  (Indexers)   Slave/Searchers   Load  Balancer   Searches   Searches   ©  Lucid  Imagina-on,  Inc.  
  • 19. Greed!   Not  making  the  effort  to  find  the  right  balance   between  precision  and  recall   Recall:  What  frac-on  of   Precision:  What  frac-on   the  relevant  documents  in   of  the  returned  results   the  collec-on  were  re-­‐   are  relevant  to  the   turned  by  the  system?     informa-on  need?   ©  Lucid  Imagina-on,  Inc.  
  • 20. Greed!   A  few  thoughts  about  relevance:    Get  feedback  from  domain  experts    Is  it  beKer  to  have  lots  of  results  with  less     precision,  or  fewer,  more  targeted  results?    Different  sites  will  have  very  different     requirements   ©  Lucid  Imagina-on,  Inc.  
  • 21. Greed: The Path To Salvation!   Pry  open  your  wallet  –  don't  be  cheap     You  don't  have  to  push  the  envelope     Find  the  right  balance  between  recall  and  precision     Don't  push  for  more  results  over  precision  –  unless   that  is  a  clear  requirement  (some-mes  it  is)   ©  Lucid  Imagina-on,  Inc.  
  • 22. "What  could  possibly   go  wrong?   ©  Lucid  Imagina-on,  Inc.  
  • 23. Pride!   Reinven-ng  the  wheel    "Why  don't  we  just  write  our  own  search   libraries?"    Nobody  has  a  use  case  like  us  –  right?    "We  need  to  change  the  scoring  algorithms."   ©  Lucid  Imagina-on,  Inc.  
  • 24. Pride!   Thinking  you  can  "do  it  all"  in  Solr    Solr  is  rarely  a  good  choice  as  a  SOR     Consider  other  tools  to  work  with  Solr:    Nutch    Mahout    OpenNLP    Google  Connector  Framework    Your  own  code   ©  Lucid  Imagina-on,  Inc.  
  • 25. Pride!   Stubbornly  refusing  to  use  resources  such  as  the     mailing  lists:    Solr  user  list:    solr-­‐user@lucene.apache.org    Solr  developer  list:    dev@lucene.apache.org    Lucene  user  list:    java-­‐user@lucene.apache.org       LucidFind:  hKp://www.lucidimagina-on.com/search/     ©  Lucid  Imagina-on,  Inc.  
  • 26. Pride!   "I  will  not  yield!"    Trying  to  "win  baKles"  on  the  mailing  lists    Good  Karma  –  be  a  good  ci-zen  in  the  community   ©  Lucid  Imagina-on,  Inc.  
  • 27. Pride: The Path To Salvation!   Ask  for  help  when  needed     Let  the  business  needs  define  the  project  –  don't   let  the  tail  wag  the  dog     Get  a  feel  for  the  Solr  community  and  respect  the   experience  of  others     You're  situa-on,  while  possibly  unique,  is  probably   not  completely  dissimilar  to  others.  Learn  from  the     pioneers  and  Solr  veterans   ©  Lucid  Imagina-on,  Inc.  
  • 28. "Someone  stop  me!"   ©  Lucid  Imagina-on,  Inc.  
  • 29. Lust!   Obsessing  over  unimportant  details  too  early   in  the  project    Agile  approach  is  well  suited  to  Solr   development  –  iterate!     Trying  to  "push  the  envelope"    Necessary  some-mes,  but  it's  not  called   the  "bleeding  edge"  without  reason    "Ease  in"  to  major  changes     Too  much  aKen-on  to  JVM  sebngs    Solr  experts  are  not  usually  JVM/GC  experts   ©  Lucid  Imagina-on,  Inc.  
  • 30. Lust!   "An--­‐greed"  –  CommiEng  too  many  resources     to  Solr    Make  sure  the  OS  has  plenty  of  RAM   to  cache  files,  etc     "If  one  is  good,  a  dozen  must  be  beKer!"    As  much  as  possible,  try  to  get  a  sense  of  what   your  query  volume  will  be,  and  don't  just  throw   money  at  building  a  monstrous  farm  of  searchers    Solr  has  proven  to  be  much  more  efficient  than  some     large,  commercial  search  solu-ons   ©  Lucid  Imagina-on,  Inc.  
  • 31. Lust!   Blood  from  a  turnip:    Trying  some  absurd  new  technique,     "just  because"     RAMDirectoryFactory  –  not  a  secret  way  to  faster   indexing/searching    No  disk-­‐backed  persistence    Usually  not  worth  it    …but  you  never  know…     Research  first  before  going  "extreme"   ©  Lucid  Imagina-on,  Inc.  
  • 32. Lust!   No  need  to  index  millions  of  docs  for  development     BeKer  to  work  with  small  sets  of  data  while   gebng  started.     Don't  worry  too  much  about  field  types  as  you  get   started.  Get  data  in  the  index,  then  analyze  and   refine.   ©  Lucid  Imagina-on,  Inc.  
  • 33. Lust: The Path To Salvation!   Use  an  agile  approach  –  start  simply,  build  your   applica-on  slowly,  iterate     Deal  with  the  low-­‐hanging  fruit  first     Measure  twice,  cut  once     Don't  miss  the  forest  for  the  trees  –  no  need  to   obsess  over  details  in  the  early  stages     Do  some  due  diligence  before  trying  unorthodox   approaches     Get  a  small  sample  of  data  indexed  w/o  worrying  about  type,   then  itera-ons  of  refinement   ©  Lucid  Imagina-on,  Inc.  
  • 34. "If  we  had  some  bacon     we  could  have  some    bacon  and  eggs  –  if  we     had  some  eggs."   ©  Lucid  Imagina-on,  Inc.  
  • 35. Envy!   Adding  "cool"  features  you  see  on  other   sites,  but  don't  really  need    Keep  it  "lean  and  mean",  especially   to  start    Resist  the  urge  to  include  the     "kitchen  sink"   ©  Lucid  Imagina-on,  Inc.  
  • 36. Envy!   You  too  can  master  dismax!    Don't  be  afraid  of  dismax/edismax    Lots  of  controls  to  learn,  but  also   lots  of  power    Flexibility  to  search  mul-ple  fields    Boost  different  fields    Boost  phrase  fields  (pf)  higher  than  query  fields  (qf)    Use  boost  queries  (bq)  and  func-on  queries  (bf)    Most  in-mida-ng  params:    -e    mm   ©  Lucid  Imagina-on,  Inc.  
  • 37. Envy!   Spa-al  search  –  seems  complicated,  but   major  sites  make  it  look  easy     Now,  in  Solr  3.1  –  it  is  easy!     You  can:    Store  spa-al  data  in  your  index    Filter  by  distance    Sort  by  distance    Boost/bias  by  distance    Facet  by  distance     Also  consider:  Search-­‐based  naviga-on  such  as   "Show  me  in-­‐stock  items  only"   ©  Lucid  Imagina-on,  Inc.  
  • 38. Envy: The Path To Salvation!   Focus  on  your  requirements,  don't  try   to  add  "bells  and  whistles"  you  don't   need     Don't  be  hesitant  to  dive  into  the  power   of  dismax/edismax     Take  advantage  of  new  features  such  as   Solr  spa-al,  if  those  features  will  add   value  to  the  end  user  experience   ©  Lucid  Imagina-on,  Inc.  
  • 39. "A  fat  stomach  never     breeds  fine  thoughts."   ©  Lucid  Imagina-on,  Inc.  
  • 40. Gluttony!   “Staying  fit  and  trim”  is  usually  good  prac-ce     when  designing  and  running  Solr  applica-ons    Once  again  –  keep  it  "lean  and  mean"       A  lot  of  these  issues  cross  over  into  the  “Sloth”     category    The  effort  needed  to  keep  your  configura-on     and  data  efficiently  managed  is  not  considered     important     Don't  lose  control  of  your  configura-on  files    Remove  unnecessary  elements    Version  control  all  configura-on  files   ©  Lucid  Imagina-on,  Inc.  
  • 41. Gluttony!   Slim  down  those  "bloated"  queries:    q="red  shoes"&  accountId=(12343  OR  338899   OR  554443  OR  243445  OR  55442OR  3330899     OR  59927  OR  3888999  OR  549  OR  440293579   34201  OR  339917  OR  300191  OR  339338  OR     109823  OR  679176  OR  31407815  OR  3001756     OR  134322  OR  311123  OR  987888  OR  997181  OR  771819  OR   100292  OR  3389474  OR  5505759  OR  2459577  OR  4499957  OR   1996571  OR  559590  OR  220299  OR  4404872  OR  151510  OR   66017  OR  666  OR  113459  OR  890575  OR  505725  OR  330393  OR   349940  OR  4094994  OR  1245995  OR  2459959  OR  4255909  OR   899955  OR  7878899  OR  100999  …  ∞  )   ©  Lucid  Imagina-on,  Inc.  
  • 42. Gluttony!   Stay  in  shape  –  Flex  Your  Solr  Muscles!    Keep  up  on  new  features    Training,  when  appropriate    Cer-fica-on    Contribute!    Follow  the  user  lists    Refactor  when  new  features  can  help    Keep  up  to  date  on  new  releases   ©  Lucid  Imagina-on,  Inc.  
  • 43. Gluttony: The Path To Salvation!   Keep  configura-on  files  clean  and  trim.  Remove   unused  elements     Periodically  review  queries  to  make  sure  they   are  efficient     Refactor  when  necessary  –  keep  your   applica-on  fit  and  trim   ©  Lucid  Imagina-on,  Inc.  
  • 44. "Hope  is  the  denial  of  reality."   ©  Lucid  Imagina-on,  Inc.  
  • 45. Wrath!   Wrath  -­‐  usually  synonymous  with  anger,  but…     Let’s  use  an  older  defini-on  here:      “A  vehement  denial  of  the  truth,     both  to  others  and  in  the  form  of     self-­‐denial  and  impaMence.”     Step  back  every  now  and  then  and  look   objec-vely  at  your  applica-on   ©  Lucid  Imagina-on,  Inc.  
  • 46. Wrath!   Resist  the  push  to  rush  to  produc-on…   ©  Lucid  Imagina-on,  Inc.  
  • 47. Wrath!   Ignoring  new  Solr  releases    OK  to  wait  un-l  a  release  is  proven    But  gebng  too  far  behind  makes  upgrading   more  painful  with  each  release     We  don't  have  -me  to  do  it  right,  but  we  always     have  -me  to  fix  it   ©  Lucid  Imagina-on,  Inc.  
  • 48. Wrath!   Ignoring  complaints  about  results  relevance     Disregarding  feedback  from  stakeholders     Remember  –  the  point  of  your  search  applica-on   is  to  support  the  business,  not  to  "build  cool  stuff"     Not  taking  advantage  of  log  files    Consider  mining  log  files,  storing  data  in   rela-onal  DB  for  genera-ng  reports    Capturing  user  queries  and  query  counts  can  be   extremely  useful    Can  also  be  used  for  query-­‐based  autosuggest.   (not  just  indexed  terms)   ©  Lucid  Imagina-on,  Inc.  
  • 49. Wrath: The Path To Salvation!   Keep  your  version  of  Solr  up  to  date    OK  to  wait  "awhile",  but  don't  skip  versions     Seek  and  embrace  feedback  from  business  and     domain  experts     Constantly  gauge  and  improve  relevance  as  an     ongoing  task     Avoid  the  push  to  release  too  soon  (as  best  you  can)     Take  advantage  of  log  files  to  understand  what     users  are  doing,  and  what  is  not  working  well   ©  Lucid  Imagina-on,  Inc.  
  • 50. ¡Búsqueda,  y  usted  encontrará!