SlideShare une entreprise Scribd logo
1  sur  29
                                  	
      	
  
	
                                   	
      	
  
	
                                   	
      	
  
	
                                   	
      	
  
	
                                   	
      	
  
	
                                   	
      	
  
	
                                   	
      	
  
	
                                   	
      	
     	
  
	
  



	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  

The	
  Case	
  for	
  Lucene/Solr:	
  	
  
A	
  Manager’s	
  Guide	
  	
  
to	
  Real	
  World	
  	
  
Open	
  Source	
  	
  
Search	
  Applications	
  	
  
	
  
	
  
	
  
By	
  Lucid	
  Imagination	
  	
  
                                                                  	
                                                                   	
  
	
                                                                   	
                                                                   	
  
	
                                                                   	
                                                                   	
  
	
                                                                   	
                                                                   	
  
	
                                                                   	
                                                                   	
  
	
                                                                   	
                                                                   	
  
	
                                                                   	
                                                                   	
  
	
                                                                   	
                                                                   	
               	
  




	
  
Abstract	
  
In	
  today’s	
  information-­‐driven	
  environment,	
  search	
  is	
  a	
  critical	
  solution	
  to	
  problems	
  when	
  it	
  slashes	
  
the	
  time	
  and	
  effort	
  separating	
  end	
  users	
  from	
  the	
  data	
  they	
  value.	
  Search	
  spans	
  the	
  range	
  of	
  
business	
  models	
  and	
  use	
  cases—from	
  driving	
  direct	
  customer	
  sales,	
  to	
  analytics	
  and	
  business	
  
intelligence,	
  employee	
  productivity,	
  and	
  reduced	
  administrative	
  overhead.	
  Making	
  the	
  best	
  use	
  of	
  
search	
  requires	
  two	
  perspectives:	
  both	
  a	
  look	
  at	
  the	
  business	
  requirements	
  for	
  a	
  search	
  application	
  
and	
  a	
  view	
  to	
  new	
  business	
  opportunities	
  created	
  by	
  using	
  search	
  to	
  leverage	
  the	
  organization’s	
  
content	
  resources.	
  	
  
	
  
Thousands	
  of	
  organizations	
  across	
  different	
  sectors	
  and	
  business	
  models	
  have	
  harnessed	
  Apache	
  
Lucene/Solr	
  to	
  search	
  their	
  rapidly	
  growing	
  and	
  diversifying	
  content	
  resources.	
  Underlying	
  this	
  
broad	
  adoption	
  is	
  the	
  extraordinary	
  power,	
  scalability,	
  and	
  versatility	
  of	
  open	
  source	
  search	
  
technologies.	
  	
  
	
  
This	
  paper	
  provides	
  an	
  overview	
  of	
  both	
  the	
  requirements	
  and	
  the	
  opportunities	
  for	
  search	
  
applications.	
  It	
  then	
  explores	
  how	
  real	
  world	
  organizations	
  are	
  successfully	
  using	
  Lucene/Solr	
  
search	
  applications	
  to	
  meet	
  those	
  opportunities,	
  presenting	
  how	
  the	
  technology	
  is	
  used	
  for	
  specific	
  
business	
  models	
  and	
  use	
  cases	
  across	
  industries.	
  In	
  addition,	
  it	
  offers	
  a	
  baseline	
  for	
  setting	
  search	
  
requirements	
  that	
  managers	
  and	
  architects	
  can	
  use	
  to	
  adopt	
  Lucene/Solr,	
  and	
  adapt	
  this	
  open	
  
source	
  search	
  technology	
  to	
  the	
  unique	
  needs	
  of	
  their	
  business.	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  
©	
  2010,	
  Lucid	
  Imagination	
  




The Case for Lucene/Solr: Real World Search Applications
A Lucid Imagination White Paper • January 2010 	
                                                                                                Page ii
                                                                                    	
                                                                                  	
  
	
                                                                                     	
                                                                                  	
  
	
                                                                                     	
                                                                                  	
  
	
                                                                                     	
                                                                                  	
  
	
                                                                                     	
                                                                                  	
  
	
                                                                                     	
                                                                                  	
  
	
                                                                                     	
                                                                                  	
  
	
                                                                                     	
                                                                                  	
            	
  




	
  


Table	
  of	
  Contents	
  
Introduction ............................................................................................................................................................... 1	
  
Understanding	
  Search	
  Opportunities	
  and	
  Requirements ...................................................................... 2	
  
           What	
  Data	
  and	
  Documents	
  Are	
  You	
  Searching? ................................................................................ 3	
  
           Who	
  Needs	
  the	
  Results	
  and	
  Why? ........................................................................................................... 3	
  
           Where	
  Is	
  Search	
  Integrated	
  with	
  IT	
  Infrastructure? ....................................................................... 5	
  
           How	
  Is	
  the	
  Search	
  Interface	
  Presented	
  to	
  the	
  User?........................................................................ 5	
  
The	
  Real	
  World:	
  Applications	
  and	
  Case	
  Studies ......................................................................................... 7	
  
       Yellow	
  Pages,	
  Local	
  Search,	
  and	
  Searching	
  Classifieds........................................................................ 8	
  
       Media .......................................................................................................................................................................10	
  
       E-­‐commerce..........................................................................................................................................................12	
  
       Job	
  and	
  Career	
  Sites ..........................................................................................................................................14	
  
       Libraries,	
  Archives,	
  and	
  Museums	
  (LAMs)	
  Search ..............................................................................16	
  
       Social	
  Media	
  Search...........................................................................................................................................18	
  
       Enterprise	
  (Intranet)	
  Search.........................................................................................................................21	
  
Business	
  Use	
  Case	
  Matrix ...................................................................................................................................23	
  
Appendix:	
  Lucene/Solr	
  Features	
  and	
  Benefits..........................................................................................24	
  
	
  




The Case for Lucene/Solr: Real World Search Applications
A Lucid Imagination White Paper • January 2010 	
                                                                                                                           Page iii
                                                                                                                                                                                                                                	
     	
  
	
                                                                                                                                                                                                                                 	
     	
  
	
                                                                                                                                                                                                                                 	
     	
  
	
                                                                                                                                                                                                                                 	
     	
  
	
                                                                                                                                                                                                                                 	
     	
  
	
                                                                                                                                                                                                                                 	
     	
  
	
                                                                                                                                                                                                                                 	
     	
  
	
                                                                                                                                                                                                                                 	
     	
              	
  




Introduction
As	
  fast	
  as	
  companies,	
  communities,	
  and	
  consumers	
  produce	
  data—about	
  each	
  other,	
  products,	
  
opinions,	
  research,	
  and	
  everything	
  else	
  imaginable—they	
  need	
  faster,	
  more	
  versatile	
  search	
  
capabilities	
  to	
  find	
  the	
  information	
  they	
  need	
  to	
  create	
  opportunities	
  for	
  competitive	
  advantage.	
  In	
  
today’s	
  information-­‐driven	
  environment,	
  search	
  addresses	
  the	
  critical	
  problems	
  created	
  by	
  the	
  
explosive	
  growth	
  of	
  content	
  by	
  slashing	
  the	
  time	
  and	
  effort	
  users	
  expend	
  in	
  finding	
  data	
  they	
  value.	
  
Search	
  spans	
  the	
  range	
  of	
  business	
  models	
  and	
  use	
  cases:	
  from	
  driving	
  direct	
  customer	
  sales,	
  to	
  
analytics	
  and	
  business	
  intelligence,	
  employee	
  productivity,	
  and	
  reduced	
  administrative	
  overhead.	
  	
  
Apache	
  Lucene/Solr1	
  open	
  source	
  search	
  technology	
  has	
  been	
  implemented	
  across	
  the	
  broadest	
  
range	
  of	
  applications	
  and	
  business	
  models—and	
  likely	
  in	
  ways	
  that	
  can	
  fit	
  the	
  needs	
  of	
  your	
  
organization.	
  In	
  successful	
  operation	
  today	
  at	
  thousands	
  of	
  enterprises,	
  Lucene/Solr	
  technology	
  
scales	
  from	
  tens	
  of	
  thousands	
  to	
  hundreds	
  and	
  billions	
  of	
  documents;	
  searches	
  data	
  that	
  is	
  
structured,	
  unstructured,	
  and	
  in	
  combination;	
  data	
  inside	
  and	
  outside	
  the	
  firewall;	
  and	
  ranges	
  in	
  
use	
  from	
  a	
  simple	
  website	
  search	
  box	
  through	
  sophisticated	
  faceted	
  navigation.	
  It	
  addresses	
  equally	
  
diverse	
  business	
  processes	
  and	
  mission	
  critical	
  applications.	
  Across	
  the	
  spectrum,	
  Lucene/Solr	
  
helps	
  users	
  find,	
  make	
  sense	
  of,	
  and	
  act	
  upon	
  information	
  quickly	
  and	
  efficiently.	
  
In	
  this	
  white	
  paper,	
  we’ll	
  review	
  real-­‐world	
  case	
  studies	
  for	
  Lucene/Solr	
  functionality	
  across	
  
business	
  sectors	
  to	
  demonstrate	
  its	
  versatility	
  and	
  varied	
  applicability.	
  The	
  diversity	
  of	
  examples	
  
provides	
  strong	
  evidence	
  of	
  Lucene/Solr’s	
  flexibility	
  and	
  power	
  as	
  a	
  search	
  technology.	
  The	
  
examples	
  also	
  attest	
  to	
  the	
  innovation	
  and	
  transparency	
  inherent	
  to	
  the	
  open	
  source	
  development	
  
model.	
  Our	
  focus	
  is	
  on	
  familiarizing	
  the	
  audience	
  of	
  business	
  managers	
  and	
  application	
  owners	
  with	
  
existing	
  Lucene/Solr	
  applications;	
  the	
  substantial	
  technical	
  advantages	
  to	
  developers	
  are	
  covered	
  
elsewhere.	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  
	
  
1
 Lucene and Solr are complementary technologies that offer very similar underlying capabilities; Solr is the Lucene
Search Server. Since Lucene serves as the core of Solr’s search capabilities, this paper refers to the two as
Lucene/Solr. For more information, see the Appendix.

The Case for Lucene/Solr: Real World Search Applications
A Lucid Imagination White Paper • January 2010 	
                                                                                                                                                                                                Page 1
                                                                       	
                                                                        	
  
	
                                                                        	
                                                                        	
  
	
                                                                        	
                                                                        	
  
	
                                                                        	
                                                                        	
  
	
                                                                        	
                                                                        	
  
	
                                                                        	
                                                                        	
  
	
                                                                        	
                                                                        	
  
	
                                                                        	
                                                                        	
              	
  



We’ll	
  first	
  survey	
  the	
  key	
  requirements	
  and	
  business	
  use	
  cases	
  of	
  search	
  and	
  then	
  look	
  at	
  where	
  
they	
  are	
  built	
  into	
  search	
  applications.	
  Our	
  objective	
  is	
  to	
  provide	
  business	
  managers	
  and	
  
application	
  owners	
  with	
  a	
  broad	
  perspective	
  on	
  how	
  Lucene/Solr	
  search	
  technology	
  is	
  used	
  to	
  build	
  
solutions	
  to	
  compelling	
  business	
  problems.	
  In	
  the	
  Appendix,	
  we	
  provide	
  an	
  overview	
  of	
  
Lucene/Solr’s	
  key	
  features	
  and	
  benefits,	
  with	
  a	
  basic	
  outline	
  of	
  the	
  capabilities	
  offered	
  to	
  meet	
  the	
  
broadest	
  range	
  of	
  business	
  needs.	
  	
  


Understanding Search
Opportunities and Requirements
Search	
  technology	
  has	
  come	
  a	
  long	
  way	
  from	
  its	
  roots	
  in	
  matching	
  keywords	
  with	
  appearance	
  in	
  
documents	
  and	
  obtaining	
  undifferentiated	
  results.	
  Search	
  today	
  empowers	
  users	
  by	
  delivering	
  
actionable	
  information	
  quickly	
  and	
  efficiently,	
  across	
  multiple,	
  diverse	
  sources	
  of	
  data.	
  The	
  
business	
  use	
  cases	
  range	
  from	
  executing	
  mission	
  critical	
  commercial	
  transactions	
  (e.g.,	
  e-­‐commerce	
  
sites)	
  to	
  unlocking	
  employee	
  and	
  end-­‐user	
  productivity	
  in	
  the	
  search	
  for	
  a	
  single	
  relevant	
  document	
  
(e.g.,	
  enterprise	
  search).	
  	
  
Given	
  the	
  breadth	
  of	
  capability	
  of	
  the	
  problem	
  domain,	
  it’s	
  useful	
  to	
  look	
  at	
  search	
  and	
  ask	
  two	
  
fundamental	
  questions:	
  “How	
  it	
  can	
  it	
  solve	
  my	
  business	
  problems?”	
  and	
  “What	
  new	
  business	
  
opportunities	
  can	
  search	
  solve	
  for?”	
  
In	
  considering	
  how	
  search	
  technology	
  solves	
  business	
  problems,	
  it	
  is	
  useful	
  to	
  start	
  with	
  an	
  
elucidation	
  of	
  the	
  requirements	
  you’ll	
  need	
  to	
  consider	
  for	
  your	
  search	
  application.	
  At	
  the	
  same	
  
time,	
  be	
  sure	
  to	
  look	
  more	
  broadly	
  at	
  the	
  capabilities	
  that	
  Lucene/Solr	
  offers,	
  as	
  it	
  can	
  help	
  open	
  up	
  
new	
  frontiers	
  for	
  incorporating	
  search	
  and	
  leveraging	
  more	
  value	
  from	
  data	
  repositories.	
  	
  
Starting	
  with	
  some	
  basic	
  questions—what,	
  who,	
  how,	
  and	
  where—you	
  can	
  clarify	
  the	
  high-­‐level	
  
business	
  requirements	
  specific	
  to	
  your	
  business	
  needs,	
  which	
  in	
  turn	
  allow	
  you	
  to	
  make	
  the	
  best	
  
decisions	
  for	
  your	
  search	
  application.	
  The	
  process	
  of	
  looking	
  at	
  the	
  fundamentals	
  also	
  raises	
  new	
  
questions	
  about	
  how	
  and	
  where	
  the	
  search	
  technology	
  offered	
  by	
  Lucene	
  and	
  Solr	
  can	
  create	
  new	
  
business	
  opportunities.	
  
Let’s	
  look	
  at	
  four	
  fundamental	
  questions	
  you	
  should	
  address	
  in	
  understanding	
  search	
  opportunities	
  
and	
  requirements:	
  
                  •     What	
  data	
  and	
  documents	
  are	
  you	
  searching?	
  	
  
                  •     Who	
  needs	
  the	
  results	
  and	
  why?	
  	
  
                  •     Where	
  is	
  search	
  integrated	
  with	
  IT	
  Infrastructure?	
  	
             	
  	
  
                  •     How	
  is	
  the	
  search	
  interface	
  presented	
  to	
  the	
  user?	
  	
  




The Case for Lucene/Solr: Real World Search Applications
A Lucid Imagination White Paper • January 2010 	
                                                                                                          Page 2
                                                                    	
                                                                      	
  
	
                                                                     	
                                                                      	
  
	
                                                                     	
                                                                      	
  
	
                                                                     	
                                                                      	
  
	
                                                                     	
                                                                      	
  
	
                                                                     	
                                                                      	
  
	
                                                                     	
                                                                      	
  
	
                                                                     	
                                                                      	
              	
  




What Data and Documents Are You Searching?
Business	
  today	
  is	
  driven	
  more	
  than	
  ever	
  by	
  the	
  end-­‐users’	
  creation	
  and	
  consumption	
  of	
  real-­‐time	
  
information.	
  A	
  key	
  differentiating	
  capability	
  of	
  search	
  technology	
  is	
  ingesting	
  a	
  broad	
  range	
  of	
  
content	
  types	
  and	
  processing	
  large	
  collections	
  of	
  diverse	
  data	
  in	
  real	
  time	
  in	
  order	
  to	
  deliver	
  
actionable	
  information.	
  Two	
  aspects	
  to	
  consider:	
  
       •   Types	
  of	
  Content	
  
           Content	
  comes	
  in	
  multiple	
  formats:	
  HTML	
  pages,	
  XML	
  files,	
  PDFs,	
  images,	
  PowerPoint	
  
           presentations,	
  Excel	
  spreadsheets,	
  Word	
  documents,	
  log	
  files,	
  multimedia	
  content,	
  and	
  
           more.	
  Content	
  resides	
  in	
  various	
  repositories,	
  including	
  databases,	
  file	
  servers,	
  content	
  
           management	
  systems,	
  archiving	
  systems,	
  collaboration	
  applications,	
  and	
  employee	
  
           desktops	
  and	
  laptops.	
  Search	
  technology	
  must	
  be	
  able	
  to	
  locate,	
  organize,	
  and	
  aggregate	
  
           data	
  whatever	
  its	
  form	
  or	
  location.	
  	
  
       •   Frequency	
  of	
  Updating	
  Content	
  
           Organizations	
  update	
  content	
  at	
  varying	
  intervals,	
  driven	
  by	
  differing	
  business	
  processes	
  
           and	
  models—social	
  media	
  or	
  news	
  applications	
  have	
  real-­‐time	
  content	
  need,	
  whereas	
  an	
  e-­‐
           commerce	
  application	
  might	
  re-­‐index	
  in	
  response	
  to	
  new	
  inventory	
  on	
  a	
  batch	
  basis	
  and	
  a	
  
           research	
  institution	
  might	
  add	
  to	
  its	
  collection	
  less	
  often	
  still.	
  Search	
  applications	
  need	
  to	
  be	
  
           adaptable	
  to	
  the	
  differences	
  in	
  content	
  change	
  frequency.	
  


Who Needs the Results and Why?
Business	
  search	
  puts	
  a	
  high	
  priority	
  on	
  end	
  user	
  experience	
  and	
  results	
  in	
  which	
  the	
  searched	
  
content	
  is	
  tuned	
  to	
  the	
  unique	
  needs	
  of	
  each	
  user.	
  Because,	
  after	
  all,	
  the	
  human	
  dimension—the	
  
usefulness	
  of	
  results	
  and	
  the	
  efficacy	
  of	
  interaction—is	
  the	
  acid	
  test	
  of	
  a	
  search	
  application.	
  Internet	
  
search	
  applications	
  like	
  Google,	
  Yahoo,	
  and	
  Bing	
  are	
  now	
  common	
  and	
  mature.	
  They	
  have	
  raised	
  
user	
  expectations	
  about	
  key	
  qualities	
  of	
  the	
  search	
  experience...but	
  they	
  solve	
  a	
  very	
  different	
  
problem.	
  	
  
While	
  Internet	
  searches	
  can	
  produce	
  millions	
  of	
  results	
  in	
  milliseconds,	
  they	
  rely	
  on	
  measures	
  like	
  
website	
  popularity	
  or	
  URLs	
  and	
  domain	
  names—not	
  relevant	
  and	
  not	
  generally	
  applicable	
  to	
  
purpose-­‐built	
  applications	
  for	
  businesses.	
  What’s	
  more,	
  they	
  rely	
  on	
  generalizing	
  relevancy	
  for	
  a	
  
global	
  population	
  of	
  all	
  Internet	
  users,	
  without	
  being	
  tied	
  to	
  business	
  rules,	
  or	
  business	
  process	
  
logic,	
  or	
  the	
  opportunity	
  cost	
  of	
  improved	
  precision	
  for	
  a	
  specific	
  set	
  of	
  data	
  or	
  search	
  users.	
  
Business	
  search	
  applications	
  cannot	
  rely	
  on	
  such	
  brute	
  force	
  coarse	
  approaches	
  to	
  tune	
  their	
  
results.	
  They	
  need	
  far	
  more	
  control	
  and	
  precision.	
  They	
  have	
  to	
  be	
  able	
  to	
  deliver	
  highly	
  useful	
  
results	
  while	
  matching,	
  if	
  not	
  exceeding,	
  the	
  levels	
  of	
  user	
  experience	
  that	
  people	
  have	
  come	
  to	
  
expect	
  by	
  virtue	
  of	
  their	
  daily	
  interactions	
  with	
  commercial	
  search	
  engines.	
  Key	
  points	
  of	
  
consideration	
  from	
  a	
  business	
  perspective	
  are:	
  

The Case for Lucene/Solr: Real World Search Applications
A Lucid Imagination White Paper • January 2010 	
                                                                                                     Page 3
                                                                     	
                                                                        	
  
	
                                                                      	
                                                                        	
  
	
                                                                      	
                                                                        	
  
	
                                                                      	
                                                                        	
  
	
                                                                      	
                                                                        	
  
	
                                                                      	
                                                                        	
  
	
                                                                      	
                                                                        	
  
	
                                                                      	
                                                                        	
              	
  



       •   Relevance	
  
           Relevance	
  is	
  entirely	
  a	
  factor	
  of	
  the	
  goals	
  of	
  the	
  search	
  application’s	
  users.	
  The	
  application	
  
           must	
  have	
  the	
  mechanisms	
  to	
  recognize	
  the	
  subjective	
  needs	
  of	
  users	
  and	
  tune	
  results	
  
           accordingly.	
  It	
  must	
  also	
  provide	
  easier	
  ways	
  to	
  narrow	
  search	
  criteria	
  without	
  requiring	
  
           users	
  to	
  come	
  up	
  with	
  perfect	
  query	
  terms.	
  Flexibility	
  for	
  drilling	
  deeper	
  will	
  make	
  results	
  
           richer	
  and	
  valuable.	
  Mechanisms	
  to	
  apply	
  filters,	
  proximity	
  values,	
  and	
  sorting	
  parameters	
  
           to	
  narrow	
  search	
  scope	
  can	
  also	
  lead	
  to	
  a	
  richer	
  set	
  of	
  more	
  useful	
  results,	
  with	
  less	
  time	
  
           and	
  effort.	
  
       •   Cost	
  of	
  Relevance	
  	
  
           As	
  business	
  goals	
  are	
  driven	
  by	
  revenue	
  opportunities	
  and	
  cost	
  savings,	
  it	
  is	
  critical	
  to	
  tie	
  
           relevance	
  to	
  the	
  economics	
  of	
  the	
  business.	
  For	
  example,	
  a	
  public-­‐facing	
  retail	
  site	
  should	
  
           focus	
  on	
  matching	
  merchandise	
  to	
  search,	
  site	
  stickiness,	
  and	
  customer	
  loyalty.	
  It	
  requires	
  
           search	
  technology	
  that	
  streamlines	
  and	
  simplifies	
  the	
  shopping	
  experience	
  with	
  relevant	
  
           results	
  directly	
  contributing	
  to	
  sales	
  revenue.	
  For	
  knowledge	
  workers,	
  internal	
  search	
  
           applications	
  should	
  help	
  make	
  employees	
  more	
  productive	
  by	
  reducing	
  the	
  amount	
  of	
  time	
  
           and	
  effort	
  to	
  find	
  documents	
  they	
  need	
  to	
  do	
  their	
  jobs.	
  Multiple	
  studies	
  show	
  that	
  
           information	
  workers	
  can	
  spend	
  20–30%	
  of	
  their	
  time	
  searching	
  for	
  information.	
  
       •   Precision	
  Ranking	
  
           Result	
  accuracy,	
  sorted	
  by	
  attributes	
  like	
  relevance,	
  date,	
  field,	
  or	
  any	
  document	
  property	
  
           feature,	
  makes	
  the	
  search	
  process	
  better.	
  End	
  users	
  generally	
  abandon	
  a	
  search	
  before	
  
           tackling	
  the	
  fine	
  points	
  of	
  Boolean	
  logic	
  or	
  scrolling	
  for	
  a	
  result	
  buried	
  too	
  far	
  down.	
  	
  
       •   Query	
  Response	
  Speed	
  
           Today,	
  5–7	
  seconds	
  is	
  the	
  typical	
  threshold	
  for	
  end-­‐user	
  patience.	
  Too	
  much	
  wait	
  time	
  for	
  
           search	
  results	
  frustrates	
  users,	
  and	
  causes	
  them	
  to	
  abandon	
  pages.	
  Fast,	
  relevant	
  results	
  
           cannot	
  be	
  limited	
  by	
  search	
  technology	
  hamstrung	
  by	
  data	
  influx	
  or	
  query	
  overload.	
  Query	
  
           response	
  time	
  should	
  also	
  work	
  hand-­‐in-­‐hand	
  with	
  the	
  refinement	
  of	
  multiple	
  search	
  
           attributes,	
  so	
  that	
  increasingly	
  complex	
  queries	
  do	
  not	
  extract	
  a	
  performance	
  penalty.	
  




The Case for Lucene/Solr: Real World Search Applications
A Lucid Imagination White Paper • January 2010 	
                                                                                                        Page 4
                                                                   	
                                                                      	
  
	
                                                                    	
                                                                      	
  
	
                                                                    	
                                                                      	
  
	
                                                                    	
                                                                      	
  
	
                                                                    	
                                                                      	
  
	
                                                                    	
                                                                      	
  
	
                                                                    	
                                                                      	
  
	
                                                                    	
                                                                      	
              	
  




Where Is Search Integrated with IT Infrastructure?
Useful,	
  valuable	
  search	
  technology	
  rarely	
  exists	
  in	
  isolation.	
  Searched	
  data	
  is	
  transformed	
  into	
  
actionable	
  information	
  when	
  it	
  is	
  integrated	
  with	
  the	
  organization’s	
  information	
  infrastructure:	
  
business	
  process	
  to	
  business	
  intelligence	
  to	
  content	
  management	
  systems.	
  A	
  robust	
  search	
  
technology	
  must	
  be	
  customizable	
  to	
  integrate	
  with	
  the	
  existing	
  systems	
  seamlessly.	
  	
  
       •     Application	
  Integration	
  
             A	
  key	
  requirement	
  for	
  a	
  search	
  application	
  is	
  its	
  extensibility	
  for	
  integration	
  with	
  existing	
  
             infrastructure	
  and	
  applications	
  like	
  content	
  management	
  systems,	
  databases,	
  and	
  the	
  full	
  
             range	
  of	
  business	
  processes	
  and	
  applications.	
  It	
  should	
  have	
  interfaces	
  that	
  support	
  
             ingestion	
  of	
  data	
  as	
  well	
  as	
  delivery	
  of	
  results	
  in	
  readily	
  consumable	
  formats—because	
  in	
  
             many	
  cases,	
  results	
  are	
  consumed	
  by	
  other	
  applications,	
  not	
  a	
  human.	
  
       •      Scalability	
  
              We	
  can	
  assume	
  that	
  data	
  will	
  change	
  and	
  grow.	
  So	
  scalability	
  is	
  a	
  key	
  factor	
  for	
  search	
  
              application.	
  Applications	
  should	
  grow	
  to	
  address	
  future	
  needs	
  without	
  penalties	
  for	
  the	
  
              breadth	
  of	
  data	
  or	
  for	
  the	
  count	
  of	
  documents	
  indexed.	
  The	
  search	
  application	
  should	
  be	
  
              able	
  to	
  grow	
  with	
  the	
  requirements	
  of	
  the	
  organization,	
  without	
  needing	
  additional	
  large	
  
              investments	
  in	
  hardware	
  to	
  match	
  the	
  pace	
  of	
  growth.	
  Proprietary	
  search	
  vendors	
  often	
  
              charge	
  for	
  search	
  by	
  the	
  number	
  of	
  documents	
  indexed.	
  In	
  a	
  world	
  where	
  constantly	
  
              expanding	
  content	
  growth	
  is	
  the	
  norm,	
  such	
  costs	
  can	
  be	
  a	
  real	
  and	
  substantial	
  drag	
  on	
  
              the	
  cost	
  of	
  ownership	
  for	
  search	
  applications,	
  many	
  times	
  resulting	
  in	
  negative	
  return.	
  	
  
       •      Security	
  
              Every	
  organization	
  has	
  its	
  own	
  security	
  requirements	
  and	
  access	
  controls.	
  Search	
  
              technologies	
  need	
  to	
  comply	
  with	
  the	
  security	
  policies	
  of	
  the	
  enterprise,	
  controlling	
  
              results	
  that	
  have	
  restricted	
  access.	
  The	
  search	
  technology	
  should	
  also	
  be	
  able	
  to	
  make	
  use	
  
              of	
  document-­‐level	
  security	
  from	
  other	
  sources.	
  	
  


How Is the Search Interface Presented to the User?
The	
  user	
  interface	
  is	
  where	
  search	
  delivers	
  on	
  findability	
  and	
  presents	
  actionable	
  results.	
  The	
  
search	
  application	
  is	
  only	
  as	
  good	
  as	
  the	
  convenience	
  of	
  submitting	
  queries,	
  reviewing	
  and	
  refining	
  
results,	
  and	
  finding	
  information.	
  Key	
  aspects	
  to	
  consider:	
  	
  




The Case for Lucene/Solr: Real World Search Applications
A Lucid Imagination White Paper • January 2010 	
                                                                                                    Page 5
                                                               	
                                                                            	
  
	
                                                                	
                                                                            	
  
	
                                                                	
                                                                            	
  
	
                                                                	
                                                                            	
  
	
                                                                	
                                                                            	
  
	
                                                                	
                                                                            	
  
	
                                                                	
                                                                            	
  
	
                                                                	
                                                                            	
              	
  



       •   Navigation	
  
           Users	
  benefit	
  from	
  guidance	
  that	
  makes	
  their	
  queries	
  more	
  productive.	
  Techniques	
  such	
  as	
  
           faceted	
  search	
  with	
  result	
  clustering,	
  advance	
  hinting	
  (“did	
  you	
  mean”),	
  “more	
  like	
  this,”	
  
           and	
  drop	
  down	
  menus	
  for	
  setting	
  search	
  scope	
  help	
  users	
  achieve	
  desired	
  results	
  faster,	
  
           making	
  a	
  search	
  application	
  both	
  user-­‐	
  and	
  information-­‐friendly.	
  It	
  is	
  also	
  important	
  to	
  
           allow	
  users	
  to	
  draw	
  associative	
  connections	
  between	
  results—using	
  the	
  technology	
  to	
  
           uncover	
  relationships	
  and	
  discover	
  more	
  about	
  what	
  they	
  were	
  seeking	
  than	
  they	
  knew	
  at	
  
           the	
  outset.	
  
           	
  


                                                                                                                  The	
  NetFlix	
  search	
  
                                                                                                                  application	
  is	
  powered	
  
                                                                                                                  by	
  Solr;	
  it	
  adds	
  the	
  fuzzy	
  
                                                                                                                  dimension	
  to	
  search,	
  
                                                                                                                  with	
  auto-­completion	
  of	
  
                                                                                                                  movie	
  names,	
  correction	
  
                                                                                                                  of	
  misspelled	
  names	
  of	
  
                                                                                                                  actors,	
  and	
  suggests	
  
                                                                                                                  titles	
  closest	
  to	
  the	
  
                                                                                                                  query.	
  As	
  a	
  result,	
  85%	
  
                                                                                                                  of	
  users	
  have	
  found	
  the	
  
                                                                                                                  movie	
  they	
  were	
  looking	
  
                                                                                                                  for	
  ranked	
  at	
  the	
  #1	
  spot	
  
                                                                                                                  in	
  the	
  results.	
  
                                                                                                           	
  

                                                                                                            	
  
              	
  
       •   Discovery	
  
           Search	
  application	
  functionality	
  should	
  extend	
  beyond	
  the	
  generic	
  presentation	
  of	
  a	
  result	
  
           list	
  of	
  documents	
  that	
  contain	
  a	
  keyword.	
  Highlighting	
  keywords	
  in	
  searched	
  results,	
  
           expanding	
  searches	
  with	
  synonyms	
  and	
  spell	
  checking,	
  and	
  offering	
  users	
  ways	
  to	
  learn	
  a	
  
           bit	
  more	
  about	
  documents	
  in	
  the	
  results	
  without	
  having	
  to	
  load	
  the	
  document	
  are	
  great	
  
           ways	
  to	
  significantly	
  improve	
  usability.	
  	
  
	
  
       •   Intuitive	
  Intelligence	
  
           Search	
  applications	
  must	
  go	
  beyond	
  keyword	
  search	
  to	
  help	
  users	
  retrieve	
  accurate	
  
           information	
  even	
  when	
  they	
  are	
  not	
  sure	
  of	
  the	
  best	
  keywords.	
  Additionally,	
  they	
  should	
  
           reduce	
  misinterpretations	
  where	
  homonyms,	
  spelling	
  errors,	
  and	
  ambiguous	
  keywords	
  are	
  
           involved	
  (e.g.,	
  is	
  “apple”	
  a	
  fruit	
  or	
  a	
  computer	
  company?).	
  

The Case for Lucene/Solr: Real World Search Applications
A Lucid Imagination White Paper • January 2010 	
                                                                                                      Page 6
                                                           	
                                                             	
  
	
                                                            	
                                                             	
  
	
                                                            	
                                                             	
  
	
                                                            	
                                                             	
  
	
                                                            	
                                                             	
  
	
                                                            	
                                                             	
  
	
                                                            	
                                                             	
  
	
                                                            	
                                                             	
              	
  




The Real World: Applications and Case Studies
With	
  an	
  understanding	
  of	
  the	
  fundamentals	
  of	
  search	
  business	
  applications	
  in	
  hand,	
  it	
  is	
  
helpful	
  to	
  gain	
  additional	
  context	
  on	
  business	
  usage	
  through	
  a	
  survey	
  of	
  organizations	
  that	
  
have	
  successfully	
  used	
  Lucene/Solr	
  for	
  powerful	
  search	
  applications.	
  	
  
All	
  of	
  these	
  cases	
  were	
  built	
  on	
  the	
  capability	
  of	
  Lucene/Solr	
  to	
  provide	
  innovative,	
  high-­‐
performance,	
  cross-­‐platform,	
  feature-­‐rich	
  search	
  technology	
  suitable	
  for	
  nearly	
  every	
  
application.	
  By	
  powering	
  diverse	
  search	
  applications	
  for	
  thousands	
  of	
  organizations	
  such	
  
as	
  AT&T,	
  Zappos,	
  McClatchy,	
  Smithsonian,	
  MTV	
  Networks,	
  LinkedIn,	
  MySpace,	
  Comcast,	
  
Monster,	
  Netflix,	
  and	
  many	
  more,	
  Lucene/Solr	
  has	
  provided	
  mission	
  critical	
  capability	
  that	
  
turns	
  search	
  into	
  a	
  robust	
  competitive	
  advantage.	
  	
  
For	
  these	
  organizations,	
  Lucene/Solr	
  solutions	
  regularly	
  index	
  and	
  search	
  hundreds	
  of	
  
millions	
  of	
  documents	
  with	
  subsecond	
  response	
  time,	
  unencumbered	
  by	
  costly	
  licensing	
  or	
  
vendor	
  lock-­‐in.	
  Together	
  they	
  represent	
  a	
  compelling	
  argument	
  for	
  the	
  broad	
  applicability	
  
of	
  Lucene/Solr	
  across	
  the	
  full	
  range	
  of	
  business	
  opportunities	
  and	
  search	
  needs.	
  Business	
  
use	
  case	
  studies	
  we’ll	
  review	
  include:	
  
       •   Yellow	
  Pages,	
  Local	
  Search,	
  and	
  Searching	
  Classifieds	
  
       •   Media	
  
       •   E-­‐commerce	
  	
  
       •   Job	
  and	
  Career	
  Sites	
  	
  
       •   Libraries,	
  Archives,	
  and	
  Museums	
  (LAMs)	
  Search	
  	
  
       •   Social	
  Media	
  Search	
  	
  
       •   Enterprise	
  (Intranet)	
  Search	
  	
  




The Case for Lucene/Solr: Real World Search Applications
A Lucid Imagination White Paper • January 2010 	
                                                                                   Page 7
                                                                                        	
                                                                           	
  
	
                                                                                         	
                                                                           	
  
	
                                                                                         	
                                                                           	
  
	
                                                                                         	
                                                                           	
  
	
                                                                                         	
                                                                           	
  
	
                                                                                         	
                                                                           	
  
	
                                                                                         	
                                                                           	
  
	
                                                                                         	
                                                                           	
              	
  




Yellow Pages, Local Search, and Searching                                                                                               Requirements	
  	
  
Classifieds
In	
  the	
  business	
  of	
  online	
  local	
  search,	
  geographic-­‐based	
  (location)	
                                         •      Intelligent	
  results	
  going	
  
                                                                                                                                               beyond	
  keyword	
  search	
  
relevance	
  generates	
  competitive	
  advantage.	
  Online	
  directories	
  
need	
  to	
  provide	
  a	
  rich,	
  interactive	
  search	
  experience	
  to	
  users	
  to	
                                       •      Deeper,	
  faceted	
  
increase	
  site	
  views	
  and	
  stickiness,	
  which	
  in	
  turn	
  translates	
  into	
                                                 navigation	
  
increased	
  advertising	
  revenue.	
  Simplified	
  location-­‐based	
  search,	
                                                     •      Seamless	
  integration	
  
                                                                                                                                               with	
  latest	
  Web	
  2.0	
  
intuitive	
  faceted	
  query	
  response,	
  and	
  data	
  mashups	
  are	
  a	
  few	
  
features	
  that	
  define	
  search	
  functionality	
  for	
  an	
  online	
  directory.	
                                                   tools	
  
                                                                                                                                        •      Lower	
  IT-­‐related	
  costs	
  
Lucene/Solr	
  solutions	
  offer	
  accurate	
  search	
  results,	
  factoring	
  in	
                                                •      Geocentric	
  user	
  
location,	
  users’	
  reviews,	
  and	
  ratings,	
  alongside	
  paid	
  advertising.	
  By	
                                                experience	
  
taking	
  advantage	
  of	
  Solr’s	
  open	
  source	
  model—with	
  search	
                                                         •      Search	
  numeric	
  values	
  
algorithms	
  that	
  are	
  completely	
  transparent—companies	
  can	
  invest	
                                                     	
  
in	
  configuring	
  their	
  search	
  solutions	
  to	
  match	
  their	
  business	
  logic,	
  
                                                                                                                                        Solr	
  Solution	
  
rather	
  than	
  trying	
  to	
  infer	
  or	
  pay	
  for	
  exposure	
  proprietary	
  back-­‐
end	
  logic.	
  	
                                                                                                                     •      Customizable	
  Search	
  
                                                                                                                                               Index	
  which	
  can	
  be	
  
	
  
                                                                                                                                               tuned	
  transparently	
  to	
  
	
                                           Internet	
  Yellow	
  pages	
  and	
  local	
                                                     account	
  for	
  key	
  
	
                                           online	
  search	
  is	
  forecast	
  to	
                                                        findability	
  drivers	
  
                                                                                                                                        •      Drop	
  down	
  filters	
  for	
  
                                             grow	
  to	
  $27.8	
  billion	
  in	
  2011.	
  
	
                                                                                                                                             narrowing	
  or	
  widening	
  
	
                                                                              The	
  Kelsey	
  Report1	
                                     the	
  scope	
  of	
  search	
  
                                                                                                                                        •      Seamless	
  integration	
  
Success	
  Stories	
                                                                                                                           with	
  existing	
  
                                                                                                                                               technologies	
  
       •       YP.com,	
  a	
  division	
  of	
  AT&T	
  Interactive	
  
                                                                                                                                        •      Native	
  numeric	
  
       •       Zvents.com,	
  local	
  event	
  search	
  service	
  	
  
                                                                                                                                               encoding	
  and	
  search	
  
       •       Yelp.com,	
  the	
  community	
  local	
  search	
  site	
  
                                                                                                                                               capabilities	
  
	
                                           M                                                                                          •      Reduced	
  server	
  
	
                                                                                                                                             footprint	
  for	
  lower	
  TCO	
  
                                             	
                                                                                                than	
  most	
  commercial	
  
	
  
                                                                                                                                               vendors	
  	
  
	
                                                                                                                                      	
  
1The	
  Kelsey	
  Group’s	
  Global	
  Print	
  Yellow	
  Pages,	
  Internet	
  Yellow	
  Pages	
  and	
  Local	
  Search	
  Five	
  
                                                                                                                                        	
  
Year	
  Outlook	
  




The Case for Lucene/Solr: Real World Search Applications
A Lucid Imagination White Paper • January 2010 	
                                                                                                                              Page 8
                                                                             	
                                                                              	
  
	
                                                                              	
                                                                              	
  
	
                                                                              	
                                                                              	
  
	
                                                                              	
                                                                              	
  
	
                                                                              	
                                                                              	
  
	
                                                                              	
                                                                              	
  
	
                                                                              	
                                                                              	
  
	
                                                                              	
                                                                              	
              	
  




                                                                                             	
  	
  	
  	
                                              	
  
Case	
  Study	
  1	
  
	
  
       yp.com	
  by	
  AT&T	
  Interactive	
  	
  
	
  
       AT&T	
  Interactive	
  is	
  an	
  online	
  and	
  mobile	
  search	
  and	
  advertising	
  company.	
  Their	
  leading-­‐edge	
  portal,	
  yp.com—an	
  
	
   online	
  business	
  listing	
  and	
  advertising	
  site—was	
  originally	
  implemented	
  with	
  a	
  commercial	
  proprietary	
  search	
  
	
   application.	
  It	
  faced	
  issues	
  of	
  scalability,	
  vendor	
  lock-­‐in,	
  and	
  performance.	
  With	
  help	
  from	
  Lucid	
  Imagination,	
  AT&T	
  
       successfully	
  migrated	
  to	
  a	
  Solr-­‐based	
  search	
  solution	
  that	
  leveraged	
  the	
  flexibility	
  of	
  open	
  source	
  without	
  
       compromising	
  features	
  and	
  functionality.	
  	
  And	
  they	
  did	
  so	
  with	
  a	
  much	
  smaller	
  budget.	
  	
  
       Business	
  Needs	
  
             •   Addressing	
  the	
  need	
  to	
  factor	
  in	
  location	
  to	
  support	
  geographic	
  search,	
  and	
  include	
  relevant	
  comments	
  
             •   Striking	
  a	
  balance	
  between	
  organic	
  search	
  and	
  advertised	
  content	
  
             •   Indexing	
  highly	
  unstructured	
  content	
  such	
  as	
  user	
  comments	
  	
  
             •   Increasing	
  relevancy	
  of	
  results	
  and	
  boosting	
  paid	
  search	
  results	
  for	
  preferential	
  placement	
  of	
  advertisers	
  
             •   Linguistic	
  support	
  to	
  enable	
  search	
  experience,	
  such	
  as	
  spellchecking,	
  synonyms,	
  find-­‐similar,	
  etc.	
  
             •   Integrating	
  with	
  latest	
  Web	
  2.0	
  tools	
  
             •   Reducing	
  server	
  footprint	
  
                 	
  
       The	
  Solr	
  Solution	
  	
  
             •     Context-­‐specific	
  relevancy,	
  geographic	
  proximity,	
  ad	
  placement,	
  and	
  user	
  comments	
  
             •     Faceting,	
  drop	
  down	
  filters	
  to	
  narrow/widen	
  the	
  scope	
  of	
  search	
  	
  
             •     Functional	
  support	
  for	
  creating	
  new	
  features	
  	
  
             •     Spell-­‐correction,	
  and	
  location-­‐optimized	
  search	
  results	
  to	
  show	
  users	
  businesses	
  nearest	
  to	
  them	
  first	
  
             •     Seamless	
  integration	
  with	
  many	
  Web	
  2.0	
  tools	
  to	
  create	
  innovative	
  features	
  and	
  mashups	
  
             •     Lowers	
  TCO	
  by	
  reducing	
  the	
  number	
  of	
  search	
  servers	
  from	
  120	
  to	
  two	
  dozen	
  servers	
  	
  




The Case for Lucene/Solr: Real World Search Applications
A Lucid Imagination White Paper • January 2010 	
                                                                                                                      Page 9
                                                                      	
                                                                	
  
	
                                                                       	
                                                                	
  
	
                                                                       	
                                                                	
  
	
                                                                       	
                                                                	
  
	
                                                                       	
                                                                	
  
	
                                                                       	
                                                                	
  
	
                                                                       	
                                                                	
  
	
                                                                       	
                                                                	
         	
  



	
  


Media
Brand	
  reinforcement,	
  premium	
  content,	
  and	
  easy	
  accessibility	
  
are	
  the	
  main	
  business	
  motivators	
  for	
  online	
  media	
  and	
                          Requirements	
  
publishing	
  companies.	
  Relevant	
  information	
  improves	
  time	
  on	
                          •      Real-­‐time	
  indexing	
  of	
  
the	
  site	
  and	
  encourages	
  users	
  to	
  explore	
  related	
  content,	
                             petabytes	
  of	
  structured	
  
boosting	
  subscription	
  rates	
  and	
  site	
  views.	
  These	
  translate	
  into	
  a	
                 and	
  unstructured	
  data	
  	
  
virtuous	
  cycle	
  of	
  additional	
  revenue	
  generation.	
                                        •      Deeper	
  search	
  capability	
  
                                                                                                         •      Improved	
  query	
  
Given	
  that	
  content	
  is	
  the	
  business,	
  the	
  need	
  for	
  a	
  robust	
  search	
  
                                                                                                                response	
  time	
  
application	
  ties	
  directly	
  to	
  competitive	
  advantage.	
  	
  
                                                                                                         •      Reduced	
  	
  infrastructure	
  
Lucene/Solr	
  provides	
  a	
  customized,	
  function	
  rich	
  solution	
  for	
  the	
                     and	
  customization	
  costs	
  
media	
  and	
  publishing	
  industry.	
  It	
  addresses	
  dynamic	
  challenges	
                    	
  
of	
  content	
  diversity,	
  content	
  freshness,	
  and	
  content	
  acquisition	
  ,	
             Solr	
  Solution	
  
and	
  gives	
  companies	
  a	
  platform	
  on	
  which	
  	
  to	
  build	
  a	
  world-­‐class	
     • Reverse	
  indexing	
  
innovative	
  search	
  experience	
  to	
  differentiate	
  themselves	
  in	
  a	
                     • Intelligent,	
  faceted	
  search	
  
highly	
  competitive	
  marketplace.	
  	
                                                                  to	
  enable	
  contextual	
  and	
  
                                                                                                             linguistic	
  relevance	
  
	
  
                                                                                                         • Easy	
  configuration	
  for	
  
	
                              “Solr	
  has	
  done	
  wonders	
  for	
  us.	
                              parsing	
  structured	
  and	
  
	
                              It	
  is	
  easy	
  to	
  understand	
  and	
                                unstructured	
  data	
  
                                deploy,	
  and	
  has	
  reduced	
  our	
                                • Easy	
  and	
  seamless	
  
	
                                                                                                           installation	
  for	
  lower	
  
                                costs	
  drastically.”	
                                                     TCO	
  
	
  
	
                                                         Doug	
  Steigerwald,	
                        • Customization	
  with	
  open	
  
                                                                                                             source	
  code	
  
	
                                                   	
  McClatchy	
  Interactive	
                             	
  
	
  
                                                                                                         	
  
	
  
Success	
  Stories	
  
       •    McClatchy	
  Newspapers	
  
       •    Netflix	
  	
  
       •    Comcast	
  Interactive	
  
       •    MTV	
  Networks,	
  a	
  division	
  of	
  Viacom	
  
                              M
       •    The	
  Motley	
  Fool,	
  fool.com	
  	
  
       •    Fanfeedr.com,	
  personalized	
  sports	
  aggregator	
  
                                	
  


The Case for Lucene/Solr: Real World Search Applications
A Lucid Imagination White Paper • January 2010 	
                                                                                         Page 10
                                                                         	
                                                                          	
  
	
                                                                          	
                                                                          	
  
	
                                                                          	
                                                                          	
  
	
                                                                          	
                                                                          	
  
	
                                                                          	
                                                                          	
  
	
                                                                          	
                                                                          	
  
	
                                                                          	
                                                                          	
  
	
                                                                          	
                                                                          	
           	
  



	
  




                                                                                                                                       	
  
       Case	
  Study	
  2	
  
       	
  
         McClatchy—Leading	
  Newspaper	
  Publisher	
  
         The	
  third	
  largest	
  newspaper	
  publisher	
  in	
  the	
  United	
  States,	
  McClatchy	
  Company	
  owns	
  30	
  daily	
  
         newspapers	
  in	
  29	
  markets	
  across	
  the	
  country.	
  To	
  win	
  online,	
  McClatchy	
  knew	
  it	
  had	
  to	
  have	
  a	
  robust	
  
         search	
  solution,	
  to	
  empower	
  the	
  McClatchy	
  audience	
  with	
  the	
  information	
  they	
  wanted	
  and	
  secure	
  
         loyalty	
  from	
  readers	
  and	
  sponsorships	
  from	
  advertisers.	
  Working	
  with	
  Lucid	
  Imagination,	
  McClatchy	
  
         migrated	
  from	
  proprietary	
  search	
  software	
  to	
  open	
  source	
  and	
  chose	
  Solr	
  for	
  its	
  high	
  performance,	
  
         comprehensive	
  capabilities,	
  and	
  superior	
  value	
  	
  
         Requirements	
  
             • Proliferating	
  content	
  and	
  data	
  sources	
  (text,	
  videos,	
  audios,	
  images),	
  with	
  real-­‐time	
  
                   streaming	
  	
  
             • Empowering	
  end	
  users	
  with	
  ease	
  of	
  use	
  
             • Supporting	
  peak	
  traffic	
  and	
  popular	
  search	
  spikes	
  with	
  consistent	
  performance	
  
             • Providing	
  scalability	
  for	
  a	
  database	
  growing	
  by	
  orders	
  of	
  magnitude	
  annually	
  
             • Providing	
  flexibility	
  to	
  support	
  customization	
  
             • Controlling	
  IT	
  costs	
  while	
  exceeding	
  performance	
  benchmarks	
  of	
  competition	
  
                   	
  
         The	
  Lucene/Solr	
  Solution	
  	
  
             • Deeper	
  content	
  by	
  indexing	
  both	
  structured	
  and	
  unstructured	
  data	
  in	
  real	
  time,	
  effortlessly	
  
             • Indexes	
  millions	
  of	
  documents,	
  with	
  search	
  results	
  delivered	
  in	
  milliseconds	
  	
  
             • User-­‐friendly	
  navigation	
  with	
  drop	
  down	
  filters,	
  faceted	
  navigation,	
  linguistic	
  corrections,	
  
                   etc.	
  	
  	
  
             • Excellent	
  performance,	
  even	
  in	
  peak	
  hours,	
  by	
  load-­‐balancing	
  search	
  requests	
  across	
  servers	
  	
  
             • Scalability	
  without	
  impact	
  on	
  performance	
  	
  
             • High	
  degree	
  of	
  customization,	
  since	
  it’s	
  open	
  source	
  
             • Integration	
  with	
  existing	
  IT	
  infrastructure	
  and	
  eliminates	
  associated	
  license	
  fees	
  to	
  cut	
  costs	
  
             • 8-­‐fold	
  reduction	
  in	
  server	
  footprint	
  	
  



The Case for Lucene/Solr: Real World Search Applications
A Lucid Imagination White Paper • January 2010 	
                                                                                                       Page 11
                                                                                                                                                                                                                                           	
                                                       	
  
	
                                                                                                                                                                                                                                            	
                                                       	
  
	
                                                                                                                                                                                                                                            	
                                                       	
  
	
                                                                                                                                                                                                                                            	
                                                       	
  
	
                                                                                                                                                                                                                                            	
                                                       	
  
	
                                                                                                                                                                                                                                            	
                                                       	
  
	
                                                                                                                                                                                                                                            	
                                                       	
  
	
                                                                                                                                                                                                                                            	
                                                       	
        	
  




E-commerce
	
  	
  
E-­‐commerce	
  businesses	
  must	
  provide	
  a	
  compelling	
  shopping	
  experience	
                                                                                                                                                                                    Requirements	
  
in	
  order	
  to	
  maintain	
  brand	
  equity	
  and	
  thrive	
  in	
  a	
  very	
  highly	
  competitive	
                                                                                                                                                                 •      Multidimensional,	
  
market	
  landscape.	
  By	
  reducing	
  the	
  time	
  and	
  effort	
  required	
  to	
  navigate	
                                                                                                                                                                                 dynamic	
  search	
  
available	
  merchandise	
  and	
  find	
  what	
  they	
  want,	
  superior	
  search	
                                                                                                                                                                                        •      Faster	
  results	
  
contributes	
  directly	
  to	
  a	
  satisfying	
  buying	
  experience	
  for	
  customers.	
                                                                                                                                                                                 •      Real-­‐time	
  indexing	
  
Search	
  then	
  translates	
  directly	
  into	
  higher	
  revenues	
  and	
  customer	
                                                                                                                                                                                            of	
  products	
  
loyalty.	
  Instant	
  results,	
  intuitively	
  organized,	
  advanced	
  faceting	
  for	
  easy	
                                                                                                                                                                           •      Faceting	
  and	
  
browsing,	
  synchronizing	
  results	
  with	
  images,	
  and	
  integration	
  with	
  user	
                                                                                                                                                                                       browsing	
  
ratings	
  are	
  among	
  the	
  must	
  have	
  features	
  of	
  an	
  e-­‐commerce	
  search	
                                                                                                                                                                                     capabilities	
  
application.	
                                                                                                                                                                                                                                                                  •      Seamless	
  
Lucene/Solr	
  gives	
  companies	
  the	
  ability	
  to	
  build	
  their	
  sites	
  around	
  the	
                                                                                                                                                                                integration	
  with	
  
concept	
  of	
  “searchendizing”—putting	
  the	
  desired	
  merchandise	
  at	
  the	
  top	
                                                                                                                                                                                       existing	
  IT	
  
of	
  the	
  results	
  list—which	
  can	
  make	
  the	
  difference	
  between	
  sales	
  made	
                                                                                                                                                                                   infrastructure	
  
and	
  sales	
  lost.	
  Faceting,	
  database	
  integration,	
  real-­‐time	
  indexing,	
  and	
                                                                                                                                                                             	
  
query	
  monitoring	
  all	
  enable	
  users	
  to	
  find	
  products	
  they	
  want,	
  driving	
                                                                                                                                                                           Solr	
  Solution	
  
conversion	
  rates	
  and	
  enabling	
  a	
  winning	
  online	
  experience.	
  2	
  	
  
                                                                                                                                                                                                                                                                                •      Faceted	
  search	
  for	
  
	
                                                                                                                                                                                                                                                                                     deeper	
  drill	
  down	
  
	
                                                                                                                                                                                                                                 Online	
  retail	
  sales	
  in	
  the	
            and	
  browsing	
  	
  
                                                                                                                                                                                                                                   B2C	
  market	
  are	
  expected	
           •      Intuitive	
  search	
  
	
  
                                                                                                                                                                                                                                                                                       capabilities	
  for	
  
Success	
  Stories	
  
                                                                                                                                                                                                                                   to	
  reach	
  $340	
  billion	
  by	
              cross-­‐channel	
  
                                                                                                                                                                                                                                   201321	
                                            shopping	
  
                           •                          Buy.com	
  
                           •                          Sears.com	
  
                                                                                                                                                                                                                                   	
                                                  experience	
  	
  
                                                                                                                                                                                                                                            Forrester	
  Research	
             •      System	
  
                           •                          Macys.com	
  
                                                                                                                                                                                                                                                                                       administration	
  tools	
  
                           •                          Zappos.com	
  
                                                                                                                                                                                                                                                                                       for	
  data	
  loading,	
  
                           •                          Advanceautoparts.com	
  
                                                                                                                                                                                                                                                                                       index	
  replication,	
  
                           •                          Dollardays.com	
  
                                                                                                                                                                                                                                                                                       monitoring,	
  logging,	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
                                                         and	
  cache	
  
                                                                                                                                                                                                                                                                                       management	
  
	
                                                                                                                                                                                                                                                                              •      Query	
  monitoring	
  
2	
  “Consumers	
  will	
  spend	
  more	
  than	
  $340	
  billion	
  online	
  by	
  2013,	
  says	
  Forrester,”	
                                                                                                                                                                  for	
  better	
  
	
  Internet	
  Retailer,	
  27	
  November	
  2009,	
  http://www.internetretailer.com/dailyNews.asp?id=32630.	
                                                                                                                                                                      highlighting	
  of	
  
                                                                                                                                                                                                                                                                                       popular	
  products	
  	
  
                                                                                                                                                                                                                                                                                	
  
The Case for Lucene/Solr: Real World Search Applications
A Lucid Imagination White Paper • January 2010 	
                                                                                                                                                                                                                                                      Page 12
                                                                              	
                                                                               	
  
	
                                                                               	
                                                                               	
  
	
                                                                               	
                                                                               	
  
	
                                                                               	
                                                                               	
  
	
                                                                               	
                                                                               	
  
	
                                                                               	
                                                                               	
  
	
                                                                               	
                                                                               	
  
	
                                                                               	
                                                                               	
            	
  




                                                                                          	
  	
  	
  	
  	
  	
  	
  	
                         	
  

Case	
  Study	
  3	
  
       Zappos	
  
       Zappos	
  is	
  the	
  premier	
  destination	
  for	
  online	
  shoe	
  shopping.	
  At	
  Zappos,	
  the	
  mission	
  is	
  excellent	
  online	
  customer	
  
       service—customers	
  should	
  be	
  able	
  to	
  browse	
  shoe	
  styles,	
  sizes,	
  shapes,	
  and	
  colors	
  more	
  easily	
  than	
  any	
  other	
  shoe	
  
       store,	
  on	
  or	
  offline.	
  To	
  achieve	
  this,	
  Zappos	
  wanted	
  a	
  robust,	
  flexible,	
  multifunctional	
  search	
  solution/application.	
  
       After	
  evaluating	
  many	
  commercial	
  search	
  technologies,	
  Zappos	
  zeroed	
  in	
  on	
  Solr,	
  working	
  with	
  Lucid	
  Imagination	
  to	
  
       ensure	
  continued,	
  successful	
  deployment.	
  
       Requirements	
  
             •   Simplified,	
  attractive	
  user	
  experience	
  that	
  makes	
  it	
  easy	
  to	
  find	
  and	
  buy	
  
             •   Relevant	
  results,	
  fast	
  
             •   Navigation	
  across	
  attributes,	
  such	
  as	
  size,	
  color,	
  and	
  style	
  for	
  broader	
  and	
  deeper	
  results	
  
             •   Indexing	
  products	
  as	
  they	
  were	
  entered	
  in	
  the	
  catalogs	
  
             •   Cross-­‐functional	
  navigation	
  to	
  give	
  customers	
  a	
  realistic	
  shopping	
  experience	
  
             •   Intuitive	
  intelligence	
  to	
  provide	
  alternate	
  suggestions	
  
             •   Analytical	
  capabilities	
  to	
  drive	
  business	
  strategy	
  
             •   Facilitating	
  control	
  on	
  results	
  
             •   Integration	
  with	
  existing	
  IT	
  infrastructure	
  
                         	
  
       The	
  Solr	
  Solution	
  
             •     Search	
  results	
  in	
  subseconds,	
  across	
  categories	
  
             •     Faceting,	
  for	
  easy	
  browsing	
  and	
  discovery	
  and	
  a	
  compelling	
  user	
  experience	
  	
  
             •     Real-­‐time	
  indexing	
  of	
  products	
  
             •     Synchronization	
  of	
  visuals,	
  specs,	
  filters,	
  and	
  promotions	
  to	
  make	
  shopping	
  experience	
  true	
  to	
  life	
  
             •     Information	
  on	
  user	
  activity	
  to	
  help	
  build	
  strategy	
  on	
  product	
  promotions	
  
             •     Controls	
  to	
  rank	
  	
  popular	
  or	
  high-­‐stock	
  products	
  in	
  results	
  	
  where	
  users	
  are	
  more	
  likely	
  to	
  buy	
  them	
  
             •     Facilitates	
  integration	
  with	
  heterogeneous	
  open	
  source	
  environment	
  




The Case for Lucene/Solr: Real World Search Applications
A Lucid Imagination White Paper • January 2010 	
                                                                                                                 Page 13
                                                                    	
                                                                   	
  
	
                                                                     	
                                                                   	
  
	
                                                                     	
                                                                   	
  
	
                                                                     	
                                                                   	
  
	
                                                                     	
                                                                   	
  
	
                                                                     	
                                                                   	
  
	
                                                                     	
                                                                   	
  
	
                                                                     	
                                                                   	
         	
  



	
  
	
  


Job and Career Sites                                                                                                Requirements	
  
	
  
                                                                                                                    •      Linguistic	
  
Job	
  portals	
  are	
  countercyclical	
  to	
  the	
  economy.	
  When	
  the	
  economy	
                              intelligence	
  for	
  
flourishes,	
  posted	
  jobs	
  grow	
  in	
  number;	
  when	
  it	
  sags,	
  candidates	
  flock	
  in	
               more	
  relevant	
  
to	
  post	
  their	
  resumes.	
  Success	
  for	
  an	
  online	
  job	
  portal	
  is	
  tied	
  to	
  the	
            results	
  
efficiency	
  of	
  its	
  search	
  capability—matching	
  résumés	
  to	
  job	
  listings	
  and	
               •      Control	
  search	
  
vice	
  versa—so	
  both	
  employers	
  and	
  prospective	
  employees	
  can	
  zero	
  in	
                            results	
  to	
  maintain	
  
on	
  just	
  the	
  right	
  opportunity.	
                                                                               privacy	
  
For	
  example,	
  an	
  employer	
  may	
  want	
  to	
  navigate	
  through	
  filters	
  to	
                    •      Deeper	
  search	
  
narrow	
  the	
  scope	
  of	
  a	
  candidate	
  search,	
  such	
  as	
  education,	
  previous	
                        capability	
  
employer,	
  salary	
  history,	
  skillsets,	
  etc.;	
  a	
  job	
  seeker	
  may	
  want	
  to	
  expose	
       •      Numeric	
  search	
  
these	
  attributes,	
  but	
  keep	
  a	
  current	
  employer’s	
  name	
  confidential.	
  A	
  job-­‐           •      Faster	
  query	
  
seeker	
  may	
  want	
  to	
  apply	
  to	
  jobs	
  within	
  a	
  particular	
  geographic	
  area.	
                   response	
  
                                                                                                                    •      Reduced	
  
Lucene/Solr	
  not	
  only	
  provides	
  such	
  flexibility	
  but	
  also	
  addresses	
  other	
  
                                                                                                                           infrastructure	
  and	
  
complexities	
  of	
  this	
  industry	
  by	
  enabling	
  linguistic	
  intelligence	
  (such	
  as	
  
                                                                                                                           customization	
  costs	
  
identical	
  acronyms	
  that	
  correspond	
  to	
  different	
  entities;	
  variations	
  in	
                          	
  
spelling,	
  imperfectly	
  constructed	
  search	
  queries);	
  indexing	
  unstructured	
                        Solr	
  Solution	
  
data	
  (résumés);	
  and	
  managing	
  ever-­‐growing	
  data.	
                                                  • Intelligent,	
  faceted	
  
	
                                                                                                                      search	
  to	
  enable	
  
                                                                                                                        contextual	
  and	
  
	
                                        “I	
  think	
  the	
  breakthrough	
  was	
                                   linguistic	
  relevance	
  
	
                                        when	
  we	
  tried	
  it,	
  and	
  we	
                                 • Easy	
  configuration	
  
                                          realized,	
  wow,	
  this	
  thing	
  could	
                                 for	
  parsing	
  
	
                                                                                                                      structured	
  and	
  
                                          really	
  scale.”	
                                                           unstructured	
  data	
  
	
  
                                          	
                                                                        • Easy	
  and	
  seamless	
  
	
                                                  Peter	
  Keegan,	
  Monster.com	
                                   installation	
  for	
  	
  
Success	
  Stories	
                                                                                                    lower	
  TCO	
  
                                                                                                                    • Business	
  process	
  
       •   Monster	
                                                                                                    integration	
  and	
  
       •   The	
  Big	
  Jobs	
                                                                                         Customization	
  with	
  
       •   eBharatJobs	
                                                                                                open	
  source	
  code	
  	
  
       •   Careerjet	
                                                                                                     	
  
                                                                                                                    	
  
                           M
The Case for Lucene/Solr: Real World Search Applications
A Lucid Imagination White Paper • January 2010 	
                                                                                          Page 14
                                          	
  
                                                                                    	
                                                                                 	
  
	
                                                                                     	
                                                                                 	
  
	
                                                                                     	
                                                                                 	
  
	
                                                                                     	
                                                                                 	
  
	
                                                                                     	
                                                                                 	
  
	
                                                                                     	
                                                                                 	
  
	
                                                                                     	
                                                                                 	
  
	
                                                                                     	
                                                                                 	
            	
  




                                                                                                                                        	
  
                                                                                              	
  
                                                                                              	
  

       Monster.com	
  
       Monster	
  is	
  the	
  largest	
  job	
  search	
  engine	
  in	
  the	
  world,	
  with	
  over	
  a	
  million	
  jobs	
  posted	
  at	
  any	
  one	
  time.	
  By	
  2008	
  it	
  had	
  
       150	
  million	
  résumés	
  in	
  its	
  database,	
  serving	
  over	
  63	
  million	
  job	
  seekers	
  per	
  month,	
  now	
  running	
  on	
  average	
  300	
  to	
  
       400	
  queries	
  per	
  second	
  with	
  an	
  average	
  response	
  time	
  of	
  40	
  milliseconds.	
  To	
  provide	
  the	
  highest	
  level	
  of	
  service	
  
       and	
  support	
  to	
  their	
  customers—both	
  employers	
  and	
  job	
  seekers—Monster	
  has	
  an	
  unmatched	
  marketplace	
  for	
  
       employment	
  opportunities,	
  with	
  Lucene-­‐based	
  search	
  at	
  the	
  heart	
  of	
  its	
  business	
  model.	
  
       	
  
       The	
  Requirements	
  	
  
              •  Managing	
  high	
  volumes	
  of	
  data,	
  continually	
  increasing	
  by	
  double	
  digit	
  percentages	
  annually	
  
              •  Maintaining	
  constant	
  inventory	
  updates	
  and	
  providing	
  faster	
  results	
  
              •  Removing	
  technological	
  barriers	
  that	
  limit	
  the	
  scope	
  of	
  information	
  
              •  Enabling	
  end	
  users	
  to	
  refine	
  search	
  and	
  drill	
  deeper	
  without	
  any	
  performance	
  impact	
  
              •  Providing	
  security	
  controls	
  to	
  ensure	
  end	
  user	
  privacy	
  
              •  Facilitating	
  scalability	
  and	
  flexibility	
  in	
  tandem	
  with	
  company’s	
  vision	
  and	
  growth	
  plans	
  
                 	
  
       The	
  Lucene	
  Solution	
  	
  
              •     High	
  volumes	
  of	
  data	
  by	
  clustering	
  data	
  to	
  reduce	
  the	
  index	
  size	
  	
  
              •     Real-­‐time	
  indexing	
  for	
  fresher,	
  faster	
  query	
  results	
  	
  
              •     Intuitive	
  search	
  to	
  enable	
  in-­‐depth	
  cross-­‐functional	
  job	
  and	
  résumé	
  browsing	
  
              •     Faceted	
  search	
  and	
  ‘single	
  click’	
  filters	
  for	
  search	
  refinement	
  	
  
              •     Security	
  controls	
  to	
  manage	
  user	
  information	
  
              •     Unlimited	
  scalability	
  and	
  customization	
  leveraging	
  open	
  source	
  licensing	
  

       	
  
The Case for Lucene/Solr: Real World Search Applications
A Lucid Imagination White Paper • January 2010 	
                                                                                                                        Page 15
Open Source Search Applications
Open Source Search Applications
Open Source Search Applications
Open Source Search Applications
Open Source Search Applications
Open Source Search Applications
Open Source Search Applications
Open Source Search Applications
Open Source Search Applications
Open Source Search Applications
Open Source Search Applications

Contenu connexe

En vedette

Updated: Marketing your Technology
Updated: Marketing your TechnologyUpdated: Marketing your Technology
Updated: Marketing your TechnologyMarty Kaszubowski
 
Coterie 9 11
Coterie 9 11Coterie 9 11
Coterie 9 11LaRue
 
Hellosong
HellosongHellosong
Hellosongtanica
 
Pangaea providing access to geoscientific data using apache lucene java
Pangaea   providing access to geoscientific data using apache lucene javaPangaea   providing access to geoscientific data using apache lucene java
Pangaea providing access to geoscientific data using apache lucene javaLucidworks (Archived)
 
Ingles haiti
Ingles haitiIngles haiti
Ingles haititanica
 
Solr Cluster installation tool "Anuenue"
Solr Cluster installation tool "Anuenue"Solr Cluster installation tool "Anuenue"
Solr Cluster installation tool "Anuenue"Lucidworks (Archived)
 
"A Study of I/O and Virtualization Performance with a Search Engine based on ...
"A Study of I/O and Virtualization Performance with a Search Engine based on ..."A Study of I/O and Virtualization Performance with a Search Engine based on ...
"A Study of I/O and Virtualization Performance with a Search Engine based on ...Lucidworks (Archived)
 
Integrating Advanced Text Analytics into Solr
Integrating Advanced Text Analytics into SolrIntegrating Advanced Text Analytics into Solr
Integrating Advanced Text Analytics into SolrLucidworks (Archived)
 
Lucene/Solr Revolution 2013: Paul Doscher Opening Remarks
Lucene/Solr Revolution 2013: Paul Doscher Opening Remarks Lucene/Solr Revolution 2013: Paul Doscher Opening Remarks
Lucene/Solr Revolution 2013: Paul Doscher Opening Remarks Lucidworks (Archived)
 
Discover the new techniques about search application
Discover the new techniques about search applicationDiscover the new techniques about search application
Discover the new techniques about search applicationLucidworks (Archived)
 
Azure と世間様
Azure と世間様Azure と世間様
Azure と世間様彰 村地
 

En vedette (19)

Updated: Marketing your Technology
Updated: Marketing your TechnologyUpdated: Marketing your Technology
Updated: Marketing your Technology
 
What’s New in Apache Lucene 3.0
What’s New in Apache Lucene 3.0What’s New in Apache Lucene 3.0
What’s New in Apache Lucene 3.0
 
Coterie 9 11
Coterie 9 11Coterie 9 11
Coterie 9 11
 
Hellosong
HellosongHellosong
Hellosong
 
Pangaea providing access to geoscientific data using apache lucene java
Pangaea   providing access to geoscientific data using apache lucene javaPangaea   providing access to geoscientific data using apache lucene java
Pangaea providing access to geoscientific data using apache lucene java
 
Ingles haiti
Ingles haitiIngles haiti
Ingles haiti
 
Cmd Training Institute - New Premises
Cmd Training Institute - New PremisesCmd Training Institute - New Premises
Cmd Training Institute - New Premises
 
Solr Cluster installation tool "Anuenue"
Solr Cluster installation tool "Anuenue"Solr Cluster installation tool "Anuenue"
Solr Cluster installation tool "Anuenue"
 
"A Study of I/O and Virtualization Performance with a Search Engine based on ...
"A Study of I/O and Virtualization Performance with a Search Engine based on ..."A Study of I/O and Virtualization Performance with a Search Engine based on ...
"A Study of I/O and Virtualization Performance with a Search Engine based on ...
 
Integrating Advanced Text Analytics into Solr
Integrating Advanced Text Analytics into SolrIntegrating Advanced Text Analytics into Solr
Integrating Advanced Text Analytics into Solr
 
Lucene rev preso cisco gannu
Lucene rev preso cisco gannuLucene rev preso cisco gannu
Lucene rev preso cisco gannu
 
E learning At The Library
E learning At The LibraryE learning At The Library
E learning At The Library
 
Lucene/Solr Revolution 2013: Paul Doscher Opening Remarks
Lucene/Solr Revolution 2013: Paul Doscher Opening Remarks Lucene/Solr Revolution 2013: Paul Doscher Opening Remarks
Lucene/Solr Revolution 2013: Paul Doscher Opening Remarks
 
Solr & Lucene at Etsy
Solr & Lucene at EtsySolr & Lucene at Etsy
Solr & Lucene at Etsy
 
Search Analytics What? Why? How?
Search Analytics What? Why? How?Search Analytics What? Why? How?
Search Analytics What? Why? How?
 
Short Presentation
Short PresentationShort Presentation
Short Presentation
 
Discover the new techniques about search application
Discover the new techniques about search applicationDiscover the new techniques about search application
Discover the new techniques about search application
 
Azure と世間様
Azure と世間様Azure と世間様
Azure と世間様
 
корея
кореякорея
корея
 

Similaire à Open Source Search Applications

E commerce search strategies
E commerce search strategiesE commerce search strategies
E commerce search strategiesRoger Xia
 
Creating A Lean Business System
Creating A Lean Business SystemCreating A Lean Business System
Creating A Lean Business SystemJillWhinfrey
 
Lean Business System
Lean Business SystemLean Business System
Lean Business Systemgrogans
 
Lean Business System
Lean Business SystemLean Business System
Lean Business Systemmunroc
 
Creatingaleanbusinesssystem 12747094244253 Phpapp01
Creatingaleanbusinesssystem 12747094244253 Phpapp01Creatingaleanbusinesssystem 12747094244253 Phpapp01
Creatingaleanbusinesssystem 12747094244253 Phpapp01tanergokalp
 
Boom startup overview
Boom startup overviewBoom startup overview
Boom startup overviewbjb84
 
Marcom Buzz September- October, 2012
Marcom Buzz September- October, 2012Marcom Buzz September- October, 2012
Marcom Buzz September- October, 2012marcombuzz
 
Software Quality Analysis with Alitheia Core
Software Quality Analysis with Alitheia CoreSoftware Quality Analysis with Alitheia Core
Software Quality Analysis with Alitheia CoreGeorgios Gousios
 
200804 loma resource_customer_centricservice
200804 loma resource_customer_centricservice200804 loma resource_customer_centricservice
200804 loma resource_customer_centricserviceSteven Callahan
 
A 4 A Prods. And Services1lgmc
A 4 A Prods. And Services1lgmcA 4 A Prods. And Services1lgmc
A 4 A Prods. And Services1lgmcjollyroll59
 
PhD Defense - Awareness Support for Knowledge Workers in Research Networks
PhD Defense - Awareness Support for Knowledge Workers in Research NetworksPhD Defense - Awareness Support for Knowledge Workers in Research Networks
PhD Defense - Awareness Support for Knowledge Workers in Research NetworksWolfgang Reinhardt
 
Social Innovation & New Media: 1. Class Introduction
Social Innovation & New Media: 1. Class IntroductionSocial Innovation & New Media: 1. Class Introduction
Social Innovation & New Media: 1. Class IntroductionNam-ho Park
 
Workforce Needs of the California Solar Industry
Workforce Needs of the California Solar IndustryWorkforce Needs of the California Solar Industry
Workforce Needs of the California Solar IndustryJoel West
 
110917 트리즈 강의 서면_pdf
110917 트리즈 강의 서면_pdf110917 트리즈 강의 서면_pdf
110917 트리즈 강의 서면_pdf형희 김
 
Prime Minister Rajoy on Spanish banks bailout; ransom or loan?
Prime Minister Rajoy on Spanish banks bailout; ransom or loan?Prime Minister Rajoy on Spanish banks bailout; ransom or loan?
Prime Minister Rajoy on Spanish banks bailout; ransom or loan?Aleix Cuberes i Diaz
 
Spanish Prime Minister Rajoy statement on the "ransom" or the "loan" for the ...
Spanish Prime Minister Rajoy statement on the "ransom" or the "loan" for the ...Spanish Prime Minister Rajoy statement on the "ransom" or the "loan" for the ...
Spanish Prime Minister Rajoy statement on the "ransom" or the "loan" for the ...ingenia_pro
 
Success foundations
Success foundationsSuccess foundations
Success foundationsConfidential
 
North Canton Master Plan Presentation 2
North Canton Master Plan Presentation 2North Canton Master Plan Presentation 2
North Canton Master Plan Presentation 2ksuCUDC
 

Similaire à Open Source Search Applications (20)

E commerce search strategies
E commerce search strategiesE commerce search strategies
E commerce search strategies
 
Creating A Lean Business System
Creating A Lean Business SystemCreating A Lean Business System
Creating A Lean Business System
 
Lean Business System
Lean Business SystemLean Business System
Lean Business System
 
Lean Business System
Lean Business SystemLean Business System
Lean Business System
 
Creatingaleanbusinesssystem 12747094244253 Phpapp01
Creatingaleanbusinesssystem 12747094244253 Phpapp01Creatingaleanbusinesssystem 12747094244253 Phpapp01
Creatingaleanbusinesssystem 12747094244253 Phpapp01
 
Boom startup overview
Boom startup overviewBoom startup overview
Boom startup overview
 
Marcom Buzz September- October, 2012
Marcom Buzz September- October, 2012Marcom Buzz September- October, 2012
Marcom Buzz September- October, 2012
 
Software Quality Analysis with Alitheia Core
Software Quality Analysis with Alitheia CoreSoftware Quality Analysis with Alitheia Core
Software Quality Analysis with Alitheia Core
 
200804 loma resource_customer_centricservice
200804 loma resource_customer_centricservice200804 loma resource_customer_centricservice
200804 loma resource_customer_centricservice
 
A 4 A Prods. And Services1lgmc
A 4 A Prods. And Services1lgmcA 4 A Prods. And Services1lgmc
A 4 A Prods. And Services1lgmc
 
PhD Defense - Awareness Support for Knowledge Workers in Research Networks
PhD Defense - Awareness Support for Knowledge Workers in Research NetworksPhD Defense - Awareness Support for Knowledge Workers in Research Networks
PhD Defense - Awareness Support for Knowledge Workers in Research Networks
 
Chap Drive 1
Chap Drive 1Chap Drive 1
Chap Drive 1
 
Social Innovation & New Media: 1. Class Introduction
Social Innovation & New Media: 1. Class IntroductionSocial Innovation & New Media: 1. Class Introduction
Social Innovation & New Media: 1. Class Introduction
 
Workforce Needs of the California Solar Industry
Workforce Needs of the California Solar IndustryWorkforce Needs of the California Solar Industry
Workforce Needs of the California Solar Industry
 
110917 트리즈 강의 서면_pdf
110917 트리즈 강의 서면_pdf110917 트리즈 강의 서면_pdf
110917 트리즈 강의 서면_pdf
 
Prime Minister Rajoy on Spanish banks bailout; ransom or loan?
Prime Minister Rajoy on Spanish banks bailout; ransom or loan?Prime Minister Rajoy on Spanish banks bailout; ransom or loan?
Prime Minister Rajoy on Spanish banks bailout; ransom or loan?
 
Spanish Prime Minister Rajoy statement on the "ransom" or the "loan" for the ...
Spanish Prime Minister Rajoy statement on the "ransom" or the "loan" for the ...Spanish Prime Minister Rajoy statement on the "ransom" or the "loan" for the ...
Spanish Prime Minister Rajoy statement on the "ransom" or the "loan" for the ...
 
Success foundations
Success foundationsSuccess foundations
Success foundations
 
Volume example
Volume exampleVolume example
Volume example
 
North Canton Master Plan Presentation 2
North Canton Master Plan Presentation 2North Canton Master Plan Presentation 2
North Canton Master Plan Presentation 2
 

Plus de Lucidworks (Archived)

Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...
Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...
Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...Lucidworks (Archived)
 
SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr
 SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr
SFBay Area Solr Meetup - July 15th: Integrating Hadoop and SolrLucidworks (Archived)
 
SFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for Business
SFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for BusinessSFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for Business
SFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for BusinessLucidworks (Archived)
 
SFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
SFBay Area Solr Meetup - June 18th: Benchmarking Solr PerformanceSFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
SFBay Area Solr Meetup - June 18th: Benchmarking Solr PerformanceLucidworks (Archived)
 
Chicago Solr Meetup - June 10th: This Ain't Your Parents' Search Engine
Chicago Solr Meetup - June 10th: This Ain't Your Parents' Search EngineChicago Solr Meetup - June 10th: This Ain't Your Parents' Search Engine
Chicago Solr Meetup - June 10th: This Ain't Your Parents' Search EngineLucidworks (Archived)
 
Chicago Solr Meetup - June 10th: Exploring Hadoop with Search
Chicago Solr Meetup - June 10th: Exploring Hadoop with SearchChicago Solr Meetup - June 10th: Exploring Hadoop with Search
Chicago Solr Meetup - June 10th: Exploring Hadoop with SearchLucidworks (Archived)
 
Minneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache Solr
Minneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache SolrMinneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache Solr
Minneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache SolrLucidworks (Archived)
 
Minneapolis Solr Meetup - May 28, 2014: Target.com Search
Minneapolis Solr Meetup - May 28, 2014: Target.com SearchMinneapolis Solr Meetup - May 28, 2014: Target.com Search
Minneapolis Solr Meetup - May 28, 2014: Target.com SearchLucidworks (Archived)
 
Exploration of multidimensional biomedical data in pub chem, Presented by Lia...
Exploration of multidimensional biomedical data in pub chem, Presented by Lia...Exploration of multidimensional biomedical data in pub chem, Presented by Lia...
Exploration of multidimensional biomedical data in pub chem, Presented by Lia...Lucidworks (Archived)
 
Unstructured Or: How I Learned to Stop Worrying and Love the xml, Presented...
Unstructured   Or: How I Learned to Stop Worrying and Love the xml, Presented...Unstructured   Or: How I Learned to Stop Worrying and Love the xml, Presented...
Unstructured Or: How I Learned to Stop Worrying and Love the xml, Presented...Lucidworks (Archived)
 
Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...
Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...
Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...Lucidworks (Archived)
 
Big Data Challenges, Presented by Wes Caldwell at SolrExchage DC
Big Data Challenges, Presented by Wes Caldwell at SolrExchage DCBig Data Challenges, Presented by Wes Caldwell at SolrExchage DC
Big Data Challenges, Presented by Wes Caldwell at SolrExchage DCLucidworks (Archived)
 
What's New in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DC
What's New  in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DCWhat's New  in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DC
What's New in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DCLucidworks (Archived)
 
Solr At AOL, Presented by Sean Timm at SolrExchage DC
Solr At AOL, Presented by Sean Timm at SolrExchage DCSolr At AOL, Presented by Sean Timm at SolrExchage DC
Solr At AOL, Presented by Sean Timm at SolrExchage DCLucidworks (Archived)
 
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DCIntro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DCLucidworks (Archived)
 
Test Driven Relevancy, Presented by Doug Turnbull at SolrExchage DC
Test Driven Relevancy, Presented by Doug Turnbull at SolrExchage DCTest Driven Relevancy, Presented by Doug Turnbull at SolrExchage DC
Test Driven Relevancy, Presented by Doug Turnbull at SolrExchage DCLucidworks (Archived)
 
Building a data driven search application with LucidWorks SiLK
Building a data driven search application with LucidWorks SiLKBuilding a data driven search application with LucidWorks SiLK
Building a data driven search application with LucidWorks SiLKLucidworks (Archived)
 

Plus de Lucidworks (Archived) (20)

Integrating Hadoop & Solr
Integrating Hadoop & SolrIntegrating Hadoop & Solr
Integrating Hadoop & Solr
 
The Data-Driven Paradigm
The Data-Driven ParadigmThe Data-Driven Paradigm
The Data-Driven Paradigm
 
Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...
Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...
Downtown SF Lucene/Solr Meetup - September 17: Thoth: Real-time Solr Monitori...
 
SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr
 SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr
SFBay Area Solr Meetup - July 15th: Integrating Hadoop and Solr
 
SFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for Business
SFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for BusinessSFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for Business
SFBay Area Solr Meetup - June 18th: Box + Solr = Content Search for Business
 
SFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
SFBay Area Solr Meetup - June 18th: Benchmarking Solr PerformanceSFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
SFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
 
Chicago Solr Meetup - June 10th: This Ain't Your Parents' Search Engine
Chicago Solr Meetup - June 10th: This Ain't Your Parents' Search EngineChicago Solr Meetup - June 10th: This Ain't Your Parents' Search Engine
Chicago Solr Meetup - June 10th: This Ain't Your Parents' Search Engine
 
Chicago Solr Meetup - June 10th: Exploring Hadoop with Search
Chicago Solr Meetup - June 10th: Exploring Hadoop with SearchChicago Solr Meetup - June 10th: Exploring Hadoop with Search
Chicago Solr Meetup - June 10th: Exploring Hadoop with Search
 
What's new in solr june 2014
What's new in solr june 2014What's new in solr june 2014
What's new in solr june 2014
 
Minneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache Solr
Minneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache SolrMinneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache Solr
Minneapolis Solr Meetup - May 28, 2014: eCommerce Search with Apache Solr
 
Minneapolis Solr Meetup - May 28, 2014: Target.com Search
Minneapolis Solr Meetup - May 28, 2014: Target.com SearchMinneapolis Solr Meetup - May 28, 2014: Target.com Search
Minneapolis Solr Meetup - May 28, 2014: Target.com Search
 
Exploration of multidimensional biomedical data in pub chem, Presented by Lia...
Exploration of multidimensional biomedical data in pub chem, Presented by Lia...Exploration of multidimensional biomedical data in pub chem, Presented by Lia...
Exploration of multidimensional biomedical data in pub chem, Presented by Lia...
 
Unstructured Or: How I Learned to Stop Worrying and Love the xml, Presented...
Unstructured   Or: How I Learned to Stop Worrying and Love the xml, Presented...Unstructured   Or: How I Learned to Stop Worrying and Love the xml, Presented...
Unstructured Or: How I Learned to Stop Worrying and Love the xml, Presented...
 
Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...
Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...
Building a Lightweight Discovery Interface for Chinese Patents, Presented by ...
 
Big Data Challenges, Presented by Wes Caldwell at SolrExchage DC
Big Data Challenges, Presented by Wes Caldwell at SolrExchage DCBig Data Challenges, Presented by Wes Caldwell at SolrExchage DC
Big Data Challenges, Presented by Wes Caldwell at SolrExchage DC
 
What's New in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DC
What's New  in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DCWhat's New  in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DC
What's New in Lucene/Solr Presented by Grant Ingersoll at SolrExchage DC
 
Solr At AOL, Presented by Sean Timm at SolrExchage DC
Solr At AOL, Presented by Sean Timm at SolrExchage DCSolr At AOL, Presented by Sean Timm at SolrExchage DC
Solr At AOL, Presented by Sean Timm at SolrExchage DC
 
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DCIntro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
 
Test Driven Relevancy, Presented by Doug Turnbull at SolrExchage DC
Test Driven Relevancy, Presented by Doug Turnbull at SolrExchage DCTest Driven Relevancy, Presented by Doug Turnbull at SolrExchage DC
Test Driven Relevancy, Presented by Doug Turnbull at SolrExchage DC
 
Building a data driven search application with LucidWorks SiLK
Building a data driven search application with LucidWorks SiLKBuilding a data driven search application with LucidWorks SiLK
Building a data driven search application with LucidWorks SiLK
 

Dernier

DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 

Dernier (20)

DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 

Open Source Search Applications

  • 1.                                                                     The  Case  for  Lucene/Solr:     A  Manager’s  Guide     to  Real  World     Open  Source     Search  Applications           By  Lucid  Imagination    
  • 2.                                                     Abstract   In  today’s  information-­‐driven  environment,  search  is  a  critical  solution  to  problems  when  it  slashes   the  time  and  effort  separating  end  users  from  the  data  they  value.  Search  spans  the  range  of   business  models  and  use  cases—from  driving  direct  customer  sales,  to  analytics  and  business   intelligence,  employee  productivity,  and  reduced  administrative  overhead.  Making  the  best  use  of   search  requires  two  perspectives:  both  a  look  at  the  business  requirements  for  a  search  application   and  a  view  to  new  business  opportunities  created  by  using  search  to  leverage  the  organization’s   content  resources.       Thousands  of  organizations  across  different  sectors  and  business  models  have  harnessed  Apache   Lucene/Solr  to  search  their  rapidly  growing  and  diversifying  content  resources.  Underlying  this   broad  adoption  is  the  extraordinary  power,  scalability,  and  versatility  of  open  source  search   technologies.       This  paper  provides  an  overview  of  both  the  requirements  and  the  opportunities  for  search   applications.  It  then  explores  how  real  world  organizations  are  successfully  using  Lucene/Solr   search  applications  to  meet  those  opportunities,  presenting  how  the  technology  is  used  for  specific   business  models  and  use  cases  across  industries.  In  addition,  it  offers  a  baseline  for  setting  search   requirements  that  managers  and  architects  can  use  to  adopt  Lucene/Solr,  and  adapt  this  open   source  search  technology  to  the  unique  needs  of  their  business.                       ©  2010,  Lucid  Imagination   The Case for Lucene/Solr: Real World Search Applications A Lucid Imagination White Paper • January 2010   Page ii
  • 3.                                                     Table  of  Contents   Introduction ............................................................................................................................................................... 1   Understanding  Search  Opportunities  and  Requirements ...................................................................... 2   What  Data  and  Documents  Are  You  Searching? ................................................................................ 3   Who  Needs  the  Results  and  Why? ........................................................................................................... 3   Where  Is  Search  Integrated  with  IT  Infrastructure? ....................................................................... 5   How  Is  the  Search  Interface  Presented  to  the  User?........................................................................ 5   The  Real  World:  Applications  and  Case  Studies ......................................................................................... 7   Yellow  Pages,  Local  Search,  and  Searching  Classifieds........................................................................ 8   Media .......................................................................................................................................................................10   E-­‐commerce..........................................................................................................................................................12   Job  and  Career  Sites ..........................................................................................................................................14   Libraries,  Archives,  and  Museums  (LAMs)  Search ..............................................................................16   Social  Media  Search...........................................................................................................................................18   Enterprise  (Intranet)  Search.........................................................................................................................21   Business  Use  Case  Matrix ...................................................................................................................................23   Appendix:  Lucene/Solr  Features  and  Benefits..........................................................................................24     The Case for Lucene/Solr: Real World Search Applications A Lucid Imagination White Paper • January 2010   Page iii
  • 4.                                                   Introduction As  fast  as  companies,  communities,  and  consumers  produce  data—about  each  other,  products,   opinions,  research,  and  everything  else  imaginable—they  need  faster,  more  versatile  search   capabilities  to  find  the  information  they  need  to  create  opportunities  for  competitive  advantage.  In   today’s  information-­‐driven  environment,  search  addresses  the  critical  problems  created  by  the   explosive  growth  of  content  by  slashing  the  time  and  effort  users  expend  in  finding  data  they  value.   Search  spans  the  range  of  business  models  and  use  cases:  from  driving  direct  customer  sales,  to   analytics  and  business  intelligence,  employee  productivity,  and  reduced  administrative  overhead.     Apache  Lucene/Solr1  open  source  search  technology  has  been  implemented  across  the  broadest   range  of  applications  and  business  models—and  likely  in  ways  that  can  fit  the  needs  of  your   organization.  In  successful  operation  today  at  thousands  of  enterprises,  Lucene/Solr  technology   scales  from  tens  of  thousands  to  hundreds  and  billions  of  documents;  searches  data  that  is   structured,  unstructured,  and  in  combination;  data  inside  and  outside  the  firewall;  and  ranges  in   use  from  a  simple  website  search  box  through  sophisticated  faceted  navigation.  It  addresses  equally   diverse  business  processes  and  mission  critical  applications.  Across  the  spectrum,  Lucene/Solr   helps  users  find,  make  sense  of,  and  act  upon  information  quickly  and  efficiently.   In  this  white  paper,  we’ll  review  real-­‐world  case  studies  for  Lucene/Solr  functionality  across   business  sectors  to  demonstrate  its  versatility  and  varied  applicability.  The  diversity  of  examples   provides  strong  evidence  of  Lucene/Solr’s  flexibility  and  power  as  a  search  technology.  The   examples  also  attest  to  the  innovation  and  transparency  inherent  to  the  open  source  development   model.  Our  focus  is  on  familiarizing  the  audience  of  business  managers  and  application  owners  with   existing  Lucene/Solr  applications;  the  substantial  technical  advantages  to  developers  are  covered   elsewhere.                                                                                                                     1 Lucene and Solr are complementary technologies that offer very similar underlying capabilities; Solr is the Lucene Search Server. Since Lucene serves as the core of Solr’s search capabilities, this paper refers to the two as Lucene/Solr. For more information, see the Appendix. The Case for Lucene/Solr: Real World Search Applications A Lucid Imagination White Paper • January 2010   Page 1
  • 5.                                                   We’ll  first  survey  the  key  requirements  and  business  use  cases  of  search  and  then  look  at  where   they  are  built  into  search  applications.  Our  objective  is  to  provide  business  managers  and   application  owners  with  a  broad  perspective  on  how  Lucene/Solr  search  technology  is  used  to  build   solutions  to  compelling  business  problems.  In  the  Appendix,  we  provide  an  overview  of   Lucene/Solr’s  key  features  and  benefits,  with  a  basic  outline  of  the  capabilities  offered  to  meet  the   broadest  range  of  business  needs.     Understanding Search Opportunities and Requirements Search  technology  has  come  a  long  way  from  its  roots  in  matching  keywords  with  appearance  in   documents  and  obtaining  undifferentiated  results.  Search  today  empowers  users  by  delivering   actionable  information  quickly  and  efficiently,  across  multiple,  diverse  sources  of  data.  The   business  use  cases  range  from  executing  mission  critical  commercial  transactions  (e.g.,  e-­‐commerce   sites)  to  unlocking  employee  and  end-­‐user  productivity  in  the  search  for  a  single  relevant  document   (e.g.,  enterprise  search).     Given  the  breadth  of  capability  of  the  problem  domain,  it’s  useful  to  look  at  search  and  ask  two   fundamental  questions:  “How  it  can  it  solve  my  business  problems?”  and  “What  new  business   opportunities  can  search  solve  for?”   In  considering  how  search  technology  solves  business  problems,  it  is  useful  to  start  with  an   elucidation  of  the  requirements  you’ll  need  to  consider  for  your  search  application.  At  the  same   time,  be  sure  to  look  more  broadly  at  the  capabilities  that  Lucene/Solr  offers,  as  it  can  help  open  up   new  frontiers  for  incorporating  search  and  leveraging  more  value  from  data  repositories.     Starting  with  some  basic  questions—what,  who,  how,  and  where—you  can  clarify  the  high-­‐level   business  requirements  specific  to  your  business  needs,  which  in  turn  allow  you  to  make  the  best   decisions  for  your  search  application.  The  process  of  looking  at  the  fundamentals  also  raises  new   questions  about  how  and  where  the  search  technology  offered  by  Lucene  and  Solr  can  create  new   business  opportunities.   Let’s  look  at  four  fundamental  questions  you  should  address  in  understanding  search  opportunities   and  requirements:   • What  data  and  documents  are  you  searching?     • Who  needs  the  results  and  why?     • Where  is  search  integrated  with  IT  Infrastructure?         • How  is  the  search  interface  presented  to  the  user?     The Case for Lucene/Solr: Real World Search Applications A Lucid Imagination White Paper • January 2010   Page 2
  • 6.                                                   What Data and Documents Are You Searching? Business  today  is  driven  more  than  ever  by  the  end-­‐users’  creation  and  consumption  of  real-­‐time   information.  A  key  differentiating  capability  of  search  technology  is  ingesting  a  broad  range  of   content  types  and  processing  large  collections  of  diverse  data  in  real  time  in  order  to  deliver   actionable  information.  Two  aspects  to  consider:   • Types  of  Content   Content  comes  in  multiple  formats:  HTML  pages,  XML  files,  PDFs,  images,  PowerPoint   presentations,  Excel  spreadsheets,  Word  documents,  log  files,  multimedia  content,  and   more.  Content  resides  in  various  repositories,  including  databases,  file  servers,  content   management  systems,  archiving  systems,  collaboration  applications,  and  employee   desktops  and  laptops.  Search  technology  must  be  able  to  locate,  organize,  and  aggregate   data  whatever  its  form  or  location.     • Frequency  of  Updating  Content   Organizations  update  content  at  varying  intervals,  driven  by  differing  business  processes   and  models—social  media  or  news  applications  have  real-­‐time  content  need,  whereas  an  e-­‐ commerce  application  might  re-­‐index  in  response  to  new  inventory  on  a  batch  basis  and  a   research  institution  might  add  to  its  collection  less  often  still.  Search  applications  need  to  be   adaptable  to  the  differences  in  content  change  frequency.   Who Needs the Results and Why? Business  search  puts  a  high  priority  on  end  user  experience  and  results  in  which  the  searched   content  is  tuned  to  the  unique  needs  of  each  user.  Because,  after  all,  the  human  dimension—the   usefulness  of  results  and  the  efficacy  of  interaction—is  the  acid  test  of  a  search  application.  Internet   search  applications  like  Google,  Yahoo,  and  Bing  are  now  common  and  mature.  They  have  raised   user  expectations  about  key  qualities  of  the  search  experience...but  they  solve  a  very  different   problem.     While  Internet  searches  can  produce  millions  of  results  in  milliseconds,  they  rely  on  measures  like   website  popularity  or  URLs  and  domain  names—not  relevant  and  not  generally  applicable  to   purpose-­‐built  applications  for  businesses.  What’s  more,  they  rely  on  generalizing  relevancy  for  a   global  population  of  all  Internet  users,  without  being  tied  to  business  rules,  or  business  process   logic,  or  the  opportunity  cost  of  improved  precision  for  a  specific  set  of  data  or  search  users.   Business  search  applications  cannot  rely  on  such  brute  force  coarse  approaches  to  tune  their   results.  They  need  far  more  control  and  precision.  They  have  to  be  able  to  deliver  highly  useful   results  while  matching,  if  not  exceeding,  the  levels  of  user  experience  that  people  have  come  to   expect  by  virtue  of  their  daily  interactions  with  commercial  search  engines.  Key  points  of   consideration  from  a  business  perspective  are:   The Case for Lucene/Solr: Real World Search Applications A Lucid Imagination White Paper • January 2010   Page 3
  • 7.                                                   • Relevance   Relevance  is  entirely  a  factor  of  the  goals  of  the  search  application’s  users.  The  application   must  have  the  mechanisms  to  recognize  the  subjective  needs  of  users  and  tune  results   accordingly.  It  must  also  provide  easier  ways  to  narrow  search  criteria  without  requiring   users  to  come  up  with  perfect  query  terms.  Flexibility  for  drilling  deeper  will  make  results   richer  and  valuable.  Mechanisms  to  apply  filters,  proximity  values,  and  sorting  parameters   to  narrow  search  scope  can  also  lead  to  a  richer  set  of  more  useful  results,  with  less  time   and  effort.   • Cost  of  Relevance     As  business  goals  are  driven  by  revenue  opportunities  and  cost  savings,  it  is  critical  to  tie   relevance  to  the  economics  of  the  business.  For  example,  a  public-­‐facing  retail  site  should   focus  on  matching  merchandise  to  search,  site  stickiness,  and  customer  loyalty.  It  requires   search  technology  that  streamlines  and  simplifies  the  shopping  experience  with  relevant   results  directly  contributing  to  sales  revenue.  For  knowledge  workers,  internal  search   applications  should  help  make  employees  more  productive  by  reducing  the  amount  of  time   and  effort  to  find  documents  they  need  to  do  their  jobs.  Multiple  studies  show  that   information  workers  can  spend  20–30%  of  their  time  searching  for  information.   • Precision  Ranking   Result  accuracy,  sorted  by  attributes  like  relevance,  date,  field,  or  any  document  property   feature,  makes  the  search  process  better.  End  users  generally  abandon  a  search  before   tackling  the  fine  points  of  Boolean  logic  or  scrolling  for  a  result  buried  too  far  down.     • Query  Response  Speed   Today,  5–7  seconds  is  the  typical  threshold  for  end-­‐user  patience.  Too  much  wait  time  for   search  results  frustrates  users,  and  causes  them  to  abandon  pages.  Fast,  relevant  results   cannot  be  limited  by  search  technology  hamstrung  by  data  influx  or  query  overload.  Query   response  time  should  also  work  hand-­‐in-­‐hand  with  the  refinement  of  multiple  search   attributes,  so  that  increasingly  complex  queries  do  not  extract  a  performance  penalty.   The Case for Lucene/Solr: Real World Search Applications A Lucid Imagination White Paper • January 2010   Page 4
  • 8.                                                   Where Is Search Integrated with IT Infrastructure? Useful,  valuable  search  technology  rarely  exists  in  isolation.  Searched  data  is  transformed  into   actionable  information  when  it  is  integrated  with  the  organization’s  information  infrastructure:   business  process  to  business  intelligence  to  content  management  systems.  A  robust  search   technology  must  be  customizable  to  integrate  with  the  existing  systems  seamlessly.     • Application  Integration   A  key  requirement  for  a  search  application  is  its  extensibility  for  integration  with  existing   infrastructure  and  applications  like  content  management  systems,  databases,  and  the  full   range  of  business  processes  and  applications.  It  should  have  interfaces  that  support   ingestion  of  data  as  well  as  delivery  of  results  in  readily  consumable  formats—because  in   many  cases,  results  are  consumed  by  other  applications,  not  a  human.   • Scalability   We  can  assume  that  data  will  change  and  grow.  So  scalability  is  a  key  factor  for  search   application.  Applications  should  grow  to  address  future  needs  without  penalties  for  the   breadth  of  data  or  for  the  count  of  documents  indexed.  The  search  application  should  be   able  to  grow  with  the  requirements  of  the  organization,  without  needing  additional  large   investments  in  hardware  to  match  the  pace  of  growth.  Proprietary  search  vendors  often   charge  for  search  by  the  number  of  documents  indexed.  In  a  world  where  constantly   expanding  content  growth  is  the  norm,  such  costs  can  be  a  real  and  substantial  drag  on   the  cost  of  ownership  for  search  applications,  many  times  resulting  in  negative  return.     • Security   Every  organization  has  its  own  security  requirements  and  access  controls.  Search   technologies  need  to  comply  with  the  security  policies  of  the  enterprise,  controlling   results  that  have  restricted  access.  The  search  technology  should  also  be  able  to  make  use   of  document-­‐level  security  from  other  sources.     How Is the Search Interface Presented to the User? The  user  interface  is  where  search  delivers  on  findability  and  presents  actionable  results.  The   search  application  is  only  as  good  as  the  convenience  of  submitting  queries,  reviewing  and  refining   results,  and  finding  information.  Key  aspects  to  consider:     The Case for Lucene/Solr: Real World Search Applications A Lucid Imagination White Paper • January 2010   Page 5
  • 9.                                                   • Navigation   Users  benefit  from  guidance  that  makes  their  queries  more  productive.  Techniques  such  as   faceted  search  with  result  clustering,  advance  hinting  (“did  you  mean”),  “more  like  this,”   and  drop  down  menus  for  setting  search  scope  help  users  achieve  desired  results  faster,   making  a  search  application  both  user-­‐  and  information-­‐friendly.  It  is  also  important  to   allow  users  to  draw  associative  connections  between  results—using  the  technology  to   uncover  relationships  and  discover  more  about  what  they  were  seeking  than  they  knew  at   the  outset.     The  NetFlix  search   application  is  powered   by  Solr;  it  adds  the  fuzzy   dimension  to  search,   with  auto-­completion  of   movie  names,  correction   of  misspelled  names  of   actors,  and  suggests   titles  closest  to  the   query.  As  a  result,  85%   of  users  have  found  the   movie  they  were  looking   for  ranked  at  the  #1  spot   in  the  results.         • Discovery   Search  application  functionality  should  extend  beyond  the  generic  presentation  of  a  result   list  of  documents  that  contain  a  keyword.  Highlighting  keywords  in  searched  results,   expanding  searches  with  synonyms  and  spell  checking,  and  offering  users  ways  to  learn  a   bit  more  about  documents  in  the  results  without  having  to  load  the  document  are  great   ways  to  significantly  improve  usability.       • Intuitive  Intelligence   Search  applications  must  go  beyond  keyword  search  to  help  users  retrieve  accurate   information  even  when  they  are  not  sure  of  the  best  keywords.  Additionally,  they  should   reduce  misinterpretations  where  homonyms,  spelling  errors,  and  ambiguous  keywords  are   involved  (e.g.,  is  “apple”  a  fruit  or  a  computer  company?).   The Case for Lucene/Solr: Real World Search Applications A Lucid Imagination White Paper • January 2010   Page 6
  • 10.                                                   The Real World: Applications and Case Studies With  an  understanding  of  the  fundamentals  of  search  business  applications  in  hand,  it  is   helpful  to  gain  additional  context  on  business  usage  through  a  survey  of  organizations  that   have  successfully  used  Lucene/Solr  for  powerful  search  applications.     All  of  these  cases  were  built  on  the  capability  of  Lucene/Solr  to  provide  innovative,  high-­‐ performance,  cross-­‐platform,  feature-­‐rich  search  technology  suitable  for  nearly  every   application.  By  powering  diverse  search  applications  for  thousands  of  organizations  such   as  AT&T,  Zappos,  McClatchy,  Smithsonian,  MTV  Networks,  LinkedIn,  MySpace,  Comcast,   Monster,  Netflix,  and  many  more,  Lucene/Solr  has  provided  mission  critical  capability  that   turns  search  into  a  robust  competitive  advantage.     For  these  organizations,  Lucene/Solr  solutions  regularly  index  and  search  hundreds  of   millions  of  documents  with  subsecond  response  time,  unencumbered  by  costly  licensing  or   vendor  lock-­‐in.  Together  they  represent  a  compelling  argument  for  the  broad  applicability   of  Lucene/Solr  across  the  full  range  of  business  opportunities  and  search  needs.  Business   use  case  studies  we’ll  review  include:   • Yellow  Pages,  Local  Search,  and  Searching  Classifieds   • Media   • E-­‐commerce     • Job  and  Career  Sites     • Libraries,  Archives,  and  Museums  (LAMs)  Search     • Social  Media  Search     • Enterprise  (Intranet)  Search     The Case for Lucene/Solr: Real World Search Applications A Lucid Imagination White Paper • January 2010   Page 7
  • 11.                                                   Yellow Pages, Local Search, and Searching Requirements     Classifieds In  the  business  of  online  local  search,  geographic-­‐based  (location)   • Intelligent  results  going   beyond  keyword  search   relevance  generates  competitive  advantage.  Online  directories   need  to  provide  a  rich,  interactive  search  experience  to  users  to   • Deeper,  faceted   increase  site  views  and  stickiness,  which  in  turn  translates  into   navigation   increased  advertising  revenue.  Simplified  location-­‐based  search,   • Seamless  integration   with  latest  Web  2.0   intuitive  faceted  query  response,  and  data  mashups  are  a  few   features  that  define  search  functionality  for  an  online  directory.   tools   • Lower  IT-­‐related  costs   Lucene/Solr  solutions  offer  accurate  search  results,  factoring  in   • Geocentric  user   location,  users’  reviews,  and  ratings,  alongside  paid  advertising.  By   experience   taking  advantage  of  Solr’s  open  source  model—with  search   • Search  numeric  values   algorithms  that  are  completely  transparent—companies  can  invest     in  configuring  their  search  solutions  to  match  their  business  logic,   Solr  Solution   rather  than  trying  to  infer  or  pay  for  exposure  proprietary  back-­‐ end  logic.     • Customizable  Search   Index  which  can  be     tuned  transparently  to     Internet  Yellow  pages  and  local   account  for  key     online  search  is  forecast  to   findability  drivers   • Drop  down  filters  for   grow  to  $27.8  billion  in  2011.     narrowing  or  widening     The  Kelsey  Report1   the  scope  of  search   • Seamless  integration   Success  Stories   with  existing   technologies   • YP.com,  a  division  of  AT&T  Interactive   • Native  numeric   • Zvents.com,  local  event  search  service     encoding  and  search   • Yelp.com,  the  community  local  search  site   capabilities     M • Reduced  server     footprint  for  lower  TCO     than  most  commercial     vendors         1The  Kelsey  Group’s  Global  Print  Yellow  Pages,  Internet  Yellow  Pages  and  Local  Search  Five     Year  Outlook   The Case for Lucene/Solr: Real World Search Applications A Lucid Imagination White Paper • January 2010   Page 8
  • 12.                                                             Case  Study  1     yp.com  by  AT&T  Interactive       AT&T  Interactive  is  an  online  and  mobile  search  and  advertising  company.  Their  leading-­‐edge  portal,  yp.com—an     online  business  listing  and  advertising  site—was  originally  implemented  with  a  commercial  proprietary  search     application.  It  faced  issues  of  scalability,  vendor  lock-­‐in,  and  performance.  With  help  from  Lucid  Imagination,  AT&T   successfully  migrated  to  a  Solr-­‐based  search  solution  that  leveraged  the  flexibility  of  open  source  without   compromising  features  and  functionality.    And  they  did  so  with  a  much  smaller  budget.     Business  Needs   • Addressing  the  need  to  factor  in  location  to  support  geographic  search,  and  include  relevant  comments   • Striking  a  balance  between  organic  search  and  advertised  content   • Indexing  highly  unstructured  content  such  as  user  comments     • Increasing  relevancy  of  results  and  boosting  paid  search  results  for  preferential  placement  of  advertisers   • Linguistic  support  to  enable  search  experience,  such  as  spellchecking,  synonyms,  find-­‐similar,  etc.   • Integrating  with  latest  Web  2.0  tools   • Reducing  server  footprint     The  Solr  Solution     • Context-­‐specific  relevancy,  geographic  proximity,  ad  placement,  and  user  comments   • Faceting,  drop  down  filters  to  narrow/widen  the  scope  of  search     • Functional  support  for  creating  new  features     • Spell-­‐correction,  and  location-­‐optimized  search  results  to  show  users  businesses  nearest  to  them  first   • Seamless  integration  with  many  Web  2.0  tools  to  create  innovative  features  and  mashups   • Lowers  TCO  by  reducing  the  number  of  search  servers  from  120  to  two  dozen  servers     The Case for Lucene/Solr: Real World Search Applications A Lucid Imagination White Paper • January 2010   Page 9
  • 13.                                                     Media Brand  reinforcement,  premium  content,  and  easy  accessibility   are  the  main  business  motivators  for  online  media  and   Requirements   publishing  companies.  Relevant  information  improves  time  on   • Real-­‐time  indexing  of   the  site  and  encourages  users  to  explore  related  content,   petabytes  of  structured   boosting  subscription  rates  and  site  views.  These  translate  into  a   and  unstructured  data     virtuous  cycle  of  additional  revenue  generation.   • Deeper  search  capability   • Improved  query   Given  that  content  is  the  business,  the  need  for  a  robust  search   response  time   application  ties  directly  to  competitive  advantage.     • Reduced    infrastructure   Lucene/Solr  provides  a  customized,  function  rich  solution  for  the   and  customization  costs   media  and  publishing  industry.  It  addresses  dynamic  challenges     of  content  diversity,  content  freshness,  and  content  acquisition  ,   Solr  Solution   and  gives  companies  a  platform  on  which    to  build  a  world-­‐class   • Reverse  indexing   innovative  search  experience  to  differentiate  themselves  in  a   • Intelligent,  faceted  search   highly  competitive  marketplace.     to  enable  contextual  and   linguistic  relevance     • Easy  configuration  for     “Solr  has  done  wonders  for  us.   parsing  structured  and     It  is  easy  to  understand  and   unstructured  data   deploy,  and  has  reduced  our   • Easy  and  seamless     installation  for  lower   costs  drastically.”   TCO       Doug  Steigerwald,   • Customization  with  open   source  code      McClatchy  Interactive           Success  Stories   • McClatchy  Newspapers   • Netflix     • Comcast  Interactive   • MTV  Networks,  a  division  of  Viacom   M • The  Motley  Fool,  fool.com     • Fanfeedr.com,  personalized  sports  aggregator     The Case for Lucene/Solr: Real World Search Applications A Lucid Imagination White Paper • January 2010   Page 10
  • 14.                                                       Case  Study  2     McClatchy—Leading  Newspaper  Publisher   The  third  largest  newspaper  publisher  in  the  United  States,  McClatchy  Company  owns  30  daily   newspapers  in  29  markets  across  the  country.  To  win  online,  McClatchy  knew  it  had  to  have  a  robust   search  solution,  to  empower  the  McClatchy  audience  with  the  information  they  wanted  and  secure   loyalty  from  readers  and  sponsorships  from  advertisers.  Working  with  Lucid  Imagination,  McClatchy   migrated  from  proprietary  search  software  to  open  source  and  chose  Solr  for  its  high  performance,   comprehensive  capabilities,  and  superior  value     Requirements   • Proliferating  content  and  data  sources  (text,  videos,  audios,  images),  with  real-­‐time   streaming     • Empowering  end  users  with  ease  of  use   • Supporting  peak  traffic  and  popular  search  spikes  with  consistent  performance   • Providing  scalability  for  a  database  growing  by  orders  of  magnitude  annually   • Providing  flexibility  to  support  customization   • Controlling  IT  costs  while  exceeding  performance  benchmarks  of  competition     The  Lucene/Solr  Solution     • Deeper  content  by  indexing  both  structured  and  unstructured  data  in  real  time,  effortlessly   • Indexes  millions  of  documents,  with  search  results  delivered  in  milliseconds     • User-­‐friendly  navigation  with  drop  down  filters,  faceted  navigation,  linguistic  corrections,   etc.       • Excellent  performance,  even  in  peak  hours,  by  load-­‐balancing  search  requests  across  servers     • Scalability  without  impact  on  performance     • High  degree  of  customization,  since  it’s  open  source   • Integration  with  existing  IT  infrastructure  and  eliminates  associated  license  fees  to  cut  costs   • 8-­‐fold  reduction  in  server  footprint     The Case for Lucene/Solr: Real World Search Applications A Lucid Imagination White Paper • January 2010   Page 11
  • 15.                                                   E-commerce     E-­‐commerce  businesses  must  provide  a  compelling  shopping  experience   Requirements   in  order  to  maintain  brand  equity  and  thrive  in  a  very  highly  competitive   • Multidimensional,   market  landscape.  By  reducing  the  time  and  effort  required  to  navigate   dynamic  search   available  merchandise  and  find  what  they  want,  superior  search   • Faster  results   contributes  directly  to  a  satisfying  buying  experience  for  customers.   • Real-­‐time  indexing   Search  then  translates  directly  into  higher  revenues  and  customer   of  products   loyalty.  Instant  results,  intuitively  organized,  advanced  faceting  for  easy   • Faceting  and   browsing,  synchronizing  results  with  images,  and  integration  with  user   browsing   ratings  are  among  the  must  have  features  of  an  e-­‐commerce  search   capabilities   application.   • Seamless   Lucene/Solr  gives  companies  the  ability  to  build  their  sites  around  the   integration  with   concept  of  “searchendizing”—putting  the  desired  merchandise  at  the  top   existing  IT   of  the  results  list—which  can  make  the  difference  between  sales  made   infrastructure   and  sales  lost.  Faceting,  database  integration,  real-­‐time  indexing,  and     query  monitoring  all  enable  users  to  find  products  they  want,  driving   Solr  Solution   conversion  rates  and  enabling  a  winning  online  experience.  2     • Faceted  search  for     deeper  drill  down     Online  retail  sales  in  the   and  browsing     B2C  market  are  expected   • Intuitive  search     capabilities  for   Success  Stories   to  reach  $340  billion  by   cross-­‐channel   201321   shopping   • Buy.com   • Sears.com     experience     Forrester  Research   • System   • Macys.com   administration  tools   • Zappos.com   for  data  loading,   • Advanceautoparts.com   index  replication,   • Dollardays.com   monitoring,  logging,                                                                                                                   and  cache   management     • Query  monitoring   2  “Consumers  will  spend  more  than  $340  billion  online  by  2013,  says  Forrester,”   for  better    Internet  Retailer,  27  November  2009,  http://www.internetretailer.com/dailyNews.asp?id=32630.   highlighting  of   popular  products       The Case for Lucene/Solr: Real World Search Applications A Lucid Imagination White Paper • January 2010   Page 12
  • 16.                                                                     Case  Study  3   Zappos   Zappos  is  the  premier  destination  for  online  shoe  shopping.  At  Zappos,  the  mission  is  excellent  online  customer   service—customers  should  be  able  to  browse  shoe  styles,  sizes,  shapes,  and  colors  more  easily  than  any  other  shoe   store,  on  or  offline.  To  achieve  this,  Zappos  wanted  a  robust,  flexible,  multifunctional  search  solution/application.   After  evaluating  many  commercial  search  technologies,  Zappos  zeroed  in  on  Solr,  working  with  Lucid  Imagination  to   ensure  continued,  successful  deployment.   Requirements   • Simplified,  attractive  user  experience  that  makes  it  easy  to  find  and  buy   • Relevant  results,  fast   • Navigation  across  attributes,  such  as  size,  color,  and  style  for  broader  and  deeper  results   • Indexing  products  as  they  were  entered  in  the  catalogs   • Cross-­‐functional  navigation  to  give  customers  a  realistic  shopping  experience   • Intuitive  intelligence  to  provide  alternate  suggestions   • Analytical  capabilities  to  drive  business  strategy   • Facilitating  control  on  results   • Integration  with  existing  IT  infrastructure     The  Solr  Solution   • Search  results  in  subseconds,  across  categories   • Faceting,  for  easy  browsing  and  discovery  and  a  compelling  user  experience     • Real-­‐time  indexing  of  products   • Synchronization  of  visuals,  specs,  filters,  and  promotions  to  make  shopping  experience  true  to  life   • Information  on  user  activity  to  help  build  strategy  on  product  promotions   • Controls  to  rank    popular  or  high-­‐stock  products  in  results    where  users  are  more  likely  to  buy  them   • Facilitates  integration  with  heterogeneous  open  source  environment   The Case for Lucene/Solr: Real World Search Applications A Lucid Imagination White Paper • January 2010   Page 13
  • 17.                                                       Job and Career Sites Requirements     • Linguistic   Job  portals  are  countercyclical  to  the  economy.  When  the  economy   intelligence  for   flourishes,  posted  jobs  grow  in  number;  when  it  sags,  candidates  flock  in   more  relevant   to  post  their  resumes.  Success  for  an  online  job  portal  is  tied  to  the   results   efficiency  of  its  search  capability—matching  résumés  to  job  listings  and   • Control  search   vice  versa—so  both  employers  and  prospective  employees  can  zero  in   results  to  maintain   on  just  the  right  opportunity.   privacy   For  example,  an  employer  may  want  to  navigate  through  filters  to   • Deeper  search   narrow  the  scope  of  a  candidate  search,  such  as  education,  previous   capability   employer,  salary  history,  skillsets,  etc.;  a  job  seeker  may  want  to  expose   • Numeric  search   these  attributes,  but  keep  a  current  employer’s  name  confidential.  A  job-­‐ • Faster  query   seeker  may  want  to  apply  to  jobs  within  a  particular  geographic  area.   response   • Reduced   Lucene/Solr  not  only  provides  such  flexibility  but  also  addresses  other   infrastructure  and   complexities  of  this  industry  by  enabling  linguistic  intelligence  (such  as   customization  costs   identical  acronyms  that  correspond  to  different  entities;  variations  in     spelling,  imperfectly  constructed  search  queries);  indexing  unstructured   Solr  Solution   data  (résumés);  and  managing  ever-­‐growing  data.   • Intelligent,  faceted     search  to  enable   contextual  and     “I  think  the  breakthrough  was   linguistic  relevance     when  we  tried  it,  and  we   • Easy  configuration   realized,  wow,  this  thing  could   for  parsing     structured  and   really  scale.”   unstructured  data       • Easy  and  seamless     Peter  Keegan,  Monster.com   installation  for     Success  Stories   lower  TCO   • Business  process   • Monster   integration  and   • The  Big  Jobs   Customization  with   • eBharatJobs   open  source  code     • Careerjet       M The Case for Lucene/Solr: Real World Search Applications A Lucid Imagination White Paper • January 2010   Page 14  
  • 18.                                                         Monster.com   Monster  is  the  largest  job  search  engine  in  the  world,  with  over  a  million  jobs  posted  at  any  one  time.  By  2008  it  had   150  million  résumés  in  its  database,  serving  over  63  million  job  seekers  per  month,  now  running  on  average  300  to   400  queries  per  second  with  an  average  response  time  of  40  milliseconds.  To  provide  the  highest  level  of  service   and  support  to  their  customers—both  employers  and  job  seekers—Monster  has  an  unmatched  marketplace  for   employment  opportunities,  with  Lucene-­‐based  search  at  the  heart  of  its  business  model.     The  Requirements     • Managing  high  volumes  of  data,  continually  increasing  by  double  digit  percentages  annually   • Maintaining  constant  inventory  updates  and  providing  faster  results   • Removing  technological  barriers  that  limit  the  scope  of  information   • Enabling  end  users  to  refine  search  and  drill  deeper  without  any  performance  impact   • Providing  security  controls  to  ensure  end  user  privacy   • Facilitating  scalability  and  flexibility  in  tandem  with  company’s  vision  and  growth  plans     The  Lucene  Solution     • High  volumes  of  data  by  clustering  data  to  reduce  the  index  size     • Real-­‐time  indexing  for  fresher,  faster  query  results     • Intuitive  search  to  enable  in-­‐depth  cross-­‐functional  job  and  résumé  browsing   • Faceted  search  and  ‘single  click’  filters  for  search  refinement     • Security  controls  to  manage  user  information   • Unlimited  scalability  and  customization  leveraging  open  source  licensing     The Case for Lucene/Solr: Real World Search Applications A Lucid Imagination White Paper • January 2010   Page 15