"Thousands of organizations around the world, including AT&T, Sears, Ford, Verizon, The Guardian, Elsevier, Cisco, Macy’s and more have found their solution: Lucene/Solr open source, the world’s most popular search technology. Our new white paper “A Manager’s Guide to Real World Open Source Search Applications” provides numerous case studies across various industries and business models to show how real-world businesses have turned Lucene/Solr open source search into competitive advantage.http://www.lucidimagination.com/files/file/whitepaper/LIWP_LuceneSolrRealWorldSearch.pdf
"
3.
Table
of
Contents
Introduction ............................................................................................................................................................... 1
Understanding
Search
Opportunities
and
Requirements ...................................................................... 2
What
Data
and
Documents
Are
You
Searching? ................................................................................ 3
Who
Needs
the
Results
and
Why? ........................................................................................................... 3
Where
Is
Search
Integrated
with
IT
Infrastructure? ....................................................................... 5
How
Is
the
Search
Interface
Presented
to
the
User?........................................................................ 5
The
Real
World:
Applications
and
Case
Studies ......................................................................................... 7
Yellow
Pages,
Local
Search,
and
Searching
Classifieds........................................................................ 8
Media .......................................................................................................................................................................10
E-‐commerce..........................................................................................................................................................12
Job
and
Career
Sites ..........................................................................................................................................14
Libraries,
Archives,
and
Museums
(LAMs)
Search ..............................................................................16
Social
Media
Search...........................................................................................................................................18
Enterprise
(Intranet)
Search.........................................................................................................................21
Business
Use
Case
Matrix ...................................................................................................................................23
Appendix:
Lucene/Solr
Features
and
Benefits..........................................................................................24
The Case for Lucene/Solr: Real World Search Applications
A Lucid Imagination White Paper • January 2010
Page iii
4.
Introduction
As
fast
as
companies,
communities,
and
consumers
produce
data—about
each
other,
products,
opinions,
research,
and
everything
else
imaginable—they
need
faster,
more
versatile
search
capabilities
to
find
the
information
they
need
to
create
opportunities
for
competitive
advantage.
In
today’s
information-‐driven
environment,
search
addresses
the
critical
problems
created
by
the
explosive
growth
of
content
by
slashing
the
time
and
effort
users
expend
in
finding
data
they
value.
Search
spans
the
range
of
business
models
and
use
cases:
from
driving
direct
customer
sales,
to
analytics
and
business
intelligence,
employee
productivity,
and
reduced
administrative
overhead.
Apache
Lucene/Solr1
open
source
search
technology
has
been
implemented
across
the
broadest
range
of
applications
and
business
models—and
likely
in
ways
that
can
fit
the
needs
of
your
organization.
In
successful
operation
today
at
thousands
of
enterprises,
Lucene/Solr
technology
scales
from
tens
of
thousands
to
hundreds
and
billions
of
documents;
searches
data
that
is
structured,
unstructured,
and
in
combination;
data
inside
and
outside
the
firewall;
and
ranges
in
use
from
a
simple
website
search
box
through
sophisticated
faceted
navigation.
It
addresses
equally
diverse
business
processes
and
mission
critical
applications.
Across
the
spectrum,
Lucene/Solr
helps
users
find,
make
sense
of,
and
act
upon
information
quickly
and
efficiently.
In
this
white
paper,
we’ll
review
real-‐world
case
studies
for
Lucene/Solr
functionality
across
business
sectors
to
demonstrate
its
versatility
and
varied
applicability.
The
diversity
of
examples
provides
strong
evidence
of
Lucene/Solr’s
flexibility
and
power
as
a
search
technology.
The
examples
also
attest
to
the
innovation
and
transparency
inherent
to
the
open
source
development
model.
Our
focus
is
on
familiarizing
the
audience
of
business
managers
and
application
owners
with
existing
Lucene/Solr
applications;
the
substantial
technical
advantages
to
developers
are
covered
elsewhere.
1
Lucene and Solr are complementary technologies that offer very similar underlying capabilities; Solr is the Lucene
Search Server. Since Lucene serves as the core of Solr’s search capabilities, this paper refers to the two as
Lucene/Solr. For more information, see the Appendix.
The Case for Lucene/Solr: Real World Search Applications
A Lucid Imagination White Paper • January 2010
Page 1
5.
We’ll
first
survey
the
key
requirements
and
business
use
cases
of
search
and
then
look
at
where
they
are
built
into
search
applications.
Our
objective
is
to
provide
business
managers
and
application
owners
with
a
broad
perspective
on
how
Lucene/Solr
search
technology
is
used
to
build
solutions
to
compelling
business
problems.
In
the
Appendix,
we
provide
an
overview
of
Lucene/Solr’s
key
features
and
benefits,
with
a
basic
outline
of
the
capabilities
offered
to
meet
the
broadest
range
of
business
needs.
Understanding Search
Opportunities and Requirements
Search
technology
has
come
a
long
way
from
its
roots
in
matching
keywords
with
appearance
in
documents
and
obtaining
undifferentiated
results.
Search
today
empowers
users
by
delivering
actionable
information
quickly
and
efficiently,
across
multiple,
diverse
sources
of
data.
The
business
use
cases
range
from
executing
mission
critical
commercial
transactions
(e.g.,
e-‐commerce
sites)
to
unlocking
employee
and
end-‐user
productivity
in
the
search
for
a
single
relevant
document
(e.g.,
enterprise
search).
Given
the
breadth
of
capability
of
the
problem
domain,
it’s
useful
to
look
at
search
and
ask
two
fundamental
questions:
“How
it
can
it
solve
my
business
problems?”
and
“What
new
business
opportunities
can
search
solve
for?”
In
considering
how
search
technology
solves
business
problems,
it
is
useful
to
start
with
an
elucidation
of
the
requirements
you’ll
need
to
consider
for
your
search
application.
At
the
same
time,
be
sure
to
look
more
broadly
at
the
capabilities
that
Lucene/Solr
offers,
as
it
can
help
open
up
new
frontiers
for
incorporating
search
and
leveraging
more
value
from
data
repositories.
Starting
with
some
basic
questions—what,
who,
how,
and
where—you
can
clarify
the
high-‐level
business
requirements
specific
to
your
business
needs,
which
in
turn
allow
you
to
make
the
best
decisions
for
your
search
application.
The
process
of
looking
at
the
fundamentals
also
raises
new
questions
about
how
and
where
the
search
technology
offered
by
Lucene
and
Solr
can
create
new
business
opportunities.
Let’s
look
at
four
fundamental
questions
you
should
address
in
understanding
search
opportunities
and
requirements:
• What
data
and
documents
are
you
searching?
• Who
needs
the
results
and
why?
• Where
is
search
integrated
with
IT
Infrastructure?
• How
is
the
search
interface
presented
to
the
user?
The Case for Lucene/Solr: Real World Search Applications
A Lucid Imagination White Paper • January 2010
Page 2
6.
What Data and Documents Are You Searching?
Business
today
is
driven
more
than
ever
by
the
end-‐users’
creation
and
consumption
of
real-‐time
information.
A
key
differentiating
capability
of
search
technology
is
ingesting
a
broad
range
of
content
types
and
processing
large
collections
of
diverse
data
in
real
time
in
order
to
deliver
actionable
information.
Two
aspects
to
consider:
• Types
of
Content
Content
comes
in
multiple
formats:
HTML
pages,
XML
files,
PDFs,
images,
PowerPoint
presentations,
Excel
spreadsheets,
Word
documents,
log
files,
multimedia
content,
and
more.
Content
resides
in
various
repositories,
including
databases,
file
servers,
content
management
systems,
archiving
systems,
collaboration
applications,
and
employee
desktops
and
laptops.
Search
technology
must
be
able
to
locate,
organize,
and
aggregate
data
whatever
its
form
or
location.
• Frequency
of
Updating
Content
Organizations
update
content
at
varying
intervals,
driven
by
differing
business
processes
and
models—social
media
or
news
applications
have
real-‐time
content
need,
whereas
an
e-‐
commerce
application
might
re-‐index
in
response
to
new
inventory
on
a
batch
basis
and
a
research
institution
might
add
to
its
collection
less
often
still.
Search
applications
need
to
be
adaptable
to
the
differences
in
content
change
frequency.
Who Needs the Results and Why?
Business
search
puts
a
high
priority
on
end
user
experience
and
results
in
which
the
searched
content
is
tuned
to
the
unique
needs
of
each
user.
Because,
after
all,
the
human
dimension—the
usefulness
of
results
and
the
efficacy
of
interaction—is
the
acid
test
of
a
search
application.
Internet
search
applications
like
Google,
Yahoo,
and
Bing
are
now
common
and
mature.
They
have
raised
user
expectations
about
key
qualities
of
the
search
experience...but
they
solve
a
very
different
problem.
While
Internet
searches
can
produce
millions
of
results
in
milliseconds,
they
rely
on
measures
like
website
popularity
or
URLs
and
domain
names—not
relevant
and
not
generally
applicable
to
purpose-‐built
applications
for
businesses.
What’s
more,
they
rely
on
generalizing
relevancy
for
a
global
population
of
all
Internet
users,
without
being
tied
to
business
rules,
or
business
process
logic,
or
the
opportunity
cost
of
improved
precision
for
a
specific
set
of
data
or
search
users.
Business
search
applications
cannot
rely
on
such
brute
force
coarse
approaches
to
tune
their
results.
They
need
far
more
control
and
precision.
They
have
to
be
able
to
deliver
highly
useful
results
while
matching,
if
not
exceeding,
the
levels
of
user
experience
that
people
have
come
to
expect
by
virtue
of
their
daily
interactions
with
commercial
search
engines.
Key
points
of
consideration
from
a
business
perspective
are:
The Case for Lucene/Solr: Real World Search Applications
A Lucid Imagination White Paper • January 2010
Page 3
7.
• Relevance
Relevance
is
entirely
a
factor
of
the
goals
of
the
search
application’s
users.
The
application
must
have
the
mechanisms
to
recognize
the
subjective
needs
of
users
and
tune
results
accordingly.
It
must
also
provide
easier
ways
to
narrow
search
criteria
without
requiring
users
to
come
up
with
perfect
query
terms.
Flexibility
for
drilling
deeper
will
make
results
richer
and
valuable.
Mechanisms
to
apply
filters,
proximity
values,
and
sorting
parameters
to
narrow
search
scope
can
also
lead
to
a
richer
set
of
more
useful
results,
with
less
time
and
effort.
• Cost
of
Relevance
As
business
goals
are
driven
by
revenue
opportunities
and
cost
savings,
it
is
critical
to
tie
relevance
to
the
economics
of
the
business.
For
example,
a
public-‐facing
retail
site
should
focus
on
matching
merchandise
to
search,
site
stickiness,
and
customer
loyalty.
It
requires
search
technology
that
streamlines
and
simplifies
the
shopping
experience
with
relevant
results
directly
contributing
to
sales
revenue.
For
knowledge
workers,
internal
search
applications
should
help
make
employees
more
productive
by
reducing
the
amount
of
time
and
effort
to
find
documents
they
need
to
do
their
jobs.
Multiple
studies
show
that
information
workers
can
spend
20–30%
of
their
time
searching
for
information.
• Precision
Ranking
Result
accuracy,
sorted
by
attributes
like
relevance,
date,
field,
or
any
document
property
feature,
makes
the
search
process
better.
End
users
generally
abandon
a
search
before
tackling
the
fine
points
of
Boolean
logic
or
scrolling
for
a
result
buried
too
far
down.
• Query
Response
Speed
Today,
5–7
seconds
is
the
typical
threshold
for
end-‐user
patience.
Too
much
wait
time
for
search
results
frustrates
users,
and
causes
them
to
abandon
pages.
Fast,
relevant
results
cannot
be
limited
by
search
technology
hamstrung
by
data
influx
or
query
overload.
Query
response
time
should
also
work
hand-‐in-‐hand
with
the
refinement
of
multiple
search
attributes,
so
that
increasingly
complex
queries
do
not
extract
a
performance
penalty.
The Case for Lucene/Solr: Real World Search Applications
A Lucid Imagination White Paper • January 2010
Page 4
8.
Where Is Search Integrated with IT Infrastructure?
Useful,
valuable
search
technology
rarely
exists
in
isolation.
Searched
data
is
transformed
into
actionable
information
when
it
is
integrated
with
the
organization’s
information
infrastructure:
business
process
to
business
intelligence
to
content
management
systems.
A
robust
search
technology
must
be
customizable
to
integrate
with
the
existing
systems
seamlessly.
• Application
Integration
A
key
requirement
for
a
search
application
is
its
extensibility
for
integration
with
existing
infrastructure
and
applications
like
content
management
systems,
databases,
and
the
full
range
of
business
processes
and
applications.
It
should
have
interfaces
that
support
ingestion
of
data
as
well
as
delivery
of
results
in
readily
consumable
formats—because
in
many
cases,
results
are
consumed
by
other
applications,
not
a
human.
• Scalability
We
can
assume
that
data
will
change
and
grow.
So
scalability
is
a
key
factor
for
search
application.
Applications
should
grow
to
address
future
needs
without
penalties
for
the
breadth
of
data
or
for
the
count
of
documents
indexed.
The
search
application
should
be
able
to
grow
with
the
requirements
of
the
organization,
without
needing
additional
large
investments
in
hardware
to
match
the
pace
of
growth.
Proprietary
search
vendors
often
charge
for
search
by
the
number
of
documents
indexed.
In
a
world
where
constantly
expanding
content
growth
is
the
norm,
such
costs
can
be
a
real
and
substantial
drag
on
the
cost
of
ownership
for
search
applications,
many
times
resulting
in
negative
return.
• Security
Every
organization
has
its
own
security
requirements
and
access
controls.
Search
technologies
need
to
comply
with
the
security
policies
of
the
enterprise,
controlling
results
that
have
restricted
access.
The
search
technology
should
also
be
able
to
make
use
of
document-‐level
security
from
other
sources.
How Is the Search Interface Presented to the User?
The
user
interface
is
where
search
delivers
on
findability
and
presents
actionable
results.
The
search
application
is
only
as
good
as
the
convenience
of
submitting
queries,
reviewing
and
refining
results,
and
finding
information.
Key
aspects
to
consider:
The Case for Lucene/Solr: Real World Search Applications
A Lucid Imagination White Paper • January 2010
Page 5
9.
• Navigation
Users
benefit
from
guidance
that
makes
their
queries
more
productive.
Techniques
such
as
faceted
search
with
result
clustering,
advance
hinting
(“did
you
mean”),
“more
like
this,”
and
drop
down
menus
for
setting
search
scope
help
users
achieve
desired
results
faster,
making
a
search
application
both
user-‐
and
information-‐friendly.
It
is
also
important
to
allow
users
to
draw
associative
connections
between
results—using
the
technology
to
uncover
relationships
and
discover
more
about
what
they
were
seeking
than
they
knew
at
the
outset.
The
NetFlix
search
application
is
powered
by
Solr;
it
adds
the
fuzzy
dimension
to
search,
with
auto-completion
of
movie
names,
correction
of
misspelled
names
of
actors,
and
suggests
titles
closest
to
the
query.
As
a
result,
85%
of
users
have
found
the
movie
they
were
looking
for
ranked
at
the
#1
spot
in
the
results.
• Discovery
Search
application
functionality
should
extend
beyond
the
generic
presentation
of
a
result
list
of
documents
that
contain
a
keyword.
Highlighting
keywords
in
searched
results,
expanding
searches
with
synonyms
and
spell
checking,
and
offering
users
ways
to
learn
a
bit
more
about
documents
in
the
results
without
having
to
load
the
document
are
great
ways
to
significantly
improve
usability.
• Intuitive
Intelligence
Search
applications
must
go
beyond
keyword
search
to
help
users
retrieve
accurate
information
even
when
they
are
not
sure
of
the
best
keywords.
Additionally,
they
should
reduce
misinterpretations
where
homonyms,
spelling
errors,
and
ambiguous
keywords
are
involved
(e.g.,
is
“apple”
a
fruit
or
a
computer
company?).
The Case for Lucene/Solr: Real World Search Applications
A Lucid Imagination White Paper • January 2010
Page 6
10.
The Real World: Applications and Case Studies
With
an
understanding
of
the
fundamentals
of
search
business
applications
in
hand,
it
is
helpful
to
gain
additional
context
on
business
usage
through
a
survey
of
organizations
that
have
successfully
used
Lucene/Solr
for
powerful
search
applications.
All
of
these
cases
were
built
on
the
capability
of
Lucene/Solr
to
provide
innovative,
high-‐
performance,
cross-‐platform,
feature-‐rich
search
technology
suitable
for
nearly
every
application.
By
powering
diverse
search
applications
for
thousands
of
organizations
such
as
AT&T,
Zappos,
McClatchy,
Smithsonian,
MTV
Networks,
LinkedIn,
MySpace,
Comcast,
Monster,
Netflix,
and
many
more,
Lucene/Solr
has
provided
mission
critical
capability
that
turns
search
into
a
robust
competitive
advantage.
For
these
organizations,
Lucene/Solr
solutions
regularly
index
and
search
hundreds
of
millions
of
documents
with
subsecond
response
time,
unencumbered
by
costly
licensing
or
vendor
lock-‐in.
Together
they
represent
a
compelling
argument
for
the
broad
applicability
of
Lucene/Solr
across
the
full
range
of
business
opportunities
and
search
needs.
Business
use
case
studies
we’ll
review
include:
• Yellow
Pages,
Local
Search,
and
Searching
Classifieds
• Media
• E-‐commerce
• Job
and
Career
Sites
• Libraries,
Archives,
and
Museums
(LAMs)
Search
• Social
Media
Search
• Enterprise
(Intranet)
Search
The Case for Lucene/Solr: Real World Search Applications
A Lucid Imagination White Paper • January 2010
Page 7
11.
Yellow Pages, Local Search, and Searching Requirements
Classifieds
In
the
business
of
online
local
search,
geographic-‐based
(location)
• Intelligent
results
going
beyond
keyword
search
relevance
generates
competitive
advantage.
Online
directories
need
to
provide
a
rich,
interactive
search
experience
to
users
to
• Deeper,
faceted
increase
site
views
and
stickiness,
which
in
turn
translates
into
navigation
increased
advertising
revenue.
Simplified
location-‐based
search,
• Seamless
integration
with
latest
Web
2.0
intuitive
faceted
query
response,
and
data
mashups
are
a
few
features
that
define
search
functionality
for
an
online
directory.
tools
• Lower
IT-‐related
costs
Lucene/Solr
solutions
offer
accurate
search
results,
factoring
in
• Geocentric
user
location,
users’
reviews,
and
ratings,
alongside
paid
advertising.
By
experience
taking
advantage
of
Solr’s
open
source
model—with
search
• Search
numeric
values
algorithms
that
are
completely
transparent—companies
can
invest
in
configuring
their
search
solutions
to
match
their
business
logic,
Solr
Solution
rather
than
trying
to
infer
or
pay
for
exposure
proprietary
back-‐
end
logic.
• Customizable
Search
Index
which
can
be
tuned
transparently
to
Internet
Yellow
pages
and
local
account
for
key
online
search
is
forecast
to
findability
drivers
• Drop
down
filters
for
grow
to
$27.8
billion
in
2011.
narrowing
or
widening
The
Kelsey
Report1
the
scope
of
search
• Seamless
integration
Success
Stories
with
existing
technologies
• YP.com,
a
division
of
AT&T
Interactive
• Native
numeric
• Zvents.com,
local
event
search
service
encoding
and
search
• Yelp.com,
the
community
local
search
site
capabilities
M • Reduced
server
footprint
for
lower
TCO
than
most
commercial
vendors
1The
Kelsey
Group’s
Global
Print
Yellow
Pages,
Internet
Yellow
Pages
and
Local
Search
Five
Year
Outlook
The Case for Lucene/Solr: Real World Search Applications
A Lucid Imagination White Paper • January 2010
Page 8
12.
Case
Study
1
yp.com
by
AT&T
Interactive
AT&T
Interactive
is
an
online
and
mobile
search
and
advertising
company.
Their
leading-‐edge
portal,
yp.com—an
online
business
listing
and
advertising
site—was
originally
implemented
with
a
commercial
proprietary
search
application.
It
faced
issues
of
scalability,
vendor
lock-‐in,
and
performance.
With
help
from
Lucid
Imagination,
AT&T
successfully
migrated
to
a
Solr-‐based
search
solution
that
leveraged
the
flexibility
of
open
source
without
compromising
features
and
functionality.
And
they
did
so
with
a
much
smaller
budget.
Business
Needs
• Addressing
the
need
to
factor
in
location
to
support
geographic
search,
and
include
relevant
comments
• Striking
a
balance
between
organic
search
and
advertised
content
• Indexing
highly
unstructured
content
such
as
user
comments
• Increasing
relevancy
of
results
and
boosting
paid
search
results
for
preferential
placement
of
advertisers
• Linguistic
support
to
enable
search
experience,
such
as
spellchecking,
synonyms,
find-‐similar,
etc.
• Integrating
with
latest
Web
2.0
tools
• Reducing
server
footprint
The
Solr
Solution
• Context-‐specific
relevancy,
geographic
proximity,
ad
placement,
and
user
comments
• Faceting,
drop
down
filters
to
narrow/widen
the
scope
of
search
• Functional
support
for
creating
new
features
• Spell-‐correction,
and
location-‐optimized
search
results
to
show
users
businesses
nearest
to
them
first
• Seamless
integration
with
many
Web
2.0
tools
to
create
innovative
features
and
mashups
• Lowers
TCO
by
reducing
the
number
of
search
servers
from
120
to
two
dozen
servers
The Case for Lucene/Solr: Real World Search Applications
A Lucid Imagination White Paper • January 2010
Page 9
13.
Media
Brand
reinforcement,
premium
content,
and
easy
accessibility
are
the
main
business
motivators
for
online
media
and
Requirements
publishing
companies.
Relevant
information
improves
time
on
• Real-‐time
indexing
of
the
site
and
encourages
users
to
explore
related
content,
petabytes
of
structured
boosting
subscription
rates
and
site
views.
These
translate
into
a
and
unstructured
data
virtuous
cycle
of
additional
revenue
generation.
• Deeper
search
capability
• Improved
query
Given
that
content
is
the
business,
the
need
for
a
robust
search
response
time
application
ties
directly
to
competitive
advantage.
• Reduced
infrastructure
Lucene/Solr
provides
a
customized,
function
rich
solution
for
the
and
customization
costs
media
and
publishing
industry.
It
addresses
dynamic
challenges
of
content
diversity,
content
freshness,
and
content
acquisition
,
Solr
Solution
and
gives
companies
a
platform
on
which
to
build
a
world-‐class
• Reverse
indexing
innovative
search
experience
to
differentiate
themselves
in
a
• Intelligent,
faceted
search
highly
competitive
marketplace.
to
enable
contextual
and
linguistic
relevance
• Easy
configuration
for
“Solr
has
done
wonders
for
us.
parsing
structured
and
It
is
easy
to
understand
and
unstructured
data
deploy,
and
has
reduced
our
• Easy
and
seamless
installation
for
lower
costs
drastically.”
TCO
Doug
Steigerwald,
• Customization
with
open
source
code
McClatchy
Interactive
Success
Stories
• McClatchy
Newspapers
• Netflix
• Comcast
Interactive
• MTV
Networks,
a
division
of
Viacom
M
• The
Motley
Fool,
fool.com
• Fanfeedr.com,
personalized
sports
aggregator
The Case for Lucene/Solr: Real World Search Applications
A Lucid Imagination White Paper • January 2010
Page 10
14.
Case
Study
2
McClatchy—Leading
Newspaper
Publisher
The
third
largest
newspaper
publisher
in
the
United
States,
McClatchy
Company
owns
30
daily
newspapers
in
29
markets
across
the
country.
To
win
online,
McClatchy
knew
it
had
to
have
a
robust
search
solution,
to
empower
the
McClatchy
audience
with
the
information
they
wanted
and
secure
loyalty
from
readers
and
sponsorships
from
advertisers.
Working
with
Lucid
Imagination,
McClatchy
migrated
from
proprietary
search
software
to
open
source
and
chose
Solr
for
its
high
performance,
comprehensive
capabilities,
and
superior
value
Requirements
• Proliferating
content
and
data
sources
(text,
videos,
audios,
images),
with
real-‐time
streaming
• Empowering
end
users
with
ease
of
use
• Supporting
peak
traffic
and
popular
search
spikes
with
consistent
performance
• Providing
scalability
for
a
database
growing
by
orders
of
magnitude
annually
• Providing
flexibility
to
support
customization
• Controlling
IT
costs
while
exceeding
performance
benchmarks
of
competition
The
Lucene/Solr
Solution
• Deeper
content
by
indexing
both
structured
and
unstructured
data
in
real
time,
effortlessly
• Indexes
millions
of
documents,
with
search
results
delivered
in
milliseconds
• User-‐friendly
navigation
with
drop
down
filters,
faceted
navigation,
linguistic
corrections,
etc.
• Excellent
performance,
even
in
peak
hours,
by
load-‐balancing
search
requests
across
servers
• Scalability
without
impact
on
performance
• High
degree
of
customization,
since
it’s
open
source
• Integration
with
existing
IT
infrastructure
and
eliminates
associated
license
fees
to
cut
costs
• 8-‐fold
reduction
in
server
footprint
The Case for Lucene/Solr: Real World Search Applications
A Lucid Imagination White Paper • January 2010
Page 11
15.
E-commerce
E-‐commerce
businesses
must
provide
a
compelling
shopping
experience
Requirements
in
order
to
maintain
brand
equity
and
thrive
in
a
very
highly
competitive
• Multidimensional,
market
landscape.
By
reducing
the
time
and
effort
required
to
navigate
dynamic
search
available
merchandise
and
find
what
they
want,
superior
search
• Faster
results
contributes
directly
to
a
satisfying
buying
experience
for
customers.
• Real-‐time
indexing
Search
then
translates
directly
into
higher
revenues
and
customer
of
products
loyalty.
Instant
results,
intuitively
organized,
advanced
faceting
for
easy
• Faceting
and
browsing,
synchronizing
results
with
images,
and
integration
with
user
browsing
ratings
are
among
the
must
have
features
of
an
e-‐commerce
search
capabilities
application.
• Seamless
Lucene/Solr
gives
companies
the
ability
to
build
their
sites
around
the
integration
with
concept
of
“searchendizing”—putting
the
desired
merchandise
at
the
top
existing
IT
of
the
results
list—which
can
make
the
difference
between
sales
made
infrastructure
and
sales
lost.
Faceting,
database
integration,
real-‐time
indexing,
and
query
monitoring
all
enable
users
to
find
products
they
want,
driving
Solr
Solution
conversion
rates
and
enabling
a
winning
online
experience.
2
• Faceted
search
for
deeper
drill
down
Online
retail
sales
in
the
and
browsing
B2C
market
are
expected
• Intuitive
search
capabilities
for
Success
Stories
to
reach
$340
billion
by
cross-‐channel
201321
shopping
• Buy.com
• Sears.com
experience
Forrester
Research
• System
• Macys.com
administration
tools
• Zappos.com
for
data
loading,
• Advanceautoparts.com
index
replication,
• Dollardays.com
monitoring,
logging,
and
cache
management
• Query
monitoring
2
“Consumers
will
spend
more
than
$340
billion
online
by
2013,
says
Forrester,”
for
better
Internet
Retailer,
27
November
2009,
http://www.internetretailer.com/dailyNews.asp?id=32630.
highlighting
of
popular
products
The Case for Lucene/Solr: Real World Search Applications
A Lucid Imagination White Paper • January 2010
Page 12
16.
Case
Study
3
Zappos
Zappos
is
the
premier
destination
for
online
shoe
shopping.
At
Zappos,
the
mission
is
excellent
online
customer
service—customers
should
be
able
to
browse
shoe
styles,
sizes,
shapes,
and
colors
more
easily
than
any
other
shoe
store,
on
or
offline.
To
achieve
this,
Zappos
wanted
a
robust,
flexible,
multifunctional
search
solution/application.
After
evaluating
many
commercial
search
technologies,
Zappos
zeroed
in
on
Solr,
working
with
Lucid
Imagination
to
ensure
continued,
successful
deployment.
Requirements
• Simplified,
attractive
user
experience
that
makes
it
easy
to
find
and
buy
• Relevant
results,
fast
• Navigation
across
attributes,
such
as
size,
color,
and
style
for
broader
and
deeper
results
• Indexing
products
as
they
were
entered
in
the
catalogs
• Cross-‐functional
navigation
to
give
customers
a
realistic
shopping
experience
• Intuitive
intelligence
to
provide
alternate
suggestions
• Analytical
capabilities
to
drive
business
strategy
• Facilitating
control
on
results
• Integration
with
existing
IT
infrastructure
The
Solr
Solution
• Search
results
in
subseconds,
across
categories
• Faceting,
for
easy
browsing
and
discovery
and
a
compelling
user
experience
• Real-‐time
indexing
of
products
• Synchronization
of
visuals,
specs,
filters,
and
promotions
to
make
shopping
experience
true
to
life
• Information
on
user
activity
to
help
build
strategy
on
product
promotions
• Controls
to
rank
popular
or
high-‐stock
products
in
results
where
users
are
more
likely
to
buy
them
• Facilitates
integration
with
heterogeneous
open
source
environment
The Case for Lucene/Solr: Real World Search Applications
A Lucid Imagination White Paper • January 2010
Page 13
17.
Job and Career Sites Requirements
• Linguistic
Job
portals
are
countercyclical
to
the
economy.
When
the
economy
intelligence
for
flourishes,
posted
jobs
grow
in
number;
when
it
sags,
candidates
flock
in
more
relevant
to
post
their
resumes.
Success
for
an
online
job
portal
is
tied
to
the
results
efficiency
of
its
search
capability—matching
résumés
to
job
listings
and
• Control
search
vice
versa—so
both
employers
and
prospective
employees
can
zero
in
results
to
maintain
on
just
the
right
opportunity.
privacy
For
example,
an
employer
may
want
to
navigate
through
filters
to
• Deeper
search
narrow
the
scope
of
a
candidate
search,
such
as
education,
previous
capability
employer,
salary
history,
skillsets,
etc.;
a
job
seeker
may
want
to
expose
• Numeric
search
these
attributes,
but
keep
a
current
employer’s
name
confidential.
A
job-‐ • Faster
query
seeker
may
want
to
apply
to
jobs
within
a
particular
geographic
area.
response
• Reduced
Lucene/Solr
not
only
provides
such
flexibility
but
also
addresses
other
infrastructure
and
complexities
of
this
industry
by
enabling
linguistic
intelligence
(such
as
customization
costs
identical
acronyms
that
correspond
to
different
entities;
variations
in
spelling,
imperfectly
constructed
search
queries);
indexing
unstructured
Solr
Solution
data
(résumés);
and
managing
ever-‐growing
data.
• Intelligent,
faceted
search
to
enable
contextual
and
“I
think
the
breakthrough
was
linguistic
relevance
when
we
tried
it,
and
we
• Easy
configuration
realized,
wow,
this
thing
could
for
parsing
structured
and
really
scale.”
unstructured
data
• Easy
and
seamless
Peter
Keegan,
Monster.com
installation
for
Success
Stories
lower
TCO
• Business
process
• Monster
integration
and
• The
Big
Jobs
Customization
with
• eBharatJobs
open
source
code
• Careerjet
M
The Case for Lucene/Solr: Real World Search Applications
A Lucid Imagination White Paper • January 2010
Page 14
18.
Monster.com
Monster
is
the
largest
job
search
engine
in
the
world,
with
over
a
million
jobs
posted
at
any
one
time.
By
2008
it
had
150
million
résumés
in
its
database,
serving
over
63
million
job
seekers
per
month,
now
running
on
average
300
to
400
queries
per
second
with
an
average
response
time
of
40
milliseconds.
To
provide
the
highest
level
of
service
and
support
to
their
customers—both
employers
and
job
seekers—Monster
has
an
unmatched
marketplace
for
employment
opportunities,
with
Lucene-‐based
search
at
the
heart
of
its
business
model.
The
Requirements
• Managing
high
volumes
of
data,
continually
increasing
by
double
digit
percentages
annually
• Maintaining
constant
inventory
updates
and
providing
faster
results
• Removing
technological
barriers
that
limit
the
scope
of
information
• Enabling
end
users
to
refine
search
and
drill
deeper
without
any
performance
impact
• Providing
security
controls
to
ensure
end
user
privacy
• Facilitating
scalability
and
flexibility
in
tandem
with
company’s
vision
and
growth
plans
The
Lucene
Solution
• High
volumes
of
data
by
clustering
data
to
reduce
the
index
size
• Real-‐time
indexing
for
fresher,
faster
query
results
• Intuitive
search
to
enable
in-‐depth
cross-‐functional
job
and
résumé
browsing
• Faceted
search
and
‘single
click’
filters
for
search
refinement
• Security
controls
to
manage
user
information
• Unlimited
scalability
and
customization
leveraging
open
source
licensing
The Case for Lucene/Solr: Real World Search Applications
A Lucid Imagination White Paper • January 2010
Page 15