Searching for products is a key operation for eCommerce sites, where both speed and flexibility are needed. Experience how Solr’s error tolerant Search helps the customers of House of Sound to find their products.
2. WHAT ARE THE STAKES?
INTERNAL SEARCH ENGINE IS ESSENTIAL.
Considering:
- One user on two is a searcher
one on two will use the internal search engine
- This searcher population transform more often than other visitors
- Less patient to browse
need to find quickly otherwise they leave to another shop
SEARCH FIND ADD TO CART PAY
3. INTRODUCTION TO SOLR.
SOLR PROJECT.
• Open source enterprise search server
Initiated by CNET in 2004
Openly published the source code in 2006
• the underlying engine
• Independent server using standards to communicate
such as HTTP / XML / JSON
usable on every web project
such as those based on Magento
5. INTRODUCTION TO SOLR.
FEATURES OFFERED BY SOLR.
Indexing data Scalability
- Index the whole site (including files, …)
- Tolerance (stemmings, synonyms, …)
Searching data Admin tools
- Layered navigation Display more statistics
- Customizable relevance calculation (most frequent requests
- Predictive search (different kinds) or search with no answer)
- Stemming, Plurals, Synonyms,
Stop words, …
6. INDEXING DATA.
FEATURES OFFERED BY SOLR.
Indexing data
- Index the whole site (including files, …)
- Tolerance (stemmings, synonyms, …)
7. INDEXING DATA.
SCHEMA & TEXT ANALYSIS.
Schema
Define how to handle structured data
sent by Magento (no crawler such as Nutch)
Typing data
price & weight are floats, product name is a string, …
o Structured data in Solr allows faceted search
to filter by price range for example
Determined by the intended search behavior
if we need to filter per price range
-> prices have to be stored as floats and not strings to stay comparable
Text analysis
Text splitted in terms which are processed to calculate stemming, define synonyms, …
9. INDEXING DATA.
INDEXING FILES.
Generally indexing structured data
e.g. products
Able to index binary formats
such as PDF, MS Office, images or music files
Using an interface Solr Cell
which is an adapter to Apache Tika
Apache Tika is a toolkit to detect and
extract metadata and text content from various documents
11. SCALABILITY.
DURABLE SOLUTION.
Suitably efficient and practical
when applied to large situations
With a bigger data index or more visitors
searches are slower!
Testing Solr performance with SolrMeter
Solutions to keep good performances with more data:
1. Scale up: Optimizing a single Solr server
2. Scale horizontally: Moving to multiple Solr Servers with replications
3. Scale deep: Combining replication and sharding (for distributed search)
12. SEARCHING DATA.
FEATURES OFFERED BY SOLR.
Searching data
- Layered navigation
- Customizable relevance calculation
- Predictive search (different kinds)
- Stemming, Plurals, Synonyms,
Stop words, …
14. SEARCHING DATA.
SEARCH RELEVANCY.
Factors influencing score:
1. Term frequency
2. Inverse document frequency
the rarer a term is in the whole index, the higher its score is.
3. Co-ordination factor
the greater the number of query clauses that match a document.
4. Field length
the shorter the matching field is, the greater the matching document‘s score is.
5. Boosting
customized mathematical rules to increase score.
In Magento, based on attribute weights
E.g. name 5 -> manufacturer 4 -> sku 3 -> price 2 -> meta_keywords 1
15. ADMIN TOOLS.
FEATURES OFFERED BY SOLR.
Admin tools
Display more statistics
(most frequent requests
or search with no answer)
16. ADMIN TOOLS.
ADMIN FEATURES.
1) Available admin tool in solr but oriented developper
To check schema, index, general config, Solr server availability, to view
technical statistics…
2) Prefer to use Magento backend
To check frequent request or no answer request
Very helpful to analyse user expectations then to improve the catalog
17. CONCLUSION.
INTEGRATE SOLR IN YOUR PROJECT.
Steps:
1. Install and configure Solr
single or multiple servers
single or multiple languages, …
2. Adapt the standard Magento product schema
to your project context
3. Define additional customized data to index
such as other tables, files, …
4. Influence search relevance
defining attribute weights
5. Integrate in Magento frontend
18. CONCLUSION.
COMPARISONS.
Features Magento Magento
Basic SE with Solr
Product indexing ▲ ▲
Document indexing ▲
Synonyms ▲ ▲
Stemming ▲
Stop words ▲
Faceted search ▲ ▲
Relevance calculation ▲ ▲
Customizable relevance calculation ▲
Scalability ▲
Predictive search ▲
Admin tools (frequent requests, no answer…) ▲ ▲
No extra time needed to integrate ▲
19. CONCLUSION.
Remember: 1 user on 2 is a searcher!
SOLR
clearly improves
User experience
which increases your
Transformation Rate