SlideShare une entreprise Scribd logo
1  sur  28
Apache SOLR
Enterprise Search Solution
        (overview)
Enterprise Search Server

The criteria ...
•Fast

•Flexible

•Powerful

•Scalable

•Relevant Results

•Production ready & Easy deployment
Why SOLR

•   Greater control over your website search
•   Caching, Replication, Distributed search
•   Really fast Indexing/Searching, Indexes can
    be merged/optimized (Index compaction)
•   Great admin interface can be used over
    HTTP
•   Awesome community support
•   Support for integration with various other
    products
SOLR Powered

http://wiki.apache.org/solr/PublicServers/

 •   whitehouse.gov        •   eBay
 •   Instagram             •   The Guardian
 •   Apple                 •   Netflix
 •   NASA                  •   Shopper
 •   CISCO                 •   News.com
 •   Disney                •   digg
 •   Sears                 •   AOL
What is SOLR?

•   Very fast full text search engine
http://lucene.apache.org/solr/
•   Based on Apache Lucene - high-performance, full-
    featured text search engine library written entirely in
    Java.

    In brief Apache Solr exposes Lucene's JAVA API as
    REST like API's which can be called over HTTP
    from any programming language/platform
Features
•   Full Text Search
•   Faceted navigation
•   More items like this(Recommendation)/
    Related searches
•   Spell Suggest/Auto-Complete
•   Custom document ranking/ordering
•   Snippet generation/highlighting
•   Geospatial Search
Spell Suggest/Auto-Complete
Faceted navigation, paging
Geospatial Search
More Features ...

•   Database integration
•   Rich document (Word, PDF) handling
•   REST-like HTTP/XML, JSON APIs (so,
    you can code virtually in any language)
•   Flexible configuration
•   Extensive Plugin architecture for advanced
    customization
•   Scalable distributed search, dynamic
    clustering, index replication
App Server Support

•   Apache Tomcat
•   Jetty
•   Resin
•   WebLogicTM
•   WebSphereTM
•   GlassFish
•   dmServerTM
•   JBossTM... and many more
SOLR History

•   Developed at CNET Networks by
    Yonik Seeley
•   Donated to ASF (Apache Software
    Foundation) in early 2006
•   Incubation period ended in january
    2007 (v1.2 released)
•   Solr is now maintained as a
    subproject of Lucene
Solr

•   Only one table (documents). No joins.
•   Each row is a document
•   A document can have multiple fields and
    fields can have multiple values
– e.g. Tags, Categories, ...

•   Fast for search (finding the documents)
•   Slow when returning large sets of data
•   Can scale to many millions of documents
Solr Architecture

• Servlet: Jetty,Tomcat ... any :)
– Handles http

• Solr
– Connectivity between Servlet and Lucene

• Lucene
– Full Text Search Framework
SOLR Workflow
How Lucene Works

•
                         key      ID
    Regular indexes      banana   1
    repeat index data    banana   2

    for each row         banana   3

                         cat      2

                         cat      3

                         dog      1

•   Inverted Indexes     dog      3

    reference the term
                         Term     IDs
    once and then the    banana   1,2,3
    matching documents   cat      2,3

                         dog      1,3
Inverted Index Matching
                                 cat banana
Term          IDs          Document    1      2   3

banana        1,2,3        cat         0      1   1

cat           2,3          banana      1      1   1


•
dog
      Lucene uses bit
                1,3        Match       0      1   1


      vectors to quickly         dog cat
      find all documents   Document    1      2   3

      with terms           dog         1      0   1

                           cat         0      1   1

                           Match       0      0   1
Scoring
•   Now that the documents are found, what order should
    they be viewed
•   Lucene uses TF-IDF (Term Frequency-
    Inverse Document Frequency) to score the
    documents




    Term                     IDs

    banana {1.28}            1 {2}, 2 {5}, 3 {1}

    cat {1.60}               2 {4}, 3 {2}

    dog {1.60}               1 {1}, 3 {6}
Scoring Notes

The goal of scoring is:
•To boost the importance of documents where
 the word is mentioned often
•To boost the importance of rare words (that
 don’t appear in many documents)
 Solr supports term boosts to increase the
 importance of one term over another as
 well
Stemming, Stopwords, Synonyms

•  Terms are trimmed of suffixes
trimmed -> trim
stemming -> stem
•  Stopwords remove common parts of
   speech that are not important
the, and, for, it, ...
•  This is done with both the words in the
   document and the query terms
•  Solr supports search by predefined
   synonyms list
Configuring Solr

•   Schema.xml – Contains all of the details
    about document structure, index-time and
    query-time processing
•   Solrconfig.xml - Contains most of the
    parameters for configuring Solr itself
QUERY SYNTAXES (RDBMS)

SELECT * FROM post WHERE
 (topic LIKE ‘%apache%’ OR author LIKE ‘%bambr%’)
 OR (topic LIKE ‘%solr%’ OR author LIKE ‘%frank%’)
ORDER BY id DESC

QUERY SYNTAXES (SOLR)
 Topic:"The Right Way" AND author:WrongGuy
Querying Solr 1

•  Plain text search
q = text:"I love android"
•   Expanding search to more fields :
title:android & type:review & price:[* To 500]
•   Add facets
 facet.field=product & facet.field=rating
•   Ordering results
sort = score desc, price asc
Querying Solr 2

•   Add facets for range queries
facet.query=price:[* TO 100]
   &facet.query=price:[100 TO 200]
   &facet.query=price:[500 TO *]
•  Limiting results
rows=15
•  Paginating on results
start=25 & rows=10
Querying Solr 3

Advanced Query operators:
•fq : FilterQuery fq = type:review & price:[* TO
 500]
•fl : Restrict fields to be returnedfl=id,title,text
•hl : Highlighting matches in snippet, Snippet
 generation etc. hl=true&hl.fl=title,text
Solr Caching

•   External Caching : Memcached, etc.
•   Internal Caching
    Different types of cache:
    1) FilterCache: Used by facetQueries(fq),
     sometimes for faceting too
    2) QueryResultCache : Used for results
    returned by generic queries
    3) DocumentCache
Books
Skype: dgolovko
dimtkg@gmail.com

Contenu connexe

Tendances

Lucene's Latest (for Libraries)
Lucene's Latest (for Libraries)Lucene's Latest (for Libraries)
Lucene's Latest (for Libraries)Erik Hatcher
 
Get the most out of Solr search with PHP
Get the most out of Solr search with PHPGet the most out of Solr search with PHP
Get the most out of Solr search with PHPPaul Borgermans
 
Integrating the Solr search engine
Integrating the Solr search engineIntegrating the Solr search engine
Integrating the Solr search engineth0masr
 
Rapid Prototyping with Solr
Rapid Prototyping with SolrRapid Prototyping with Solr
Rapid Prototyping with SolrErik Hatcher
 
Rapid Prototyping with Solr
Rapid Prototyping with SolrRapid Prototyping with Solr
Rapid Prototyping with SolrErik Hatcher
 
code4lib 2011 preconference: What's New in Solr (since 1.4.1)
code4lib 2011 preconference: What's New in Solr (since 1.4.1)code4lib 2011 preconference: What's New in Solr (since 1.4.1)
code4lib 2011 preconference: What's New in Solr (since 1.4.1)Erik Hatcher
 
Tutorial on developing a Solr search component plugin
Tutorial on developing a Solr search component pluginTutorial on developing a Solr search component plugin
Tutorial on developing a Solr search component pluginsearchbox-com
 
Introduction to Solr
Introduction to SolrIntroduction to Solr
Introduction to SolrJayesh Bhoyar
 
Solr 6 Feature Preview
Solr 6 Feature PreviewSolr 6 Feature Preview
Solr 6 Feature PreviewYonik Seeley
 
Solr Troubleshooting - TreeMap approach
Solr Troubleshooting - TreeMap approachSolr Troubleshooting - TreeMap approach
Solr Troubleshooting - TreeMap approachAlexandre Rafalovitch
 
Rapid Prototyping with Solr
Rapid Prototyping with SolrRapid Prototyping with Solr
Rapid Prototyping with SolrErik Hatcher
 
Solr Query Parsing
Solr Query ParsingSolr Query Parsing
Solr Query ParsingErik Hatcher
 
Lucene for Solr Developers
Lucene for Solr DevelopersLucene for Solr Developers
Lucene for Solr DevelopersErik Hatcher
 
Mastering solr
Mastering solrMastering solr
Mastering solrjurcello
 
Introduction to Solr
Introduction to SolrIntroduction to Solr
Introduction to SolrErik Hatcher
 
Introduction Apache Solr & PHP
Introduction Apache Solr & PHPIntroduction Apache Solr & PHP
Introduction Apache Solr & PHPHiraq Citra M
 
Beyond full-text searches with Lucene and Solr
Beyond full-text searches with Lucene and SolrBeyond full-text searches with Lucene and Solr
Beyond full-text searches with Lucene and SolrBertrand Delacretaz
 
Apache Solr! Enterprise Search Solutions at your Fingertips!
Apache Solr! Enterprise Search Solutions at your Fingertips!Apache Solr! Enterprise Search Solutions at your Fingertips!
Apache Solr! Enterprise Search Solutions at your Fingertips!Murshed Ahmmad Khan
 

Tendances (20)

Lucene's Latest (for Libraries)
Lucene's Latest (for Libraries)Lucene's Latest (for Libraries)
Lucene's Latest (for Libraries)
 
Get the most out of Solr search with PHP
Get the most out of Solr search with PHPGet the most out of Solr search with PHP
Get the most out of Solr search with PHP
 
Integrating the Solr search engine
Integrating the Solr search engineIntegrating the Solr search engine
Integrating the Solr search engine
 
Rapid Prototyping with Solr
Rapid Prototyping with SolrRapid Prototyping with Solr
Rapid Prototyping with Solr
 
Rapid Prototyping with Solr
Rapid Prototyping with SolrRapid Prototyping with Solr
Rapid Prototyping with Solr
 
code4lib 2011 preconference: What's New in Solr (since 1.4.1)
code4lib 2011 preconference: What's New in Solr (since 1.4.1)code4lib 2011 preconference: What's New in Solr (since 1.4.1)
code4lib 2011 preconference: What's New in Solr (since 1.4.1)
 
Tutorial on developing a Solr search component plugin
Tutorial on developing a Solr search component pluginTutorial on developing a Solr search component plugin
Tutorial on developing a Solr search component plugin
 
Introduction to Solr
Introduction to SolrIntroduction to Solr
Introduction to Solr
 
Solr 6 Feature Preview
Solr 6 Feature PreviewSolr 6 Feature Preview
Solr 6 Feature Preview
 
Solr Masterclass Bangkok, June 2014
Solr Masterclass Bangkok, June 2014Solr Masterclass Bangkok, June 2014
Solr Masterclass Bangkok, June 2014
 
Solr Troubleshooting - TreeMap approach
Solr Troubleshooting - TreeMap approachSolr Troubleshooting - TreeMap approach
Solr Troubleshooting - TreeMap approach
 
Rapid Prototyping with Solr
Rapid Prototyping with SolrRapid Prototyping with Solr
Rapid Prototyping with Solr
 
Solr Query Parsing
Solr Query ParsingSolr Query Parsing
Solr Query Parsing
 
Lucene for Solr Developers
Lucene for Solr DevelopersLucene for Solr Developers
Lucene for Solr Developers
 
Mastering solr
Mastering solrMastering solr
Mastering solr
 
Introduction to Solr
Introduction to SolrIntroduction to Solr
Introduction to Solr
 
Introduction Apache Solr & PHP
Introduction Apache Solr & PHPIntroduction Apache Solr & PHP
Introduction Apache Solr & PHP
 
Beyond full-text searches with Lucene and Solr
Beyond full-text searches with Lucene and SolrBeyond full-text searches with Lucene and Solr
Beyond full-text searches with Lucene and Solr
 
Apache Solr! Enterprise Search Solutions at your Fingertips!
Apache Solr! Enterprise Search Solutions at your Fingertips!Apache Solr! Enterprise Search Solutions at your Fingertips!
Apache Solr! Enterprise Search Solutions at your Fingertips!
 
Solr workshop
Solr workshopSolr workshop
Solr workshop
 

En vedette

Introduction to Apache Solr
Introduction to Apache SolrIntroduction to Apache Solr
Introduction to Apache SolrAndy Jackson
 
Building a real time big data analytics platform with solr
Building a real time big data analytics platform with solrBuilding a real time big data analytics platform with solr
Building a real time big data analytics platform with solrTrey Grainger
 
Building a Large Scale SEO/SEM Application with Apache Solr: Presented by Rah...
Building a Large Scale SEO/SEM Application with Apache Solr: Presented by Rah...Building a Large Scale SEO/SEM Application with Apache Solr: Presented by Rah...
Building a Large Scale SEO/SEM Application with Apache Solr: Presented by Rah...Lucidworks
 
Solr & Lucene @ Etsy by Gregg Donovan
Solr & Lucene @ Etsy by Gregg DonovanSolr & Lucene @ Etsy by Gregg Donovan
Solr & Lucene @ Etsy by Gregg DonovanGregg Donovan
 
Психология восприятия и UX дизайн
Психология восприятия и UX дизайнПсихология восприятия и UX дизайн
Психология восприятия и UX дизайнEcommerce Solution Provider SysIQ
 
Эффективный JavaScript - IQLab Frontend Fusion 2012
Эффективный  JavaScript - IQLab Frontend Fusion 2012Эффективный  JavaScript - IQLab Frontend Fusion 2012
Эффективный JavaScript - IQLab Frontend Fusion 2012Ecommerce Solution Provider SysIQ
 

En vedette (20)

Introduction to Apache Solr
Introduction to Apache SolrIntroduction to Apache Solr
Introduction to Apache Solr
 
Building a real time big data analytics platform with solr
Building a real time big data analytics platform with solrBuilding a real time big data analytics platform with solr
Building a real time big data analytics platform with solr
 
Solr on Cloud
Solr on CloudSolr on Cloud
Solr on Cloud
 
Building a Large Scale SEO/SEM Application with Apache Solr: Presented by Rah...
Building a Large Scale SEO/SEM Application with Apache Solr: Presented by Rah...Building a Large Scale SEO/SEM Application with Apache Solr: Presented by Rah...
Building a Large Scale SEO/SEM Application with Apache Solr: Presented by Rah...
 
Solr & Lucene @ Etsy by Gregg Donovan
Solr & Lucene @ Etsy by Gregg DonovanSolr & Lucene @ Etsy by Gregg Donovan
Solr & Lucene @ Etsy by Gregg Donovan
 
Lupan big enterprise ecommerce fusion 2013
Lupan   big enterprise ecommerce fusion 2013Lupan   big enterprise ecommerce fusion 2013
Lupan big enterprise ecommerce fusion 2013
 
Психология восприятия и UX дизайн
Психология восприятия и UX дизайнПсихология восприятия и UX дизайн
Психология восприятия и UX дизайн
 
QA evolution, in pictures
QA evolution, in picturesQA evolution, in pictures
QA evolution, in pictures
 
QA evolution to the present day
QA evolution to the present dayQA evolution to the present day
QA evolution to the present day
 
Manifest of modern engineers
Manifest of modern engineersManifest of modern engineers
Manifest of modern engineers
 
User focused design
User focused designUser focused design
User focused design
 
All things php
All things phpAll things php
All things php
 
Databases on Client Side
Databases on Client SideDatabases on Client Side
Databases on Client Side
 
Speed Up Your Website
Speed Up Your WebsiteSpeed Up Your Website
Speed Up Your Website
 
Эффективный JavaScript - IQLab Frontend Fusion 2012
Эффективный  JavaScript - IQLab Frontend Fusion 2012Эффективный  JavaScript - IQLab Frontend Fusion 2012
Эффективный JavaScript - IQLab Frontend Fusion 2012
 
External Widgets Performance
External Widgets PerformanceExternal Widgets Performance
External Widgets Performance
 
User Behavior: Interacting With Important Website Elements
User Behavior: Interacting With Important Website ElementsUser Behavior: Interacting With Important Website Elements
User Behavior: Interacting With Important Website Elements
 
Unexpected achievements 2013
Unexpected achievements 2013Unexpected achievements 2013
Unexpected achievements 2013
 
Developing for e commerce is important
Developing for e commerce is importantDeveloping for e commerce is important
Developing for e commerce is important
 
Getting to know magento
Getting to know magentoGetting to know magento
Getting to know magento
 

Similaire à Enterprise Search Solution: Apache SOLR. What's available and why it's so cool

Lucene BootCamp
Lucene BootCampLucene BootCamp
Lucene BootCampGokulD
 
What is in a Lucene index?
What is in a Lucene index?What is in a Lucene index?
What is in a Lucene index?lucenerevolution
 
Lucene for Solr Developers
Lucene for Solr DevelopersLucene for Solr Developers
Lucene for Solr DevelopersErik Hatcher
 
Apache Solr - Enterprise search platform
Apache Solr - Enterprise search platformApache Solr - Enterprise search platform
Apache Solr - Enterprise search platformTommaso Teofili
 
Let's Build an Inverted Index: Introduction to Apache Lucene/Solr
Let's Build an Inverted Index: Introduction to Apache Lucene/SolrLet's Build an Inverted Index: Introduction to Apache Lucene/Solr
Let's Build an Inverted Index: Introduction to Apache Lucene/SolrSease
 
Enterprise Search Using Apache Solr
Enterprise Search Using Apache SolrEnterprise Search Using Apache Solr
Enterprise Search Using Apache Solrsagar chaturvedi
 
Self-learned Relevancy with Apache Solr
Self-learned Relevancy with Apache SolrSelf-learned Relevancy with Apache Solr
Self-learned Relevancy with Apache SolrTrey Grainger
 
Intro to Elasticsearch
Intro to ElasticsearchIntro to Elasticsearch
Intro to ElasticsearchClifford James
 
Finite State Queries In Lucene
Finite State Queries In LuceneFinite State Queries In Lucene
Finite State Queries In Luceneotisg
 
Lucene for Solr Developers
Lucene for Solr DevelopersLucene for Solr Developers
Lucene for Solr DevelopersErik Hatcher
 
20130310 solr tuorial
20130310 solr tuorial20130310 solr tuorial
20130310 solr tuorialChris Huang
 
Natural Language Search in Solr
Natural Language Search in SolrNatural Language Search in Solr
Natural Language Search in SolrTommaso Teofili
 
Apache Solr crash course
Apache Solr crash courseApache Solr crash course
Apache Solr crash courseTommaso Teofili
 
Full Text Search with Lucene
Full Text Search with LuceneFull Text Search with Lucene
Full Text Search with LuceneWO Community
 
An Introduction to Elastic Search.
An Introduction to Elastic Search.An Introduction to Elastic Search.
An Introduction to Elastic Search.Jurriaan Persyn
 
Lucene Bootcamp - 2
Lucene Bootcamp - 2Lucene Bootcamp - 2
Lucene Bootcamp - 2GokulD
 

Similaire à Enterprise Search Solution: Apache SOLR. What's available and why it's so cool (20)

Lucene BootCamp
Lucene BootCampLucene BootCamp
Lucene BootCamp
 
What is in a Lucene index?
What is in a Lucene index?What is in a Lucene index?
What is in a Lucene index?
 
Lucene for Solr Developers
Lucene for Solr DevelopersLucene for Solr Developers
Lucene for Solr Developers
 
Lucene intro
Lucene introLucene intro
Lucene intro
 
Apache Solr - Enterprise search platform
Apache Solr - Enterprise search platformApache Solr - Enterprise search platform
Apache Solr - Enterprise search platform
 
Let's Build an Inverted Index: Introduction to Apache Lucene/Solr
Let's Build an Inverted Index: Introduction to Apache Lucene/SolrLet's Build an Inverted Index: Introduction to Apache Lucene/Solr
Let's Build an Inverted Index: Introduction to Apache Lucene/Solr
 
Enterprise Search Using Apache Solr
Enterprise Search Using Apache SolrEnterprise Search Using Apache Solr
Enterprise Search Using Apache Solr
 
Self-learned Relevancy with Apache Solr
Self-learned Relevancy with Apache SolrSelf-learned Relevancy with Apache Solr
Self-learned Relevancy with Apache Solr
 
Intro to Elasticsearch
Intro to ElasticsearchIntro to Elasticsearch
Intro to Elasticsearch
 
Finite State Queries In Lucene
Finite State Queries In LuceneFinite State Queries In Lucene
Finite State Queries In Lucene
 
Apache solr
Apache solrApache solr
Apache solr
 
Lucene 101
Lucene 101Lucene 101
Lucene 101
 
Lucene for Solr Developers
Lucene for Solr DevelopersLucene for Solr Developers
Lucene for Solr Developers
 
20130310 solr tuorial
20130310 solr tuorial20130310 solr tuorial
20130310 solr tuorial
 
Natural Language Search in Solr
Natural Language Search in SolrNatural Language Search in Solr
Natural Language Search in Solr
 
Apache Solr crash course
Apache Solr crash courseApache Solr crash course
Apache Solr crash course
 
Solr
SolrSolr
Solr
 
Full Text Search with Lucene
Full Text Search with LuceneFull Text Search with Lucene
Full Text Search with Lucene
 
An Introduction to Elastic Search.
An Introduction to Elastic Search.An Introduction to Elastic Search.
An Introduction to Elastic Search.
 
Lucene Bootcamp - 2
Lucene Bootcamp - 2Lucene Bootcamp - 2
Lucene Bootcamp - 2
 

Plus de Ecommerce Solution Provider SysIQ

Правила хорошего SEO тона в Frontend разработке
Правила хорошего SEO тона в Frontend разработкеПравила хорошего SEO тона в Frontend разработке
Правила хорошего SEO тона в Frontend разработкеEcommerce Solution Provider SysIQ
 

Plus de Ecommerce Solution Provider SysIQ (17)

Java serialization
Java serializationJava serialization
Java serialization
 
Developing for e commerce is important
Developing for e commerce is importantDeveloping for e commerce is important
Developing for e commerce is important
 
Magento code audit
Magento code auditMagento code audit
Magento code audit
 
Scalability and performance for e commerce
Scalability and performance for e commerceScalability and performance for e commerce
Scalability and performance for e commerce
 
non-blocking java script
non-blocking java scriptnon-blocking java script
non-blocking java script
 
Going global
Going globalGoing global
Going global
 
Going Global
Going GlobalGoing Global
Going Global
 
Quick Intro to Clean Coding
Quick Intro to Clean CodingQuick Intro to Clean Coding
Quick Intro to Clean Coding
 
Seo and Marketing Requirements in Web Architecture
Seo and Marketing Requirements in Web ArchitectureSeo and Marketing Requirements in Web Architecture
Seo and Marketing Requirements in Web Architecture
 
Management and Communications (IPAA)
Management and Communications (IPAA)Management and Communications (IPAA)
Management and Communications (IPAA)
 
Testing schools overview
Testing schools overviewTesting schools overview
Testing schools overview
 
IGears: Template Architecture and Principles
IGears: Template Architecture and PrinciplesIGears: Template Architecture and Principles
IGears: Template Architecture and Principles
 
Interactive web prototyping
Interactive web prototypingInteractive web prototyping
Interactive web prototyping
 
Модульные сетки в реальном мире
Модульные сетки в реальном миреМодульные сетки в реальном мире
Модульные сетки в реальном мире
 
Правила хорошего SEO тона в Frontend разработке
Правила хорошего SEO тона в Frontend разработкеПравила хорошего SEO тона в Frontend разработке
Правила хорошего SEO тона в Frontend разработке
 
Understanding Annotations in Java
Understanding Annotations in JavaUnderstanding Annotations in Java
Understanding Annotations in Java
 
Mastering Java ByteCode
Mastering Java ByteCodeMastering Java ByteCode
Mastering Java ByteCode
 

Dernier

Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDThiyagu K
 
PROCESS RECORDING FORMAT.docx
PROCESS      RECORDING        FORMAT.docxPROCESS      RECORDING        FORMAT.docx
PROCESS RECORDING FORMAT.docxPoojaSen20
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxAreebaZafar22
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.MaryamAhmad92
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxnegromaestrong
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Celine George
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...christianmathematics
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.christianmathematics
 
Making and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfMaking and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfChris Hunter
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfAdmir Softic
 
Food Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-II
Food Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-IIFood Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-II
Food Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-IIShubhangi Sonawane
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104misteraugie
 
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...Shubhangi Sonawane
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphThiyagu K
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 

Dernier (20)

Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 
PROCESS RECORDING FORMAT.docx
PROCESS      RECORDING        FORMAT.docxPROCESS      RECORDING        FORMAT.docx
PROCESS RECORDING FORMAT.docx
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
Making and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfMaking and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdf
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Food Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-II
Food Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-IIFood Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-II
Food Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-II
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
Asian American Pacific Islander Month DDSD 2024.pptx
Asian American Pacific Islander Month DDSD 2024.pptxAsian American Pacific Islander Month DDSD 2024.pptx
Asian American Pacific Islander Month DDSD 2024.pptx
 
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 

Enterprise Search Solution: Apache SOLR. What's available and why it's so cool

  • 1. Apache SOLR Enterprise Search Solution (overview)
  • 2. Enterprise Search Server The criteria ... •Fast •Flexible •Powerful •Scalable •Relevant Results •Production ready & Easy deployment
  • 3. Why SOLR • Greater control over your website search • Caching, Replication, Distributed search • Really fast Indexing/Searching, Indexes can be merged/optimized (Index compaction) • Great admin interface can be used over HTTP • Awesome community support • Support for integration with various other products
  • 4. SOLR Powered http://wiki.apache.org/solr/PublicServers/ • whitehouse.gov • eBay • Instagram • The Guardian • Apple • Netflix • NASA • Shopper • CISCO • News.com • Disney • digg • Sears • AOL
  • 5. What is SOLR? • Very fast full text search engine http://lucene.apache.org/solr/ • Based on Apache Lucene - high-performance, full- featured text search engine library written entirely in Java. In brief Apache Solr exposes Lucene's JAVA API as REST like API's which can be called over HTTP from any programming language/platform
  • 6. Features • Full Text Search • Faceted navigation • More items like this(Recommendation)/ Related searches • Spell Suggest/Auto-Complete • Custom document ranking/ordering • Snippet generation/highlighting • Geospatial Search
  • 10. More Features ... • Database integration • Rich document (Word, PDF) handling • REST-like HTTP/XML, JSON APIs (so, you can code virtually in any language) • Flexible configuration • Extensive Plugin architecture for advanced customization • Scalable distributed search, dynamic clustering, index replication
  • 11. App Server Support • Apache Tomcat • Jetty • Resin • WebLogicTM • WebSphereTM • GlassFish • dmServerTM • JBossTM... and many more
  • 12. SOLR History • Developed at CNET Networks by Yonik Seeley • Donated to ASF (Apache Software Foundation) in early 2006 • Incubation period ended in january 2007 (v1.2 released) • Solr is now maintained as a subproject of Lucene
  • 13. Solr • Only one table (documents). No joins. • Each row is a document • A document can have multiple fields and fields can have multiple values – e.g. Tags, Categories, ... • Fast for search (finding the documents) • Slow when returning large sets of data • Can scale to many millions of documents
  • 14. Solr Architecture • Servlet: Jetty,Tomcat ... any :) – Handles http • Solr – Connectivity between Servlet and Lucene • Lucene – Full Text Search Framework
  • 16. How Lucene Works • key ID Regular indexes banana 1 repeat index data banana 2 for each row banana 3 cat 2 cat 3 dog 1 • Inverted Indexes dog 3 reference the term Term IDs once and then the banana 1,2,3 matching documents cat 2,3 dog 1,3
  • 17. Inverted Index Matching cat banana Term IDs Document 1 2 3 banana 1,2,3 cat 0 1 1 cat 2,3 banana 1 1 1 • dog Lucene uses bit 1,3 Match 0 1 1 vectors to quickly dog cat find all documents Document 1 2 3 with terms dog 1 0 1 cat 0 1 1 Match 0 0 1
  • 18. Scoring • Now that the documents are found, what order should they be viewed • Lucene uses TF-IDF (Term Frequency- Inverse Document Frequency) to score the documents Term IDs banana {1.28} 1 {2}, 2 {5}, 3 {1} cat {1.60} 2 {4}, 3 {2} dog {1.60} 1 {1}, 3 {6}
  • 19. Scoring Notes The goal of scoring is: •To boost the importance of documents where the word is mentioned often •To boost the importance of rare words (that don’t appear in many documents) Solr supports term boosts to increase the importance of one term over another as well
  • 20. Stemming, Stopwords, Synonyms • Terms are trimmed of suffixes trimmed -> trim stemming -> stem • Stopwords remove common parts of speech that are not important the, and, for, it, ... • This is done with both the words in the document and the query terms • Solr supports search by predefined synonyms list
  • 21. Configuring Solr • Schema.xml – Contains all of the details about document structure, index-time and query-time processing • Solrconfig.xml - Contains most of the parameters for configuring Solr itself
  • 22. QUERY SYNTAXES (RDBMS) SELECT * FROM post WHERE (topic LIKE ‘%apache%’ OR author LIKE ‘%bambr%’) OR (topic LIKE ‘%solr%’ OR author LIKE ‘%frank%’) ORDER BY id DESC QUERY SYNTAXES (SOLR) Topic:"The Right Way" AND author:WrongGuy
  • 23. Querying Solr 1 • Plain text search q = text:"I love android" • Expanding search to more fields : title:android & type:review & price:[* To 500] • Add facets facet.field=product & facet.field=rating • Ordering results sort = score desc, price asc
  • 24. Querying Solr 2 • Add facets for range queries facet.query=price:[* TO 100] &facet.query=price:[100 TO 200] &facet.query=price:[500 TO *] • Limiting results rows=15 • Paginating on results start=25 & rows=10
  • 25. Querying Solr 3 Advanced Query operators: •fq : FilterQuery fq = type:review & price:[* TO 500] •fl : Restrict fields to be returnedfl=id,title,text •hl : Highlighting matches in snippet, Snippet generation etc. hl=true&hl.fl=title,text
  • 26. Solr Caching • External Caching : Memcached, etc. • Internal Caching Different types of cache: 1) FilterCache: Used by facetQueries(fq), sometimes for faceting too 2) QueryResultCache : Used for results returned by generic queries 3) DocumentCache
  • 27. Books