SlideShare une entreprise Scribd logo
1  sur  21
Télécharger pour lire hors ligne
Leveraging Publisher’s Search
 Engines to Deliver Relevant
      Results to Users
                 Presented by
        Abe Lederman, President and CTO
          Deep Web Technologies, LLC

   28th Annual Scholarly Publishing Meeting – Virginia – June 9, 2006
Abe’s Background
• Earned B.S. and M.S. Computer Science degrees, MIT
• 18 years experience developing sophisticated
  information retrieval applications
• Cofounded Verity, 1988
• Consulted to LANL, 1994-2000
• Deployed first “federated search” portal in the Federal
  government, 1999
• Founded Deep Web Technologies (DWT), 2002
 DWT is a New Mexico based company focused on providing
  state-of-the-art software solutions which search, retrieve,
 aggregate, and analyze content from web-based databases.
The Problem:

   Searching a
 large number of
sources can lead
   to a flood of
      results
Relevance
  ranking
 begins as
soon as the
user clicks
the Search
   button
Ranking Recipe
INGREDIENTS
 Source Selection
 Query Language
 Search Conductor
 Ranking Algorithms

MIX WELL AND SERVE UP
RELEVANT RESULTS
Source Selection Optimizer
                   Search
                  Conductor


               Source Selection
                  Optimizer


  Source                          Previous
Descriptions                      Results
Powerful Query Language
• Takes advantage of search capabilities of
  each source
• Supports full Boolean operators where
  possible
• Supports fielded search
• Translates natural language questions into
  query syntax
Search Conductor
                 Select sources
                   to search


                 Perform Search



                    Enough
Get Next                          YES   Deliver results
                      good
Results             results?                to user

                        NO
                    Can I get
           YES    more results
                  from “good”
                    sources?
                         NO
Challenges in Organizing and
      Ranking Results

       Multi-tier Relevance
             Ranking


       User-driven Ranking



       Clustering of Results
Multi-tier Relevance Ranking
• QuickRank – Ranks results based
  on occurrence of search terms in
  title, author, and snippet

• MetaRank – Ranks results utilizing
  custom algorithms applied to meta-
  data

• DeepRank – Downloads and             HEAVY LIFTING
  indexes full-text documents           REQUIRED!
User-driven Ranking

Credibility of source    Geographic proximity
Date range               Popularity of document
Document length          Reading level
Document type            Relevance

   Desired: Blending (weighing) of above criteria
Clustering
Attributes of Successful
          Federated Search
• Powerful query language that takes
  advantage of publisher search capabilities
• Source selection optimizer will reduce
  unnecessary searches
• Search conductor gets more results from
  sources bringing back good results
• A tool that highlights best search results
• Caching of search results
Advice for Publishers
• Use good search engines with good
  relevance ranking
• Return 100 or more results at a time
• Return meta-data (author, journal, snippet)
  as part of result list
• Provide access to your content through
  XML Gateway or Web Services
• Speed up search time
Thank You!


Abe Lederman
301 N Guadalupe, Ste 201
Santa Fe, NM 87501
abe@deepwebtech.com
www.deepwebtech.com

Contenu connexe

Similaire à 180 sspcc3 b_lederman

Taming Information Chaos in SharePoint 2010
Taming Information Chaos in SharePoint 2010Taming Information Chaos in SharePoint 2010
Taming Information Chaos in SharePoint 2010Eric Shupps
 
digital marketing on search engine material for marketing students
digital marketing on search engine material for marketing studentsdigital marketing on search engine material for marketing students
digital marketing on search engine material for marketing studentsAlazerTesfayeErsasuT
 
SPConnections - Search Administration in SharePoint 2013
SPConnections - Search Administration in SharePoint 2013SPConnections - Search Administration in SharePoint 2013
SPConnections - Search Administration in SharePoint 2013Agnes Molnar
 
Contextualized Online Search and Research Skills
Contextualized Online Search and Research SkillsContextualized Online Search and Research Skills
Contextualized Online Search and Research SkillsAngelito Quiambao
 
SPConnections Amsterdam: Beyond the Search Center - Application or Solution? ...
SPConnections Amsterdam: Beyond the Search Center - Application or Solution? ...SPConnections Amsterdam: Beyond the Search Center - Application or Solution? ...
SPConnections Amsterdam: Beyond the Search Center - Application or Solution? ...Agnes Molnar
 
Web Scale Discovery Services: Google like search experience
Web Scale Discovery Services: Google like search experienceWeb Scale Discovery Services: Google like search experience
Web Scale Discovery Services: Google like search experienceNikesh Narayanan
 
An introduction to Elasticsearch's advanced relevance ranking toolbox
An introduction to Elasticsearch's advanced relevance ranking toolboxAn introduction to Elasticsearch's advanced relevance ranking toolbox
An introduction to Elasticsearch's advanced relevance ranking toolboxElasticsearch
 
Owning the Discovery Experience for Your Patrons
Owning the Discovery Experience for Your PatronsOwning the Discovery Experience for Your Patrons
Owning the Discovery Experience for Your PatronsRobert H. McDonald
 
Enterprise search Information
Enterprise search Information Enterprise search Information
Enterprise search Information Netwoven Inc.
 
Search Engine Optimization for the Research Librarian, or, How Librarians Can...
Search Engine Optimization for the Research Librarian, or, How Librarians Can...Search Engine Optimization for the Research Librarian, or, How Librarians Can...
Search Engine Optimization for the Research Librarian, or, How Librarians Can...melissagasparotto
 
THAT Conference 2021 - State-of-the-art Search with Azure Cognitive Search
THAT Conference 2021 - State-of-the-art Search with Azure Cognitive SearchTHAT Conference 2021 - State-of-the-art Search with Azure Cognitive Search
THAT Conference 2021 - State-of-the-art Search with Azure Cognitive SearchBrian McKeiver
 
SEO Keyword Research & Mapping
SEO Keyword Research & Mapping SEO Keyword Research & Mapping
SEO Keyword Research & Mapping Vivastream
 
Current and emerging trends in library services
Current and emerging trends in library servicesCurrent and emerging trends in library services
Current and emerging trends in library servicesNikesh Narayanan
 

Similaire à 180 sspcc3 b_lederman (20)

Taming Information Chaos in SharePoint 2010
Taming Information Chaos in SharePoint 2010Taming Information Chaos in SharePoint 2010
Taming Information Chaos in SharePoint 2010
 
digital marketing on search engine material for marketing students
digital marketing on search engine material for marketing studentsdigital marketing on search engine material for marketing students
digital marketing on search engine material for marketing students
 
SPConnections - Search Administration in SharePoint 2013
SPConnections - Search Administration in SharePoint 2013SPConnections - Search Administration in SharePoint 2013
SPConnections - Search Administration in SharePoint 2013
 
Contextualized Online Search and Research Skills
Contextualized Online Search and Research SkillsContextualized Online Search and Research Skills
Contextualized Online Search and Research Skills
 
SPConnections Amsterdam: Beyond the Search Center - Application or Solution? ...
SPConnections Amsterdam: Beyond the Search Center - Application or Solution? ...SPConnections Amsterdam: Beyond the Search Center - Application or Solution? ...
SPConnections Amsterdam: Beyond the Search Center - Application or Solution? ...
 
Web Scale Discovery Services: Google like search experience
Web Scale Discovery Services: Google like search experienceWeb Scale Discovery Services: Google like search experience
Web Scale Discovery Services: Google like search experience
 
ASAP Session 4
ASAP Session 4ASAP Session 4
ASAP Session 4
 
An introduction to Elasticsearch's advanced relevance ranking toolbox
An introduction to Elasticsearch's advanced relevance ranking toolboxAn introduction to Elasticsearch's advanced relevance ranking toolbox
An introduction to Elasticsearch's advanced relevance ranking toolbox
 
Owning the Discovery Experience for Your Patrons
Owning the Discovery Experience for Your PatronsOwning the Discovery Experience for Your Patrons
Owning the Discovery Experience for Your Patrons
 
Enterprise search Information
Enterprise search Information Enterprise search Information
Enterprise search Information
 
Search Engine Optimization for the Research Librarian, or, How Librarians Can...
Search Engine Optimization for the Research Librarian, or, How Librarians Can...Search Engine Optimization for the Research Librarian, or, How Librarians Can...
Search Engine Optimization for the Research Librarian, or, How Librarians Can...
 
Search engines
Search enginesSearch engines
Search engines
 
Evaluation of search engine
Evaluation of search engineEvaluation of search engine
Evaluation of search engine
 
SEO Web Development Process
SEO Web Development ProcessSEO Web Development Process
SEO Web Development Process
 
THAT Conference 2021 - State-of-the-art Search with Azure Cognitive Search
THAT Conference 2021 - State-of-the-art Search with Azure Cognitive SearchTHAT Conference 2021 - State-of-the-art Search with Azure Cognitive Search
THAT Conference 2021 - State-of-the-art Search with Azure Cognitive Search
 
SEO Keyword Research & Mapping
SEO Keyword Research & Mapping SEO Keyword Research & Mapping
SEO Keyword Research & Mapping
 
Current and emerging trends in library services
Current and emerging trends in library servicesCurrent and emerging trends in library services
Current and emerging trends in library services
 
SharePoint site admins leverage search
SharePoint site admins leverage searchSharePoint site admins leverage search
SharePoint site admins leverage search
 
Search Engine Optimization
Search Engine OptimizationSearch Engine Optimization
Search Engine Optimization
 
Search 1
Search 1Search 1
Search 1
 

Plus de Society for Scholarly Publishing

04142015 ssp webinar_theworldisflatforscholarlypublishing_caitlinmeadows
04142015 ssp webinar_theworldisflatforscholarlypublishing_caitlinmeadows04142015 ssp webinar_theworldisflatforscholarlypublishing_caitlinmeadows
04142015 ssp webinar_theworldisflatforscholarlypublishing_caitlinmeadowsSociety for Scholarly Publishing
 
04142015 ssp webinar_theworldisflatforscholarlypublishing_bruceheterick
04142015 ssp webinar_theworldisflatforscholarlypublishing_bruceheterick04142015 ssp webinar_theworldisflatforscholarlypublishing_bruceheterick
04142015 ssp webinar_theworldisflatforscholarlypublishing_bruceheterickSociety for Scholarly Publishing
 

Plus de Society for Scholarly Publishing (20)

10052016 ssp seminar2_newsham
10052016 ssp seminar2_newsham10052016 ssp seminar2_newsham
10052016 ssp seminar2_newsham
 
10052016 ssp seminar2_rivera
10052016 ssp seminar2_rivera10052016 ssp seminar2_rivera
10052016 ssp seminar2_rivera
 
10052016 ssp seminar2_pesanelli
10052016 ssp seminar2_pesanelli10052016 ssp seminar2_pesanelli
10052016 ssp seminar2_pesanelli
 
10052016 ssp seminar2_harley
10052016 ssp seminar2_harley10052016 ssp seminar2_harley
10052016 ssp seminar2_harley
 
10042016 ssp seminar1_session4_myers
10042016 ssp seminar1_session4_myers10042016 ssp seminar1_session4_myers
10042016 ssp seminar1_session4_myers
 
10042016 ssp seminar1_session4_demers
10042016 ssp seminar1_session4_demers10042016 ssp seminar1_session4_demers
10042016 ssp seminar1_session4_demers
 
10042016 ssp seminar1_session4_cochran
10042016 ssp seminar1_session4_cochran10042016 ssp seminar1_session4_cochran
10042016 ssp seminar1_session4_cochran
 
10042016 ssp seminar1_session3_stanley
10042016 ssp seminar1_session3_stanley10042016 ssp seminar1_session3_stanley
10042016 ssp seminar1_session3_stanley
 
10042016 ssp seminar1_session3_ranganathan
10042016 ssp seminar1_session3_ranganathan10042016 ssp seminar1_session3_ranganathan
10042016 ssp seminar1_session3_ranganathan
 
10042016 ssp seminar1_session3_odike
10042016 ssp seminar1_session3_odike10042016 ssp seminar1_session3_odike
10042016 ssp seminar1_session3_odike
 
10042016 ssp seminar1_session3_cochran
10042016 ssp seminar1_session3_cochran10042016 ssp seminar1_session3_cochran
10042016 ssp seminar1_session3_cochran
 
10042016 ssp seminar1_session2_walker
10042016 ssp seminar1_session2_walker10042016 ssp seminar1_session2_walker
10042016 ssp seminar1_session2_walker
 
10042016 ssp seminar1_session2_ivins
10042016 ssp seminar1_session2_ivins10042016 ssp seminar1_session2_ivins
10042016 ssp seminar1_session2_ivins
 
10042016 ssp seminar1_session2_holland
10042016 ssp seminar1_session2_holland10042016 ssp seminar1_session2_holland
10042016 ssp seminar1_session2_holland
 
10042016 ssp seminar1_session1_stanley
10042016 ssp seminar1_session1_stanley10042016 ssp seminar1_session1_stanley
10042016 ssp seminar1_session1_stanley
 
10042016 ssp seminar1_session1_keane
10042016 ssp seminar1_session1_keane10042016 ssp seminar1_session1_keane
10042016 ssp seminar1_session1_keane
 
10042016 ssp seminar1_session1_ivins
10042016 ssp seminar1_session1_ivins10042016 ssp seminar1_session1_ivins
10042016 ssp seminar1_session1_ivins
 
10042016 ssp seminar1_session1_asadilari
10042016 ssp seminar1_session1_asadilari10042016 ssp seminar1_session1_asadilari
10042016 ssp seminar1_session1_asadilari
 
04142015 ssp webinar_theworldisflatforscholarlypublishing_caitlinmeadows
04142015 ssp webinar_theworldisflatforscholarlypublishing_caitlinmeadows04142015 ssp webinar_theworldisflatforscholarlypublishing_caitlinmeadows
04142015 ssp webinar_theworldisflatforscholarlypublishing_caitlinmeadows
 
04142015 ssp webinar_theworldisflatforscholarlypublishing_bruceheterick
04142015 ssp webinar_theworldisflatforscholarlypublishing_bruceheterick04142015 ssp webinar_theworldisflatforscholarlypublishing_bruceheterick
04142015 ssp webinar_theworldisflatforscholarlypublishing_bruceheterick
 

180 sspcc3 b_lederman

  • 1. Leveraging Publisher’s Search Engines to Deliver Relevant Results to Users Presented by Abe Lederman, President and CTO Deep Web Technologies, LLC 28th Annual Scholarly Publishing Meeting – Virginia – June 9, 2006
  • 2. Abe’s Background • Earned B.S. and M.S. Computer Science degrees, MIT • 18 years experience developing sophisticated information retrieval applications • Cofounded Verity, 1988 • Consulted to LANL, 1994-2000 • Deployed first “federated search” portal in the Federal government, 1999 • Founded Deep Web Technologies (DWT), 2002 DWT is a New Mexico based company focused on providing state-of-the-art software solutions which search, retrieve, aggregate, and analyze content from web-based databases.
  • 3. The Problem: Searching a large number of sources can lead to a flood of results
  • 4. Relevance ranking begins as soon as the user clicks the Search button
  • 5. Ranking Recipe INGREDIENTS Source Selection Query Language Search Conductor Ranking Algorithms MIX WELL AND SERVE UP RELEVANT RESULTS
  • 6. Source Selection Optimizer Search Conductor Source Selection Optimizer Source Previous Descriptions Results
  • 7. Powerful Query Language • Takes advantage of search capabilities of each source • Supports full Boolean operators where possible • Supports fielded search • Translates natural language questions into query syntax
  • 8. Search Conductor Select sources to search Perform Search Enough Get Next YES Deliver results good Results results? to user NO Can I get YES more results from “good” sources? NO
  • 9. Challenges in Organizing and Ranking Results Multi-tier Relevance Ranking User-driven Ranking Clustering of Results
  • 10. Multi-tier Relevance Ranking • QuickRank – Ranks results based on occurrence of search terms in title, author, and snippet • MetaRank – Ranks results utilizing custom algorithms applied to meta- data • DeepRank – Downloads and HEAVY LIFTING indexes full-text documents REQUIRED!
  • 11. User-driven Ranking Credibility of source Geographic proximity Date range Popularity of document Document length Reading level Document type Relevance Desired: Blending (weighing) of above criteria
  • 13. Attributes of Successful Federated Search • Powerful query language that takes advantage of publisher search capabilities • Source selection optimizer will reduce unnecessary searches • Search conductor gets more results from sources bringing back good results • A tool that highlights best search results • Caching of search results
  • 14. Advice for Publishers • Use good search engines with good relevance ranking • Return 100 or more results at a time • Return meta-data (author, journal, snippet) as part of result list • Provide access to your content through XML Gateway or Web Services • Speed up search time
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
  • 20.
  • 21. Thank You! Abe Lederman 301 N Guadalupe, Ste 201 Santa Fe, NM 87501 abe@deepwebtech.com www.deepwebtech.com