SlideShare une entreprise Scribd logo
1  sur  43
Télécharger pour lire hors ligne
Federated Search in a Disparate
                                 Environment




                                  PREPARED FOR:
                                 SLA Webinar Series
                         Evidence-Based Practice in Libraries
2040 Corbett Rd
Monkton, Md 21111

(410.472.4631
                               Helen L. Mitchell Curtis
* hmitchell5@gmail.com
                          Principal, Enterprising Solutions


                                  September 9, 2009
Enterprising Solutions

                                                           Biography
                             Helen L. Mitchell Curtis – Principal, Enterprising
                               Solutions

                              32+ years at FDA leading one of the largest
                               enterprise search implementations among Civilian
                               Federal Agencies
                              Develop enterprise-wide search strategies &
                               solutions
                              Integrate search technologies across IT
                               applications and disparate document repositories
                              Build governance, management and end user
                               buy-in
                              Promote collaboration, standards, findability and
                               improved organization of data and document
                               assets
                              Passion – to help clients to reduce costs, improve
                               quality and efficiency, reduce 'pain points' and
                               achieve a positive search experience

2
Enterprising Solutions

                                                    Polling Question

    • What is Your Role? (select all that apply, if group participants)

       •   CIO, Executive Director
       •   Library Director (Corporate, Gov’t, Academia, Solo)
       •   Librarian/Information Management Professional
       •   IT Professional or Consultant
       •   Project/Product Manager
       •   Sales/Marketing/Communications
       •   End User (i.e., Scientist, Researcher, Engineering Professional)
       •   Federated Search Vendor
       •   Other




3
Enterprising Solutions

                                        Agenda

    1. Terms Clarified
    2. Types of Federated Search (FS)
    3. FS Challenges & Benefits
    4. FDA Case Study
    5. FS Evaluation Criteria
    6. Examples of FS Solutions
    7. Live Federated Search Demo
    8. Best Practices
    9. Future Vision
    10. Questions & Answers


4
1.   Definition by AIIM Market IQ
                                  2.   Definition by CMS Watch
       Enterprising Solutions

                                                                                        Clarify Terms
                                  3.   A Federated Search Primer – Part II
                                  4.   Deep Web Technologies
                                  5.   Federated Search Rpt & Toolkit-Jill Hurst-Wahl



                                • Reliable and complete retrieval of content based on user need,
                                  i.e. everything relevant is recalled (recall) while simultaneously
    Findability                   returning only that content relevant to the user’s focus
                                  (precision), thus eliminating the review of irrelevant content by the
                                        1
                                  user.

                                • Systems…within an organization…seeking information held
    Enterprise                    internally…in a variety of formats and locations, including
     Search                       databases, document management systems, and other
                                               2
                                  repositories. Content is pre-indexed, simultaneously searched,
       (ES)                       and displayed to authorized users.

                                • The process of performing a simultaneous real-time search of
     Federated                    multiple diverse and distributed sources from a single search
                                                                                                 3
                                  page, with the federated search engine acting as intermediary.
    Search (FS)
                                • The set of web-sites and their documents that cannot be accessed
                                  via crawler-type search engines such as Google. Deep web content
    Deep Web                      typically lives inside of databases, and is accessed through search
                                          4
                                  forms. It is also referred to as the Hidden or Invisible Web.


                                • SW written to access a content source that must know the URL of
    Connector                     the source, how to send search commands, its search syntax, &
                                                                                            5
                                  how to process the search results returned from a source.
5
Enterprising Solutions

                                               Polling Question
     Information Accessibility (select all that apply)

           1. I can easily find information to do my job
           2. Less than 50% of our organization’s info is searchable online
           3. More than 50% of our organization's info is searchable online
           4. I reference less than 5 systems (info sources) in any given
              week
           5. I reference 5 or more systems (info sources) in any given
              week




6
Enterprising Solutions

                                                           Findability Issues
     AIIM Market IQ Research on Findability (of 528 end users):
         50% believe Findability in their organization is ―Worse to Much Worse‖
          than their consumer-facing web sites
         49% have no formal goal for Enterprise Findability within their
          organizations
         49% ―Agreed or Strongly Agreed‖ that finding the information to do their
          job is difficult and time consuming
         69% believe less than 50% of their organization's information is
          searchable online
         36% reference five or more systems in any given week




7

                                  Source: AIIM Market Intelligence, 2008
Enterprising Solutions

                                  Why Use Federated Search

    To increase findability to better accomplish business objectives.

    To issue a single query across multiple content sources through a common
    search interface.

    When not feasible to re-index all of the content available from large public
    sites like PubMed.

    To increase user awareness of all content sources such as deep web for
    scientific, technical and business content.

    To eliminate using multiple database search protocols & passwords.


    When don‘t have the rights to index the content (e.g. subscription sites).

    Real-time search: for content constantly being updated & impractical to
8   keep the data as timely as it needs to be.
Federated Search Sources
           Enterprising Solutions



                                                                        (examples)



    Reason                               Corporate   Academic   Gov’t      Public
                                                                           Library
    Subscription Databases                  X           X        X             X
    Internal or External Repositories       X                    X
    Library Catalog(s)                      X           X        X             X
    News                                                X        X
    Digitized Material                                  X        X             X
    Blogs & Wikis                                       X        X             X
    Intranet/Internet Sites                             X        X
    Industry Specific Sources                                    X
    DB‘s available to customers                                  X             X
    Historical Collections                                                     X



9
Enterprising Solutions

                              Typical Non-Federated Search




10                                             Courtesy of MuseGlobal, Inc.
Enterprising Solutions

                              Typical Federated Search




11
                                           Courtesy of MuseGlobal, Inc.
Enterprising Solutions

                                Federated „Master Index‟ Search
      Index multiple data sources content into a single master index
           Queries & results come from that one master index

      Many Enterprise Search products integrate FS via ‗connectors‘ to
       accomplish this (ex., FAST, Autonomy, Endeca)




12                                                  Source: New Idea Engineering, Inc.
Enterprising Solutions

                                Federated „Data Silos‟ Search
     ‗Search Federator‘ processes queries for each data source silo
     Transforms search terms to match each content source requirements
     Submits query to each of the sources simultaneously
     Merges each source‘s results together - single look & feel
     Maintains no indices of its own, relies on linked systems capabilities




13
                                                   Source: New Idea Engineering, Inc.
Enterprising Solutions

                                      Surface vs. Deep Web Search

        Popular search engines (Google, Yahoo…) ―crawl‖ surface web

        FS can drill down to the deep web where specialized content (i.e.,
        scientific and technical databases) reside


     Deep Web FS Examples:
     www.completeplanet.com -
     70,000+ searchable DBs & specialty
     search engines
     www.science.gov- federates U.S.
     federal agency science info
     http://imlsdcc.grainger.uiuc.edu/ -
     Institute of Museum & Library
     Services (IMLS) - Digital Collections
     & Content w/descriptions of digital
     resources developed by IMLS
     grantees


14
                                              Source: Juanico-Environmental Consultants, Ltd.
Enterprising Solutions

                                              Vertical Search Engine

      Closely related to Deep Web – searches for a particular niche i.e.,
       a specific industry, topic, type of content (e.g., scientific research,
       travel, movies, images, blogs)
           Example: www.vetseek.info - is a search engine focusing on veterinary science and
              related topics




15
Enterprising Solutions

                                           Polling Question

     Federated Search Solutions (select one)

     1. We are currently conducting an evaluation to procure a
        Federated Search Product
     2. We currently have a Federated Search Solution installed that
        satisfies our requirements
     3. We have a Federated Search Solution by are considering
        replacing it or enhancing its capabilities & features




16
Enterprising Solutions

                                                                     Challenges

                         Authentication
                               Showing each record‘s branding and copyright information
                               Licensed or subscription databases
                         True De-duplication
                               Virtually impossible because DBs return 10-20 results at a
                                time
                               Vendors usually just de-dupe the first results set returned
                         Security
                               Mapping user credentials and access rights to each
                                repository security model
                         Speed
                               Limited by slowest search engine‘s performance




17
Enterprising Solutions

                                                      Challenges   (continued)




      Lack of data standardization
         Each source has a unique access method & needs
          translation
         Metadata mapping between FSS and underlying systems
      Access methods to sources may change
         Requires an interface rewrite or modification
      Rules for error handling
         Ex. Query term not available—exclude the query, the
          repository, or proceed without the term?
         Ex. Timeouts or connection problem
      Complex searches usually not available
         Fielded searches
      Known Items, i.e. Article Name
         Best to directly search database
18
Enterprising Solutions

                                                             Challenges               (continued)




                          Relevancy scores
                               Can‘t identify a single relevancy ranking model
                                   Relevancy rankings for repository‘s results refers to its own
                               May be not be useful when comparing the results with
                                those from another system
                          Access to content stored in a variety of
                           places
                               Results page may not let user obtain identified documents
                                   This may involve a built-in viewer or invoking the owning
                                    product‘s interface.

                          Combining navigators from each result set
                               i.e., faceted search, taxonomies and auto-generate
                                clusters
                          Selecting the right FS engine
                               Depends on business goals, type of content sources –
                                structured vs. unstructured, licensed/subscriptions
19
Enterprising Solutions

                                                                             Benefits

     • Single master index
          •   Quicker response times
          •   No need to access original data sources
          •   Relevancy algorithms applied uniformly
          •   Dynamic navigators are available for all documents
     • Time savings
          • Searches many sources at one time
          • Combines results into a single results page
     • Quality of results
          • Client selects the sources to search
     • Minimum impact on the data silos
          • Only accessed when a user performs a query
                  • Eliminates increased load crawling/indexing the data source


20
Enterprising Solutions

                                                               Benefits        (continued)




     • Improve productivity
        • Reduces number of searches executed to find relevant results
        • Save, reuse, schedule, and share effective search queries
     • Leverage security controls at queried source
        • Access repositories secured against crawls but can be accessed
          by search queries
     • Reduce costs
        • No additional capacity requirements for content index since its
          not crawled by search server
     • Most current content
        • Real time searches - as soon as the source is updated, the info is
          available to the searcher on the very next query
     • Increase awareness
        • Identify most relevant sources to search based on # of results
          each source produced

21
Enterprising Solutions
                                    FDA Case Study Success
                                        (Federated „Master Index‟ Search System)

     ACTIONS                         RESULT
     Started small with high ‘pain Increased productivity & popularity.
     points’.
     Modified business processes. Standardized nomenclature improved
                                  efficiencies.
     Users across organization       Produced more timely & QUALITY
     could find content in silos.    work products.
     Indexed structured &            Grew from 1 repository of 500 docs
     unstructured content with       to 50 with 30 million docs. Accessed
     document level security.        on ‘need to know’ basis.
     Introduced standardized         Reduced development time & costs.
     search web services into        Increased mgmt & user acceptance.
     applications.                   Integrated in more applications.
     Increased user awareness        Used more & content added. Search
     with training, newsletters &    requirements now captured at
     meetings.                       BEGINNING of project development.

22
Enterprising Solutions

                                Evaluation Criteria Overview

                               Identify Goals
                               Create an Effective Search
                                Strategy
                               Collect Business Requirements
                               Conduct needs assessment
                               Work Closely with User
                                Community

23
Evaluation Criteria Overview
      Enterprising Solutions


                                                        (continued)




      Define Features and Functions
          Eliminate emotional decisions re: product,
           company or others using the product
      High Precision
          Return content relevant to user‘s focus
      High Recall
          Recall everything relevant to user‘s need
      Thoroughly Research
       Products, Users & Product
       Reviewers
24
Enterprising Solutions

                                            Sample Evaluation Criteria
     Rating Criteria                   Importance   Product #1   Product #1         Product #2   Product #2
                                       (Rank 1-5)   Score        Weighted Score     Score        Weighted Score
                                                    (0-100)      (Rank x Score)     (0-100)      (Rank x Score)

     Ease of Use                           5           85              425              70             350
     Ability to Customize UI               1           80               80              65             65
     Speed                                 5           90              450              85             425
     De-duplication                        4           75              300              75             300
     Clustering                            4           85              340              80             320
     Help Functionality                    3           70              210              0               0
     Alerts                                4           90              360              50             200
     # of Searchable Sources               3           90              270              80             240
     Save Selections/Citations             2           85              170              0               0
     Security                              4           90              360              85             340
     Product Cost                          5           75              375              85             425
     Vendor Credibility                    4           95              380              85             340

     Total Weighted Score                             1010             3720            760            3005

25
                                                                 -Courtesy of Federated Search Report & Tool Kit
FSS Example
     Enterprising Solutions


                                                     (uses FAST ESP – Vertical Search)
                              Features of Interest




26
FSS Example
     Enterprising Solutions


                                                         (uses MS & Vivisimo)




                              Features of Interest




27
Enterprising Solutions
                                      FSS Example
                                      (uses Deep Web Technologies)



     Features of Interest




28
Enterprising Solutions
                                                     FSS Example
                                                          (uses Webfeat)



                              Features of Interest




29
Digital Library FSS Example
     Enterprising Solutions


                                      http://www.calisphere.universityofcalifornia.edu/




                                                               Features of Interest




30
Digital Library FSS Example
     Enterprising Solutions


                                       http://www.calisphere.universityofcalifornia.edu


                  1                                 2




                                                                   3




31
Enterprising Solutions
                                                        FSS Example
                              (LibraryFind® developed by Oregon State Univ Libraries)




                                                                Features of Interest




32
Enterprising Solutions
                                  Semantic Federated Search
                                          (prototype by Collexis & Deep Web Technologies)




                                                                         SOURCES:

                                                                         •PubMed
                                                                         •NCI=Nat‘l Cancer Inst
     DeepWeb Technologies (a federated search provider) and              •DTIC=Defense Tech. Info Ctr
                                                                         •PMC=PubMed Central
     Collexis (a developer of semantic search & knowledge                •ScrDOEIB=DOE Info Bridge
     discovery solutions) teamed up to deliver the world’s first         •Eurekalert=Science News

     semantic federated search.                                          THESAURI Used:

                                                                         •MeSH
                                                                         •DTIC=Defense Tech. Info Ctr


     •How does semantic federated search work?
         •All results from your initial query are processed
         through one or more thesauri. (i.e., MeSH & DTIC.)
         •The system then returns terms that are found both in
         the top results and in the thesauri.
33
Enterprising Solutions

                                 Collexis & Deep Web Technologies
                                                         (Search Results – screenshot 1)

     Unlike clustering, which
     simply lumps together
     words that are
     frequently found near
     each other, these terms
     are being suggested
     from an expert-
     developed thesaurus
     (taxonomy) in which                                       2429 hits
     terms are meaningfully
     & consistently
     organized.


                                        The longer the
      Semantic terms.                   blue bar, the
                                        more semantic
                                        evidence found
                                        for that term.




34
Enterprising Solutions

                        Collexis & Deep Web Technologies
                                                                       (Search Results – screenshot 2)


                                                                                      •Clicking on term
                                                                                      “Mental Recall” from
                                                                                      prior screen added
                                                                                      term to search, reduced
                                                                                      relevant hits to 3; &
                                                                                      terms suggested are
                                                                                      organized.




                                                                                    •Thesaurus-based search will
                                                                                    consistently suggest terms in
                                                                                    the same organized way.
                                                                                    •Clustering changes the way it
                                                                                    organizes suggestions with
                                                                                    every query.
                                                                                    • Clustering tends to be useful
                                                                                    for very broad, general or
                                                                                    unpredictable content.

                              *Thesaurus-based semantic search tends to be better
                              when you are working consistently in knowledge
                              domains, such as medicine, physics or electronics.
35
Enterprising Solutions

                                                          Best Practices

       Strategically plan how to deliver your
       mission and just DO IT!


                Do proof of concept – demos can be
                deceiving


                      Establish common set of standards &
                      governance model


                              Measure results by establishing key
                              performance indicators

                                 Leverage lessons learned to reduce
                                 project cycles, increase trust and
                                 empower communities
36
Enterprising Solutions

                                                             Future Vision

     Personalized Search
     • A simple, persistent box on a users‘ browser, cell, or entertainment screen
       that initiates a search based on what the user was doing, their previous
       keystrokes, & perhaps using historical data.


     Better Quality of Search Results
     • Number of results retrieved, Relevance Ranking, De-Duplication


     Enterprise Mashups

     • Combine real-time searching with social networking tools, maps, etc.


     Users build the index by their searching

     • Know Web pages people display, what‘s on them & what apps are
       showing up on users' computers
37
Enterprising Solutions

                                                 Future Vision (continued)

        Query analysis & predictive modeling on the fly

        • Business users expect to access info behind company firewalls &
          from the larger web world using the same tools and consistency

         Improved Navigators, Facets, Clustering

        • Filter result sets dynamically for more relevant results


         Web of Interconnected Data

        • Automate analysis of database structures and cross-reference
          results. Ex.- Health site cross-references data from pharmaceutical
           companies with the latest findings from medical researchers


         Visualization Technologies

38
        • Enable extreme-scale knowledge discovery
Enterprising Solutions

                                                                  Resources
     1. Great resource for many Federated Search topics:
         www.federatedsearchblog.com – Author: Sol Lederman

     2. Open Source & commercial search components & tools list:
         http://tinyurl.com/l3w8of

     3. Federated Search Vendors: http://tinyurl.com/92s8qv

     4. Deep Web Databases: http://tinyurl.com/yam3sw

     5. Deep Web resources: http://www.internettutorials.net/deepweb.asp

     6. Digital Image Resources on the Deep Web: http://tinyurl.com/46vcqp

     7. Info on Vertical Search Engines: http://tinyurl.com/lpcufw

     8. 50 Niche Search Engines: http://tinyurl.com/lukxwx

     9. Library of Congress FS Portal Products/Vendors list:
         http://tinyurl.com/l6mdy8

     10. Resources to Research & Mine the Deep Web: http://tinyurl.com/6g5768
39
Enterprising Solutions

                                                                                  References
     1)   ―What‟s in a Name: Federated Search‖ – Miles Kehoe, New Idea Engineering, Inc,Vol. 4 No.4 8/07
     2)   “Federated Search Engine Article” - Online (Weston, Conn.) 28 no2 16-19 Mr/Ap 2004 (Reprint of
          article by Donna Fryer www.SearchitRight.com )
     3)   “Growing Up With Federated Search” - by Walt Warnick, OSTI
     4)   “Sophisticated Yet Simple - The Technology Behind OSTI's E-print Network: Part 3” – Walt Warnick,
          OSTI
     5)   “Vertical Search Engines & the Deep Web” - Laura B. Cohen http://www.internettutorials.net/
     6)   Blog: www.federatedsearchblog.com – by Sol Lederman
     7)   “Exploring a „Deep Web‟ that Google can‟t Grasp” - NYT 2-23-09 http://tinyurl.com/mvt42f
     8)   “Federated Search Primer, Part I-III” – by Sol Lederman
     9)   www.searchdoneright.com – by Vivisimo –Raoul – CEO & Cofounder
     10) “Enterprise Search Grows Up‟”- Podcast from BizTalk
     11) “Federation: Big Need, Still a Challenge” – Stephen Arnold, 4/25/08
     12) “The Future of Federated Search or What Will the World Look Like in 10 Years” – Rich Turner
     13) “Federated Search Report & Tool Kit” – Jill Hurst-Wahl, 10/08, © Free Pint Limited 2008




40
Enterprising Solutions

                              QUESTIONS




41
Enterprising Solutions




                                   THANK YOU!

                              Helen L. Mitchell Curtis
                              Principal
                              Enterprising Solutions


                              hmitchell5@gmail.com

                              410-472-4631(w)
                              410-259-7766(m)




42
42
Enterprising Solutions




                 Enterprising Solutions
     “Results Driven…Exceeding Expectations”




43

Contenu connexe

Tendances

InfoFusion Overview And Roadmap
InfoFusion Overview And RoadmapInfoFusion Overview And Roadmap
InfoFusion Overview And RoadmapMarten den Haring
 
Exploring Process Barriers to Release Public Sector Information in Local Gove...
Exploring Process Barriers to Release Public Sector Information in Local Gove...Exploring Process Barriers to Release Public Sector Information in Local Gove...
Exploring Process Barriers to Release Public Sector Information in Local Gove...Peter Conradie
 
How Search 2.0 Has Been Redefined by Enterprise 2.0
How Search 2.0 Has Been Redefined by Enterprise 2.0How Search 2.0 Has Been Redefined by Enterprise 2.0
How Search 2.0 Has Been Redefined by Enterprise 2.0Enterprise 2.0 Conference
 
Everything Self-Service:Linked Data Applications with the Information Workbench
Everything Self-Service:Linked Data Applications with the Information WorkbenchEverything Self-Service:Linked Data Applications with the Information Workbench
Everything Self-Service:Linked Data Applications with the Information WorkbenchPeter Haase
 
Towards a Data-Centric Notion of Trust in the Semantic Web (A Position Statem...
Towards a Data-Centric Notion of Trust in the Semantic Web (A Position Statem...Towards a Data-Centric Notion of Trust in the Semantic Web (A Position Statem...
Towards a Data-Centric Notion of Trust in the Semantic Web (A Position Statem...Olaf Hartig
 
Needs for Data Management & Citation Throughout the Information Lifecycle
Needs for Data Management & Citation Throughout  the Information LifecycleNeeds for Data Management & Citation Throughout  the Information Lifecycle
Needs for Data Management & Citation Throughout the Information LifecycleMicah Altman
 

Tendances (9)

InfoFusion Overview And Roadmap
InfoFusion Overview And RoadmapInfoFusion Overview And Roadmap
InfoFusion Overview And Roadmap
 
Exploring Process Barriers to Release Public Sector Information in Local Gove...
Exploring Process Barriers to Release Public Sector Information in Local Gove...Exploring Process Barriers to Release Public Sector Information in Local Gove...
Exploring Process Barriers to Release Public Sector Information in Local Gove...
 
NISO Forum, Denver, Sept. 24, 2012: DataCite and Campus Data Services
NISO Forum, Denver, Sept. 24, 2012: DataCite and Campus Data ServicesNISO Forum, Denver, Sept. 24, 2012: DataCite and Campus Data Services
NISO Forum, Denver, Sept. 24, 2012: DataCite and Campus Data Services
 
How Search 2.0 Has Been Redefined by Enterprise 2.0
How Search 2.0 Has Been Redefined by Enterprise 2.0How Search 2.0 Has Been Redefined by Enterprise 2.0
How Search 2.0 Has Been Redefined by Enterprise 2.0
 
Everything Self-Service:Linked Data Applications with the Information Workbench
Everything Self-Service:Linked Data Applications with the Information WorkbenchEverything Self-Service:Linked Data Applications with the Information Workbench
Everything Self-Service:Linked Data Applications with the Information Workbench
 
Towards a Data-Centric Notion of Trust in the Semantic Web (A Position Statem...
Towards a Data-Centric Notion of Trust in the Semantic Web (A Position Statem...Towards a Data-Centric Notion of Trust in the Semantic Web (A Position Statem...
Towards a Data-Centric Notion of Trust in the Semantic Web (A Position Statem...
 
Needs for Data Management & Citation Throughout the Information Lifecycle
Needs for Data Management & Citation Throughout  the Information LifecycleNeeds for Data Management & Citation Throughout  the Information Lifecycle
Needs for Data Management & Citation Throughout the Information Lifecycle
 
Provenance and Trust
Provenance and TrustProvenance and Trust
Provenance and Trust
 
NISO Forum, Denver, Sept. 24, 2012: Scientific discovery and innovation in an...
NISO Forum, Denver, Sept. 24, 2012: Scientific discovery and innovation in an...NISO Forum, Denver, Sept. 24, 2012: Scientific discovery and innovation in an...
NISO Forum, Denver, Sept. 24, 2012: Scientific discovery and innovation in an...
 

Similaire à Federated Search in Disparate Environments

Enterprise Search White Paper: Increase Your Competitiveness - Make a Knowled...
Enterprise Search White Paper: Increase Your Competitiveness - Make a Knowled...Enterprise Search White Paper: Increase Your Competitiveness - Make a Knowled...
Enterprise Search White Paper: Increase Your Competitiveness - Make a Knowled...Findwise
 
KMWorld Martin Briefing
KMWorld Martin BriefingKMWorld Martin Briefing
KMWorld Martin Briefingmartingarland
 
Search Strategy for Enterprise SharePoint 2013 - Vancouver SharePoint Summit
Search Strategy for Enterprise SharePoint 2013 - Vancouver SharePoint SummitSearch Strategy for Enterprise SharePoint 2013 - Vancouver SharePoint Summit
Search Strategy for Enterprise SharePoint 2013 - Vancouver SharePoint SummitJoel Oleson
 
Configuring share point 2010 just do it
Configuring share point 2010   just do itConfiguring share point 2010   just do it
Configuring share point 2010 just do itMarianne Sweeny
 
Taxonomies And Search Aiim Mn
Taxonomies And Search Aiim MnTaxonomies And Search Aiim Mn
Taxonomies And Search Aiim MnAIIM Minnesota
 
Enterprise Search, Simple, Complex and Powerful
Enterprise Search, Simple, Complex and PowerfulEnterprise Search, Simple, Complex and Powerful
Enterprise Search, Simple, Complex and PowerfulFindwise
 
How to Get Enterprise Search Right Webinar
How to Get Enterprise Search Right WebinarHow to Get Enterprise Search Right Webinar
How to Get Enterprise Search Right WebinarConcept Searching, Inc
 
Federated Search in a Disparate Environment
Federated Search in a Disparate EnvironmentFederated Search in a Disparate Environment
Federated Search in a Disparate EnvironmentHelen Mitchell
 
Session 0.0 poster minutes madness
Session 0.0   poster minutes madnessSession 0.0   poster minutes madness
Session 0.0 poster minutes madnesssemanticsconference
 
Aiim Webinar Helen Mitchell Unified Search Final 7 21 2010
Aiim Webinar Helen Mitchell  Unified Search Final 7 21 2010Aiim Webinar Helen Mitchell  Unified Search Final 7 21 2010
Aiim Webinar Helen Mitchell Unified Search Final 7 21 2010Helen Mitchell
 
Falling in and out and in love with Information Architecture
Falling in and out and in love with Information ArchitectureFalling in and out and in love with Information Architecture
Falling in and out and in love with Information ArchitectureLouis Rosenfeld
 
Search Analytics for Content Strategists
Search Analytics for Content StrategistsSearch Analytics for Content Strategists
Search Analytics for Content StrategistsLouis Rosenfeld
 
AMCTO presentation on moving from records managment to information management
AMCTO presentation on moving from records managment to information managementAMCTO presentation on moving from records managment to information management
AMCTO presentation on moving from records managment to information managementChristopher Wynder
 
Optimising Your Content for Findability
Optimising Your Content for FindabilityOptimising Your Content for Findability
Optimising Your Content for FindabilityFindwise
 
04 share pointday2012_fast search
04 share pointday2012_fast search04 share pointday2012_fast search
04 share pointday2012_fast searchPablo Peris
 
Elqano - Where Knowledge Finds People
Elqano - Where Knowledge Finds PeopleElqano - Where Knowledge Finds People
Elqano - Where Knowledge Finds PeopleGuillermo Garcia
 
Introduction to Enterprise Search
Introduction to Enterprise SearchIntroduction to Enterprise Search
Introduction to Enterprise SearchFindwise
 

Similaire à Federated Search in Disparate Environments (20)

Enterprise Search White Paper: Increase Your Competitiveness - Make a Knowled...
Enterprise Search White Paper: Increase Your Competitiveness - Make a Knowled...Enterprise Search White Paper: Increase Your Competitiveness - Make a Knowled...
Enterprise Search White Paper: Increase Your Competitiveness - Make a Knowled...
 
KMWorld Martin Briefing
KMWorld Martin BriefingKMWorld Martin Briefing
KMWorld Martin Briefing
 
Search Strategy for Enterprise SharePoint 2013 - Vancouver SharePoint Summit
Search Strategy for Enterprise SharePoint 2013 - Vancouver SharePoint SummitSearch Strategy for Enterprise SharePoint 2013 - Vancouver SharePoint Summit
Search Strategy for Enterprise SharePoint 2013 - Vancouver SharePoint Summit
 
Configuring share point 2010 just do it
Configuring share point 2010   just do itConfiguring share point 2010   just do it
Configuring share point 2010 just do it
 
Taxonomies And Search Aiim Mn
Taxonomies And Search Aiim MnTaxonomies And Search Aiim Mn
Taxonomies And Search Aiim Mn
 
Enterprise Search, Simple, Complex and Powerful
Enterprise Search, Simple, Complex and PowerfulEnterprise Search, Simple, Complex and Powerful
Enterprise Search, Simple, Complex and Powerful
 
How to Get Enterprise Search Right Webinar
How to Get Enterprise Search Right WebinarHow to Get Enterprise Search Right Webinar
How to Get Enterprise Search Right Webinar
 
Federated Search in a Disparate Environment
Federated Search in a Disparate EnvironmentFederated Search in a Disparate Environment
Federated Search in a Disparate Environment
 
Session 0.0 poster minutes madness
Session 0.0   poster minutes madnessSession 0.0   poster minutes madness
Session 0.0 poster minutes madness
 
Aiim Webinar Helen Mitchell Unified Search Final 7 21 2010
Aiim Webinar Helen Mitchell  Unified Search Final 7 21 2010Aiim Webinar Helen Mitchell  Unified Search Final 7 21 2010
Aiim Webinar Helen Mitchell Unified Search Final 7 21 2010
 
Falling in and out and in love with Information Architecture
Falling in and out and in love with Information ArchitectureFalling in and out and in love with Information Architecture
Falling in and out and in love with Information Architecture
 
FAST Search-webinar-06-29-2010
FAST Search-webinar-06-29-2010FAST Search-webinar-06-29-2010
FAST Search-webinar-06-29-2010
 
Search Analytics for Content Strategists
Search Analytics for Content StrategistsSearch Analytics for Content Strategists
Search Analytics for Content Strategists
 
AMCTO presentation on moving from records managment to information management
AMCTO presentation on moving from records managment to information managementAMCTO presentation on moving from records managment to information management
AMCTO presentation on moving from records managment to information management
 
Optimising Your Content for Findability
Optimising Your Content for FindabilityOptimising Your Content for Findability
Optimising Your Content for Findability
 
AKM PPT C4 ASSET FORMATION
AKM PPT C4 ASSET FORMATIONAKM PPT C4 ASSET FORMATION
AKM PPT C4 ASSET FORMATION
 
04 share pointday2012_fast search
04 share pointday2012_fast search04 share pointday2012_fast search
04 share pointday2012_fast search
 
Digital Science
Digital ScienceDigital Science
Digital Science
 
Elqano - Where Knowledge Finds People
Elqano - Where Knowledge Finds PeopleElqano - Where Knowledge Finds People
Elqano - Where Knowledge Finds People
 
Introduction to Enterprise Search
Introduction to Enterprise SearchIntroduction to Enterprise Search
Introduction to Enterprise Search
 

Dernier

A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 

Dernier (20)

A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 

Federated Search in Disparate Environments

  • 1. Federated Search in a Disparate Environment PREPARED FOR: SLA Webinar Series Evidence-Based Practice in Libraries 2040 Corbett Rd Monkton, Md 21111 (410.472.4631 Helen L. Mitchell Curtis * hmitchell5@gmail.com Principal, Enterprising Solutions September 9, 2009
  • 2. Enterprising Solutions Biography Helen L. Mitchell Curtis – Principal, Enterprising Solutions  32+ years at FDA leading one of the largest enterprise search implementations among Civilian Federal Agencies  Develop enterprise-wide search strategies & solutions  Integrate search technologies across IT applications and disparate document repositories  Build governance, management and end user buy-in  Promote collaboration, standards, findability and improved organization of data and document assets  Passion – to help clients to reduce costs, improve quality and efficiency, reduce 'pain points' and achieve a positive search experience 2
  • 3. Enterprising Solutions Polling Question • What is Your Role? (select all that apply, if group participants) • CIO, Executive Director • Library Director (Corporate, Gov’t, Academia, Solo) • Librarian/Information Management Professional • IT Professional or Consultant • Project/Product Manager • Sales/Marketing/Communications • End User (i.e., Scientist, Researcher, Engineering Professional) • Federated Search Vendor • Other 3
  • 4. Enterprising Solutions Agenda 1. Terms Clarified 2. Types of Federated Search (FS) 3. FS Challenges & Benefits 4. FDA Case Study 5. FS Evaluation Criteria 6. Examples of FS Solutions 7. Live Federated Search Demo 8. Best Practices 9. Future Vision 10. Questions & Answers 4
  • 5. 1. Definition by AIIM Market IQ 2. Definition by CMS Watch Enterprising Solutions Clarify Terms 3. A Federated Search Primer – Part II 4. Deep Web Technologies 5. Federated Search Rpt & Toolkit-Jill Hurst-Wahl • Reliable and complete retrieval of content based on user need, i.e. everything relevant is recalled (recall) while simultaneously Findability returning only that content relevant to the user’s focus (precision), thus eliminating the review of irrelevant content by the 1 user. • Systems…within an organization…seeking information held Enterprise internally…in a variety of formats and locations, including Search databases, document management systems, and other 2 repositories. Content is pre-indexed, simultaneously searched, (ES) and displayed to authorized users. • The process of performing a simultaneous real-time search of Federated multiple diverse and distributed sources from a single search 3 page, with the federated search engine acting as intermediary. Search (FS) • The set of web-sites and their documents that cannot be accessed via crawler-type search engines such as Google. Deep web content Deep Web typically lives inside of databases, and is accessed through search 4 forms. It is also referred to as the Hidden or Invisible Web. • SW written to access a content source that must know the URL of Connector the source, how to send search commands, its search syntax, & 5 how to process the search results returned from a source. 5
  • 6. Enterprising Solutions Polling Question  Information Accessibility (select all that apply) 1. I can easily find information to do my job 2. Less than 50% of our organization’s info is searchable online 3. More than 50% of our organization's info is searchable online 4. I reference less than 5 systems (info sources) in any given week 5. I reference 5 or more systems (info sources) in any given week 6
  • 7. Enterprising Solutions Findability Issues  AIIM Market IQ Research on Findability (of 528 end users):  50% believe Findability in their organization is ―Worse to Much Worse‖ than their consumer-facing web sites  49% have no formal goal for Enterprise Findability within their organizations  49% ―Agreed or Strongly Agreed‖ that finding the information to do their job is difficult and time consuming  69% believe less than 50% of their organization's information is searchable online  36% reference five or more systems in any given week 7 Source: AIIM Market Intelligence, 2008
  • 8. Enterprising Solutions Why Use Federated Search To increase findability to better accomplish business objectives. To issue a single query across multiple content sources through a common search interface. When not feasible to re-index all of the content available from large public sites like PubMed. To increase user awareness of all content sources such as deep web for scientific, technical and business content. To eliminate using multiple database search protocols & passwords. When don‘t have the rights to index the content (e.g. subscription sites). Real-time search: for content constantly being updated & impractical to 8 keep the data as timely as it needs to be.
  • 9. Federated Search Sources Enterprising Solutions (examples) Reason Corporate Academic Gov’t Public Library Subscription Databases X X X X Internal or External Repositories X X Library Catalog(s) X X X X News X X Digitized Material X X X Blogs & Wikis X X X Intranet/Internet Sites X X Industry Specific Sources X DB‘s available to customers X X Historical Collections X 9
  • 10. Enterprising Solutions Typical Non-Federated Search 10 Courtesy of MuseGlobal, Inc.
  • 11. Enterprising Solutions Typical Federated Search 11 Courtesy of MuseGlobal, Inc.
  • 12. Enterprising Solutions Federated „Master Index‟ Search  Index multiple data sources content into a single master index  Queries & results come from that one master index  Many Enterprise Search products integrate FS via ‗connectors‘ to accomplish this (ex., FAST, Autonomy, Endeca) 12 Source: New Idea Engineering, Inc.
  • 13. Enterprising Solutions Federated „Data Silos‟ Search ‗Search Federator‘ processes queries for each data source silo Transforms search terms to match each content source requirements Submits query to each of the sources simultaneously Merges each source‘s results together - single look & feel Maintains no indices of its own, relies on linked systems capabilities 13 Source: New Idea Engineering, Inc.
  • 14. Enterprising Solutions Surface vs. Deep Web Search Popular search engines (Google, Yahoo…) ―crawl‖ surface web FS can drill down to the deep web where specialized content (i.e., scientific and technical databases) reside Deep Web FS Examples: www.completeplanet.com - 70,000+ searchable DBs & specialty search engines www.science.gov- federates U.S. federal agency science info http://imlsdcc.grainger.uiuc.edu/ - Institute of Museum & Library Services (IMLS) - Digital Collections & Content w/descriptions of digital resources developed by IMLS grantees 14 Source: Juanico-Environmental Consultants, Ltd.
  • 15. Enterprising Solutions Vertical Search Engine  Closely related to Deep Web – searches for a particular niche i.e., a specific industry, topic, type of content (e.g., scientific research, travel, movies, images, blogs)  Example: www.vetseek.info - is a search engine focusing on veterinary science and related topics 15
  • 16. Enterprising Solutions Polling Question Federated Search Solutions (select one) 1. We are currently conducting an evaluation to procure a Federated Search Product 2. We currently have a Federated Search Solution installed that satisfies our requirements 3. We have a Federated Search Solution by are considering replacing it or enhancing its capabilities & features 16
  • 17. Enterprising Solutions Challenges  Authentication  Showing each record‘s branding and copyright information  Licensed or subscription databases  True De-duplication  Virtually impossible because DBs return 10-20 results at a time  Vendors usually just de-dupe the first results set returned  Security  Mapping user credentials and access rights to each repository security model  Speed  Limited by slowest search engine‘s performance 17
  • 18. Enterprising Solutions Challenges (continued)  Lack of data standardization  Each source has a unique access method & needs translation  Metadata mapping between FSS and underlying systems  Access methods to sources may change  Requires an interface rewrite or modification  Rules for error handling  Ex. Query term not available—exclude the query, the repository, or proceed without the term?  Ex. Timeouts or connection problem  Complex searches usually not available  Fielded searches  Known Items, i.e. Article Name  Best to directly search database 18
  • 19. Enterprising Solutions Challenges (continued)  Relevancy scores  Can‘t identify a single relevancy ranking model  Relevancy rankings for repository‘s results refers to its own  May be not be useful when comparing the results with those from another system  Access to content stored in a variety of places  Results page may not let user obtain identified documents  This may involve a built-in viewer or invoking the owning product‘s interface.  Combining navigators from each result set  i.e., faceted search, taxonomies and auto-generate clusters  Selecting the right FS engine  Depends on business goals, type of content sources – structured vs. unstructured, licensed/subscriptions 19
  • 20. Enterprising Solutions Benefits • Single master index • Quicker response times • No need to access original data sources • Relevancy algorithms applied uniformly • Dynamic navigators are available for all documents • Time savings • Searches many sources at one time • Combines results into a single results page • Quality of results • Client selects the sources to search • Minimum impact on the data silos • Only accessed when a user performs a query • Eliminates increased load crawling/indexing the data source 20
  • 21. Enterprising Solutions Benefits (continued) • Improve productivity • Reduces number of searches executed to find relevant results • Save, reuse, schedule, and share effective search queries • Leverage security controls at queried source • Access repositories secured against crawls but can be accessed by search queries • Reduce costs • No additional capacity requirements for content index since its not crawled by search server • Most current content • Real time searches - as soon as the source is updated, the info is available to the searcher on the very next query • Increase awareness • Identify most relevant sources to search based on # of results each source produced 21
  • 22. Enterprising Solutions FDA Case Study Success (Federated „Master Index‟ Search System) ACTIONS RESULT Started small with high ‘pain Increased productivity & popularity. points’. Modified business processes. Standardized nomenclature improved efficiencies. Users across organization Produced more timely & QUALITY could find content in silos. work products. Indexed structured & Grew from 1 repository of 500 docs unstructured content with to 50 with 30 million docs. Accessed document level security. on ‘need to know’ basis. Introduced standardized Reduced development time & costs. search web services into Increased mgmt & user acceptance. applications. Integrated in more applications. Increased user awareness Used more & content added. Search with training, newsletters & requirements now captured at meetings. BEGINNING of project development. 22
  • 23. Enterprising Solutions Evaluation Criteria Overview  Identify Goals  Create an Effective Search Strategy  Collect Business Requirements  Conduct needs assessment  Work Closely with User Community 23
  • 24. Evaluation Criteria Overview Enterprising Solutions (continued)  Define Features and Functions  Eliminate emotional decisions re: product, company or others using the product  High Precision  Return content relevant to user‘s focus  High Recall  Recall everything relevant to user‘s need  Thoroughly Research Products, Users & Product Reviewers 24
  • 25. Enterprising Solutions Sample Evaluation Criteria Rating Criteria Importance Product #1 Product #1 Product #2 Product #2 (Rank 1-5) Score Weighted Score Score Weighted Score (0-100) (Rank x Score) (0-100) (Rank x Score) Ease of Use 5 85 425 70 350 Ability to Customize UI 1 80 80 65 65 Speed 5 90 450 85 425 De-duplication 4 75 300 75 300 Clustering 4 85 340 80 320 Help Functionality 3 70 210 0 0 Alerts 4 90 360 50 200 # of Searchable Sources 3 90 270 80 240 Save Selections/Citations 2 85 170 0 0 Security 4 90 360 85 340 Product Cost 5 75 375 85 425 Vendor Credibility 4 95 380 85 340 Total Weighted Score 1010 3720 760 3005 25 -Courtesy of Federated Search Report & Tool Kit
  • 26. FSS Example Enterprising Solutions (uses FAST ESP – Vertical Search) Features of Interest 26
  • 27. FSS Example Enterprising Solutions (uses MS & Vivisimo) Features of Interest 27
  • 28. Enterprising Solutions FSS Example (uses Deep Web Technologies) Features of Interest 28
  • 29. Enterprising Solutions FSS Example (uses Webfeat) Features of Interest 29
  • 30. Digital Library FSS Example Enterprising Solutions http://www.calisphere.universityofcalifornia.edu/ Features of Interest 30
  • 31. Digital Library FSS Example Enterprising Solutions http://www.calisphere.universityofcalifornia.edu 1 2 3 31
  • 32. Enterprising Solutions FSS Example (LibraryFind® developed by Oregon State Univ Libraries) Features of Interest 32
  • 33. Enterprising Solutions Semantic Federated Search (prototype by Collexis & Deep Web Technologies) SOURCES: •PubMed •NCI=Nat‘l Cancer Inst DeepWeb Technologies (a federated search provider) and •DTIC=Defense Tech. Info Ctr •PMC=PubMed Central Collexis (a developer of semantic search & knowledge •ScrDOEIB=DOE Info Bridge discovery solutions) teamed up to deliver the world’s first •Eurekalert=Science News semantic federated search. THESAURI Used: •MeSH •DTIC=Defense Tech. Info Ctr •How does semantic federated search work? •All results from your initial query are processed through one or more thesauri. (i.e., MeSH & DTIC.) •The system then returns terms that are found both in the top results and in the thesauri. 33
  • 34. Enterprising Solutions Collexis & Deep Web Technologies (Search Results – screenshot 1) Unlike clustering, which simply lumps together words that are frequently found near each other, these terms are being suggested from an expert- developed thesaurus (taxonomy) in which 2429 hits terms are meaningfully & consistently organized. The longer the Semantic terms. blue bar, the more semantic evidence found for that term. 34
  • 35. Enterprising Solutions Collexis & Deep Web Technologies (Search Results – screenshot 2) •Clicking on term “Mental Recall” from prior screen added term to search, reduced relevant hits to 3; & terms suggested are organized. •Thesaurus-based search will consistently suggest terms in the same organized way. •Clustering changes the way it organizes suggestions with every query. • Clustering tends to be useful for very broad, general or unpredictable content. *Thesaurus-based semantic search tends to be better when you are working consistently in knowledge domains, such as medicine, physics or electronics. 35
  • 36. Enterprising Solutions Best Practices Strategically plan how to deliver your mission and just DO IT! Do proof of concept – demos can be deceiving Establish common set of standards & governance model Measure results by establishing key performance indicators Leverage lessons learned to reduce project cycles, increase trust and empower communities 36
  • 37. Enterprising Solutions Future Vision Personalized Search • A simple, persistent box on a users‘ browser, cell, or entertainment screen that initiates a search based on what the user was doing, their previous keystrokes, & perhaps using historical data. Better Quality of Search Results • Number of results retrieved, Relevance Ranking, De-Duplication Enterprise Mashups • Combine real-time searching with social networking tools, maps, etc. Users build the index by their searching • Know Web pages people display, what‘s on them & what apps are showing up on users' computers 37
  • 38. Enterprising Solutions Future Vision (continued) Query analysis & predictive modeling on the fly • Business users expect to access info behind company firewalls & from the larger web world using the same tools and consistency Improved Navigators, Facets, Clustering • Filter result sets dynamically for more relevant results Web of Interconnected Data • Automate analysis of database structures and cross-reference results. Ex.- Health site cross-references data from pharmaceutical companies with the latest findings from medical researchers Visualization Technologies 38 • Enable extreme-scale knowledge discovery
  • 39. Enterprising Solutions Resources 1. Great resource for many Federated Search topics: www.federatedsearchblog.com – Author: Sol Lederman 2. Open Source & commercial search components & tools list: http://tinyurl.com/l3w8of 3. Federated Search Vendors: http://tinyurl.com/92s8qv 4. Deep Web Databases: http://tinyurl.com/yam3sw 5. Deep Web resources: http://www.internettutorials.net/deepweb.asp 6. Digital Image Resources on the Deep Web: http://tinyurl.com/46vcqp 7. Info on Vertical Search Engines: http://tinyurl.com/lpcufw 8. 50 Niche Search Engines: http://tinyurl.com/lukxwx 9. Library of Congress FS Portal Products/Vendors list: http://tinyurl.com/l6mdy8 10. Resources to Research & Mine the Deep Web: http://tinyurl.com/6g5768 39
  • 40. Enterprising Solutions References 1) ―What‟s in a Name: Federated Search‖ – Miles Kehoe, New Idea Engineering, Inc,Vol. 4 No.4 8/07 2) “Federated Search Engine Article” - Online (Weston, Conn.) 28 no2 16-19 Mr/Ap 2004 (Reprint of article by Donna Fryer www.SearchitRight.com ) 3) “Growing Up With Federated Search” - by Walt Warnick, OSTI 4) “Sophisticated Yet Simple - The Technology Behind OSTI's E-print Network: Part 3” – Walt Warnick, OSTI 5) “Vertical Search Engines & the Deep Web” - Laura B. Cohen http://www.internettutorials.net/ 6) Blog: www.federatedsearchblog.com – by Sol Lederman 7) “Exploring a „Deep Web‟ that Google can‟t Grasp” - NYT 2-23-09 http://tinyurl.com/mvt42f 8) “Federated Search Primer, Part I-III” – by Sol Lederman 9) www.searchdoneright.com – by Vivisimo –Raoul – CEO & Cofounder 10) “Enterprise Search Grows Up‟”- Podcast from BizTalk 11) “Federation: Big Need, Still a Challenge” – Stephen Arnold, 4/25/08 12) “The Future of Federated Search or What Will the World Look Like in 10 Years” – Rich Turner 13) “Federated Search Report & Tool Kit” – Jill Hurst-Wahl, 10/08, © Free Pint Limited 2008 40
  • 41. Enterprising Solutions QUESTIONS 41
  • 42. Enterprising Solutions THANK YOU! Helen L. Mitchell Curtis Principal Enterprising Solutions hmitchell5@gmail.com 410-472-4631(w) 410-259-7766(m) 42 42
  • 43. Enterprising Solutions Enterprising Solutions “Results Driven…Exceeding Expectations” 43