SlideShare une entreprise Scribd logo
1  sur  22
Télécharger pour lire hors ligne
How can library materials be ranked in the
OPAC?
Prof. Dr. Dirk Lewandowski
University of Applied Sciences Hamburg
dirk.lewandowski@haw-hamburg.de


9th International Bielefeld Conference
Bielefeld, 4 February 2009
Agenda




 The state of the OPAC and the importance of relevance ranking

 Ranking factors

 The composition of results lists

 Conclusions




1 | Dirk Lewandowski
Agenda




 The state of the OPAC and the importance of relevance ranking

 Ranking factors

 The composition of results lists

 Conclusions




2 | Dirk Lewandowski
What’s wrong with library catalogueues?




 • catalogueues are incomplete
     – Items from journal article collections, abstracting and indexing databases

 • “Electronic card catalogueue”?

 • User behaviour changed
    – Short queries, fast results, one set of results
    – Search engines strongly influence users’ demands

 • Known item vs. topic-based search
    – OPACs should accomodate both.




3 | Dirk Lewandowski
Some ideas to improve the OPAC (“catalogue 2.0”)




 • Let users participate
    – Write reviews
    – Rate titles

 • Enrich bibliographic data
    – Add reviews
    – Add TOC

 • Improve navigation
     – Drill-down menus on results pages to combine searching and browsing

 • Extend the database
    – Federated search

4 | Dirk Lewandowski
Core of all search appliances: Relevance ranking




 • While Web 2.0 features add value to the catalogue, search is still the core.

 • “Search must work”

 • Users’ needs
    – Users want results quickly.
    – Users are not willing to think too much about formulating their queries.
    – Users are not willing to search for the right database before conducting their
      search.
    – Users are only willing to view a few results on the first results page before
      deciding to continue.




5 | Dirk Lewandowski
Misconceptions about relevance ranking




 • A clear sorting criterion is better than relevance ranking.
    – Ranking does not reduce the number of results, but puts them in a certain order.
    – Other ordering options can be given.

 • Library catalogues do not apply any form of ranking.
     – Even conventional OPACs rank the results (according to publication date).

 • Relevance ranking is useless because it simply doesn’t work.
    – “Relevance” is hard to determine and depends on the context and on the
      individual user. However, a good relevance ranking can at least produce sufficient
      results lists.

 • Ranking is not that complicated. One must only apply standard measures such
   as TF/IDF.
     – For a good ranking, text matching alone is insufficient.

6 | Dirk Lewandowski
Agenda




 The state of the OPAC and the importance of relevance ranking

 Ranking factors

 The composition of results lists

 Conclusions




7 | Dirk Lewandowski
Ranking factors in web search engines



 • Text matching
    – Measures matching between query and document.
    – Term frequency, position of search terms within the documents, etc.
    – Text from document fulltexts, anchor texts.

 • Popularity
    – Measures popularity of the document (overall popularity or topic-based)
    – Link popularity (PageRank etc.), click popularity.

 • Freshness
     – Fresh documents can sometimes be very useful.
     – Derived from documents or from structural data (e.g., linkage)

 • Locality
    – Mainly expressed in differing rankings for country-specific search interfaces.

8 | Dirk Lewandowski: How can library materials be ranked in the OPAC?
         Lewandowski
Text matching




 • Factors
    – Term frequency, inverted document frequency
    – Fields: Title, subject headings, author, etc.

 • Availability of text elements as a ranking factor
    – Fulltext, TOC, reviews, user comments

 • Problems with text matching
    – Not enough text in metadata.
    – Amount of text varies considerably (from mere bibliographic data to hundreds of
      pages of fulltext).




9 | Dirk Lewandowski
Popularity




 • Popularity of
    – Item
    – Author/editor
    – Publisher
    – Book series

 • Measures
    – Number of items (by author, publisher, etc.)
    – Usage (circulation rate, download requests)
    – Average user rating
    – Citations




10 | Dirk Lewandowski
Freshness




 • Freshness is the most-used ranking criterion in catalogues today.

 • It is often difficult to determine whether fresh items will be relevant to a certain
   query.

 • Need for fresh items can be derived from
    – Circulation rate for the individual item
    – Circulation rates for items from a certain group (from broad disciplines to specific
      subject headings)




11 | Dirk Lewandowski
Locality




 • Availability of item
    – from the local library; within a certain distance.
    – Item currently available.

 • Physical location of the user
    – At home (electronic items strongly preferred)
    – At the library




12 | Dirk Lewandowski
Other ranking factors




 • Size of item (no. of pages)

 • Document types
    – Monograph, edited book, proceedings, etc.
    – Article vs. Book
    – Physical vs. online materials

 • User group
    – Professor, undergraduate student, graduate student, etc.

 • Personalization
    – Individual usage data
    – Click-stream data from navigation

13 | Dirk Lewandowski
Agenda




 The state of the OPAC and the importance of relevance ranking

 Ranking factors

 The composition of results lists

 Conclusions




14 | Dirk Lewandowski
Data needed




 • Data from the catalogue

 • Circulation data
    – Anonymous

 • Location data
    – From IP ranges

 • User data

 • Data from remote resources
    – Abstracts (and fulltexts) from publishers.


15 | Dirk Lewandowski
Collections and databases




 • Library controlled
     – catalogue
     – Local digital repositories
     – Course management systems
     – Institutional web sites

 • External collections
    – A&I databases
    – E-journal collections




16 | Dirk Lewandowski
Mixed results lists




 • Ranking algorithms prefer “more of the same”. This does not satisfy users’
   needs for a variety of results.

 • Example for a broad query
    – Reference works (from subject headings + items from reference collection)
    – Text books
    – Relevant databases
    – Some current items
    – Relevant journals




17 | Dirk Lewandowski
“Universal Search”


                                                                                  Additional databases

 • x




                                                                          One box results (e.g., news or images)




18 | Dirk Lewandowski: How can library materials be ranked in the OPAC?
          Lewandowski
Agenda




 The state of the OPAC and the importance of relevance ranking

 Ranking factors

 The composition of results lists

 Conclusions




19 | Dirk Lewandowski
Conclusions




 • Search is the core of the library catalogue.
    – However, other elements must be considered, too:
          – Usability
          – User guidance
          – Spelling corrections
          – etc.

 • A good ranking is always a mixture of ranking factors

 • In addition, results lists should be mixed.
     – Items from different collections.
     – Mixture of direct results and pointers to other collections.

 • Future: catalogue will become more like a search engines.
20 | Dirk Lewandowski
Thank you for your attention.
Prof. Dr.
Dirk Lewandowski

Hamburg University of Applied Sciences
Department Information
Berliner Tor 5
D - 20099 Hamburg
Germany



www.bui.haw-hamburg.de/lewandowski.html
E-Mail: dirk.lewandowski@haw-hamburg.de

Contenu connexe

Tendances

Promoting Open Access and Open Educational Resources to Faculty
Promoting Open Access and Open Educational Resources to FacultyPromoting Open Access and Open Educational Resources to Faculty
Promoting Open Access and Open Educational Resources to Faculty
NASIG
 
NISO Standards update: KBart and Demand Driven Acquisitions Best Practices
NISO Standards update: KBart and Demand Driven Acquisitions Best PracticesNISO Standards update: KBart and Demand Driven Acquisitions Best Practices
NISO Standards update: KBart and Demand Driven Acquisitions Best Practices
Jason Price, PhD
 
Getting on with it (research support at an academic library) presented at Uni...
Getting on with it (research support at an academic library) presented at Uni...Getting on with it (research support at an academic library) presented at Uni...
Getting on with it (research support at an academic library) presented at Uni...
Reed Elsevier
 
Virtual support_to_research_communities
Virtual  support_to_research_communitiesVirtual  support_to_research_communities
Virtual support_to_research_communities
СОБДиЮ
 

Tendances (20)

ArchiveSpark Introduction @ WebSci' 2016 Hackathon
ArchiveSpark Introduction @ WebSci' 2016 HackathonArchiveSpark Introduction @ WebSci' 2016 Hackathon
ArchiveSpark Introduction @ WebSci' 2016 Hackathon
 
Role of libraries in accelerating research
Role of libraries in accelerating researchRole of libraries in accelerating research
Role of libraries in accelerating research
 
WorldCat Presentation
WorldCat PresentationWorldCat Presentation
WorldCat Presentation
 
Federated to library discovery platfoms
Federated to library discovery platfomsFederated to library discovery platfoms
Federated to library discovery platfoms
 
Search & Recommendation: Birds of a Feather?
Search & Recommendation: Birds of a Feather?Search & Recommendation: Birds of a Feather?
Search & Recommendation: Birds of a Feather?
 
All Your Data Displayed in One Place: Scoping Research for a Library Assessme...
All Your Data Displayed in One Place: Scoping Research for a Library Assessme...All Your Data Displayed in One Place: Scoping Research for a Library Assessme...
All Your Data Displayed in One Place: Scoping Research for a Library Assessme...
 
Semantic Search
Semantic SearchSemantic Search
Semantic Search
 
Shared Print in the Orbis Cascade Alliance and Colorado Alliance (Levine-Clark)
Shared Print in the Orbis Cascade Alliance and Colorado Alliance (Levine-Clark)Shared Print in the Orbis Cascade Alliance and Colorado Alliance (Levine-Clark)
Shared Print in the Orbis Cascade Alliance and Colorado Alliance (Levine-Clark)
 
Web scale discovery service
Web scale discovery serviceWeb scale discovery service
Web scale discovery service
 
Promoting Open Access and Open Educational Resources to Faculty
Promoting Open Access and Open Educational Resources to FacultyPromoting Open Access and Open Educational Resources to Faculty
Promoting Open Access and Open Educational Resources to Faculty
 
Web scale discovery tools
Web scale discovery tools Web scale discovery tools
Web scale discovery tools
 
E book acquisition discovery-delivery-support
E book acquisition discovery-delivery-supportE book acquisition discovery-delivery-support
E book acquisition discovery-delivery-support
 
NISO Standards update: KBart and Demand Driven Acquisitions Best Practices
NISO Standards update: KBart and Demand Driven Acquisitions Best PracticesNISO Standards update: KBart and Demand Driven Acquisitions Best Practices
NISO Standards update: KBart and Demand Driven Acquisitions Best Practices
 
Analyzing workflows and improving communication across departments
Analyzing workflows and improving communication across departments Analyzing workflows and improving communication across departments
Analyzing workflows and improving communication across departments
 
Getting on with it (research support at an academic library) presented at Uni...
Getting on with it (research support at an academic library) presented at Uni...Getting on with it (research support at an academic library) presented at Uni...
Getting on with it (research support at an academic library) presented at Uni...
 
Hansen Metadata for Institutional Repositories
Hansen Metadata for Institutional RepositoriesHansen Metadata for Institutional Repositories
Hansen Metadata for Institutional Repositories
 
Library Services Navigation
Library Services NavigationLibrary Services Navigation
Library Services Navigation
 
Research Support Services ECU Library
Research Support Services ECU LibraryResearch Support Services ECU Library
Research Support Services ECU Library
 
Virtual support_to_research_communities
Virtual  support_to_research_communitiesVirtual  support_to_research_communities
Virtual support_to_research_communities
 
RDA Toolkit Essentials webinar 03.19.14
RDA Toolkit Essentials webinar 03.19.14RDA Toolkit Essentials webinar 03.19.14
RDA Toolkit Essentials webinar 03.19.14
 

En vedette

Ordinary Search Engine Users Assessing Difficulty, Effort and Outcome for Sim...
Ordinary Search Engine Users Assessing Difficulty, Effort and Outcome for Sim...Ordinary Search Engine Users Assessing Difficulty, Effort and Outcome for Sim...
Ordinary Search Engine Users Assessing Difficulty, Effort and Outcome for Sim...
Dirk Lewandowski
 
Verwendung von Skalenbewertungen in der Evaluierung von Suchmaschinen
Verwendung von Skalenbewertungen in der Evaluierung von SuchmaschinenVerwendung von Skalenbewertungen in der Evaluierung von Suchmaschinen
Verwendung von Skalenbewertungen in der Evaluierung von Suchmaschinen
Dirk Lewandowski
 
Internet-Suchmaschinen: Aktueller Stand und Entwicklungsperspektiven
Internet-Suchmaschinen: Aktueller Stand und EntwicklungsperspektivenInternet-Suchmaschinen: Aktueller Stand und Entwicklungsperspektiven
Internet-Suchmaschinen: Aktueller Stand und Entwicklungsperspektiven
Dirk Lewandowski
 
Neue Trends: Google, SEO und Co.?
Neue Trends: Google, SEO und Co.?Neue Trends: Google, SEO und Co.?
Neue Trends: Google, SEO und Co.?
Dirk Lewandowski
 

En vedette (9)

Ordinary Search Engine Users Assessing Difficulty, Effort and Outcome for Sim...
Ordinary Search Engine Users Assessing Difficulty, Effort and Outcome for Sim...Ordinary Search Engine Users Assessing Difficulty, Effort and Outcome for Sim...
Ordinary Search Engine Users Assessing Difficulty, Effort and Outcome for Sim...
 
Verwendung von Skalenbewertungen in der Evaluierung von Suchmaschinen
Verwendung von Skalenbewertungen in der Evaluierung von SuchmaschinenVerwendung von Skalenbewertungen in der Evaluierung von Suchmaschinen
Verwendung von Skalenbewertungen in der Evaluierung von Suchmaschinen
 
Perspektiven eines Open Web Index
Perspektiven eines Open Web IndexPerspektiven eines Open Web Index
Perspektiven eines Open Web Index
 
Internet-Suchmaschinen: Aktueller Stand und Entwicklungsperspektiven
Internet-Suchmaschinen: Aktueller Stand und EntwicklungsperspektivenInternet-Suchmaschinen: Aktueller Stand und Entwicklungsperspektiven
Internet-Suchmaschinen: Aktueller Stand und Entwicklungsperspektiven
 
Neue Trends: Google, SEO und Co.?
Neue Trends: Google, SEO und Co.?Neue Trends: Google, SEO und Co.?
Neue Trends: Google, SEO und Co.?
 
Wie entwickeln sich Suchmaschinen heute, was kommt morgen?
Wie entwickeln sich Suchmaschinen heute, was kommt morgen?Wie entwickeln sich Suchmaschinen heute, was kommt morgen?
Wie entwickeln sich Suchmaschinen heute, was kommt morgen?
 
Wie Suchmaschinen die Inhalte des Web interpretieren
Wie Suchmaschinen die Inhalte des Web interpretierenWie Suchmaschinen die Inhalte des Web interpretieren
Wie Suchmaschinen die Inhalte des Web interpretieren
 
Suchmaschinen verstehen
Suchmaschinen verstehenSuchmaschinen verstehen
Suchmaschinen verstehen
 
Pasar Persaingan Sempurna (Ekonomi Mikro)
Pasar Persaingan Sempurna (Ekonomi Mikro)Pasar Persaingan Sempurna (Ekonomi Mikro)
Pasar Persaingan Sempurna (Ekonomi Mikro)
 

Similaire à How can library materials be ranked in the OPAC?

Measuring the quality of web search engines
Measuring the quality of web search enginesMeasuring the quality of web search engines
Measuring the quality of web search engines
Dirk Lewandowski
 
Snyman unisa battle to build an ebook collection
Snyman unisa battle to build an ebook collectionSnyman unisa battle to build an ebook collection
Snyman unisa battle to build an ebook collection
FOTIM
 
Andrew cox rdm rose
Andrew cox   rdm roseAndrew cox   rdm rose
Andrew cox rdm rose
sconul
 

Similaire à How can library materials be ranked in the OPAC? (20)

Measuring the quality of web search engines
Measuring the quality of web search enginesMeasuring the quality of web search engines
Measuring the quality of web search engines
 
Snyman unisa battle to build an ebook collection
Snyman unisa battle to build an ebook collectionSnyman unisa battle to build an ebook collection
Snyman unisa battle to build an ebook collection
 
Introducing the Open Discovery Initiative
Introducing the Open Discovery InitiativeIntroducing the Open Discovery Initiative
Introducing the Open Discovery Initiative
 
Final delasalle for uksg
Final delasalle for uksgFinal delasalle for uksg
Final delasalle for uksg
 
Andrew cox rdm rose
Andrew cox   rdm roseAndrew cox   rdm rose
Andrew cox rdm rose
 
RDA Toolkit Essentials 01.16
RDA Toolkit Essentials 01.16RDA Toolkit Essentials 01.16
RDA Toolkit Essentials 01.16
 
09.19 rda toolkit essentials
09.19 rda toolkit essentials 09.19 rda toolkit essentials
09.19 rda toolkit essentials
 
07.18 rda toolkit essentials
07.18 rda toolkit essentials07.18 rda toolkit essentials
07.18 rda toolkit essentials
 
11.14 RDA Toolkit essentials
11.14 RDA Toolkit essentials 11.14 RDA Toolkit essentials
11.14 RDA Toolkit essentials
 
Role of libraries in research and scholarly communication
Role of libraries in research and scholarly communicationRole of libraries in research and scholarly communication
Role of libraries in research and scholarly communication
 
The OCLC Research Library Partnership
The OCLC Research Library PartnershipThe OCLC Research Library Partnership
The OCLC Research Library Partnership
 
Web Scale Discovery Services: Google like search experience
Web Scale Discovery Services: Google like search experienceWeb Scale Discovery Services: Google like search experience
Web Scale Discovery Services: Google like search experience
 
RDA Toolkit Essentials 2015-06-11
RDA Toolkit Essentials 2015-06-11RDA Toolkit Essentials 2015-06-11
RDA Toolkit Essentials 2015-06-11
 
RDA Toolkit Essentials 2015-09-24
RDA Toolkit Essentials 2015-09-24RDA Toolkit Essentials 2015-09-24
RDA Toolkit Essentials 2015-09-24
 
Benchmarking Domain-specific Expert Search using Workshop Program Committees
Benchmarking Domain-specific Expert Search using Workshop Program CommitteesBenchmarking Domain-specific Expert Search using Workshop Program Committees
Benchmarking Domain-specific Expert Search using Workshop Program Committees
 
Discovery on a budget
Discovery on a budgetDiscovery on a budget
Discovery on a budget
 
Discovery on a budget: Improved searching without a Web-scale discovery product
Discovery on a budget: Improved searching without a Web-scale discovery productDiscovery on a budget: Improved searching without a Web-scale discovery product
Discovery on a budget: Improved searching without a Web-scale discovery product
 
Breeding, Introducing the Open Discovery Initiative
Breeding, Introducing the Open Discovery InitiativeBreeding, Introducing the Open Discovery Initiative
Breeding, Introducing the Open Discovery Initiative
 
RDA Toolkit Essentials 2014-12-17
RDA Toolkit Essentials 2014-12-17RDA Toolkit Essentials 2014-12-17
RDA Toolkit Essentials 2014-12-17
 
RDA Toolkit Essentials 2015-03-18
RDA Toolkit Essentials 2015-03-18RDA Toolkit Essentials 2015-03-18
RDA Toolkit Essentials 2015-03-18
 

Plus de Dirk Lewandowski

In a World of Biased Search Engines
In a World of Biased Search EnginesIn a World of Biased Search Engines
In a World of Biased Search Engines
Dirk Lewandowski
 
Künstliche Intelligenz bei Suchmaschinen
Künstliche Intelligenz bei SuchmaschinenKünstliche Intelligenz bei Suchmaschinen
Künstliche Intelligenz bei Suchmaschinen
Dirk Lewandowski
 
Analysing search engine data on socially relevant topics
Analysing search engine data on socially relevant topicsAnalysing search engine data on socially relevant topics
Analysing search engine data on socially relevant topics
Dirk Lewandowski
 
Neue Entwicklungen bei Suchmaschinen und deren Relevanz für Bibliotheken (3)
Neue Entwicklungen bei Suchmaschinen und deren Relevanz für Bibliotheken (3)Neue Entwicklungen bei Suchmaschinen und deren Relevanz für Bibliotheken (3)
Neue Entwicklungen bei Suchmaschinen und deren Relevanz für Bibliotheken (3)
Dirk Lewandowski
 
Neue Entwicklungen bei Suchmaschinen und deren Relevanz für Bibliotheken (2)
Neue Entwicklungen bei Suchmaschinen und deren Relevanz für Bibliotheken (2)Neue Entwicklungen bei Suchmaschinen und deren Relevanz für Bibliotheken (2)
Neue Entwicklungen bei Suchmaschinen und deren Relevanz für Bibliotheken (2)
Dirk Lewandowski
 
Neue Entwicklungen bei Suchmaschinen und deren Relevanz für Bibliotheken (1)
Neue Entwicklungen bei Suchmaschinen und deren Relevanz für Bibliotheken (1)Neue Entwicklungen bei Suchmaschinen und deren Relevanz für Bibliotheken (1)
Neue Entwicklungen bei Suchmaschinen und deren Relevanz für Bibliotheken (1)
Dirk Lewandowski
 
Medientage 2013: Die Zukunft der Suche
Medientage 2013: Die Zukunft der SucheMedientage 2013: Die Zukunft der Suche
Medientage 2013: Die Zukunft der Suche
Dirk Lewandowski
 
Suchmaschinen: Googlerisierung der Gesellschaft
Suchmaschinen: Googlerisierung der GesellschaftSuchmaschinen: Googlerisierung der Gesellschaft
Suchmaschinen: Googlerisierung der Gesellschaft
Dirk Lewandowski
 
Wie beeinflussen Suchmaschinen den Informationsmarkt?
Wie beeinflussen Suchmaschinen den Informationsmarkt?Wie beeinflussen Suchmaschinen den Informationsmarkt?
Wie beeinflussen Suchmaschinen den Informationsmarkt?
Dirk Lewandowski
 
Warum wir Alternativen zu Google benötigen
Warum wir Alternativen zu Google benötigenWarum wir Alternativen zu Google benötigen
Warum wir Alternativen zu Google benötigen
Dirk Lewandowski
 

Plus de Dirk Lewandowski (20)

The Need for and fundamentals of an Open Web Index
The Need for and fundamentals of an Open Web IndexThe Need for and fundamentals of an Open Web Index
The Need for and fundamentals of an Open Web Index
 
In a World of Biased Search Engines
In a World of Biased Search EnginesIn a World of Biased Search Engines
In a World of Biased Search Engines
 
EIN ANDERER BLICK AUF GOOGLE: Wie interpretieren Nutzer/innen die Suchergebni...
EIN ANDERER BLICK AUF GOOGLE: Wie interpretieren Nutzer/innen die Suchergebni...EIN ANDERER BLICK AUF GOOGLE: Wie interpretieren Nutzer/innen die Suchergebni...
EIN ANDERER BLICK AUF GOOGLE: Wie interpretieren Nutzer/innen die Suchergebni...
 
Künstliche Intelligenz bei Suchmaschinen
Künstliche Intelligenz bei SuchmaschinenKünstliche Intelligenz bei Suchmaschinen
Künstliche Intelligenz bei Suchmaschinen
 
Analysing search engine data on socially relevant topics
Analysing search engine data on socially relevant topicsAnalysing search engine data on socially relevant topics
Analysing search engine data on socially relevant topics
 
Google Assistant, Alexa & Co.: Wie sich die Welt der Suche verändert
Google Assistant, Alexa & Co.: Wie sich die Welt der Suche verändertGoogle Assistant, Alexa & Co.: Wie sich die Welt der Suche verändert
Google Assistant, Alexa & Co.: Wie sich die Welt der Suche verändert
 
Suchverhalten und die Grenzen von Suchdiensten
Suchverhalten und die Grenzen von SuchdienstenSuchverhalten und die Grenzen von Suchdiensten
Suchverhalten und die Grenzen von Suchdiensten
 
Können Nutzer echte Suchergebnisse von Werbung in Suchmaschinen unterscheiden?
Können Nutzer echte Suchergebnisse von Werbung in Suchmaschinen unterscheiden?Können Nutzer echte Suchergebnisse von Werbung in Suchmaschinen unterscheiden?
Können Nutzer echte Suchergebnisse von Werbung in Suchmaschinen unterscheiden?
 
Are Ads on Google search engine results pages labeled clearly enough?
Are Ads on Google search engine results pages labeled clearly enough?Are Ads on Google search engine results pages labeled clearly enough?
Are Ads on Google search engine results pages labeled clearly enough?
 
Search Engine Bias - sollen wir Googles Suchergebnissen vertrauen?
Search Engine Bias - sollen wir Googles Suchergebnissen vertrauen?Search Engine Bias - sollen wir Googles Suchergebnissen vertrauen?
Search Engine Bias - sollen wir Googles Suchergebnissen vertrauen?
 
Neue Entwicklungen bei Suchmaschinen und deren Relevanz für Bibliotheken (3)
Neue Entwicklungen bei Suchmaschinen und deren Relevanz für Bibliotheken (3)Neue Entwicklungen bei Suchmaschinen und deren Relevanz für Bibliotheken (3)
Neue Entwicklungen bei Suchmaschinen und deren Relevanz für Bibliotheken (3)
 
Neue Entwicklungen bei Suchmaschinen und deren Relevanz für Bibliotheken (2)
Neue Entwicklungen bei Suchmaschinen und deren Relevanz für Bibliotheken (2)Neue Entwicklungen bei Suchmaschinen und deren Relevanz für Bibliotheken (2)
Neue Entwicklungen bei Suchmaschinen und deren Relevanz für Bibliotheken (2)
 
Neue Entwicklungen bei Suchmaschinen und deren Relevanz für Bibliotheken (1)
Neue Entwicklungen bei Suchmaschinen und deren Relevanz für Bibliotheken (1)Neue Entwicklungen bei Suchmaschinen und deren Relevanz für Bibliotheken (1)
Neue Entwicklungen bei Suchmaschinen und deren Relevanz für Bibliotheken (1)
 
Nutzer verstehen
Nutzer verstehenNutzer verstehen
Nutzer verstehen
 
Medientage 2013: Die Zukunft der Suche
Medientage 2013: Die Zukunft der SucheMedientage 2013: Die Zukunft der Suche
Medientage 2013: Die Zukunft der Suche
 
Suchmaschinen: Googlerisierung der Gesellschaft
Suchmaschinen: Googlerisierung der GesellschaftSuchmaschinen: Googlerisierung der Gesellschaft
Suchmaschinen: Googlerisierung der Gesellschaft
 
Wie beeinflussen Suchmaschinen den Informationsmarkt?
Wie beeinflussen Suchmaschinen den Informationsmarkt?Wie beeinflussen Suchmaschinen den Informationsmarkt?
Wie beeinflussen Suchmaschinen den Informationsmarkt?
 
Web-Index-Workshop 2014
Web-Index-Workshop 2014Web-Index-Workshop 2014
Web-Index-Workshop 2014
 
Alternatives to Google
Alternatives to GoogleAlternatives to Google
Alternatives to Google
 
Warum wir Alternativen zu Google benötigen
Warum wir Alternativen zu Google benötigenWarum wir Alternativen zu Google benötigen
Warum wir Alternativen zu Google benötigen
 

Dernier

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 

Dernier (20)

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdf
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 

How can library materials be ranked in the OPAC?

  • 1. How can library materials be ranked in the OPAC? Prof. Dr. Dirk Lewandowski University of Applied Sciences Hamburg dirk.lewandowski@haw-hamburg.de 9th International Bielefeld Conference Bielefeld, 4 February 2009
  • 2. Agenda The state of the OPAC and the importance of relevance ranking Ranking factors The composition of results lists Conclusions 1 | Dirk Lewandowski
  • 3. Agenda The state of the OPAC and the importance of relevance ranking Ranking factors The composition of results lists Conclusions 2 | Dirk Lewandowski
  • 4. What’s wrong with library catalogueues? • catalogueues are incomplete – Items from journal article collections, abstracting and indexing databases • “Electronic card catalogueue”? • User behaviour changed – Short queries, fast results, one set of results – Search engines strongly influence users’ demands • Known item vs. topic-based search – OPACs should accomodate both. 3 | Dirk Lewandowski
  • 5. Some ideas to improve the OPAC (“catalogue 2.0”) • Let users participate – Write reviews – Rate titles • Enrich bibliographic data – Add reviews – Add TOC • Improve navigation – Drill-down menus on results pages to combine searching and browsing • Extend the database – Federated search 4 | Dirk Lewandowski
  • 6. Core of all search appliances: Relevance ranking • While Web 2.0 features add value to the catalogue, search is still the core. • “Search must work” • Users’ needs – Users want results quickly. – Users are not willing to think too much about formulating their queries. – Users are not willing to search for the right database before conducting their search. – Users are only willing to view a few results on the first results page before deciding to continue. 5 | Dirk Lewandowski
  • 7. Misconceptions about relevance ranking • A clear sorting criterion is better than relevance ranking. – Ranking does not reduce the number of results, but puts them in a certain order. – Other ordering options can be given. • Library catalogues do not apply any form of ranking. – Even conventional OPACs rank the results (according to publication date). • Relevance ranking is useless because it simply doesn’t work. – “Relevance” is hard to determine and depends on the context and on the individual user. However, a good relevance ranking can at least produce sufficient results lists. • Ranking is not that complicated. One must only apply standard measures such as TF/IDF. – For a good ranking, text matching alone is insufficient. 6 | Dirk Lewandowski
  • 8. Agenda The state of the OPAC and the importance of relevance ranking Ranking factors The composition of results lists Conclusions 7 | Dirk Lewandowski
  • 9. Ranking factors in web search engines • Text matching – Measures matching between query and document. – Term frequency, position of search terms within the documents, etc. – Text from document fulltexts, anchor texts. • Popularity – Measures popularity of the document (overall popularity or topic-based) – Link popularity (PageRank etc.), click popularity. • Freshness – Fresh documents can sometimes be very useful. – Derived from documents or from structural data (e.g., linkage) • Locality – Mainly expressed in differing rankings for country-specific search interfaces. 8 | Dirk Lewandowski: How can library materials be ranked in the OPAC? Lewandowski
  • 10. Text matching • Factors – Term frequency, inverted document frequency – Fields: Title, subject headings, author, etc. • Availability of text elements as a ranking factor – Fulltext, TOC, reviews, user comments • Problems with text matching – Not enough text in metadata. – Amount of text varies considerably (from mere bibliographic data to hundreds of pages of fulltext). 9 | Dirk Lewandowski
  • 11. Popularity • Popularity of – Item – Author/editor – Publisher – Book series • Measures – Number of items (by author, publisher, etc.) – Usage (circulation rate, download requests) – Average user rating – Citations 10 | Dirk Lewandowski
  • 12. Freshness • Freshness is the most-used ranking criterion in catalogues today. • It is often difficult to determine whether fresh items will be relevant to a certain query. • Need for fresh items can be derived from – Circulation rate for the individual item – Circulation rates for items from a certain group (from broad disciplines to specific subject headings) 11 | Dirk Lewandowski
  • 13. Locality • Availability of item – from the local library; within a certain distance. – Item currently available. • Physical location of the user – At home (electronic items strongly preferred) – At the library 12 | Dirk Lewandowski
  • 14. Other ranking factors • Size of item (no. of pages) • Document types – Monograph, edited book, proceedings, etc. – Article vs. Book – Physical vs. online materials • User group – Professor, undergraduate student, graduate student, etc. • Personalization – Individual usage data – Click-stream data from navigation 13 | Dirk Lewandowski
  • 15. Agenda The state of the OPAC and the importance of relevance ranking Ranking factors The composition of results lists Conclusions 14 | Dirk Lewandowski
  • 16. Data needed • Data from the catalogue • Circulation data – Anonymous • Location data – From IP ranges • User data • Data from remote resources – Abstracts (and fulltexts) from publishers. 15 | Dirk Lewandowski
  • 17. Collections and databases • Library controlled – catalogue – Local digital repositories – Course management systems – Institutional web sites • External collections – A&I databases – E-journal collections 16 | Dirk Lewandowski
  • 18. Mixed results lists • Ranking algorithms prefer “more of the same”. This does not satisfy users’ needs for a variety of results. • Example for a broad query – Reference works (from subject headings + items from reference collection) – Text books – Relevant databases – Some current items – Relevant journals 17 | Dirk Lewandowski
  • 19. “Universal Search” Additional databases • x One box results (e.g., news or images) 18 | Dirk Lewandowski: How can library materials be ranked in the OPAC? Lewandowski
  • 20. Agenda The state of the OPAC and the importance of relevance ranking Ranking factors The composition of results lists Conclusions 19 | Dirk Lewandowski
  • 21. Conclusions • Search is the core of the library catalogue. – However, other elements must be considered, too: – Usability – User guidance – Spelling corrections – etc. • A good ranking is always a mixture of ranking factors • In addition, results lists should be mixed. – Items from different collections. – Mixture of direct results and pointers to other collections. • Future: catalogue will become more like a search engines. 20 | Dirk Lewandowski
  • 22. Thank you for your attention. Prof. Dr. Dirk Lewandowski Hamburg University of Applied Sciences Department Information Berliner Tor 5 D - 20099 Hamburg Germany www.bui.haw-hamburg.de/lewandowski.html E-Mail: dirk.lewandowski@haw-hamburg.de