SlideShare une entreprise Scribd logo
1  sur  14
The Mosaic Search Engine Mark van Harmelen Hedtek Ltd markvanharmelen@gmail.comhedtek.com
Aim Provide a proof of concept that  Users can have personalised search results according to their place and stage of studies Users can adopt other personas or points-of-view to explore academic resources We can exploit ‘mass’ attention data as revealed by library circulation information So far only working with ISBN identified books
HEI circulation data build Solr index anonymise partial Copac records annotated with use and reading list data reading lists Solr HEI anonymise front-end HEI anonymise
Anonymisation Level 1: Current prototype, enables faceting Level 2: With extra information, enables“people who borrowed this also borrowed”and“people who borrowed this went on to borrow” Anonymisationutility provided DPA compliant, can also use fair processing agreements
Augmenting Solr’s index Solr’s search index is loaded with items and any associated use information Use information is:   institution  course  progression level   year of use count of number of uses in that year Use information enables faceting Also add reading list info to items
Solr OPAC resultset itemquery item data query client-side front-end (browser)
Narrowing and broadening Thoughts (NB, ‘thoughts’) of narrowing of choice led to two features to broaden choice Don’t believe that the Mosaic demo in itself narrows when used for browsing Broadening features More like this link Reading lists
The Harry Potter ‘problem’ and scale The Harry Potter ‘problem’: Balderdash! We can control this using Library of Congress subject categories and Dewey Decimal shelfmarks Paul Miller raises questions of scale Dave Pattern has shown success use of use data at a single (small) institution We want to leverage reasonably large scale: 3.5-4M students in HE, over say the last five years
User context and attention Has been relatively simple to parameterise an open source search engine with user context Institution, course, progression level, academic year This is only part of the user context, can add Location Attention data, e.g., search history Further social search information
Disclaimer The next slide is independent of any decisions on a pure data approach Could be a pure data approach in there Or maybe not
Where is this going? A personal view Bind together ,[object Object]
Mosiac searchpersonalised/point-of-view search Massively parallel search for blindingly fast response times Data mining for library ‘stewardship’ We have prototypes for the first two, and we’re about to start experimenting with parallel search using Hadoop+Lucene
Building institutional contributions Propose union-cat-local: Search in local library Mosaic-like search utilises local loan data if it is available Two ways to encourage library contribution of loan data (thoughts in progress) Narrow: Libraries which contribute loan data to the pool get Mosaic search over the pool Broad: Offer the contextual/PoV search available everywhere; users will agitate if they don’t see local data
This is a Just Do It moment A national union catalogue with contextual search and local library interfaces Relatively cheap to do Potentially massive gains for learners, teachers and researchers Portends the development of shared services across the library domain and large cost savings Doesn’t preclude / agnostic on an open data approach Could incorporate a pure data service approach and/or a centralised service

Contenu connexe

Tendances

Extracting and Reducing the Semantic Information Content of Web Documents to ...
Extracting and Reducing the Semantic Information Content of Web Documents to ...Extracting and Reducing the Semantic Information Content of Web Documents to ...
Extracting and Reducing the Semantic Information Content of Web Documents to ...
ijsrd.com
 
Data.Mining.C.8(Ii).Web Mining 570802461
Data.Mining.C.8(Ii).Web Mining 570802461Data.Mining.C.8(Ii).Web Mining 570802461
Data.Mining.C.8(Ii).Web Mining 570802461
Margaret Wang
 
Semantic Search Tutorial at SemTech 2012
Semantic Search Tutorial at SemTech 2012 Semantic Search Tutorial at SemTech 2012
Semantic Search Tutorial at SemTech 2012
Thanh Tran
 
Overbeeke
OverbeekeOverbeeke
Overbeeke
anesah
 
Semantic Search tutorial at SemTech 2012
Semantic Search tutorial at SemTech 2012Semantic Search tutorial at SemTech 2012
Semantic Search tutorial at SemTech 2012
Peter Mika
 

Tendances (11)

Extracting and Reducing the Semantic Information Content of Web Documents to ...
Extracting and Reducing the Semantic Information Content of Web Documents to ...Extracting and Reducing the Semantic Information Content of Web Documents to ...
Extracting and Reducing the Semantic Information Content of Web Documents to ...
 
Data.Mining.C.8(Ii).Web Mining 570802461
Data.Mining.C.8(Ii).Web Mining 570802461Data.Mining.C.8(Ii).Web Mining 570802461
Data.Mining.C.8(Ii).Web Mining 570802461
 
Semantic Search Tutorial at SemTech 2012
Semantic Search Tutorial at SemTech 2012 Semantic Search Tutorial at SemTech 2012
Semantic Search Tutorial at SemTech 2012
 
Overbeeke
OverbeekeOverbeeke
Overbeeke
 
Structured data and metadata evaluation methodology for organizations looking...
Structured data and metadata evaluation methodology for organizations looking...Structured data and metadata evaluation methodology for organizations looking...
Structured data and metadata evaluation methodology for organizations looking...
 
Semantic Search on the Rise
Semantic Search on the RiseSemantic Search on the Rise
Semantic Search on the Rise
 
Personalized search
Personalized searchPersonalized search
Personalized search
 
Evaluation of Web Scale Discovery Services
Evaluation of Web Scale Discovery ServicesEvaluation of Web Scale Discovery Services
Evaluation of Web Scale Discovery Services
 
H0314450
H0314450H0314450
H0314450
 
A NOVEL APPROACH FOR INFORMATION RETRIEVAL TECHNIQUE FOR WEB USING NLP
A NOVEL APPROACH FOR INFORMATION RETRIEVAL TECHNIQUE FOR WEB USING NLPA NOVEL APPROACH FOR INFORMATION RETRIEVAL TECHNIQUE FOR WEB USING NLP
A NOVEL APPROACH FOR INFORMATION RETRIEVAL TECHNIQUE FOR WEB USING NLP
 
Semantic Search tutorial at SemTech 2012
Semantic Search tutorial at SemTech 2012Semantic Search tutorial at SemTech 2012
Semantic Search tutorial at SemTech 2012
 

En vedette

Blogging and Internal Communications
Blogging and Internal CommunicationsBlogging and Internal Communications
Blogging and Internal Communications
sbooth
 
Internet Is Fun
Internet Is FunInternet Is Fun
Internet Is Fun
ngkaihoe
 
Enterprise2
Enterprise2Enterprise2
Enterprise2
ngkaihoe
 
Il Web del futuro: dati strutturati e semantici in XHTML con un click - RDFa
Il Web del futuro: dati strutturati e semantici in XHTML con un click - RDFaIl Web del futuro: dati strutturati e semantici in XHTML con un click - RDFa
Il Web del futuro: dati strutturati e semantici in XHTML con un click - RDFa
Simone Onofri
 
Training Program
Training ProgramTraining Program
Training Program
ngkaihoe
 

En vedette (20)

Waiting For The Babel Fish
 Waiting For The Babel Fish Waiting For The Babel Fish
Waiting For The Babel Fish
 
Externally Hosted Web 2.0 Services
Externally Hosted Web 2.0 ServicesExternally Hosted Web 2.0 Services
Externally Hosted Web 2.0 Services
 
Blogging and Internal Communications
Blogging and Internal CommunicationsBlogging and Internal Communications
Blogging and Internal Communications
 
Italy and rome's geography
Italy and rome's geographyItaly and rome's geography
Italy and rome's geography
 
Rev1,1
Rev1,1Rev1,1
Rev1,1
 
Manpower
ManpowerManpower
Manpower
 
Elgg at the University of Brighton -- Stanier
Elgg at the University of Brighton -- StanierElgg at the University of Brighton -- Stanier
Elgg at the University of Brighton -- Stanier
 
Hea.Keynote
Hea.KeynoteHea.Keynote
Hea.Keynote
 
Internet Is Fun
Internet Is FunInternet Is Fun
Internet Is Fun
 
Introduction to CS60171 (2009)
Introduction to CS60171 (2009)Introduction to CS60171 (2009)
Introduction to CS60171 (2009)
 
Jh Student Handbook 09 10
Jh Student Handbook 09 10Jh Student Handbook 09 10
Jh Student Handbook 09 10
 
Teenagers and Blogs
Teenagers and BlogsTeenagers and Blogs
Teenagers and Blogs
 
Enterprise2
Enterprise2Enterprise2
Enterprise2
 
La scuola siamo noi: Matteucci Garibaldi
La scuola siamo noi: Matteucci Garibaldi La scuola siamo noi: Matteucci Garibaldi
La scuola siamo noi: Matteucci Garibaldi
 
giornalino3M, terzo numero
giornalino3M, terzo numerogiornalino3M, terzo numero
giornalino3M, terzo numero
 
Il Web del futuro: dati strutturati e semantici in XHTML con un click - RDFa
Il Web del futuro: dati strutturati e semantici in XHTML con un click - RDFaIl Web del futuro: dati strutturati e semantici in XHTML con un click - RDFa
Il Web del futuro: dati strutturati e semantici in XHTML con un click - RDFa
 
Policy and Strategy
Policy and StrategyPolicy and Strategy
Policy and Strategy
 
Revelation 1st
Revelation 1stRevelation 1st
Revelation 1st
 
Web 2.0 and Learning and Teaching
Web 2.0 and Learning and TeachingWeb 2.0 and Learning and Teaching
Web 2.0 and Learning and Teaching
 
Training Program
Training ProgramTraining Program
Training Program
 

Similaire à Mosiac Search Engine

Faceted Navigation (LACASIS Fall Workshop 2005)
Faceted Navigation (LACASIS Fall Workshop 2005)Faceted Navigation (LACASIS Fall Workshop 2005)
Faceted Navigation (LACASIS Fall Workshop 2005)
Bradley Allen
 
Sem tech2013 tutorial
Sem tech2013 tutorialSem tech2013 tutorial
Sem tech2013 tutorial
Thengo Kim
 
Research4C4U
Research4C4UResearch4C4U
Research4C4U
ianmcnee
 
Web-scale Discovery Implementation with the End User in Mind (SLA 2012)
Web-scale Discovery Implementation with the End User in Mind (SLA 2012)Web-scale Discovery Implementation with the End User in Mind (SLA 2012)
Web-scale Discovery Implementation with the End User in Mind (SLA 2012)
Rafal Kasprowski
 
Investing in a time of desruptive change
Investing in a time of desruptive changeInvesting in a time of desruptive change
Investing in a time of desruptive change
Jisc
 

Similaire à Mosiac Search Engine (20)

Inteligent Catalogue Final
Inteligent Catalogue FinalInteligent Catalogue Final
Inteligent Catalogue Final
 
Faceted Navigation (LACASIS Fall Workshop 2005)
Faceted Navigation (LACASIS Fall Workshop 2005)Faceted Navigation (LACASIS Fall Workshop 2005)
Faceted Navigation (LACASIS Fall Workshop 2005)
 
How discovery impacts of users' experiences
How discovery impacts of users' experiencesHow discovery impacts of users' experiences
How discovery impacts of users' experiences
 
Webscale Discovery with the Enduser in Mind
Webscale Discovery with the Enduser in Mind Webscale Discovery with the Enduser in Mind
Webscale Discovery with the Enduser in Mind
 
Sem tech2013 tutorial
Sem tech2013 tutorialSem tech2013 tutorial
Sem tech2013 tutorial
 
Recent Trends in Semantic Search Technologies
Recent Trends in Semantic Search TechnologiesRecent Trends in Semantic Search Technologies
Recent Trends in Semantic Search Technologies
 
Establishing the Connection: Creating a Linked Data Version of the BNB
Establishing the Connection: Creating a Linked Data Version of the BNBEstablishing the Connection: Creating a Linked Data Version of the BNB
Establishing the Connection: Creating a Linked Data Version of the BNB
 
2015 NISO Forum: The Future of Library Resource Discovery
2015 NISO Forum: The Future of Library Resource Discovery2015 NISO Forum: The Future of Library Resource Discovery
2015 NISO Forum: The Future of Library Resource Discovery
 
confernece paper
confernece paperconfernece paper
confernece paper
 
Falling in and out and in love with Information Architecture
Falling in and out and in love with Information ArchitectureFalling in and out and in love with Information Architecture
Falling in and out and in love with Information Architecture
 
Research4C4U
Research4C4UResearch4C4U
Research4C4U
 
Research4C4U
Research4C4UResearch4C4U
Research4C4U
 
Sweeny ux-seo om-cap 2014_v3
Sweeny ux-seo om-cap 2014_v3Sweeny ux-seo om-cap 2014_v3
Sweeny ux-seo om-cap 2014_v3
 
Using Search Analytics to Diagnose What’s Ailing your Information Architecture
Using Search Analytics to Diagnose What’s Ailing your Information ArchitectureUsing Search Analytics to Diagnose What’s Ailing your Information Architecture
Using Search Analytics to Diagnose What’s Ailing your Information Architecture
 
Erl10 web scale-gb-sg
Erl10 web scale-gb-sgErl10 web scale-gb-sg
Erl10 web scale-gb-sg
 
2008 web-managers-hwilfert-final
2008 web-managers-hwilfert-final2008 web-managers-hwilfert-final
2008 web-managers-hwilfert-final
 
Encore Presentation - ACRL/NEC ITIG Annual Meeting
Encore Presentation - ACRL/NEC ITIG Annual MeetingEncore Presentation - ACRL/NEC ITIG Annual Meeting
Encore Presentation - ACRL/NEC ITIG Annual Meeting
 
Web-scale Discovery Implementation with the End User in Mind (SLA 2012)
Web-scale Discovery Implementation with the End User in Mind (SLA 2012)Web-scale Discovery Implementation with the End User in Mind (SLA 2012)
Web-scale Discovery Implementation with the End User in Mind (SLA 2012)
 
Investing in a time of desruptive change
Investing in a time of desruptive changeInvesting in a time of desruptive change
Investing in a time of desruptive change
 
Web scale discovery service
Web scale discovery serviceWeb scale discovery service
Web scale discovery service
 

Dernier

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Dernier (20)

presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 

Mosiac Search Engine

  • 1. The Mosaic Search Engine Mark van Harmelen Hedtek Ltd markvanharmelen@gmail.comhedtek.com
  • 2. Aim Provide a proof of concept that Users can have personalised search results according to their place and stage of studies Users can adopt other personas or points-of-view to explore academic resources We can exploit ‘mass’ attention data as revealed by library circulation information So far only working with ISBN identified books
  • 3. HEI circulation data build Solr index anonymise partial Copac records annotated with use and reading list data reading lists Solr HEI anonymise front-end HEI anonymise
  • 4. Anonymisation Level 1: Current prototype, enables faceting Level 2: With extra information, enables“people who borrowed this also borrowed”and“people who borrowed this went on to borrow” Anonymisationutility provided DPA compliant, can also use fair processing agreements
  • 5. Augmenting Solr’s index Solr’s search index is loaded with items and any associated use information Use information is: institution course progression level year of use count of number of uses in that year Use information enables faceting Also add reading list info to items
  • 6. Solr OPAC resultset itemquery item data query client-side front-end (browser)
  • 7. Narrowing and broadening Thoughts (NB, ‘thoughts’) of narrowing of choice led to two features to broaden choice Don’t believe that the Mosaic demo in itself narrows when used for browsing Broadening features More like this link Reading lists
  • 8. The Harry Potter ‘problem’ and scale The Harry Potter ‘problem’: Balderdash! We can control this using Library of Congress subject categories and Dewey Decimal shelfmarks Paul Miller raises questions of scale Dave Pattern has shown success use of use data at a single (small) institution We want to leverage reasonably large scale: 3.5-4M students in HE, over say the last five years
  • 9. User context and attention Has been relatively simple to parameterise an open source search engine with user context Institution, course, progression level, academic year This is only part of the user context, can add Location Attention data, e.g., search history Further social search information
  • 10. Disclaimer The next slide is independent of any decisions on a pure data approach Could be a pure data approach in there Or maybe not
  • 11.
  • 12. Mosiac searchpersonalised/point-of-view search Massively parallel search for blindingly fast response times Data mining for library ‘stewardship’ We have prototypes for the first two, and we’re about to start experimenting with parallel search using Hadoop+Lucene
  • 13. Building institutional contributions Propose union-cat-local: Search in local library Mosaic-like search utilises local loan data if it is available Two ways to encourage library contribution of loan data (thoughts in progress) Narrow: Libraries which contribute loan data to the pool get Mosaic search over the pool Broad: Offer the contextual/PoV search available everywhere; users will agitate if they don’t see local data
  • 14. This is a Just Do It moment A national union catalogue with contextual search and local library interfaces Relatively cheap to do Potentially massive gains for learners, teachers and researchers Portends the development of shared services across the library domain and large cost savings Doesn’t preclude / agnostic on an open data approach Could incorporate a pure data service approach and/or a centralised service