SlideShare une entreprise Scribd logo
1  sur  42
Télécharger pour lire hors ligne
Metadata:
The Three-Legged Dance
Tim Spalding
ALA NISO-BISG Forum | June 23, 2017
tim@librarything.com
@LibraryThingTim
Who am I?
Book-lover, ex-scholar, programmers
LibraryThing (2005)
LibraryThing for Libraries (2007)
TinyCat (2016)
Syndetics Unbound (2016)
At the Intersection Of…
Readers
Collectors
Libraries
Academic, Public,
School, "Tiny"
Online booksellers
Bookstores
Publishers
Authors
Also: archives,
scholars, famous
dead people with
books, music and
movie lovers
Data is Good
Everyone their data
Every data its glorious purpose
Every data its data that makes it better
My Approach to Data Is…
Loving
Respectful
Flexible
Statistical
Optimistic as to what librarians can do…
The Three-Legged Stool
Professional data
User data
Content data
(a very, very simplified framework)
Professional Data
Library cataloging (MARC, BIBFRAME)
Publisher/bookseller (ONIX, Amazon, Bowker)
Classification (DDC, LCC, BIC, BISAC, LCSH)
Professional reviews
Bibliographies and guides (LibGuides, bibliographic
monographs)
Reading levels (Lexile, AR, F&P)
User Data
Intentional
User reviews
Ratings
Tags
Annotations
Lists
Discussions
User book recommendations
Implicit
Purchase patterns
Ownership patterns
Checkout patterns
Reading patterns
Popularity
Content Data
Text of book
Samples, quotes, etc.
Tables of contents
Indexes
Word and phrase statistics
In-text references and footnotes
Recommendations by bibliographic data alone
Recommendations by subject alone
Recommendations by statistics alone
Recommendations by content alone
One-Legged Stools:
"Recommendations,"
"Similar Books," etc.
One-Legged Stools:
Recommendations
Recommendations by bibliographic data alone
Recommendations by subject alone
Recommendations by statistics alone
Recommendations by content alone
Recommendations by bibliographic data alone
Recommendations by subject alone
Recommendations by statistics alone
Recommendations by content alone
One-Legged Stools:
Recommendations
Boring
Repetitive
Keep people in their bubble
No serendipity, surprise
No taste!
Recommendations too much
by statistics?
One-Legged Stools:
Recommendations
Recommendations by bibliographic data alone
Recommendations by subject alone
Recommendations by statistics alone
Recommendations by content alone
Solution: Add a leg or two…
Let users act like professionals
Use statistics on classification
"Everyone a Librarian"
Improved
Author disambiguation
(1,741,282)
Edition/work control
(5,544,233)
Canonical book titles
Series
Author name variants
Created
Work relationships
(contained in, commentary
on, parody of, etc.)
Awards
Places, characters, events
Author picture
Author information
(education, family,
occupation, nationality, etc.)
The Dewmoji !
174.3 = 💭 🚎 🙈 ⚖
1 💭 Philosophy and Psychology
7 🚎 Ethics
4 🙈 Professional Ethics
.3 ⚖ Lawyers
"Everyone's a librarian?"
Ha. Add ANOTHER leg.
Librarians at LibraryThing vet USER DATA:
Tag approval
— LibraryThing has 135m tags; 75% belong to 30,000 unique
Series approval
Award approval
Picture approval
Review approval
Solution: Add a leg or two…
Let users act like professionals.
Use user statistics on professional
data
Does that classification map to
user/usage data?
DDC against
"people who have X have Y"
Clusters well — high "salience"
618.4 — Birthing books
668.1 — Soapmaking
638.1 — Beekeeping
Clusters terribly — low "salience"
All literature in DDC
796.1 — Miscellaneous games
225.6 — New Testament > Hermeneutics, Exegesis
How we do Recommendations
Basic Factors
"People who have X have Y
statistics"
Three different statistical
approaches
Shared tags
Reorder and Drop
Ratings
Reviews
User recommendations
User up and down votes
LT Popularity curves
Library popularity curves
Tag "salience"
Tag approval
tag-to-author
Classification systems
Classification salience
Series
Series order
Series-order importance
Author clustering
In-house algorithmic genre
system
Crosswalks from genre to tag, etc.
Final factor: TASTE!
Mix of authors, popularities,
genres, etc.
Steal Someone's Leg
Users do stuff to Professional data
Users add and improve bibliographic information
Professionals do stuff to user data
Professional curation of tags, reviews
Professionals pretend to be users
Publishers suggest similar books
Random Hortatory Slogans
Use all the data you can
Free your data
Use data by others,
even distant others
Be flexible
Use statistics
Don't be afraid of users
But don't let them run
rampant either…
Cede ground …
… Take ground
Add professional value
to non-professional
data
Thank you!
tim@librarything.com
@LibraryThingTim
Idea:
What's the best shelf-order system?
Lay out an entire "typical" library in one long line by
classification
Take data on non-library clustering (e.g., people who
have X have Y)
Calculate the average distance you'd have to travel

Contenu connexe

Plus de National Information Standards Organization (NISO)

Plus de National Information Standards Organization (NISO) (20)

Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
Bazargan "NISO Webinar, Sustainability in Publishing"
Bazargan "NISO Webinar, Sustainability in Publishing"Bazargan "NISO Webinar, Sustainability in Publishing"
Bazargan "NISO Webinar, Sustainability in Publishing"
 
Rapple "Scholarly Communications and the Sustainable Development Goals"
Rapple "Scholarly Communications and the Sustainable Development Goals"Rapple "Scholarly Communications and the Sustainable Development Goals"
Rapple "Scholarly Communications and the Sustainable Development Goals"
 
Compton "NISO Webinar, Sustainability in Publishing"
Compton "NISO Webinar, Sustainability in Publishing"Compton "NISO Webinar, Sustainability in Publishing"
Compton "NISO Webinar, Sustainability in Publishing"
 
Mattingly "AI & Prompt Design: Large Language Models"
Mattingly "AI & Prompt Design: Large Language Models"Mattingly "AI & Prompt Design: Large Language Models"
Mattingly "AI & Prompt Design: Large Language Models"
 
Hazen, Morse, and Varnum "Spring 2024 ODI Conformance Statement Workshop for ...
Hazen, Morse, and Varnum "Spring 2024 ODI Conformance Statement Workshop for ...Hazen, Morse, and Varnum "Spring 2024 ODI Conformance Statement Workshop for ...
Hazen, Morse, and Varnum "Spring 2024 ODI Conformance Statement Workshop for ...
 
Mattingly "AI & Prompt Design" - Introduction to Machine Learning"
Mattingly "AI & Prompt Design" - Introduction to Machine Learning"Mattingly "AI & Prompt Design" - Introduction to Machine Learning"
Mattingly "AI & Prompt Design" - Introduction to Machine Learning"
 
Mattingly "Text and Data Mining: Building Data Driven Applications"
Mattingly "Text and Data Mining: Building Data Driven Applications"Mattingly "Text and Data Mining: Building Data Driven Applications"
Mattingly "Text and Data Mining: Building Data Driven Applications"
 
Mattingly "Text and Data Mining: Searching Vectors"
Mattingly "Text and Data Mining: Searching Vectors"Mattingly "Text and Data Mining: Searching Vectors"
Mattingly "Text and Data Mining: Searching Vectors"
 
Mattingly "Text Mining Techniques"
Mattingly "Text Mining Techniques"Mattingly "Text Mining Techniques"
Mattingly "Text Mining Techniques"
 
Mattingly "Text Processing for Library Data: Representing Text as Data"
Mattingly "Text Processing for Library Data: Representing Text as Data"Mattingly "Text Processing for Library Data: Representing Text as Data"
Mattingly "Text Processing for Library Data: Representing Text as Data"
 
Carpenter "Designing NISO's New Strategic Plan: 2023-2026"
Carpenter "Designing NISO's New Strategic Plan: 2023-2026"Carpenter "Designing NISO's New Strategic Plan: 2023-2026"
Carpenter "Designing NISO's New Strategic Plan: 2023-2026"
 
Ross and Clark "Strategic Planning"
Ross and Clark "Strategic Planning"Ross and Clark "Strategic Planning"
Ross and Clark "Strategic Planning"
 
Mattingly "Data Mining Techniques: Classification and Clustering"
Mattingly "Data Mining Techniques: Classification and Clustering"Mattingly "Data Mining Techniques: Classification and Clustering"
Mattingly "Data Mining Techniques: Classification and Clustering"
 
Straza "Global collaboration towards equitable and open science: UNESCO Recom...
Straza "Global collaboration towards equitable and open science: UNESCO Recom...Straza "Global collaboration towards equitable and open science: UNESCO Recom...
Straza "Global collaboration towards equitable and open science: UNESCO Recom...
 
Lippincott "Beyond access: Accelerating discovery and increasing trust throug...
Lippincott "Beyond access: Accelerating discovery and increasing trust throug...Lippincott "Beyond access: Accelerating discovery and increasing trust throug...
Lippincott "Beyond access: Accelerating discovery and increasing trust throug...
 
Kriegsman "Integrating Open and Equitable Research into Open Science"
Kriegsman "Integrating Open and Equitable Research into Open Science"Kriegsman "Integrating Open and Equitable Research into Open Science"
Kriegsman "Integrating Open and Equitable Research into Open Science"
 
Mattingly "Ethics and Cleaning Data"
Mattingly "Ethics and Cleaning Data"Mattingly "Ethics and Cleaning Data"
Mattingly "Ethics and Cleaning Data"
 
Mercado-Lara "Open & Equitable Program"
Mercado-Lara "Open & Equitable Program"Mercado-Lara "Open & Equitable Program"
Mercado-Lara "Open & Equitable Program"
 

Dernier

Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
heathfieldcps1
 

Dernier (20)

Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
Wellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxWellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptx
 
21st_Century_Skills_Framework_Final_Presentation_2.pptx
21st_Century_Skills_Framework_Final_Presentation_2.pptx21st_Century_Skills_Framework_Final_Presentation_2.pptx
21st_Century_Skills_Framework_Final_Presentation_2.pptx
 
Plant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptxPlant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptx
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024
 
latest AZ-104 Exam Questions and Answers
latest AZ-104 Exam Questions and Answerslatest AZ-104 Exam Questions and Answers
latest AZ-104 Exam Questions and Answers
 
How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17How to Add New Custom Addons Path in Odoo 17
How to Add New Custom Addons Path in Odoo 17
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and Modifications
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
Interdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxInterdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptx
 
Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptx
 
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptxOn_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
 
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
 
Tatlong Kwento ni Lola basyang-1.pdf arts
Tatlong Kwento ni Lola basyang-1.pdf artsTatlong Kwento ni Lola basyang-1.pdf arts
Tatlong Kwento ni Lola basyang-1.pdf arts
 

Spalding Metadata The Three Legged Dance

  • 1. Metadata: The Three-Legged Dance Tim Spalding ALA NISO-BISG Forum | June 23, 2017 tim@librarything.com @LibraryThingTim
  • 2. Who am I? Book-lover, ex-scholar, programmers LibraryThing (2005) LibraryThing for Libraries (2007) TinyCat (2016) Syndetics Unbound (2016)
  • 3.
  • 4.
  • 5. At the Intersection Of… Readers Collectors Libraries Academic, Public, School, "Tiny" Online booksellers Bookstores Publishers Authors Also: archives, scholars, famous dead people with books, music and movie lovers
  • 6. Data is Good Everyone their data Every data its glorious purpose Every data its data that makes it better
  • 7. My Approach to Data Is… Loving Respectful Flexible Statistical Optimistic as to what librarians can do…
  • 8. The Three-Legged Stool Professional data User data Content data (a very, very simplified framework)
  • 9. Professional Data Library cataloging (MARC, BIBFRAME) Publisher/bookseller (ONIX, Amazon, Bowker) Classification (DDC, LCC, BIC, BISAC, LCSH) Professional reviews Bibliographies and guides (LibGuides, bibliographic monographs) Reading levels (Lexile, AR, F&P)
  • 10. User Data Intentional User reviews Ratings Tags Annotations Lists Discussions User book recommendations Implicit Purchase patterns Ownership patterns Checkout patterns Reading patterns Popularity
  • 11. Content Data Text of book Samples, quotes, etc. Tables of contents Indexes Word and phrase statistics In-text references and footnotes
  • 12. Recommendations by bibliographic data alone Recommendations by subject alone Recommendations by statistics alone Recommendations by content alone One-Legged Stools: "Recommendations," "Similar Books," etc.
  • 13.
  • 14.
  • 15. One-Legged Stools: Recommendations Recommendations by bibliographic data alone Recommendations by subject alone Recommendations by statistics alone Recommendations by content alone
  • 16.
  • 17.
  • 18.
  • 19.
  • 20.
  • 21.
  • 22.
  • 23. Recommendations by bibliographic data alone Recommendations by subject alone Recommendations by statistics alone Recommendations by content alone One-Legged Stools: Recommendations
  • 24.
  • 25. Boring Repetitive Keep people in their bubble No serendipity, surprise No taste! Recommendations too much by statistics?
  • 26. One-Legged Stools: Recommendations Recommendations by bibliographic data alone Recommendations by subject alone Recommendations by statistics alone Recommendations by content alone
  • 27. Solution: Add a leg or two… Let users act like professionals Use statistics on classification
  • 28.
  • 29.
  • 30. "Everyone a Librarian" Improved Author disambiguation (1,741,282) Edition/work control (5,544,233) Canonical book titles Series Author name variants Created Work relationships (contained in, commentary on, parody of, etc.) Awards Places, characters, events Author picture Author information (education, family, occupation, nationality, etc.)
  • 31. The Dewmoji ! 174.3 = 💭 🚎 🙈 ⚖ 1 💭 Philosophy and Psychology 7 🚎 Ethics 4 🙈 Professional Ethics .3 ⚖ Lawyers
  • 32. "Everyone's a librarian?" Ha. Add ANOTHER leg. Librarians at LibraryThing vet USER DATA: Tag approval — LibraryThing has 135m tags; 75% belong to 30,000 unique Series approval Award approval Picture approval Review approval
  • 33. Solution: Add a leg or two… Let users act like professionals. Use user statistics on professional data Does that classification map to user/usage data?
  • 34.
  • 35.
  • 36. DDC against "people who have X have Y" Clusters well — high "salience" 618.4 — Birthing books 668.1 — Soapmaking 638.1 — Beekeeping Clusters terribly — low "salience" All literature in DDC 796.1 — Miscellaneous games 225.6 — New Testament > Hermeneutics, Exegesis
  • 37. How we do Recommendations Basic Factors "People who have X have Y statistics" Three different statistical approaches Shared tags Reorder and Drop Ratings Reviews User recommendations User up and down votes LT Popularity curves Library popularity curves Tag "salience" Tag approval tag-to-author Classification systems Classification salience Series Series order Series-order importance Author clustering In-house algorithmic genre system Crosswalks from genre to tag, etc. Final factor: TASTE! Mix of authors, popularities, genres, etc.
  • 38.
  • 39. Steal Someone's Leg Users do stuff to Professional data Users add and improve bibliographic information Professionals do stuff to user data Professional curation of tags, reviews Professionals pretend to be users Publishers suggest similar books
  • 40. Random Hortatory Slogans Use all the data you can Free your data Use data by others, even distant others Be flexible Use statistics Don't be afraid of users But don't let them run rampant either… Cede ground … … Take ground Add professional value to non-professional data
  • 42. Idea: What's the best shelf-order system? Lay out an entire "typical" library in one long line by classification Take data on non-library clustering (e.g., people who have X have Y) Calculate the average distance you'd have to travel