SlideShare une entreprise Scribd logo
1  sur  8
Télécharger pour lire hors ligne
Extract and Analyze Culture/Trend Data
from SJPL Digital Collection
Theme: Helping San Jose Public Library figure out how to make
the California Room Digital Collections more open, engaging, hackable,
linkable, browsable, tag-able, map-able and responsive.
Open Data Hack SJ
Saturday, February 21, 2015 from 9:30 AM to 5:00 PM (PST)
San Jose, CA
Hiroyuki Sato @sa2hi
OPEN DATA
• SJPL DIGITAL COLLECTIONS
• California room
• School Yearbooks
• http://www.sjpl.org/yearbooks
• Digital Data is available for San Jose High School Yearbook (1902-
1929)
• http://digitalcollections.sjlibrary.org/cdm/landingpage/collection/sjplyb
What I wanted to do / I did
• Can we see a culture/trend from the Digital Collection?
• Extract the data related to athletic teams
• Count manually the numbers of people for each sport team on yearbooks…
• https://docs.google.com/spreadsheets/d/1GhCA-I6mRZ1rs-
ORHNt1ktfrd3qn9OB4h7JeqJlaYn4/edit#gid=0
• Visualize the data
An Example of Original Data (1)
An Example of Original Data (2)
1905 1910 1915 1920 1925
Transition of numbers of team members for each sport
Issues
• A lot of missing years
• Need more meta data
• Need automated detailed metadata extraction technologies from picture and
text
• Need population/total numbers of school people to compare a data
with a data for different year
• Need other schools digital data

Contenu connexe

Similaire à Extract and Analyze Culture/Trend Data from SJPL Digital Collection

Social media and reference
Social media and referenceSocial media and reference
Social media and referenceShawn Smith
 
Information + graphics =
Information + graphics =Information + graphics =
Information + graphics =techiesue
 
Palisades High School Library Annual report 2017 2018
Palisades High School Library Annual report 2017 2018Palisades High School Library Annual report 2017 2018
Palisades High School Library Annual report 2017 2018khornberger
 
Fast, Cheap, and In-control: Evaluating Digital at the Smithsonian Archives
Fast, Cheap, and In-control: Evaluating Digital at the Smithsonian ArchivesFast, Cheap, and In-control: Evaluating Digital at the Smithsonian Archives
Fast, Cheap, and In-control: Evaluating Digital at the Smithsonian ArchivesEffie Kapsalis
 
Local History in the Classroom
Local History in the ClassroomLocal History in the Classroom
Local History in the Classroompotternmu
 
2013 ifla satellite zarndt et al [marketing cultural heritage digital collect...
2013 ifla satellite zarndt et al [marketing cultural heritage digital collect...2013 ifla satellite zarndt et al [marketing cultural heritage digital collect...
2013 ifla satellite zarndt et al [marketing cultural heritage digital collect...Frederick Zarndt
 
YALSA Write2Xpress presentation at ALA July 13 2009
YALSA Write2Xpress presentation at ALA July 13 2009YALSA Write2Xpress presentation at ALA July 13 2009
YALSA Write2Xpress presentation at ALA July 13 2009Elise C. Cole
 
Membership Solutions--New England Museum Association Workshop
Membership Solutions--New England Museum Association WorkshopMembership Solutions--New England Museum Association Workshop
Membership Solutions--New England Museum Association WorkshopLee Wright
 
Free Library Briefing
Free Library Briefing Free Library Briefing
Free Library Briefing free_library
 
Teaching Students How (Not) to Lie with Statistics
Teaching Students How (Not) to Lie with StatisticsTeaching Students How (Not) to Lie with Statistics
Teaching Students How (Not) to Lie with StatisticsLynette Hoelter
 
The Notable Reports Panel Strikes Again: WAPL 2017
The Notable Reports Panel Strikes Again: WAPL 2017The Notable Reports Panel Strikes Again: WAPL 2017
The Notable Reports Panel Strikes Again: WAPL 2017WiLS
 
Breaking Up with MARC: Are We There Yet? 2017 MCLS Linked Data Summit (03.16....
Breaking Up with MARC: Are We There Yet? 2017 MCLS Linked Data Summit (03.16....Breaking Up with MARC: Are We There Yet? 2017 MCLS Linked Data Summit (03.16....
Breaking Up with MARC: Are We There Yet? 2017 MCLS Linked Data Summit (03.16....Evansville Vanderburgh Public Library
 
Social Media Bootcamp--NICCL
Social Media Bootcamp--NICCLSocial Media Bootcamp--NICCL
Social Media Bootcamp--NICCLLaura Solomon
 
Ohio University Libraries Social Media Strategies
Ohio University Libraries Social Media StrategiesOhio University Libraries Social Media Strategies
Ohio University Libraries Social Media StrategiesJessica Hagman
 

Similaire à Extract and Analyze Culture/Trend Data from SJPL Digital Collection (20)

Social media and reference
Social media and referenceSocial media and reference
Social media and reference
 
Information + graphics =
Information + graphics =Information + graphics =
Information + graphics =
 
Palisades High School Library Annual report 2017 2018
Palisades High School Library Annual report 2017 2018Palisades High School Library Annual report 2017 2018
Palisades High School Library Annual report 2017 2018
 
Fast, Cheap, and In-control: Evaluating Digital at the Smithsonian Archives
Fast, Cheap, and In-control: Evaluating Digital at the Smithsonian ArchivesFast, Cheap, and In-control: Evaluating Digital at the Smithsonian Archives
Fast, Cheap, and In-control: Evaluating Digital at the Smithsonian Archives
 
CCFB Library Bingo at Sacramento Library
CCFB Library Bingo at Sacramento LibraryCCFB Library Bingo at Sacramento Library
CCFB Library Bingo at Sacramento Library
 
Local History in the Classroom
Local History in the ClassroomLocal History in the Classroom
Local History in the Classroom
 
2013 ifla satellite zarndt et al [marketing cultural heritage digital collect...
2013 ifla satellite zarndt et al [marketing cultural heritage digital collect...2013 ifla satellite zarndt et al [marketing cultural heritage digital collect...
2013 ifla satellite zarndt et al [marketing cultural heritage digital collect...
 
YALSA Write2Xpress presentation at ALA July 13 2009
YALSA Write2Xpress presentation at ALA July 13 2009YALSA Write2Xpress presentation at ALA July 13 2009
YALSA Write2Xpress presentation at ALA July 13 2009
 
Membership Solutions--New England Museum Association Workshop
Membership Solutions--New England Museum Association WorkshopMembership Solutions--New England Museum Association Workshop
Membership Solutions--New England Museum Association Workshop
 
652 final project
652 final project652 final project
652 final project
 
Free Library Briefing
Free Library Briefing Free Library Briefing
Free Library Briefing
 
Teaching Students How (Not) to Lie with Statistics
Teaching Students How (Not) to Lie with StatisticsTeaching Students How (Not) to Lie with Statistics
Teaching Students How (Not) to Lie with Statistics
 
Sharing Our Special Collections with the World— Lessons Learned / Geoffrey Sk...
Sharing Our Special Collections with the World— Lessons Learned / Geoffrey Sk...Sharing Our Special Collections with the World— Lessons Learned / Geoffrey Sk...
Sharing Our Special Collections with the World— Lessons Learned / Geoffrey Sk...
 
The Notable Reports Panel Strikes Again: WAPL 2017
The Notable Reports Panel Strikes Again: WAPL 2017The Notable Reports Panel Strikes Again: WAPL 2017
The Notable Reports Panel Strikes Again: WAPL 2017
 
Breaking Up with MARC: Are We There Yet? 2017 MCLS Linked Data Summit (03.16....
Breaking Up with MARC: Are We There Yet? 2017 MCLS Linked Data Summit (03.16....Breaking Up with MARC: Are We There Yet? 2017 MCLS Linked Data Summit (03.16....
Breaking Up with MARC: Are We There Yet? 2017 MCLS Linked Data Summit (03.16....
 
Spec kit 347 Webinar Presentation
Spec kit 347 Webinar PresentationSpec kit 347 Webinar Presentation
Spec kit 347 Webinar Presentation
 
Programming for Adults
Programming for Adults Programming for Adults
Programming for Adults
 
Social Media Bootcamp--NICCL
Social Media Bootcamp--NICCLSocial Media Bootcamp--NICCL
Social Media Bootcamp--NICCL
 
Social Media Plan
Social Media PlanSocial Media Plan
Social Media Plan
 
Ohio University Libraries Social Media Strategies
Ohio University Libraries Social Media StrategiesOhio University Libraries Social Media Strategies
Ohio University Libraries Social Media Strategies
 

Plus de Hiroyuki Sato

Guide for building a spending.jp site with team members
Guide for building a spending.jp site with team membersGuide for building a spending.jp site with team members
Guide for building a spending.jp site with team membersHiroyuki Sato
 
What's the point of Linked Open Data?
What's the point of Linked Open Data?What's the point of Linked Open Data?
What's the point of Linked Open Data?Hiroyuki Sato
 
LOD for Yokohama city budget plan
LOD for Yokohama city budget planLOD for Yokohama city budget plan
LOD for Yokohama city budget planHiroyuki Sato
 
LODチャレンジ応募作品からみえた日本のLODの可能性と今後の課題
LODチャレンジ応募作品からみえた日本のLODの可能性と今後の課題LODチャレンジ応募作品からみえた日本のLODの可能性と今後の課題
LODチャレンジ応募作品からみえた日本のLODの可能性と今後の課題Hiroyuki Sato
 
経済産業省工業統計データのLinked open data化のイメージ
経済産業省工業統計データのLinked open data化のイメージ経済産業省工業統計データのLinked open data化のイメージ
経済産業省工業統計データのLinked open data化のイメージHiroyuki Sato
 
実社会で進められているサービスとLOD
実社会で進められているサービスとLOD実社会で進められているサービスとLOD
実社会で進められているサービスとLODHiroyuki Sato
 

Plus de Hiroyuki Sato (6)

Guide for building a spending.jp site with team members
Guide for building a spending.jp site with team membersGuide for building a spending.jp site with team members
Guide for building a spending.jp site with team members
 
What's the point of Linked Open Data?
What's the point of Linked Open Data?What's the point of Linked Open Data?
What's the point of Linked Open Data?
 
LOD for Yokohama city budget plan
LOD for Yokohama city budget planLOD for Yokohama city budget plan
LOD for Yokohama city budget plan
 
LODチャレンジ応募作品からみえた日本のLODの可能性と今後の課題
LODチャレンジ応募作品からみえた日本のLODの可能性と今後の課題LODチャレンジ応募作品からみえた日本のLODの可能性と今後の課題
LODチャレンジ応募作品からみえた日本のLODの可能性と今後の課題
 
経済産業省工業統計データのLinked open data化のイメージ
経済産業省工業統計データのLinked open data化のイメージ経済産業省工業統計データのLinked open data化のイメージ
経済産業省工業統計データのLinked open data化のイメージ
 
実社会で進められているサービスとLOD
実社会で進められているサービスとLOD実社会で進められているサービスとLOD
実社会で進められているサービスとLOD
 

Dernier

ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Cantervoginip
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfJohn Sterrett
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Seán Kennedy
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...Amil Baba Dawood bangali
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样vhwb25kk
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.natarajan8993
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanMYRABACSAFRA2
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一fhwihughh
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectBoston Institute of Analytics
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queensdataanalyticsqueen03
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Jack DiGiovanna
 
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxNLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxBoston Institute of Analytics
 
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...ssuserf63bd7
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsVICTOR MAESTRE RAMIREZ
 

Dernier (20)

ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Canter
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdf
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population Mean
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis Project
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queens
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
 
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxNLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business Professionals
 

Extract and Analyze Culture/Trend Data from SJPL Digital Collection

  • 1. Extract and Analyze Culture/Trend Data from SJPL Digital Collection Theme: Helping San Jose Public Library figure out how to make the California Room Digital Collections more open, engaging, hackable, linkable, browsable, tag-able, map-able and responsive. Open Data Hack SJ Saturday, February 21, 2015 from 9:30 AM to 5:00 PM (PST) San Jose, CA Hiroyuki Sato @sa2hi
  • 2. OPEN DATA • SJPL DIGITAL COLLECTIONS • California room • School Yearbooks • http://www.sjpl.org/yearbooks • Digital Data is available for San Jose High School Yearbook (1902- 1929) • http://digitalcollections.sjlibrary.org/cdm/landingpage/collection/sjplyb
  • 3.
  • 4. What I wanted to do / I did • Can we see a culture/trend from the Digital Collection? • Extract the data related to athletic teams • Count manually the numbers of people for each sport team on yearbooks… • https://docs.google.com/spreadsheets/d/1GhCA-I6mRZ1rs- ORHNt1ktfrd3qn9OB4h7JeqJlaYn4/edit#gid=0 • Visualize the data
  • 5. An Example of Original Data (1)
  • 6. An Example of Original Data (2)
  • 7. 1905 1910 1915 1920 1925 Transition of numbers of team members for each sport
  • 8. Issues • A lot of missing years • Need more meta data • Need automated detailed metadata extraction technologies from picture and text • Need population/total numbers of school people to compare a data with a data for different year • Need other schools digital data