SlideShare une entreprise Scribd logo
1  sur  26
Télécharger pour lire hors ligne
JSTOR
    Advanced Technology Research



    Denver
    25th January 2008
    John Burns
    Clare Llewellyn




l
Today we will introduce a public beta of our Data for
      Research service and show you some of the other
      services that JSTOR’s advanced technology group
      is working on.

    Mission: Working with other researchers on large-
      scale text and data mining initiatives with an eye
      toward beneficial applications for scholars and
      students.




l
What is Data Mining?

“Data mining is the process of extracting hidden patterns from data”
                                                  Lyman and Varian 2003

“As data sets and the information extracted from them have grown in
   size and complexity, direct hands-on data analysis has increasingly
   been supplemented and augmented with indirect, automatic data
   processing using more complex and sophisticated tools, methods and
   models”
                                                        Kantardizic 2002

Example:
  Data mining is using consumer purchasing patterns to predict which
  products are bought together (gas and flights)




l
What is Text Mining?


“In text mining the patterns are extracted from natural language text
    rather than from structured databases of facts”
                                                      Marti Hearst 2003

“Text mining attempts to discover new, previously unknown
   information by applying techniques from information retrieval,
   natural language processing and data mining”
                                      National Text Mining Center, UK

Example:
  Looking at which words co-occur in articles that in order to predict
  interactions (magnesium and migraines)




l
Advanced Technology at JSTOR




    •  Why are we here
    •  Who we are
    •  What we are doing




l
Why are we releasing our system here?

Librarians are the point from which innovation is spread throughout the
academy

“New roles and functions for librarians include:
    •  information consultants and producers
    •  information gatekeepers and intermediators
    •  end-user educators
    •  managers and leaders
    •  data analysts in data administration centers
    •  preservers of knowledge
    •  information equalizers”
                                                               Park 1987

A Data Support Role: “Helping students get their hands dirty with the
data”
                                                        Robin Rice 2008
                     2nd DCC / RIN Research Data Management Forum


l
Who we are - Advanced Technology Research

•  A formal commitment by JSTOR to a pro-active role in technology
   innovation to face new challenges and opportunities
•  Our MO is to collaborate with and aid the scholarly community
•  We area team of world-class scientists and technologists with a proven
   track record of innovation

Mission Statement

    “The Advanced Technology Research Group is dedicated to creating,
    discovering and using relevant technologies in support of JSTOR and the
    broader scholarly community.”




l
ATR - Collaborations with the academic community.

For other researchers we provide
•  Access to large well-curated data sets
•  An exposure channel on JSTOR for research results
•  Facilities on JSTOR to expose tools and techniques to users
•  Collaboration opportunities

For JSTOR
•  We evaluate novel techniques
•  We present rapid prototypes to users
•  Develop peer relationships with research institutions
•  Bring new forms of traffic to the JSTOR data
•  Reuse JSTOR data in new and exciting ways




l
What we are doing - Projects and Partners


    •  University of Washington – Citation Network Analysis
    •  University of Princeton – Topic Analysis
    •  UIUC - Software Environment for the Advancement of Scholarly
       Research (SEASR)
    •  University of Michigan – Linguistic tools
    •  Tufts -Classics Studies
    •  University of Liverpool – OAI-ORE, Text Mining, Data Analysis
    •  University of Queensland - Annotations
    •  Los Alamos National Labs – Annotation Management
    •  DFKI (German Artificial Intelligence Centre) – Document capture
       and reconstruction / remastering.
    •  XRCE (EuroPARC, France) – Scanned Document Analysis
    •  …


l
Advanced Technology Research - Showcase



    Showcase provides a preview of interesting and useful
      technologies. It allows our research partners to demonstrate
      their tools and gain feedback and it allows JSTOR to assess
      candidate technologies before committing them to the product
      roadmap.




l
Advanced Technology Research - Showcase


    A place to expose JSTOR data and tools and to encourage new
       research

       •  Provides access to JSTOR datasets
       •  Facility to expose and use tools created by researchers from
          JSTOR and elsewhere.
       •  Explanation of ongoing research
       •  As a forum to facilitate connections between groups working with
          JSTOR data

       URL: http://showcase.jstor.org




l
Data for Research



    •  DFR is a set of web tools designed to allow for the visual
       exploration of large-scale data sets and the download of word
       frequencies in JSTOR articles

    •  Beta Version launched 01/23/09

    •  URL: http://dfr.jstor.org




l
Why Word Frequencies

    Data Requested from JSTOR users in 2008




                                              OCR Data

                                              Citation Data

                                              Usage Data

                                              Word Frequency




l
What can you do with work counts?


    Real life requests:

    “I would like to request time and word distribution frequencies in
        linguistics (specific movement removed). These sorts of
        frequencies could potentially allow me to better understand and
        delimit the formation of groups, and the underlying impetus
        behind these groups as expressed in linguistic form.”

    “I would like to create subject headings for material, using word
        frequency as a guide to selecting the appropriate terms for the
        headings.”




l
DFR – DEMO!



                  http://dfr.jstor.org




l
DFR – Front Page




l
Thefe




l
Hath Pre - 1900




l
Hath – post 1900




l
Chymistry




l
Download Page




l
Files Downloaded




l
l
                           4




           0
               1
                   2
                       3
                               5
                                   6
                                       7
    1666                                   8
    1669
    1672
    1675
    1683
    1692
    1697
    1703
    1712
    1738
    1765
    1783
    1801
    1889
    1907
    1916
    1921
    1928
    1931
    1936
    1941
    1945
    1950
    1953
    1956
    1960
    1964
    1967
    1971
    1974
    1980
    1983
                                               Chart to show the use of the word Chymistry




    1987
    1990
    1993
    1996
    1999
    2002
    2005
l
3 Journals from 1957




The Annals Mathematics   American Journal Nursing   Agricultural History




l
Any questions / feedback?

      Please take a look at the site and tell us what you think.
                       Email: dfr@jstor.org

    Contact details
    Email: clare.llewellyn@jstor.org
    Phone: 609-986-2282




l

Contenu connexe

Tendances

Metadata enriching and discovery at Solent University Library
Metadata enriching and discovery at Solent University Library Metadata enriching and discovery at Solent University Library
Metadata enriching and discovery at Solent University Library Getaneh Alemu
 
Metadata for digital humanities
Metadata for digital humanities Metadata for digital humanities
Metadata for digital humanities Getaneh Alemu
 
Metadata enriching and discovery
Metadata enriching and discovery Metadata enriching and discovery
Metadata enriching and discovery Getaneh Alemu
 
Cleveland & Western Reserve Digital Text Collection Project - Suzhen Chen & R...
Cleveland & Western Reserve Digital Text Collection Project - Suzhen Chen & R...Cleveland & Western Reserve Digital Text Collection Project - Suzhen Chen & R...
Cleveland & Western Reserve Digital Text Collection Project - Suzhen Chen & R...CALA-MW
 
Linked Open Data for Digital Humanities
Linked Open Data for Digital HumanitiesLinked Open Data for Digital Humanities
Linked Open Data for Digital HumanitiesChristophe Guéret
 
Europeana and open data
Europeana and open dataEuropeana and open data
Europeana and open dataRobinaClayphan
 
ESIP Commons Presentation
ESIP Commons PresentationESIP Commons Presentation
ESIP Commons PresentationErin Robinson
 
Research Data-DOI Experiment in Japanese DOI Registration Agency (Japan Link ...
Research Data-DOI Experiment in Japanese DOI Registration Agency (Japan Link ...Research Data-DOI Experiment in Japanese DOI Registration Agency (Japan Link ...
Research Data-DOI Experiment in Japanese DOI Registration Agency (Japan Link ...National Institute of Informatics (NII)
 
Intro to Linked Open Data in Libraries, Archives & Museums
Intro to Linked Open Data in Libraries, Archives & MuseumsIntro to Linked Open Data in Libraries, Archives & Museums
Intro to Linked Open Data in Libraries, Archives & MuseumsJon Voss
 
Building the New Open Linked Library
Building the New Open Linked LibraryBuilding the New Open Linked Library
Building the New Open Linked LibraryJoel Richard
 
Scratchpad training
Scratchpad trainingScratchpad training
Scratchpad trainingVince Smith
 
鏈結資料在圖書館的應用20131107
鏈結資料在圖書館的應用20131107鏈結資料在圖書館的應用20131107
鏈結資料在圖書館的應用20131107皓仁 柯
 
When the Web of Linked Data Arrives
When the Web of Linked Data ArrivesWhen the Web of Linked Data Arrives
When the Web of Linked Data ArrivesRichard Wallis
 
Rebecca Grant - DRI Training Series: 1. Organising Your Collection
Rebecca Grant - DRI Training Series: 1. Organising Your Collection Rebecca Grant - DRI Training Series: 1. Organising Your Collection
Rebecca Grant - DRI Training Series: 1. Organising Your Collection dri_ireland
 
Linked Data for Libraries: Benefits of a Conceptual Shift from Library-Specif...
Linked Data for Libraries: Benefits of a Conceptual Shift from Library-Specif...Linked Data for Libraries: Benefits of a Conceptual Shift from Library-Specif...
Linked Data for Libraries: Benefits of a Conceptual Shift from Library-Specif...Getaneh Alemu
 
UW Libraries Data Services Forum
UW Libraries Data Services ForumUW Libraries Data Services Forum
UW Libraries Data Services ForumStephanie Wright
 

Tendances (20)

Metadata enriching and discovery at Solent University Library
Metadata enriching and discovery at Solent University Library Metadata enriching and discovery at Solent University Library
Metadata enriching and discovery at Solent University Library
 
Metadata for digital humanities
Metadata for digital humanities Metadata for digital humanities
Metadata for digital humanities
 
Gonzalez-8-jun15
Gonzalez-8-jun15Gonzalez-8-jun15
Gonzalez-8-jun15
 
Metadata enriching and discovery
Metadata enriching and discovery Metadata enriching and discovery
Metadata enriching and discovery
 
Cleveland & Western Reserve Digital Text Collection Project - Suzhen Chen & R...
Cleveland & Western Reserve Digital Text Collection Project - Suzhen Chen & R...Cleveland & Western Reserve Digital Text Collection Project - Suzhen Chen & R...
Cleveland & Western Reserve Digital Text Collection Project - Suzhen Chen & R...
 
Linked Open Data for Digital Humanities
Linked Open Data for Digital HumanitiesLinked Open Data for Digital Humanities
Linked Open Data for Digital Humanities
 
A theory of Metadata enriching & filtering
A theory of  Metadata enriching & filteringA theory of  Metadata enriching & filtering
A theory of Metadata enriching & filtering
 
Europeana and open data
Europeana and open dataEuropeana and open data
Europeana and open data
 
ESIP Commons Presentation
ESIP Commons PresentationESIP Commons Presentation
ESIP Commons Presentation
 
Research Data-DOI Experiment in Japanese DOI Registration Agency (Japan Link ...
Research Data-DOI Experiment in Japanese DOI Registration Agency (Japan Link ...Research Data-DOI Experiment in Japanese DOI Registration Agency (Japan Link ...
Research Data-DOI Experiment in Japanese DOI Registration Agency (Japan Link ...
 
Intro to Linked Open Data in Libraries, Archives & Museums
Intro to Linked Open Data in Libraries, Archives & MuseumsIntro to Linked Open Data in Libraries, Archives & Museums
Intro to Linked Open Data in Libraries, Archives & Museums
 
Building the New Open Linked Library
Building the New Open Linked LibraryBuilding the New Open Linked Library
Building the New Open Linked Library
 
Scratchpad training
Scratchpad trainingScratchpad training
Scratchpad training
 
Open Science and Identifiers
Open Science and IdentifiersOpen Science and Identifiers
Open Science and Identifiers
 
鏈結資料在圖書館的應用20131107
鏈結資料在圖書館的應用20131107鏈結資料在圖書館的應用20131107
鏈結資料在圖書館的應用20131107
 
When the Web of Linked Data Arrives
When the Web of Linked Data ArrivesWhen the Web of Linked Data Arrives
When the Web of Linked Data Arrives
 
Rebecca Grant - DRI Training Series: 1. Organising Your Collection
Rebecca Grant - DRI Training Series: 1. Organising Your Collection Rebecca Grant - DRI Training Series: 1. Organising Your Collection
Rebecca Grant - DRI Training Series: 1. Organising Your Collection
 
Linked Data for Libraries: Benefits of a Conceptual Shift from Library-Specif...
Linked Data for Libraries: Benefits of a Conceptual Shift from Library-Specif...Linked Data for Libraries: Benefits of a Conceptual Shift from Library-Specif...
Linked Data for Libraries: Benefits of a Conceptual Shift from Library-Specif...
 
UW Libraries Data Services Forum
UW Libraries Data Services ForumUW Libraries Data Services Forum
UW Libraries Data Services Forum
 
A Clean Slate?
A Clean Slate?A Clean Slate?
A Clean Slate?
 

En vedette

EngWri 300 (Magneson)
EngWri 300 (Magneson)EngWri 300 (Magneson)
EngWri 300 (Magneson)karlsen
 
JSTOR Sustainabilty: Supporting Multidisciplinary Researchers
JSTOR Sustainabilty: Supporting Multidisciplinary ResearchersJSTOR Sustainabilty: Supporting Multidisciplinary Researchers
JSTOR Sustainabilty: Supporting Multidisciplinary ResearchersAlex Humphreys
 
The Earths Crust.Ppt 1.Ppt 4
The Earths Crust.Ppt 1.Ppt 4The Earths Crust.Ppt 1.Ppt 4
The Earths Crust.Ppt 1.Ppt 4derekfun
 
Discovery and analysis of the world's research collections: JSTOR and Summon ...
Discovery and analysis of the world's research collections: JSTOR and Summon ...Discovery and analysis of the world's research collections: JSTOR and Summon ...
Discovery and analysis of the world's research collections: JSTOR and Summon ...NASIG
 
HR Records & Reports
HR Records & ReportsHR Records & Reports
HR Records & Reportskris.j
 
Personnel records, audit and research - HR Audit
Personnel records, audit and research - HR AuditPersonnel records, audit and research - HR Audit
Personnel records, audit and research - HR AuditTanuj Poddar
 

En vedette (7)

EngWri 300 (Magneson)
EngWri 300 (Magneson)EngWri 300 (Magneson)
EngWri 300 (Magneson)
 
JSTOR Sustainabilty: Supporting Multidisciplinary Researchers
JSTOR Sustainabilty: Supporting Multidisciplinary ResearchersJSTOR Sustainabilty: Supporting Multidisciplinary Researchers
JSTOR Sustainabilty: Supporting Multidisciplinary Researchers
 
The Earths Crust.Ppt 1.Ppt 4
The Earths Crust.Ppt 1.Ppt 4The Earths Crust.Ppt 1.Ppt 4
The Earths Crust.Ppt 1.Ppt 4
 
Discovery and analysis of the world's research collections: JSTOR and Summon ...
Discovery and analysis of the world's research collections: JSTOR and Summon ...Discovery and analysis of the world's research collections: JSTOR and Summon ...
Discovery and analysis of the world's research collections: JSTOR and Summon ...
 
HR Records & Reports
HR Records & ReportsHR Records & Reports
HR Records & Reports
 
Personnel records, audit and research - HR Audit
Personnel records, audit and research - HR AuditPersonnel records, audit and research - HR Audit
Personnel records, audit and research - HR Audit
 
Hr audit
Hr auditHr audit
Hr audit
 

Similaire à Data for Research (DfR) service

ALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & MuseumsALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & MuseumsJon Voss
 
Leslie Johnston: Library Big Data Repository Services, Open Repositories 2012
Leslie Johnston: Library Big Data Repository Services, Open Repositories 2012Leslie Johnston: Library Big Data Repository Services, Open Repositories 2012
Leslie Johnston: Library Big Data Repository Services, Open Repositories 2012lljohnston
 
Leslie Johnston Keynote, Best Practices Exchange 2011
Leslie Johnston Keynote, Best Practices Exchange 2011Leslie Johnston Keynote, Best Practices Exchange 2011
Leslie Johnston Keynote, Best Practices Exchange 2011lljohnston
 
RDA - The Research Data Alliance in a Nutshell
RDA - The Research Data Alliance in a NutshellRDA - The Research Data Alliance in a Nutshell
RDA - The Research Data Alliance in a NutshellResearch Data Alliance
 
Linked Open Data in Libraries, Archives & Museums
Linked Open Data in Libraries, Archives & MuseumsLinked Open Data in Libraries, Archives & Museums
Linked Open Data in Libraries, Archives & MuseumsJon Voss
 
Rethink research, illuminate history with the British Library
Rethink research, illuminate history with the British LibraryRethink research, illuminate history with the British Library
Rethink research, illuminate history with the British LibraryMia
 
Open science, open-source, and open data: Collaboration as an emergent property?
Open science, open-source, and open data: Collaboration as an emergent property?Open science, open-source, and open data: Collaboration as an emergent property?
Open science, open-source, and open data: Collaboration as an emergent property?Hilmar Lapp
 
Cosi Usage Data
Cosi   Usage DataCosi   Usage Data
Cosi Usage Datadaveyp
 
2013 DataCite Summer Meeting - Purdue University Research Repository (PURR) (...
2013 DataCite Summer Meeting - Purdue University Research Repository (PURR) (...2013 DataCite Summer Meeting - Purdue University Research Repository (PURR) (...
2013 DataCite Summer Meeting - Purdue University Research Repository (PURR) (...datacite
 
Foundations to Actions: Extending Innovations to Digital Libraries in Partner...
Foundations to Actions: Extending Innovations to Digital Libraries in Partner...Foundations to Actions: Extending Innovations to Digital Libraries in Partner...
Foundations to Actions: Extending Innovations to Digital Libraries in Partner...Trish Rose-Sandler
 
RDA Presentation to the International Federation of Library Associations
RDA Presentation to the International Federation of Library AssociationsRDA Presentation to the International Federation of Library Associations
RDA Presentation to the International Federation of Library AssociationsResearch Data Alliance
 
EOSC and libraries
EOSC and librariesEOSC and libraries
EOSC and librariesSarah Jones
 
Data curator: who is s / he?
Findings of the IFLA Library Theory and Research...
Data curator: who is s / he?
Findings of the IFLA Library Theory and Research...Data curator: who is s / he?
Findings of the IFLA Library Theory and Research...
Data curator: who is s / he?
Findings of the IFLA Library Theory and Research...Anna Maria Tammaro
 
APLIC 2012: Discovering & Dealing with Data
APLIC 2012: Discovering & Dealing with DataAPLIC 2012: Discovering & Dealing with Data
APLIC 2012: Discovering & Dealing with DataHamilton Public Library
 

Similaire à Data for Research (DfR) service (20)

ALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & MuseumsALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
 
C N I20080404
C N I20080404C N I20080404
C N I20080404
 
Torsten Reimer
Torsten ReimerTorsten Reimer
Torsten Reimer
 
Leslie Johnston: Library Big Data Repository Services, Open Repositories 2012
Leslie Johnston: Library Big Data Repository Services, Open Repositories 2012Leslie Johnston: Library Big Data Repository Services, Open Repositories 2012
Leslie Johnston: Library Big Data Repository Services, Open Repositories 2012
 
Leslie Johnston Keynote, Best Practices Exchange 2011
Leslie Johnston Keynote, Best Practices Exchange 2011Leslie Johnston Keynote, Best Practices Exchange 2011
Leslie Johnston Keynote, Best Practices Exchange 2011
 
RDA - The Research Data Alliance in a Nutshell
RDA - The Research Data Alliance in a NutshellRDA - The Research Data Alliance in a Nutshell
RDA - The Research Data Alliance in a Nutshell
 
Linked Open Data in Libraries, Archives & Museums
Linked Open Data in Libraries, Archives & MuseumsLinked Open Data in Libraries, Archives & Museums
Linked Open Data in Libraries, Archives & Museums
 
Rethink research, illuminate history with the British Library
Rethink research, illuminate history with the British LibraryRethink research, illuminate history with the British Library
Rethink research, illuminate history with the British Library
 
Our World is Socio-technical
Our World is Socio-technicalOur World is Socio-technical
Our World is Socio-technical
 
Open science, open-source, and open data: Collaboration as an emergent property?
Open science, open-source, and open data: Collaboration as an emergent property?Open science, open-source, and open data: Collaboration as an emergent property?
Open science, open-source, and open data: Collaboration as an emergent property?
 
Sensors1(1)
Sensors1(1)Sensors1(1)
Sensors1(1)
 
Cosi Usage Data
Cosi   Usage DataCosi   Usage Data
Cosi Usage Data
 
2013 DataCite Summer Meeting - Purdue University Research Repository (PURR) (...
2013 DataCite Summer Meeting - Purdue University Research Repository (PURR) (...2013 DataCite Summer Meeting - Purdue University Research Repository (PURR) (...
2013 DataCite Summer Meeting - Purdue University Research Repository (PURR) (...
 
Foundations to Actions: Extending Innovations to Digital Libraries in Partner...
Foundations to Actions: Extending Innovations to Digital Libraries in Partner...Foundations to Actions: Extending Innovations to Digital Libraries in Partner...
Foundations to Actions: Extending Innovations to Digital Libraries in Partner...
 
RDA Presentation to the International Federation of Library Associations
RDA Presentation to the International Federation of Library AssociationsRDA Presentation to the International Federation of Library Associations
RDA Presentation to the International Federation of Library Associations
 
Ir1
Ir1Ir1
Ir1
 
Open Science
Open Science Open Science
Open Science
 
EOSC and libraries
EOSC and librariesEOSC and libraries
EOSC and libraries
 
Data curator: who is s / he?
Findings of the IFLA Library Theory and Research...
Data curator: who is s / he?
Findings of the IFLA Library Theory and Research...Data curator: who is s / he?
Findings of the IFLA Library Theory and Research...
Data curator: who is s / he?
Findings of the IFLA Library Theory and Research...
 
APLIC 2012: Discovering & Dealing with Data
APLIC 2012: Discovering & Dealing with DataAPLIC 2012: Discovering & Dealing with Data
APLIC 2012: Discovering & Dealing with Data
 

Plus de historiaimedia

Perspektywy oddolnej digitalizacji
Perspektywy oddolnej digitalizacjiPerspektywy oddolnej digitalizacji
Perspektywy oddolnej digitalizacjihistoriaimedia
 
The Social Use of Digital History (presentation)
The Social Use of Digital History (presentation)The Social Use of Digital History (presentation)
The Social Use of Digital History (presentation)historiaimedia
 
Crowdsourcing 2010 05_05
Crowdsourcing 2010 05_05Crowdsourcing 2010 05_05
Crowdsourcing 2010 05_05historiaimedia
 
historia i dziedzictwo w kulturze uczestnictwa
historia i dziedzictwo w kulturze uczestnictwahistoria i dziedzictwo w kulturze uczestnictwa
historia i dziedzictwo w kulturze uczestnictwahistoriaimedia
 
Prezentacja Archiwa KARTY w Internecie
Prezentacja Archiwa KARTY w InterneciePrezentacja Archiwa KARTY w Internecie
Prezentacja Archiwa KARTY w Interneciehistoriaimedia
 
“Methodology for the Infinite Archive”: Exploring the Implications of Digital...
“Methodology for the Infinite Archive”: Exploring the Implications of Digital...“Methodology for the Infinite Archive”: Exploring the Implications of Digital...
“Methodology for the Infinite Archive”: Exploring the Implications of Digital...historiaimedia
 
Strategie wykorzystania Internetu w nauce historycznej
Strategie wykorzystania Internetu w nauce historycznejStrategie wykorzystania Internetu w nauce historycznej
Strategie wykorzystania Internetu w nauce historycznejhistoriaimedia
 

Plus de historiaimedia (9)

Perspektywy oddolnej digitalizacji
Perspektywy oddolnej digitalizacjiPerspektywy oddolnej digitalizacji
Perspektywy oddolnej digitalizacji
 
The Social Use of Digital History (presentation)
The Social Use of Digital History (presentation)The Social Use of Digital History (presentation)
The Social Use of Digital History (presentation)
 
428348032942
428348032942428348032942
428348032942
 
Crowdsourcing 2010 05_05
Crowdsourcing 2010 05_05Crowdsourcing 2010 05_05
Crowdsourcing 2010 05_05
 
historia i dziedzictwo w kulturze uczestnictwa
historia i dziedzictwo w kulturze uczestnictwahistoria i dziedzictwo w kulturze uczestnictwa
historia i dziedzictwo w kulturze uczestnictwa
 
Prezentacja Archiwa KARTY w Internecie
Prezentacja Archiwa KARTY w InterneciePrezentacja Archiwa KARTY w Internecie
Prezentacja Archiwa KARTY w Internecie
 
Schemat prezentacji
Schemat prezentacjiSchemat prezentacji
Schemat prezentacji
 
“Methodology for the Infinite Archive”: Exploring the Implications of Digital...
“Methodology for the Infinite Archive”: Exploring the Implications of Digital...“Methodology for the Infinite Archive”: Exploring the Implications of Digital...
“Methodology for the Infinite Archive”: Exploring the Implications of Digital...
 
Strategie wykorzystania Internetu w nauce historycznej
Strategie wykorzystania Internetu w nauce historycznejStrategie wykorzystania Internetu w nauce historycznej
Strategie wykorzystania Internetu w nauce historycznej
 

Dernier

Mental Health Awareness - a toolkit for supporting young minds
Mental Health Awareness - a toolkit for supporting young mindsMental Health Awareness - a toolkit for supporting young minds
Mental Health Awareness - a toolkit for supporting young mindsPooky Knightsmith
 
Transaction Management in Database Management System
Transaction Management in Database Management SystemTransaction Management in Database Management System
Transaction Management in Database Management SystemChristalin Nelson
 
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnv
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnvESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnv
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnvRicaMaeCastro1
 
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxQ4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxlancelewisportillo
 
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptxDecoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptxDhatriParmar
 
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...Team Lead Succeed – Helping you and your team achieve high-performance teamwo...
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...Association for Project Management
 
MS4 level being good citizen -imperative- (1) (1).pdf
MS4 level   being good citizen -imperative- (1) (1).pdfMS4 level   being good citizen -imperative- (1) (1).pdf
MS4 level being good citizen -imperative- (1) (1).pdfMr Bounab Samir
 
Using Grammatical Signals Suitable to Patterns of Idea Development
Using Grammatical Signals Suitable to Patterns of Idea DevelopmentUsing Grammatical Signals Suitable to Patterns of Idea Development
Using Grammatical Signals Suitable to Patterns of Idea Developmentchesterberbo7
 
Textual Evidence in Reading and Writing of SHS
Textual Evidence in Reading and Writing of SHSTextual Evidence in Reading and Writing of SHS
Textual Evidence in Reading and Writing of SHSMae Pangan
 
Grade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptxGrade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptxkarenfajardo43
 
Oppenheimer Film Discussion for Philosophy and Film
Oppenheimer Film Discussion for Philosophy and FilmOppenheimer Film Discussion for Philosophy and Film
Oppenheimer Film Discussion for Philosophy and FilmStan Meyer
 
Scientific Writing :Research Discourse
Scientific  Writing :Research  DiscourseScientific  Writing :Research  Discourse
Scientific Writing :Research DiscourseAnita GoswamiGiri
 
4.11.24 Poverty and Inequality in America.pptx
4.11.24 Poverty and Inequality in America.pptx4.11.24 Poverty and Inequality in America.pptx
4.11.24 Poverty and Inequality in America.pptxmary850239
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management systemChristalin Nelson
 
Measures of Position DECILES for ungrouped data
Measures of Position DECILES for ungrouped dataMeasures of Position DECILES for ungrouped data
Measures of Position DECILES for ungrouped dataBabyAnnMotar
 
How to Fix XML SyntaxError in Odoo the 17
How to Fix XML SyntaxError in Odoo the 17How to Fix XML SyntaxError in Odoo the 17
How to Fix XML SyntaxError in Odoo the 17Celine George
 
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...DhatriParmar
 
4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptxmary850239
 

Dernier (20)

Mental Health Awareness - a toolkit for supporting young minds
Mental Health Awareness - a toolkit for supporting young mindsMental Health Awareness - a toolkit for supporting young minds
Mental Health Awareness - a toolkit for supporting young minds
 
Transaction Management in Database Management System
Transaction Management in Database Management SystemTransaction Management in Database Management System
Transaction Management in Database Management System
 
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnv
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnvESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnv
ESP 4-EDITED.pdfmmcncncncmcmmnmnmncnmncmnnjvnnv
 
INCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptx
INCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptxINCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptx
INCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptx
 
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxQ4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
 
Mattingly "AI & Prompt Design: Large Language Models"
Mattingly "AI & Prompt Design: Large Language Models"Mattingly "AI & Prompt Design: Large Language Models"
Mattingly "AI & Prompt Design: Large Language Models"
 
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptxDecoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
 
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...Team Lead Succeed – Helping you and your team achieve high-performance teamwo...
Team Lead Succeed – Helping you and your team achieve high-performance teamwo...
 
MS4 level being good citizen -imperative- (1) (1).pdf
MS4 level   being good citizen -imperative- (1) (1).pdfMS4 level   being good citizen -imperative- (1) (1).pdf
MS4 level being good citizen -imperative- (1) (1).pdf
 
Using Grammatical Signals Suitable to Patterns of Idea Development
Using Grammatical Signals Suitable to Patterns of Idea DevelopmentUsing Grammatical Signals Suitable to Patterns of Idea Development
Using Grammatical Signals Suitable to Patterns of Idea Development
 
Textual Evidence in Reading and Writing of SHS
Textual Evidence in Reading and Writing of SHSTextual Evidence in Reading and Writing of SHS
Textual Evidence in Reading and Writing of SHS
 
Grade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptxGrade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptx
 
Oppenheimer Film Discussion for Philosophy and Film
Oppenheimer Film Discussion for Philosophy and FilmOppenheimer Film Discussion for Philosophy and Film
Oppenheimer Film Discussion for Philosophy and Film
 
Scientific Writing :Research Discourse
Scientific  Writing :Research  DiscourseScientific  Writing :Research  Discourse
Scientific Writing :Research Discourse
 
4.11.24 Poverty and Inequality in America.pptx
4.11.24 Poverty and Inequality in America.pptx4.11.24 Poverty and Inequality in America.pptx
4.11.24 Poverty and Inequality in America.pptx
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management system
 
Measures of Position DECILES for ungrouped data
Measures of Position DECILES for ungrouped dataMeasures of Position DECILES for ungrouped data
Measures of Position DECILES for ungrouped data
 
How to Fix XML SyntaxError in Odoo the 17
How to Fix XML SyntaxError in Odoo the 17How to Fix XML SyntaxError in Odoo the 17
How to Fix XML SyntaxError in Odoo the 17
 
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...
 
4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx
 

Data for Research (DfR) service

  • 1. JSTOR Advanced Technology Research Denver 25th January 2008 John Burns Clare Llewellyn l
  • 2. Today we will introduce a public beta of our Data for Research service and show you some of the other services that JSTOR’s advanced technology group is working on. Mission: Working with other researchers on large- scale text and data mining initiatives with an eye toward beneficial applications for scholars and students. l
  • 3. What is Data Mining? “Data mining is the process of extracting hidden patterns from data” Lyman and Varian 2003 “As data sets and the information extracted from them have grown in size and complexity, direct hands-on data analysis has increasingly been supplemented and augmented with indirect, automatic data processing using more complex and sophisticated tools, methods and models” Kantardizic 2002 Example: Data mining is using consumer purchasing patterns to predict which products are bought together (gas and flights) l
  • 4. What is Text Mining? “In text mining the patterns are extracted from natural language text rather than from structured databases of facts” Marti Hearst 2003 “Text mining attempts to discover new, previously unknown information by applying techniques from information retrieval, natural language processing and data mining” National Text Mining Center, UK Example: Looking at which words co-occur in articles that in order to predict interactions (magnesium and migraines) l
  • 5. Advanced Technology at JSTOR •  Why are we here •  Who we are •  What we are doing l
  • 6. Why are we releasing our system here? Librarians are the point from which innovation is spread throughout the academy “New roles and functions for librarians include: •  information consultants and producers •  information gatekeepers and intermediators •  end-user educators •  managers and leaders •  data analysts in data administration centers •  preservers of knowledge •  information equalizers” Park 1987 A Data Support Role: “Helping students get their hands dirty with the data” Robin Rice 2008 2nd DCC / RIN Research Data Management Forum l
  • 7. Who we are - Advanced Technology Research •  A formal commitment by JSTOR to a pro-active role in technology innovation to face new challenges and opportunities •  Our MO is to collaborate with and aid the scholarly community •  We area team of world-class scientists and technologists with a proven track record of innovation Mission Statement “The Advanced Technology Research Group is dedicated to creating, discovering and using relevant technologies in support of JSTOR and the broader scholarly community.” l
  • 8. ATR - Collaborations with the academic community. For other researchers we provide •  Access to large well-curated data sets •  An exposure channel on JSTOR for research results •  Facilities on JSTOR to expose tools and techniques to users •  Collaboration opportunities For JSTOR •  We evaluate novel techniques •  We present rapid prototypes to users •  Develop peer relationships with research institutions •  Bring new forms of traffic to the JSTOR data •  Reuse JSTOR data in new and exciting ways l
  • 9. What we are doing - Projects and Partners •  University of Washington – Citation Network Analysis •  University of Princeton – Topic Analysis •  UIUC - Software Environment for the Advancement of Scholarly Research (SEASR) •  University of Michigan – Linguistic tools •  Tufts -Classics Studies •  University of Liverpool – OAI-ORE, Text Mining, Data Analysis •  University of Queensland - Annotations •  Los Alamos National Labs – Annotation Management •  DFKI (German Artificial Intelligence Centre) – Document capture and reconstruction / remastering. •  XRCE (EuroPARC, France) – Scanned Document Analysis •  … l
  • 10. Advanced Technology Research - Showcase Showcase provides a preview of interesting and useful technologies. It allows our research partners to demonstrate their tools and gain feedback and it allows JSTOR to assess candidate technologies before committing them to the product roadmap. l
  • 11. Advanced Technology Research - Showcase A place to expose JSTOR data and tools and to encourage new research •  Provides access to JSTOR datasets •  Facility to expose and use tools created by researchers from JSTOR and elsewhere. •  Explanation of ongoing research •  As a forum to facilitate connections between groups working with JSTOR data URL: http://showcase.jstor.org l
  • 12. Data for Research •  DFR is a set of web tools designed to allow for the visual exploration of large-scale data sets and the download of word frequencies in JSTOR articles •  Beta Version launched 01/23/09 •  URL: http://dfr.jstor.org l
  • 13. Why Word Frequencies Data Requested from JSTOR users in 2008 OCR Data Citation Data Usage Data Word Frequency l
  • 14. What can you do with work counts? Real life requests: “I would like to request time and word distribution frequencies in linguistics (specific movement removed). These sorts of frequencies could potentially allow me to better understand and delimit the formation of groups, and the underlying impetus behind these groups as expressed in linguistic form.” “I would like to create subject headings for material, using word frequency as a guide to selecting the appropriate terms for the headings.” l
  • 15. DFR – DEMO! http://dfr.jstor.org l
  • 16. DFR – Front Page l
  • 18. Hath Pre - 1900 l
  • 19. Hath – post 1900 l
  • 23. l 4 0 1 2 3 5 6 7 1666 8 1669 1672 1675 1683 1692 1697 1703 1712 1738 1765 1783 1801 1889 1907 1916 1921 1928 1931 1936 1941 1945 1950 1953 1956 1960 1964 1967 1971 1974 1980 1983 Chart to show the use of the word Chymistry 1987 1990 1993 1996 1999 2002 2005
  • 24. l
  • 25. 3 Journals from 1957 The Annals Mathematics American Journal Nursing Agricultural History l
  • 26. Any questions / feedback? Please take a look at the site and tell us what you think. Email: dfr@jstor.org Contact details Email: clare.llewellyn@jstor.org Phone: 609-986-2282 l