1. Metadata and its Relationship
to Journal Workflow
SSP Seminar in Philadelphia, PA
David Yakimischak, Chief Technology Officer
November 18, 2004 www.jstor.org
2. Agenda
• What is JSTOR?
• What is Metadata?
• What is Journal Workflow?
• What is the relationship?
• Discussion
3. JSTOR Mission
• JSTOR is a not-for-profit organization with a
mission to help the scholarly community take
advantage of the advances in information
technology. This includes: (1) building a
reliable and comprehensive archive of core
scholarly journals, and (2) dramatically
improve access to this scholarly material
• In pursuing its mission, JSTOR takes a
system-wide perspective, seeking benefits
for libraries, publishers and scholars
4. JSTOR Today
• 2,160 participating libraries
• 269 participating publishers
• 449 journals online
• 16,379,559 pages scanned (and
counting!)
• Formed an Electronic Archiving
initiative in 2003
5. Jan
-97
5,000,000
10,000,000
15,000,000
20,000,000
25,000,000
0
Ap
r-9
7
Ju
l-9
7
Oc
t-9
7
Jan
-98
Ap
r-9
8
Jul
-98
Oc
t-98
Jan
-99
Ap
r-9
9
Ju
l-9
9
Oc
t-9
9
Jan
-00
Ap
r-0
0
Jul
-00
Oc
t-00
Jan
-01
Ap
r-0
1
Ju
l-0
1
Oc
t-0
1
Meaningful Accesses per Month
Jan
-02
Ap
r-0
2
Jul
-02
Oc
t-02
Jan
-03
Ap
r-0
3
Ju
l-0
3
Oc
t-0
3
Jan
-04
Ap
r-0
4
Jul
-04
Oc
t-04
JSTOR Monthly Usage
6. What is Metadata?
• Data about data
• Internally we outlaw the unqualified
use of the word metadata
• Three broad categories
• Descriptive Metadata
• Technical Metadata
• Administrative Metadata
7. Descriptive Metadata
• Describes the content or underlying asset
• Bibliographic information
• Title, author, journal, date, page, etc.
• Used to create citations
• Use to categorize or organize information
• Creates the browse interface to content
8. Technical Metadata
• Describes the structure or format of an
item
• Allows for verification or validation
• Typically used for internal operations
• A registry can be used to catalog these
• Commonly needs to be migrated,
although this applies to all metadata
• Most suitable for automated processing
9. Administrative Metadata
• Generally deals with how an item has
been handled
• Most relevant to the archival records
• Tracks history of change or access
• Closely related to workflow
• Can be gathered automatically and
unobtrusively
10. What is Journal Workflow?
• The process of creating the finished
product
• Can be formal or ad-hoc
• Automated or manual
• Stress-relieving or stress-inducing
11. What is the relationship?
• Metadata can be managed during the journal
workflow process
• Added, ingested, modified, calculated, stored, versioned
• Metadata is used to drive the workflow process
itself
• The intended use must be considered carefully
• Why are we gathering this?
• Who is the audience? When? Why?
• What about statistics, for example?
• Connections into other business process systems
12. JSTOR Observations
• Metadata is created for every word, illustration,
page, article, issue, title
• Authorizable units are described with metadata
• Users are described with metadata
• We have established internal guidelines and
standards
• Relationship to, but not the same as external exposure
• Quality assurance and quality control are involved in
oversight
• Formalize workflows early when it is easier
• Get it right up front, it is expensive to go back