2. • The Collection
the history, the data, the images, the deadline
• The Strategy
monograph compound objects w/a tab-delimited text
file
• The Results
what went well, what didn’t, next steps
• Some Alternatives
considering other digital text collection display
methods
3. Bound volumes of Orleans
Parish School Board meeting
minutes.
Dates Covered: 1841-1996
Includes the Civil War and
Desegregation. Scholars as far
away as Japan consult this
collection on site
Thanks to UNO history
professor Al Kennedy who
rescued many of the documents
from being discarded
4. Indexed by:
• VOLUME
A Board of Regents grant
allowed UNO Midlo
Center for New Orleans
Studies historians to
summarize and index +/900 pages of meeting
minutes from just
before, during, and right
after the Civil War.
• MEETING
• Meeting Title
• Meeting Date
• Board Members
present, absent
• Keywords
• Meeting Summary
• PAGE
• Page Summaries
• Page Dates
5. Based on what was indexed as part
of the grant, our data structure would
have to support the following:
• Data
• Volume-Level Metadata (Title:
Municipality, District, and Dates
Covered)
• "Chapter"-Level Metadata (Meeting
Information, Keywords, Dates)
• Page-Level Metadata (Page
Summaries, Dates)
Given CONTENTdm as our
repository tool, how would we make
this happen?
6. "Monograph" is the compound object structure that would allow
us to keep the volume-meeting-page structure and retain the
index data created for all those levels (incl. page level).
7. Ultimately will be data on 300-600 pages for dozens of volumes
of minute books. The UNO History Dept. provided data in Excel
for first three indexed volumes.
Verdict: Convert Excel file into tab-delimited text for import
into CONTENTdm.
What is a tab-delimited text file?
• a plain text file without formatting where data fields (Excel
cells) are separated by a "tab" character
• file is saved with extension ".txt"
• similar to a CSV file where a comma separates the values
instead of a tab
8.
9. How do you make a tab-delimited text file from an Excel file?
When saving your Excel spreadsheet, choose "Text (Tab
delimited)" from the "Save as type" drop down box under the file
name
Remember where you save it. You will need to tell CONTENTdm
where to find it later.
More information:
• Microsoft instructions: http://office.microsoft.com/en-us/excelhelp/import-or-export-text-txt-or-csv-filesHP010099725.aspx#BMexport
• CONTENTdm Help: "Using Tab-Delimited Text Files":
http://www.contentdm.org/help6/projectclient/entering5.asp
10. What kind of columns are necessary to tell CONTENTdm
how to structure your "monograph?“
• Which rows are chapters?
• Which rows are pages?
Some terminology:
• Object: Book-level; the entire bound volume of minutes;
contains chapters, etc.
• Item: Page-level; an individual page within a book/object.
• CONTENTdm Field = Excel Column
• CONTENTdm Record = Excel Row
11. Our "Object":
Minute Book Volume 1, City of Lafayette, June 1, 1847 - July
5, 1854
Unique Identifier: op000001
Our "Items":
347 pages
(op000001_0001.jpg, op000001_0002.jpg, etc. etc. etc.)
Our "Chapters":
Meeting, June 1, 1847 (Pages 1-4)
Meeting, June 11, 1847 (Pages 5-10)
Meeting, June 24, 1847 (Pages 11-14)
etc. etc. etc.
12. After creating a column for all the fields you want to
populate in CONTENTdm (i.e. Title, Creator, etc.), you
need two columns at the start of the Excel spreadsheet:
1. CDM_LVL - tells CONTENTdm where you want this row
to fall in the book-chapter-page hierarchy.
2. CDM_LVL_NAME - this is what will display as the title
of this row in the table of contents (i.e. "Chapter 9" or
"Page 135")
13. Some libraries will not add a separate row for the
"Chapter," but since we have metadata at that level, here is
how we assigned levels for the OPSB project:
CDM_LVL Assigned Level
0
Book / Object
1
Meeting /
Chapter
2
Page / Item
NOTE: CONTENTdm
will allow up to nine
levels in a
monograph
compound object.
14. CDM_LVL
CDM_LVL_NAME TITLE
CREATOR
PAGE
DESCRIPTION
KEYWORDS
FILE NAME
City of Lafayette City of Lafayette Orleans Parish
Meeting Minutes, Meeting Minutes, School Board
1847-1854
1847-1854
0
City of Lafayette Front Cover
Meeting Minutes,
1847-1854
Orleans Parish
School Board
Public Board of Administrators
meeting minutes, 1847-1854
op1_0001.j
pg
1
City of Lafayette Meeting, June 1,
Meeting Minutes, 1847
1847-1854
Orleans Parish 1-4
School Board
Discussion of whipping,
Superintendent's monthly
report, discussion of library,
and discussion of attendance
rules.
2
Meeting, June 1,
1847
Page 1
Orleans Parish 1
School Board
Charges were leveled against
Mrs. Smith for severely
whipping a student.
op1_0002.j
pg
2
Meeting, June 1,
1847
Page 2
Orleans Parish 2
School Board
Monthly Superintendent
report discussion
op1_0003.j
pg
2
Meeting, June 1,
1847
Page 3
Orleans Parish 3
School Board
Monthly Superintendent
report discussion cont.
op1_0004.j
pg
2
Meeting, June 1,
1847
Page 4
Orleans Parish 4
School Board
Discussion of attendance
rules.
op1_0005.j
pg
1
City of Lafayette Meeting, June 11, Orleans Parish 5-10
Meeting Minutes, 1847
School Board
1847-1854
Results of whipping
Discipline
investigation was sole topic of
discussion.
op1_0006.j
pg
discipline;
attendance;
expenses
op1_0002.j
pg
15. Once you have created a project in project client, add a
compound object:
17. Choose “Monograph” from the list of compound object types.
Yes, we will be using a tab-delimited text file.
18. Browse to find your tab-delimited text file.
Browse to find the directory where your page (item) files are
saved.
NOTE: All image
(page) files for an
object (book) must
be saved in the
same directory.
19.
20. “Label pages using
tab-delimited text file”
will label each page
with its actual title
as opposed to something
like “op000005_0039”…
21. Click through the summaries and click “Finish” to upload
the files
to CONTENTdm.
Notice how it is adding more items than you have pages?
“But I only had 347 pages!!!”
22. This is because of all the added structure rows (chapters,
etc.),
which CONTENTdm counts as items:
571 rows in Excel = 347 page rows plus all the
chapter/meeting-level rows.
23. Table of Contents
Navigation is Confusing
• Multiple expansions are
necessary to get to
page links
• "Plus" (+) expansion
icon very tiny. Difficult to
see to get the idea that
it should be clicked on
and hard to hit with the
mouse pointer.
25. Users give up before they find
“Search by Date”
• "Narrow your search by Date"
only gives a few options, which
seem random.
• After "Advanced Search,“ user
must find and click another tiny
link to “Search by Date.”
• “Search by Date” returns every
individual page in a date range
- quite a few results, given that
each volume is 300 to 600
pages long. Either need a
better way to filter or need to
take date off page records.
26. • Have since added many more unindexed books to the
original three indexed as part of the grant. We hope there
will be support to index these as well.
• Would like to ask historians or library staff to further index
these by Municipality / District. This information is in the
title but is not split out as data. Complicated because it
changed over time…
• Would like to add CQRs, other search mechanisms to
supplement CDM search and take advantage of rich
data.
• PAGE TURNER!!!!!
• Logical way for users to also download complete PDF of
minute books…
27. TEI Encoding
METS
What it is
Not page images - take the text of a
work, encode it in XML using the TEI
standard, and write a Web app to
output the XML file(s).
What it is
An XML "wrapper" that builds a
structure around other
metadata records (i.e. Dublin
Core page records, etc.). This
structure could include such
levels as chapter, page,
paragraph, sentence, headline,
caption, and much more.
In Action:
Folger Digital Texts:
http://www.folgerdigitaltexts.org/
In Action:
The (CUA) Tower Online:
http://tower.lib.cua.edu/
NOTE: You can encode Dublin Core records, TEI transcriptions, and more within a
METS wrappers. CONTENTdm can handle METS through the Flex Loader
(usually via a vendor).
28. • Creating Compound Objects (Documents, Monographs,
Postcards, and Picture Cubes):
http://www.contentdm.com/USC/tutorials/compoundwizard.pdf
• Adding Compound Objects with Tab-Delimited Text:
http://www.contentdm.com/help6/objects/adding3a.asp
• Clemson University documentation (more detailed
instruction and uses more levels): http://libraryweb.clemson.edu/wiki/images/9/92/Using_a_tabdelimited_for_mongraphs.pdf
29. CDM_LVL
CDM_LVL_NAME
TITLE
CREATOR
PAGE
DESCRIPTION
KEYWORDS
FILE NAME
A Very Exciting
Tale
A Very
Smith, Joe
Exciting Tale
0
A Very Exciting
Tale
Front Cover
Smith, Joe
1
Cover of the book
fiction;
excitement
js000001_0001.jpg
1
A Very Exciting
Tale
Chapter 1
Smith, Joe
2-4
Our hero wakes
up
js000001_0002.jpg
2
Chapter 1
Page 2
Smith, Joe
2
Joe gets out of
bed.
js000001_0002.jpg
2
Chapter 1
Page 3
Smith, Joe
3
Joe has breakfast.
js000001_0003.jpg
2
Chapter 1
Page 4
Smith, Joe
4
Joe goes to work.
js000001_0004.jpg