Chandigarh Escorts Service 📞8868886958📞 Just📲 Call Nihal Chandigarh Call Girl...
M.Romanello Ecal Presentation
1. Electronic
Linking Text References to
Corpora for
Ancient Relevant Digital Resources
Languages Over The Web
Prague,
November Matteo Romanello matteo.romanello@yahoo.it
University “Ca' Foscari” of Venice
16th -17th
2007
2. A Microformat for Canonical Texts
References
• Topic: how to link secondary sources to corpora
of ancient languages texts?
• Goal: to give scholars reading the Digital Library's
primary and secondary sources more powerful
research tools and a richer reading experience
• Focus: references to Canonical Texts in XHTML
• Examples' Scope: Classical (Greek and Latin)
literature
Matteo Romanello Electronic Corpora for Ancient Languages - Prague, November 16th -17th 2007 2/32
3. Digital Library on Classics:
the State of the Art
• A few of on-line secondary sources (journal
articles and monographies) available as (X)HTML
• A few of on-line authoritative and born-digital
journals: e.g. Classics@ published by the
Harvard's Center for Hellenic Studies
• Some On-line Text Corpora (Perseus and other
minor scattered collections)
• Some resources and reviews of electronic
resources for humanists, reviews of books...
• Research blogs
Matteo Romanello Electronic Corpora for Ancient Languages - Prague, November 16th -17th 2007 3/32
4. Current e-scholarship scenarios (1)
Scenario 1
John is a scholar on Greek Literature and
wants to find all on-line articles or Author of Iliad
electronic resources related to the verse and Odissey
he is focusing on (Hom. Il. 20.249).
Then he submits to Google a query like
'Hom. Il. 20.249' and what Google Homer Homère
retrieves is not pertinent or interesting.
Ordinary search engine are just a text
based (no semantics, language
dependent etc.). Omero
John would have a more precise or n
specialized search engine available, ...
perhaps capable of understanding the
semantic of the reference he typed in as
query string.
Matteo Romanello Electronic Corpora for Ancient Languages - Prague, November 16th -17th 2007 4/32
5. Current e-scholarship scenarios (2)
Scenario 2
John's colleague points out to him that Gregory Nagy within a
passage of 2nd chapter mentions the passage John is interested
about. John finds an on-line version of the book and open it up in
his browser...
Matteo Romanello Electronic Corpora for Ancient Languages - Prague, November 16th -17th 2007 5/32
6. Current e-scholarship scenarios (3)
In order to have a significant e-reading experience, John would be able to
read the cited verse in its context, to compare the text of that verse as
recorded in different manuscripts, to read the same passage in a given
translation or read a commentary on it.
Matteo Romanello Electronic Corpora for Ancient Languages - Prague, November 16th -17th 2007 6/32
7. New e-scholarship scenarios (1)
• Semantic understanding of text references by
web browser
• Research of resources pertinent to the author,
the work or the precise text passage referred to
Matteo Romanello Electronic Corpora for Ancient Languages - Prague, November 16th -17th 2007 7/32
8. New e-scholarship scenarios (2)
• Value added services
(VAS) for scholars
– Reference linking
– Related resources
– Targeted and
semantic-oriented
search
– Different exemplars of
a work
• Problems:
1) To build a distributed
library
2) To provide VAS
linking secondary to
primary sources
Matteo Romanello Electronic Corpora for Ancient Languages - Prague, November 16th -17th 2007 8/32
9. From printed to digital libraries
• Find new constructive paradigms to take
advantage of net's properties
• In a network environment:
– Library universally distributed and with higher
granularity
– Provide reference linking
• Reference linking to primary sources (from
references in secondary sources):
– Ex. move from the citation Hom. Il. 1.1 to all available
translations, comparing critical editions and finding
related resources
Matteo Romanello Electronic Corpora for Ancient Languages - Prague, November 16th -17th 2007 9/32
10. The evolution of ancient languages
corpora
• TLG (1970s) -> mass storage and rapid retrieval
• Perseus (1980s) -> richer media and higher level
data structures
• DLs + web protocol -> convergence of
– XML related technologies:
• TEI (encoding)
• XML Db (storage of structured data)
• Query capabilities over http protocols
– Web services communication over REST protocol
– Success of a distributed architecture (cfr. OAI-MHP)
Which protocol? Canonical Text Services protocol
Matteo Romanello Electronic Corpora for Ancient Languages - Prague, November 16th -17th 2007 10/32
11. A new paradigm for building on-line
corpora: the CTS protocol (1)
• CTS web protocol:
– new paradigm for building electronic corpora
– gives hierarchical access to works as XML-TEI files
– lies on the model described by FRBR
– developed by Neel Smith et al. at Harvard's CHS
– Built on the Registry Services Protocol (v. 1.0.rc1) ->
authority lists
• Some CTS related projects:
– Perseus' CTS interface
– Multitext Homer
Matteo Romanello Electronic Corpora for Ancient Languages - Prague, November 16th -17th 2007 11/32
12. A new paradigm for building on-line
corpora: the CTS protocol (2)
• Text Server
CTS-compliant
• Texts: XML TEI
• Textgroup and
Works are
identified by
URNs
• Collections
described by
authority lists
Matteo Romanello Electronic Corpora for Ancient Languages - Prague, November 16th -17th 2007 12/32
13. Reference Linking in the Digital Library
Matteo Romanello Electronic Corpora for Ancient Languages - Prague, November 16th -17th 2007 13/32
14. Linking primary to secondary sources
on-line: state of the art
• Two very loosely coupled systems
• No born-digital equivalent to printed references
• Most of projects use an internal linking system:
– Worthy degree of hypertextuality
– Fairly closed systems of hard-linked resources
• Digital references == strings
– No semantic information
– No aware information processing
– Disambiguation of abbreviations and implicit
statementes is left to the reader
Matteo Romanello Electronic Corpora for Ancient Languages - Prague, November 16th -17th 2007 14/32
15. A digital companion to printed canonical
texts references
• Problem: provide a digital companion to printed
references
– to express references in a simple and semantic way
• exploiting the opportunities given by the digital medium
• Separating semantics from presentational matters
• Solution:
– mapping references to requests compliant to the
protocol to build a distributed library (CTS)
– embedding chunks of semantic information within
XHTML docs
• Implementation: Microformats (from Web 2.0)
• Goal: to design a Microformat for Canonical Text
references
Matteo Romanello Electronic Corpora for Ancient Languages - Prague, November 16th -17th 2007 15/32
16. Microformats or RDF?
• Mfs = a bottom-up way to Semantic Web (real
world semantics or lower-case semantic web)
• Used within blogs for friendships, geographical
data, reviews...
• Firefox 3 -> native support for Microformats
(microformatted content display integrated in the
UI)
• Not the only way to embed metadata inside
common tag elements
– RDFa <http://www.w3.org/TR/xhtml-rdfa-primer/>
proposed by W3C
Matteo Romanello Electronic Corpora for Ancient Languages - Prague, November 16th -17th 2007 16/32
17. Microformats vs RDF
Microformats
Matteo Romanello Electronic Corpora for Ancient Languages - Prague, November 16th -17th 2007 17/32
18. Microformats or RDF?
Matteo Romanello Electronic Corpora for Ancient Languages - Prague, November 16th -17th 2007 18/32
19. Microformats or RDF?
Matteo Romanello Electronic Corpora for Ancient Languages - Prague, November 16th -17th 2007 19/32
20. Microformats: definition
• Microformats are: • Microformats are not:
– XHTML (POSH) – A new language
compounds – An attempt to change
everyone's current
– A set of design
behavior
principles for formats
– set of simple open data • Goals:
formats built upon – Make data reusable and
existing and widely interoperable among
adopted standards webservices and
mashup applications
Matteo Romanello Electronic Corpora for Ancient Languages - Prague, November 16th -17th 2007 20/32
21. Texts references: different use cases
1. Politics
2. like Aristotle claims
3. Politics of Aristotle
4. Artist. Pol. 1304B
5. Line 1 of the first book of Homer's Iliad
6. Hom. Il. I 1
7. Α 1 (== Upper-case Alpha 1, hellenistic books
notation)
Matteo Romanello Electronic Corpora for Ancient Languages - Prague, November 16th -17th 2007 21/32
22. Designing a MF for Canonical Texts
References (1)
• Start from a specific problem (principle #1)
– Problem: link secondary to primary sources on the web
• Reuse building blocks from widely adopted
standards (princ. #4)
– Canonical texts citation scheme widely used among
scholars on Classical Literature
– Canon of Greek Literature provided as authority list
compliant to the Registry Services Protocol
• “Paving the Cowpaths”
– keep the references appearing the same way as now,
regarding to their appearance
– Besides add semantics to references
– Allow also internal linking systems
Matteo Romanello Electronic Corpora for Ancient Languages - Prague, November 16th -17th 2007 22/32
23. Designing a MF for Canonical Texts
References (2)
• Modularity and embeddability (princ. #5)
3. MF for Text 1. MF for author
references
2. MF for works
Matteo Romanello Electronic Corpora for Ancient Languages - Prague, November 16th -17th 2007 23/32
24. Designing a MF for Canonical Texts
References (3)
Reference
appearance
Reference
underlying
microformatted
content
urn:cts:greekLit:tlg0012:tlg001:20.131-20.137
Matteo Romanello Electronic Corpora for Ancient Languages - Prague, November 16th -17th 2007 24/32
25. The Microformat in action
• Get some valid microformatted references
• Tag resources from a popular review with urns
instead of simple tags
• Make the browser aware of microformatted
contents adding support for CTSreference MF to
Operator extension for Firefox
• Add exemplifying actions to perform upon each
MF:
– find pertinent bookmarks on del.icio.us
– search for pertinent research articles on CiteUlike
Matteo Romanello Electronic Corpora for Ancient Languages - Prague, November 16th -17th 2007 25/32
26. The Microformat in action
Green icons means that
Operator is working... Recognized
microformats
Available actions
Some microformatted
references
Matteo Romanello Electronic Corpora for Ancient Languages - Prague, November 16th -17th 2007 26/32
27. The Microformat in action
Matteo Romanello Electronic Corpora for Ancient Languages - Prague, November 16th -17th 2007 27/32
28. The Microformat in action
Matteo Romanello Electronic Corpora for Ancient Languages - Prague, November 16th -17th 2007 28/32
29. The Microformat in action
Matteo Romanello Electronic Corpora for Ancient Languages - Prague, November 16th -17th 2007 29/32
30. Benefits for scholarship on Ancient
Languages
• Citations encoded with a MF express references
in a form:
– Cross-language
– Fully semantic, interoperable
– reusable
• The reference linking system produced is:
– Open (client-side based)
– Independent from specific solutions
• Microformatted references allow:
– targeted search -> more precise Information Retrieval
tools (Pingerati: microformats search engine provided by
developers at Technorati)
Matteo Romanello Electronic Corpora for Ancient Languages - Prague, November 16th -17th 2007 30/32
31. TODOs
• Discussion on Microformats' mailing lists and wiki
• Advocacy and support by real projects
• Support of a digital library built upon CTS protocol
• Urns as semantic tags and keywords in metadata
description
• Tools for easy authoring
• Webservices taking advantage of such a MF:
– An application that manages and exports references with
several output formats to desktop applications
– harvester of CTS repositories
Matteo Romanello Electronic Corpora for Ancient Languages - Prague, November 16th -17th 2007 31/32
32. References
•
John Allsopp, Microformats: Empowering Your Markup for
Web 2.0, Berkeley, CA : friends of ed.; New York :
Distributed to the book trade by Springer Verlag, 2007
• Neel Smith, “TextServer: Toward a Protocol for Describing
Libraries”, Classics@ vol. 2, edition of April 3, 2004.
• G. Crane et al., 'Beyond digital incunabula: Modeling the
next generation of digital libraries', Proceedings of the 10th
European Conference on Research and Advanced
Technology for Digital Libraries (ECDL 2006) vol. 4172.
• The Canonical Text Services (CTS) Protocol, current version:
1.1<http://katoptron.holycross.edu/cocoon/diginc/specs/cts
>
• The Registry Services Protocol, current version: 1.0.rc1 <
http://katoptron.holycross.edu/cocoon/diginc/specs/registry
>
Matteo Romanello Electronic Corpora for Ancient Languages - Prague, November 16th -17th 2007 32/32