SlideShare une entreprise Scribd logo
1  sur  33
1
DH 2013, Nebraska
Practical Interoperability:
The Map of Early Modern London
and Internet Shakespeare Editions
Janelle Jenstad and Martin Holmes
University of Victoria
mapoflondon.uvic.ca
2
Loose couplings
3
MoEML and ISE
● On UVic servers
● Overlapping teams
● Mutual need
4
The Map of Early Modern London
● Maps streets, sites and
boundaries of London 1560-1640
● Interface based on Agas Map
● Includes (1) gazetteer, (2)
encyclopedia of London people
and places, (3) library of primary
source texts, (4) edition of A
Survey of London
● Pure TEI XML throughout
5
Mapping toponyms
6
Library
7
Dramatic extracts
8
Internet Shakespeare Editions
● Open-source digital anthology
● Also hosts and incubates Queen's
Men's Editions and Digital
Renaissance Editions
● Goal: all plays of Shakespeare and
contemporaries, 1500-1640
● SGML and non-standard XML :-(
9
Frequency of toponyms
The London locations in Richard III on the Agas Map, sized
according to the number of references to them.
10
Research questions
● How typical is Shakespeare's invocation of
London?
● How do his characters move through the urban
environment?
● What is the relationship between London and
the court?
● How does this vision compare to other
playwrights and to historians?
11
Integration? Interoperability?
Interchange?
Is it reasonable to ask editors to revisit their
"finished" work?
● Can we overcome the significant programming
differences?
12
Initial assumptions
● First rule of collaboration: You're on your own.
● The ISE agenda is not MoEML's agenda.
● MoEML can't ask the ISE editors to tag their
texts.
● MoEML can't depend on the ISE programmers to
implement things for us.
● We must beware of making features on our site
dependent on functionality on theirs.
13
Loose coupling
● We take the ISE texts and tag them.
● We generate sets of links based on through
line numbers (TLNs).
● We store the links in our database.
● We only depend on the fact that links to TLNs
on their website work.
14
The (rather unrealistic) plan
15
Manual tagging and NER
16
Manual tagging and NER
17
Manual tagging and NER
18
Manual tagging and NER
19
Manual tagging and NER
20
Manual tagging and NER
21
Manual tagging and NER
22
Manual tagging and NER
23
Manual tagging and NER
24
Results
● 4 plays:
– Richard II and Richard III (modern spelling)
– Henry VIII and Henry VI Part 2 (old spelling)
● 495 placenames marked up
● 95 linked to Map of Early Modern London
placeography
25
Performance of NER engine
26
Difficulties for NER tagger
● <stage>Enter Yorke, Salisbury, and
Warwick.</stage>
● "Was not your husband / In Margaret's battle at
St Albans slain?"
● Spelling variation ("Tower" versus "Towre")
● Capitalization is unhelpful in old-spelling texts.
● Short utterances confuse it:
– Queen Margaret: Richard.
– Richard: <LOC>Ha</LOC>?
27
The showstopper problem
● Henry VI pt 2:
– 210 placenames in the text
– tagger tagged 109 places, of which 106 were correct
– 29 of these were "England" and 38 "France"
– Among placenames missed:
● 48 were in Britain
● 20 of these were key London locations (Bedlam,
Southwark, London Bridge & Smithfield)
28
Is it worth using NER for
this?
● No.
● Possibly.
– It can function as a check on manual tagging.
● Yes.
– 75 "city plays" are eventually coming...
29
Happy endings
● Second rule of collaboration: nobody wants to
be left out.
● Now the ISE editors have seen how we're
linking to their plays, they want to tag
placenames for themselves.
● We'll just be able to harvest their tags for
MoEML.
30
Functional interoperability
ISE guidelines:
Internal Links to MoEML's London Locations
We are moving towards interoperability with The Map of Early
Modern London (MoEML). If your play includes references to
London locations, you will identify each London location using
the ilink element and the unique MoEML identifier for the
location....
31
ISE guidelines, cont.
The purpose of this tagging is two-fold: (1) it
will allow us to visualize the London locations in
a play using a MoEML map in the
ISE/DRE/QME environment, and (2) it will
allow MoEML to import London references in
ISE/DRE/QME plays into its database of
literary references (with a link back to the ISE).
32
ISE/DRE/QME tagging
<ilink component="geo"
href="mol:CHEA2">Cheapside</link>
ISE will have various instructions in its "geo"
component (England, France, Europe, London, stage
geometry)
All we need is the mol:XXXX# and the TLN
33
Should we continue to use NER?
ISE wants to use tags only in modern critical
editions.
ISE editions of 1 Henry IV, 2 Henry IV, and
Henry V are “done.”
500+ plays in DRE

Contenu connexe

En vedette

Final power-pineda-sebastian
Final power-pineda-sebastianFinal power-pineda-sebastian
Final power-pineda-sebastian
Seba Pineda
 
SIMULATION OF REDUCED SWITCH INVERTER BASED UPQC WITH FUZZY LOGIC AND ANN CON...
SIMULATION OF REDUCED SWITCH INVERTER BASED UPQC WITH FUZZY LOGIC AND ANN CON...SIMULATION OF REDUCED SWITCH INVERTER BASED UPQC WITH FUZZY LOGIC AND ANN CON...
SIMULATION OF REDUCED SWITCH INVERTER BASED UPQC WITH FUZZY LOGIC AND ANN CON...
MABUSUBANI SHAIK
 
Mitigation of Power Quality Issues by Nine Switches UPQC Using PI & ANN with ...
Mitigation of Power Quality Issues by Nine Switches UPQC Using PI & ANN with ...Mitigation of Power Quality Issues by Nine Switches UPQC Using PI & ANN with ...
Mitigation of Power Quality Issues by Nine Switches UPQC Using PI & ANN with ...
MABUSUBANI SHAIK
 

En vedette (9)

Ipi32246
Ipi32246Ipi32246
Ipi32246
 
Final power-pineda-sebastian
Final power-pineda-sebastianFinal power-pineda-sebastian
Final power-pineda-sebastian
 
Encoding Historical Dates Correctly: Is it Practical, and is it Worth it?
Encoding Historical Dates Correctly:  Is it Practical, and is it Worth it?Encoding Historical Dates Correctly:  Is it Practical, and is it Worth it?
Encoding Historical Dates Correctly: Is it Practical, and is it Worth it?
 
Introduction to perl_operators
Introduction to perl_operatorsIntroduction to perl_operators
Introduction to perl_operators
 
SIMULATION OF REDUCED SWITCH INVERTER BASED UPQC WITH FUZZY LOGIC AND ANN CON...
SIMULATION OF REDUCED SWITCH INVERTER BASED UPQC WITH FUZZY LOGIC AND ANN CON...SIMULATION OF REDUCED SWITCH INVERTER BASED UPQC WITH FUZZY LOGIC AND ANN CON...
SIMULATION OF REDUCED SWITCH INVERTER BASED UPQC WITH FUZZY LOGIC AND ANN CON...
 
Projeto didático interdisciplinar
Projeto didático interdisciplinarProjeto didático interdisciplinar
Projeto didático interdisciplinar
 
Annotated Bibliography Assignment
Annotated Bibliography AssignmentAnnotated Bibliography Assignment
Annotated Bibliography Assignment
 
Mitigation of Power Quality Issues by Nine Switches UPQC Using PI & ANN with ...
Mitigation of Power Quality Issues by Nine Switches UPQC Using PI & ANN with ...Mitigation of Power Quality Issues by Nine Switches UPQC Using PI & ANN with ...
Mitigation of Power Quality Issues by Nine Switches UPQC Using PI & ANN with ...
 
20130727 cv machine_learning@tokyo webmining
20130727 cv machine_learning@tokyo webmining20130727 cv machine_learning@tokyo webmining
20130727 cv machine_learning@tokyo webmining
 

Dernier

Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
PECB
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
SoniaTolstoy
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
QucHHunhnh
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
kauryashika82
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
QucHHunhnh
 

Dernier (20)

A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room service
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajan
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writing
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpin
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdf
 
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 

Final dh2013 interoperability

  • 1. 1 DH 2013, Nebraska Practical Interoperability: The Map of Early Modern London and Internet Shakespeare Editions Janelle Jenstad and Martin Holmes University of Victoria mapoflondon.uvic.ca
  • 3. 3 MoEML and ISE ● On UVic servers ● Overlapping teams ● Mutual need
  • 4. 4 The Map of Early Modern London ● Maps streets, sites and boundaries of London 1560-1640 ● Interface based on Agas Map ● Includes (1) gazetteer, (2) encyclopedia of London people and places, (3) library of primary source texts, (4) edition of A Survey of London ● Pure TEI XML throughout
  • 8. 8 Internet Shakespeare Editions ● Open-source digital anthology ● Also hosts and incubates Queen's Men's Editions and Digital Renaissance Editions ● Goal: all plays of Shakespeare and contemporaries, 1500-1640 ● SGML and non-standard XML :-(
  • 9. 9 Frequency of toponyms The London locations in Richard III on the Agas Map, sized according to the number of references to them.
  • 10. 10 Research questions ● How typical is Shakespeare's invocation of London? ● How do his characters move through the urban environment? ● What is the relationship between London and the court? ● How does this vision compare to other playwrights and to historians?
  • 11. 11 Integration? Interoperability? Interchange? Is it reasonable to ask editors to revisit their "finished" work? ● Can we overcome the significant programming differences?
  • 12. 12 Initial assumptions ● First rule of collaboration: You're on your own. ● The ISE agenda is not MoEML's agenda. ● MoEML can't ask the ISE editors to tag their texts. ● MoEML can't depend on the ISE programmers to implement things for us. ● We must beware of making features on our site dependent on functionality on theirs.
  • 13. 13 Loose coupling ● We take the ISE texts and tag them. ● We generate sets of links based on through line numbers (TLNs). ● We store the links in our database. ● We only depend on the fact that links to TLNs on their website work.
  • 24. 24 Results ● 4 plays: – Richard II and Richard III (modern spelling) – Henry VIII and Henry VI Part 2 (old spelling) ● 495 placenames marked up ● 95 linked to Map of Early Modern London placeography
  • 26. 26 Difficulties for NER tagger ● <stage>Enter Yorke, Salisbury, and Warwick.</stage> ● "Was not your husband / In Margaret's battle at St Albans slain?" ● Spelling variation ("Tower" versus "Towre") ● Capitalization is unhelpful in old-spelling texts. ● Short utterances confuse it: – Queen Margaret: Richard. – Richard: <LOC>Ha</LOC>?
  • 27. 27 The showstopper problem ● Henry VI pt 2: – 210 placenames in the text – tagger tagged 109 places, of which 106 were correct – 29 of these were "England" and 38 "France" – Among placenames missed: ● 48 were in Britain ● 20 of these were key London locations (Bedlam, Southwark, London Bridge & Smithfield)
  • 28. 28 Is it worth using NER for this? ● No. ● Possibly. – It can function as a check on manual tagging. ● Yes. – 75 "city plays" are eventually coming...
  • 29. 29 Happy endings ● Second rule of collaboration: nobody wants to be left out. ● Now the ISE editors have seen how we're linking to their plays, they want to tag placenames for themselves. ● We'll just be able to harvest their tags for MoEML.
  • 30. 30 Functional interoperability ISE guidelines: Internal Links to MoEML's London Locations We are moving towards interoperability with The Map of Early Modern London (MoEML). If your play includes references to London locations, you will identify each London location using the ilink element and the unique MoEML identifier for the location....
  • 31. 31 ISE guidelines, cont. The purpose of this tagging is two-fold: (1) it will allow us to visualize the London locations in a play using a MoEML map in the ISE/DRE/QME environment, and (2) it will allow MoEML to import London references in ISE/DRE/QME plays into its database of literary references (with a link back to the ISE).
  • 32. 32 ISE/DRE/QME tagging <ilink component="geo" href="mol:CHEA2">Cheapside</link> ISE will have various instructions in its "geo" component (England, France, Europe, London, stage geometry) All we need is the mol:XXXX# and the TLN
  • 33. 33 Should we continue to use NER? ISE wants to use tags only in modern critical editions. ISE editions of 1 Henry IV, 2 Henry IV, and Henry V are “done.” 500+ plays in DRE

Notes de l'éditeur

  1. There is an obvious convergence between the two projects, and we imagine many benefits from interoperability. The gazetteer and mapping features in MoEML would significantly enhance the critical apparatus of the plays, while tying MoEML&apos;s placeography into the works of Shakespeare and his contemporaries would reinforce links between the physical geography and the literature.
  2. We cannot simply ask or expect the editors of ISE plays to tag all the placenames for us. They&apos;re too busy with other stuff, and they can&apos;t see the payoff.
  3. Our original plan involved the creation of a detailed London gazetteer, including all the variant spellings of placenames we know from our own texts, along with a training set of manually tagged plays, to serve as input to the NER process.
  4. Between the first and second plays, I improved the gazetteer substantially by importing a lot of non-London content; and for each play after the first, there&apos;s a larger training set, leading to better results. While precision is remarkably good – over 95% for the last two – recall is very low, and improving only slowly. Note: NER did find several placenames I&apos;d missed in the tagging of the plays.
  5. Places are people throughout the history plays. Syntax is frequently convoluted. Spelling in the old-spelling plays is inconsistent within the play. Nouns are frequently capitalized, so capitalization is not a useful clue for the NER engine as it is with modern texts.
  6. The point here is that the placenames we are most interested in are precisely the ones the tagger is least good at finding. It even missed &quot;London&quot; in one case. Note also, though, that despite finding &quot;England&quot; 29 times, it contrived to miss it 14 times.
  7. No, because it&apos;s hopeless at the very thing we care about most; and we only have 10 plays in our Shakespeare set. Possibly, because it&apos;s slowly getting better, and although we have only 10 Shakespeare plays, we have up to 75 &quot;city plays&quot; coming in the future from Digital Renaissance Editions (out of 500 they intend to tag). Yes, because NER did catch a few instances of placenames I&apos;d missed.