Ce diaporama a bien été signalé.
Le téléchargement de votre SlideShare est en cours. ×

Reusing Legacy data: Irish Historic Vital Registration Data, 1864-1913

Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité

Consultez-les par la suite

1 sur 22 Publicité

Plus De Contenu Connexe

Les utilisateurs ont également aimé (15)

Similaire à Reusing Legacy data: Irish Historic Vital Registration Data, 1864-1913 (20)

Publicité

Plus récents (20)

Reusing Legacy data: Irish Historic Vital Registration Data, 1864-1913

  1. 1. Reusing Legacy data: Irish Historic Vital Registration Data, 1864-1913 Dolores Grant Dr Ciara Breathnach, Dr Sandra Collins, Rebecca Grant Irish Record Linkage 1864-1913
  2. 2. Irish Record Linkage project 1864-1913 Irish Record Linkage is an Irish Research Council funded project running until December 2015 To construct a Knowledge Platform by applying semantic technologies to vital-registration data generously shared by the Office of the Registrar General To address research queries around infant and maternal mortality rates and patterns in Dublin
  3. 3. Irish Record Linkage project 1864-1913 Collaboration between the Digital Repository of Ireland at the Royal Irish Academy, the University of Limerick and Insight@NUI Galway Principal Investigators: Dr Ciara Breathnach (UL), Dr Sandra Collins (DRI), Dr Stefan Decker (Insight) Project Team: Dr Brian Gurrin (UL), Dr Christophe Debruyne (Insight/DRI), Dr Oya Beyan (Insight), Rebecca Grant (DRI), Dolores Grant (DRI)
  4. 4. University of Limerick Its mission is to promote and advance learning and knowledge through teaching, research and scholarship in an environment which encourages innovation and upholds the principles of free enquiry and expression. The Faculty of Arts, Humanities and Social Sciences prides itself on the quality of its teaching and its commitment to research and places a strong emphasis on the role of debate and discussion in the development of knowledge and analytical skills.
  5. 5. The Digital Repository of Ireland Based in the Royal Irish Academy (Ireland's Academy for the Sciences and Humanities) DRI is a trusted digital repository for the Humanities and Social Sciences data Linking and preserving the rich data held by Irish institutions, providing a central internet access point and multimedia tools Focal point for the development of national guidelines and policy for digital preservation and access
  6. 6. INSIGHT@NUI Galway Insight brings together leading Irish academics from 5 of Ireland'€™s leading research centres (DERI, CLARITY, CLIQUE, 4C, TRIL), in key areas of priority research including: The Semantic Web, Sensors and the Sensor Web, Social network analysis, Decision Support and Optimization, and Connected Health.
  7. 7. Irish Historic Vital Registration Data 1845: Registration of marriages act was introduced to gather official statistics of marriages of the established Church of Ireland 1864: the first year Births, Deaths and Marriages (including Catholic Marriages) were registered following the establishment of a complete Irish civil registration system in 1863 Ireland 1864-1912: 2.9 million birth records 4.9 million death records 3.18 million marriages Dublin 1864-1912: 609,720 birth records 537,635 death records 330,605 marriage records (1845-1913)
  8. 8. The Linked Data Concept A method of publishing structured data on the Web, allowing it to be connected and enriched, and facilitating linking between related resources. A key principle of Linked Data is that HTTP URIs are used to name the semantic elements of the dataset Linked Data standards such as RDF allows semantic definitions to be applied to information, using statements called ‘triples’ in the form subject, predicate, object.
  9. 9. The Linked Data Concept This example describes the subject (James Joyce) and his relationship (predicate) to an object (Dublin). By semantically separating the elements of the information (that James Joyce was born in Dublin) datasets stored in this way can be easily queried.
  10. 10. General Register Office Data Vital registration data: birth, death, marriage records for Dublin TIFF images of pre-digitised indexes and registers of birth, death and marriage General Register Office database for these records
  11. 11. Marriage Records Register TIFF Index TIFF System 1845-1901 System 1902-c.1912 Registrar’s District Registration District District District Marriage solemnised at Parish Union County County County Province Province Number in register Entry number When married Year of event Year of event , Date of marriage When registered Returns year Returns year Returns quarter Returns quarter Name and surname Name Forename, Surname Forename, Surname Partner’s surname Age Sex Condition Rank or profession Residence at the time of marriage Father’s name and surname Rank or profession of father Celebrant Witnesses Signature of Registrar Signature of Superintendant Registrar and date Stamp Number Stamp number Stamp number Volume number Returns volume number Returns volume number Page number Page number Returns page number Returns Page number Stamped number Page ID Page ID 2nd Stamped number Index entry number Index entry number Index page number
  12. 12. Birth Records Register TIFF Index TIFF System Pre 1900 System Post 1900 Superintendent Registrar’s District Registrar’s District Registration district District District Union County County County Province Province Number in register Entry number Date & place of birth Year of event Date of birth, year of event Name (if any) Name Forename, Surname Forename, Surname Sex Sex Name, surname & dwelling place of father Name & surname & Mother’s maiden name maiden surname of mother Rank or profession of father Signature, qualification, and residence of informant When Registered Returns year Returns year Returns quarter Returns quarter Signature of Registrar Name & surname & maiden surname of mother Rank or profession of father Signature, qualification, and residence of informant Signature of Registrar Signature of Superintendant Registrar and date Baptismal name if added after registration of birth and date Stamp Number Stamp number Stamp number Volume number Returns volume number Returns volume number Page number Page number Returns page number Returns page number Stamped number Page ID 2nd Stamped number Index entry number Index page number
  13. 13. Death Records Register TIFF Index TIFF System Superintendent Registrar’s District Registrar’s District Registration District District District Union County County Province Number in register Date and place of death Year of event Name and surname Name Forename, Surname Sex Condition Age last birthday Age Age at death Rank, profession or occupation Certified cause of death and duration of illness Signature, qualification and residence of informant When registered Returns year Returns quarter Signature of Registrar Signature of Superintendant Registrar and date Stamp number Stamp number Volume number Returns volume number Page number Page number Returns page number Stamped number Page ID 2nd Stamped number Index entry number Index page number
  14. 14. Research Questions Identifying the record fields that are necessary to maintain the archival authenticity of the records and answer the research questions: •How many women died within 42 days following childbirth due to complications related to labour and how does that figure correspond with the official reports? •Which women died of causes that can be attributed to maternal death, but for which no corresponding birth certificate exists? •How did various socio-economic conditions affect maternal and infant mortality rates?
  15. 15. Competency questions to construct the Ontology ID Competency Question C01 Women died within 42 days after giving birth (the date of birth counted as day 1 and day 42 is included) C02 Women died within 42 days after giving birth AND in their death certificate ‘complication 1’ is mentioned. C03 Women died within 42 days after giving birth AND in their death certificate ‘complication 2’ is mentioned. C04 Women having official maternal death reports including “XXXX’ C05 Women having official maternal death reports including “cause 1” C06 Women having official maternal death reports including “cause 2 and cause 3 together” C07 For each record in C04 find the ones with corresponding birth record (the date of death counted as day 1 and day 42 is included)
  16. 16. Creation of RDF triples described by GRO Triplestore Digital Archivist extract load GRO Ontology consulted by amends/curates Transform GRO Database Storage Model Metadata that can be queried declaratively with a W3C standard
  17. 17. GRO Records annotation vs. Data Analysis GRO Triplestore Triplestore 2 Data Analysis Transformation from one model to another • SPIN – SPARQL Inference • SWRL / RuleML • SPARQL Construct • … SEPARATION OF CONCERNS
  18. 18. <#B000-001> a irl:BirthRecord; irl:on "1900-08-08"; irl:name "James"; irl:mother "Mary Murphy"; irl:place "Castle Road"; … <#B010-022> a irl:BirthRecord; irl:on "1902-04-19"; irl:name "Patrick"; irl:mother "Mary Murphy"; irl:place "Castle Road"; ... <#B022-051> a irl:BirthRecord; irl:on "1904-09-20"; irl:name "Agnes"; irl:mother "Mary Murphy"; irl:place “Convent Hill"; ... <#B050-003> a irl:BirthRecord; irl:on "1905-02-18"; irl:name "Michael"; #1 Mary Murphy owl:sameAs #2 Mary Murphy owl:sameAs #3 Mary Murphy #4 Mary Murphy owl:sameAs TRANSFORMATION ONTOLOGY MATCHING All generated are stored separately for data analytics ...
  19. 19. Data analysis on the generated triples #1 Mary Murphy #1 Mary Murphy #1 Mary Murphy James Patrick Michael 1900-08-08 1902-04-19 1905-02-18 619 days 1036 days Average sibship interval = 827.5 days
  20. 20. Data Challenges •Data security - transfer, storage and use by authorised parties •Data protection best practice •Quantity of data •Varying levels of detail eg causes of death • Establishing maternal death- fever •Archaic medical terms •Variances in record subject names and places •Place names changes over time
  21. 21. The Irish Record Linkage Knowledge Platform • State of the art linked data & ontology based analysis platform for historical 'big data' • Platform within a secure, closed system • Prepared to allow formulation of the specific research queries • Query interface to allow for the historical analysis of the data. • Potential expansion to include additional contextualising datasets @IRL_Project http://irishrecordlinkage. DRI Presentation wordpress.com/

Notes de l'éditeur

  • The resulting platform will provide a powerful research resource to enable the historians to study Irish infant and maternal mortality rates and patterns during this period of Irish history. The project aims to provide a comprehensive map of infant and maternal mortality for Dublin.
  • Our project team is cross-disciplinary and team members include knowledge engineers, historians and digital archivists.
  • A marriage register page from 1900
  • The research questions set by Dr Breathnach. Identifying, tracking and interlinking individuals across the registers, through place and time, allows for a granular analysis of these reconstructed virtual households thus enabling the analysis of Irish historic rates, which have yet to receive thorough treatment from historians.
  • Some context around the records chosen for the project

×