SlideShare une entreprise Scribd logo
1  sur  23
CetgK o l g o t f t lkdD t
           rain n w d e u o I eine a
                   e        nr      a
                    BIS – 2012/ 01 Leipzig – Page 1
                               03/                                        http:/ l
                                                                                /od2.eu




         A Transparent Formalization of
               Text for Machines

                                         http://nlp2rdf.org



Start: Jan 2009
Tentative End: Summer 2012
                                                              Sebastian Hellmann
                                                                A S , U ivr äLipig
                                                                 KW n e it e z
                                                                          st
  L D Pee tt n . 0 .0 .2 1 . P g
   O 2 rsnaio     2 9 00      ae                                        ht:/o 2 u
                                                                         t / d .e
                                                                          p l
BIS – 2012/ 01 Leipzig – Page 2
                             03/                    http:/ l
                                                          /od2.eu




          Overview




Introduction of the touched areas
Scientific Core
Evaluation
Plan
BIS – 2012/ 01 Leipzig – Page 3
               03/                    http:/ l
                                            /od2.eu




The Semantic Gap
BIS – 2012/ 01 Leipzig – Page 4
               03/                                                          http:/ l
                                                                                  /od2.eu




The Semantic Gap
                                      Most problems occurred at the bottom
                                      Data integration is difficult, if the pivots
                                          are not well defined
                                      Questions (in order):
                                            What structure to use?
                                            What URIs to use?
                                            What is a String?
                                            How can we teach machines to
                                              understand Strings
                                              (Knowledge Representation)?
BIS – 2012/ 01 Leipzig – Page 5
                           03/                    http:/ l
                                                        /od2.eu




         Main question




How can we formalize text in a way, which is:
     Transparent for machines
     Efficient for NLP Use Cases
     Consistent with the Web architecture
BIS – 2012/ 01 Leipzig – Page 6
               03/                    http:/ l
                                            /od2.eu




Areas
BIS – 2012/ 01 Leipzig – Page 7
                             03/                                            http:/ l
                                                                                  /od2.eu




           Preliminary definition


The NLP Interchange Format (NIF) is an RDF/OWL-based format that aims to
    achieve interoperability between Natural Language Processing (NLP) tools,
    language resources and annotations.


     This definition is still limited to RDF and NLP and targets software
        integration via a common exchange format
BIS – 2012/ 01 Leipzig – Page 8
                03/                    http:/ l
                                             /od2.eu




Scientific core
BIS – 2012/ 01 Leipzig – Page 9
                03/                    http:/ l
                                             /od2.eu




Scientific core
BIS – 2012/ 01 Leipzig – Page 10
                03/                              http:/ l
                                                       /od2.eu




Scientific core



                    Intransparent for machines
BIS – 2012/ 01 Leipzig – Page 11
                                 03/                                                   http:/ l
                                                                                             /od2.eu




              Scientific core

     Universe of discourse is defined as the words over the alphabet of Unicode
     characters (Unicode Normal Form C), often called Σ*

            URI
http://example.org/sample                                “The city Berlin is the capital of
       #offset_0_42                                                 Germany.”
BIS – 2012/ 01 Leipzig – Page 12
                                 03/                                                         http:/ l
                                                                                                   /od2.eu




              Scientific core

     Universe of discourse is defined as the words over the alphabet of Unicode
     characters (Unicode Normal Form C), often called Σ*

            URI
http://example.org/sample                      context         “The city Berlin is the capital of
       #offset_0_42                            isString                   Germany.”




     referenceContext




http://example.org/sample                           isString             “Germany”
       #offset_34_41
BIS – 2012/ 01 Leipzig – Page 13
                                 03/                                              http:/ l
                                                                                        /od2.eu




               Scientific core
Define the notion of “Context” and formalize it in OWL:
     Context is similar to the German word “Betrachtungshorizont”
     In English maybe “inside context”, i.e. the text itself, which serves as a
        reference context for all included substrings.
     Definitely disjoint with groupings such as “Document”, because a “wider
       context” is needed for this.




     Example following...
BIS – 2012/ 01 Leipzig – Page 14
                03/                     http:/ l
                                              /od2.eu




Scientific core
BIS – 2012/ 01 Leipzig – Page 15
                                 03/                                              http:/ l
                                                                                        /od2.eu




               Scientific core
Define the notion of “Context” and formalize it in OWL:
     Context is similar to the German word “Betrachtungshorizont”
     In English maybe “inside context”, i.e. the text itself, which serves as a
        reference context for all included substrings.
     Definitely disjoint with groupings such as “Document”, because a “wider
       context” is needed for this.
BIS – 2012/ 01 Leipzig – Page 16
                                03/                                        http:/ l
                                                                                 /od2.eu




              Scientific Core

Goal is to research some of the implications, ...
    but I might not be able to finish it, completely.
In scope:
  Property “contextString” is inverse-functional, which means that machines can
      infer automatically that the same context occurs in different documents.
  Show consistency with ambiguity
  Define metrics that compare contexts
  Formalize the interpretation function
  Show interoperability with internal models of all major NLP frameworks
  (Partial) compatibility with the WWW and the GGG
BIS – 2012/ 01 Leipzig – Page 17
                                03/                                          http:/ l
                                                                                   /od2.eu




                Scientific Core

Out of scope:
  Transition between contexts: Do statements from a smaller context hold in a
      broader context
  Incorporate all layers of NLP (Stack). Limited to POS tags and Entity Recognition
  Fill all the question marks in the Venn diagram
BIS – 2012/ 01 Leipzig – Page 18
               03/                     http:/ l
                                             /od2.eu




Areas
BIS – 2012/ 01 Leipzig – Page 19
                03/                     http:/ l
                                              /od2.eu




Linguistic Linked Open Data Cloud
BIS – 2012/ 01 Leipzig – Page 20
                03/                     http:/ l
                                              /od2.eu




Developers study
BIS – 2012/ 01 Leipzig – Page 21
               03/                     http:/ l
                                             /od2.eu




Areas
BIS – 2012/ 01 Leipzig – Page 22
                             03/                                         http:/ l
                                                                               /od2.eu




           Evaluation


Compare to other models in NLP:
Size (RDF vs. XML) , performance, expressivity
Is NIF easy to understand and implement?
Developers study, release of the specification had quite an impact, people
   started to create extensions and use the format. 50 people on the mailing
   list.
How to evaluate Web Service integration or consistency with web architecture. If
   the way strings are represented is transparent and formalized, do I need to
   do experimental evaluation to show benefits?
BIS – 2012/ 01 Leipzig – Page 23
             03/                             http:/ l
                                                   /od2.eu




Q&A




             Thank you for your attention


       Standing on the shoulders of giants

Contenu connexe

Similaire à Thesis presentation

NIF 2.0 Phd thesis intermediate report
NIF 2.0 Phd thesis intermediate reportNIF 2.0 Phd thesis intermediate report
NIF 2.0 Phd thesis intermediate reportSebastian Hellmann
 
EBCL Presentation LLAS2012-Edinburgh
EBCL Presentation LLAS2012-EdinburghEBCL Presentation LLAS2012-Edinburgh
EBCL Presentation LLAS2012-EdinburghLiang Wang
 
Emblematica overview dlf
Emblematica overview dlfEmblematica overview dlf
Emblematica overview dlfjjett2
 
Semantics at the multimedia fragment level or how enabling the remixing of on...
Semantics at the multimedia fragment level or how enabling the remixing of on...Semantics at the multimedia fragment level or how enabling the remixing of on...
Semantics at the multimedia fragment level or how enabling the remixing of on...Raphael Troncy
 
Navigation-induced Knowledge Engineering by Example
 Navigation-induced Knowledge Engineering by Example Navigation-induced Knowledge Engineering by Example
Navigation-induced Knowledge Engineering by ExampleSebastian Hellmann
 
Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...
Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...
Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...MediaEval2012
 

Similaire à Thesis presentation (7)

NIF 2.0 Phd thesis intermediate report
NIF 2.0 Phd thesis intermediate reportNIF 2.0 Phd thesis intermediate report
NIF 2.0 Phd thesis intermediate report
 
Steffen Hennicke Presentation DM2E Kick-Off
Steffen Hennicke Presentation DM2E Kick-OffSteffen Hennicke Presentation DM2E Kick-Off
Steffen Hennicke Presentation DM2E Kick-Off
 
EBCL Presentation LLAS2012-Edinburgh
EBCL Presentation LLAS2012-EdinburghEBCL Presentation LLAS2012-Edinburgh
EBCL Presentation LLAS2012-Edinburgh
 
Emblematica overview dlf
Emblematica overview dlfEmblematica overview dlf
Emblematica overview dlf
 
Semantics at the multimedia fragment level or how enabling the remixing of on...
Semantics at the multimedia fragment level or how enabling the remixing of on...Semantics at the multimedia fragment level or how enabling the remixing of on...
Semantics at the multimedia fragment level or how enabling the remixing of on...
 
Navigation-induced Knowledge Engineering by Example
 Navigation-induced Knowledge Engineering by Example Navigation-induced Knowledge Engineering by Example
Navigation-induced Knowledge Engineering by Example
 
Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...
Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...
Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...
 

Plus de Sebastian Hellmann

Linguistic Linked Open Data, Challenges, Approaches, Future Work
Linguistic Linked Open Data, Challenges, Approaches, Future WorkLinguistic Linked Open Data, Challenges, Approaches, Future Work
Linguistic Linked Open Data, Challenges, Approaches, Future WorkSebastian Hellmann
 
DBpedia/association Introduction The Hague 12.2.2016
DBpedia/association Introduction The Hague 12.2.2016DBpedia/association Introduction The Hague 12.2.2016
DBpedia/association Introduction The Hague 12.2.2016Sebastian Hellmann
 
Lider Reference Model ld4lt session March, 3rd, 2015
Lider Reference Model ld4lt session  March, 3rd, 2015Lider Reference Model ld4lt session  March, 3rd, 2015
Lider Reference Model ld4lt session March, 3rd, 2015Sebastian Hellmann
 
LD4LT Roadmap session 19_02_2015
LD4LT Roadmap session 19_02_2015LD4LT Roadmap session 19_02_2015
LD4LT Roadmap session 19_02_2015Sebastian Hellmann
 
DBpedia: A Public Data Infrastructure for the Web of Data
DBpedia: A Public Data Infrastructure for the Web of DataDBpedia: A Public Data Infrastructure for the Web of Data
DBpedia: A Public Data Infrastructure for the Web of DataSebastian Hellmann
 
Integrating NLP using Linked Data
Integrating NLP using Linked DataIntegrating NLP using Linked Data
Integrating NLP using Linked DataSebastian Hellmann
 
NIF 2.0 Tutorial: Content Analysis and the Semantic Web
NIF 2.0 Tutorial: Content Analysis and the Semantic Web  NIF 2.0 Tutorial: Content Analysis and the Semantic Web
NIF 2.0 Tutorial: Content Analysis and the Semantic Web Sebastian Hellmann
 
Linked Data for Abbreviations and Segmentation
Linked Data for Abbreviations and SegmentationLinked Data for Abbreviations and Segmentation
Linked Data for Abbreviations and SegmentationSebastian Hellmann
 
NIF - Version 1.0 - 2011/10/23
NIF - Version 1.0 - 2011/10/23NIF - Version 1.0 - 2011/10/23
NIF - Version 1.0 - 2011/10/23Sebastian Hellmann
 
NLP2RDF Wortschatz and Linguistic LOD draft
NLP2RDF Wortschatz and Linguistic LOD draftNLP2RDF Wortschatz and Linguistic LOD draft
NLP2RDF Wortschatz and Linguistic LOD draftSebastian Hellmann
 

Plus de Sebastian Hellmann (14)

KEDL DBpedia 2019
KEDL DBpedia  2019KEDL DBpedia  2019
KEDL DBpedia 2019
 
Linguistic Linked Open Data, Challenges, Approaches, Future Work
Linguistic Linked Open Data, Challenges, Approaches, Future WorkLinguistic Linked Open Data, Challenges, Approaches, Future Work
Linguistic Linked Open Data, Challenges, Approaches, Future Work
 
DBpedia/association Introduction The Hague 12.2.2016
DBpedia/association Introduction The Hague 12.2.2016DBpedia/association Introduction The Hague 12.2.2016
DBpedia/association Introduction The Hague 12.2.2016
 
Lider Reference Model ld4lt session March, 3rd, 2015
Lider Reference Model ld4lt session  March, 3rd, 2015Lider Reference Model ld4lt session  March, 3rd, 2015
Lider Reference Model ld4lt session March, 3rd, 2015
 
LD4LT Roadmap session 19_02_2015
LD4LT Roadmap session 19_02_2015LD4LT Roadmap session 19_02_2015
LD4LT Roadmap session 19_02_2015
 
DBpedia: A Public Data Infrastructure for the Web of Data
DBpedia: A Public Data Infrastructure for the Web of DataDBpedia: A Public Data Infrastructure for the Web of Data
DBpedia: A Public Data Infrastructure for the Web of Data
 
Integrating NLP using Linked Data
Integrating NLP using Linked DataIntegrating NLP using Linked Data
Integrating NLP using Linked Data
 
NIF 2.0 Tutorial: Content Analysis and the Semantic Web
NIF 2.0 Tutorial: Content Analysis and the Semantic Web  NIF 2.0 Tutorial: Content Analysis and the Semantic Web
NIF 2.0 Tutorial: Content Analysis and the Semantic Web
 
Linked Data for Abbreviations and Segmentation
Linked Data for Abbreviations and SegmentationLinked Data for Abbreviations and Segmentation
Linked Data for Abbreviations and Segmentation
 
Introduction to LDL 2012
Introduction to LDL 2012Introduction to LDL 2012
Introduction to LDL 2012
 
NIF - Version 1.0 - 2011/10/23
NIF - Version 1.0 - 2011/10/23NIF - Version 1.0 - 2011/10/23
NIF - Version 1.0 - 2011/10/23
 
NIF - NLP Interchange Format
NIF - NLP Interchange FormatNIF - NLP Interchange Format
NIF - NLP Interchange Format
 
Tool collection as linkeddata
Tool collection as linkeddataTool collection as linkeddata
Tool collection as linkeddata
 
NLP2RDF Wortschatz and Linguistic LOD draft
NLP2RDF Wortschatz and Linguistic LOD draftNLP2RDF Wortschatz and Linguistic LOD draft
NLP2RDF Wortschatz and Linguistic LOD draft
 

Dernier

Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.pptRamjanShidvankar
 
Interdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxInterdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxPooja Bhuva
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfPoh-Sun Goh
 
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptxOn_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptxPooja Bhuva
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsMebane Rash
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxEsquimalt MFRC
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Jisc
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxRamakrishna Reddy Bijjam
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.MaryamAhmad92
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.christianmathematics
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxheathfieldcps1
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfAdmir Softic
 
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfUnit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfDr Vijay Vishwakarma
 
How to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxHow to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxCeline George
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptxMaritesTamaniVerdade
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfagholdier
 
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxHMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxmarlenawright1
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...pradhanghanshyam7136
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and ModificationsMJDuyan
 
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...Amil baba
 

Dernier (20)

Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
Interdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxInterdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptx
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptxOn_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfUnit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
 
How to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxHow to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptx
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxHMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and Modifications
 
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
 

Thesis presentation

  • 1. CetgK o l g o t f t lkdD t rain n w d e u o I eine a e nr a BIS – 2012/ 01 Leipzig – Page 1 03/ http:/ l /od2.eu A Transparent Formalization of Text for Machines http://nlp2rdf.org Start: Jan 2009 Tentative End: Summer 2012 Sebastian Hellmann A S , U ivr äLipig KW n e it e z st L D Pee tt n . 0 .0 .2 1 . P g O 2 rsnaio 2 9 00 ae ht:/o 2 u t / d .e p l
  • 2. BIS – 2012/ 01 Leipzig – Page 2 03/ http:/ l /od2.eu Overview Introduction of the touched areas Scientific Core Evaluation Plan
  • 3. BIS – 2012/ 01 Leipzig – Page 3 03/ http:/ l /od2.eu The Semantic Gap
  • 4. BIS – 2012/ 01 Leipzig – Page 4 03/ http:/ l /od2.eu The Semantic Gap Most problems occurred at the bottom Data integration is difficult, if the pivots are not well defined Questions (in order): What structure to use? What URIs to use? What is a String? How can we teach machines to understand Strings (Knowledge Representation)?
  • 5. BIS – 2012/ 01 Leipzig – Page 5 03/ http:/ l /od2.eu Main question How can we formalize text in a way, which is: Transparent for machines Efficient for NLP Use Cases Consistent with the Web architecture
  • 6. BIS – 2012/ 01 Leipzig – Page 6 03/ http:/ l /od2.eu Areas
  • 7. BIS – 2012/ 01 Leipzig – Page 7 03/ http:/ l /od2.eu Preliminary definition The NLP Interchange Format (NIF) is an RDF/OWL-based format that aims to achieve interoperability between Natural Language Processing (NLP) tools, language resources and annotations. This definition is still limited to RDF and NLP and targets software integration via a common exchange format
  • 8. BIS – 2012/ 01 Leipzig – Page 8 03/ http:/ l /od2.eu Scientific core
  • 9. BIS – 2012/ 01 Leipzig – Page 9 03/ http:/ l /od2.eu Scientific core
  • 10. BIS – 2012/ 01 Leipzig – Page 10 03/ http:/ l /od2.eu Scientific core Intransparent for machines
  • 11. BIS – 2012/ 01 Leipzig – Page 11 03/ http:/ l /od2.eu Scientific core Universe of discourse is defined as the words over the alphabet of Unicode characters (Unicode Normal Form C), often called Σ* URI http://example.org/sample “The city Berlin is the capital of #offset_0_42 Germany.”
  • 12. BIS – 2012/ 01 Leipzig – Page 12 03/ http:/ l /od2.eu Scientific core Universe of discourse is defined as the words over the alphabet of Unicode characters (Unicode Normal Form C), often called Σ* URI http://example.org/sample context “The city Berlin is the capital of #offset_0_42 isString Germany.” referenceContext http://example.org/sample isString “Germany” #offset_34_41
  • 13. BIS – 2012/ 01 Leipzig – Page 13 03/ http:/ l /od2.eu Scientific core Define the notion of “Context” and formalize it in OWL: Context is similar to the German word “Betrachtungshorizont” In English maybe “inside context”, i.e. the text itself, which serves as a reference context for all included substrings. Definitely disjoint with groupings such as “Document”, because a “wider context” is needed for this. Example following...
  • 14. BIS – 2012/ 01 Leipzig – Page 14 03/ http:/ l /od2.eu Scientific core
  • 15. BIS – 2012/ 01 Leipzig – Page 15 03/ http:/ l /od2.eu Scientific core Define the notion of “Context” and formalize it in OWL: Context is similar to the German word “Betrachtungshorizont” In English maybe “inside context”, i.e. the text itself, which serves as a reference context for all included substrings. Definitely disjoint with groupings such as “Document”, because a “wider context” is needed for this.
  • 16. BIS – 2012/ 01 Leipzig – Page 16 03/ http:/ l /od2.eu Scientific Core Goal is to research some of the implications, ... but I might not be able to finish it, completely. In scope: Property “contextString” is inverse-functional, which means that machines can infer automatically that the same context occurs in different documents. Show consistency with ambiguity Define metrics that compare contexts Formalize the interpretation function Show interoperability with internal models of all major NLP frameworks (Partial) compatibility with the WWW and the GGG
  • 17. BIS – 2012/ 01 Leipzig – Page 17 03/ http:/ l /od2.eu Scientific Core Out of scope: Transition between contexts: Do statements from a smaller context hold in a broader context Incorporate all layers of NLP (Stack). Limited to POS tags and Entity Recognition Fill all the question marks in the Venn diagram
  • 18. BIS – 2012/ 01 Leipzig – Page 18 03/ http:/ l /od2.eu Areas
  • 19. BIS – 2012/ 01 Leipzig – Page 19 03/ http:/ l /od2.eu Linguistic Linked Open Data Cloud
  • 20. BIS – 2012/ 01 Leipzig – Page 20 03/ http:/ l /od2.eu Developers study
  • 21. BIS – 2012/ 01 Leipzig – Page 21 03/ http:/ l /od2.eu Areas
  • 22. BIS – 2012/ 01 Leipzig – Page 22 03/ http:/ l /od2.eu Evaluation Compare to other models in NLP: Size (RDF vs. XML) , performance, expressivity Is NIF easy to understand and implement? Developers study, release of the specification had quite an impact, people started to create extensions and use the format. 50 people on the mailing list. How to evaluate Web Service integration or consistency with web architecture. If the way strings are represented is transparent and formalized, do I need to do experimental evaluation to show benefits?
  • 23. BIS – 2012/ 01 Leipzig – Page 23 03/ http:/ l /od2.eu Q&A Thank you for your attention Standing on the shoulders of giants