SlideShare une entreprise Scribd logo
1  sur  23
CetgK o l g o t f t lkdD t
           rain n w d e u o I eine a
                   e        nr      a
                    BIS – 2012/ 01 Leipzig – Page 1
                               03/                                        http:/ l
                                                                                /od2.eu




         A Transparent Formalization of
               Text for Machines

                                         http://nlp2rdf.org



Start: Jan 2009
Tentative End: Summer 2012
                                                              Sebastian Hellmann
                                                                A S , U ivr äLipig
                                                                 KW n e it e z
                                                                          st
  L D Pee tt n . 0 .0 .2 1 . P g
   O 2 rsnaio     2 9 00      ae                                        ht:/o 2 u
                                                                         t / d .e
                                                                          p l
BIS – 2012/ 01 Leipzig – Page 2
                             03/                    http:/ l
                                                          /od2.eu




          Overview




Introduction of the touched areas
Scientific Core
Evaluation
Plan
BIS – 2012/ 01 Leipzig – Page 3
               03/                    http:/ l
                                            /od2.eu




The Semantic Gap
BIS – 2012/ 01 Leipzig – Page 4
               03/                                                          http:/ l
                                                                                  /od2.eu




The Semantic Gap
                                      Most problems occurred at the bottom
                                      Data integration is difficult, if the pivots
                                          are not well defined
                                      Questions (in order):
                                            What structure to use?
                                            What URIs to use?
                                            What is a String?
                                            How can we teach machines to
                                              understand Strings
                                              (Knowledge Representation)?
BIS – 2012/ 01 Leipzig – Page 5
                           03/                    http:/ l
                                                        /od2.eu




         Main question




How can we formalize text in a way, which is:
     Transparent for machines
     Efficient for NLP Use Cases
     Consistent with the Web architecture
BIS – 2012/ 01 Leipzig – Page 6
               03/                    http:/ l
                                            /od2.eu




Areas
BIS – 2012/ 01 Leipzig – Page 7
                             03/                                            http:/ l
                                                                                  /od2.eu




           Preliminary definition


The NLP Interchange Format (NIF) is an RDF/OWL-based format that aims to
    achieve interoperability between Natural Language Processing (NLP) tools,
    language resources and annotations.


     This definition is still limited to RDF and NLP and targets software
        integration via a common exchange format
BIS – 2012/ 01 Leipzig – Page 8
                03/                    http:/ l
                                             /od2.eu




Scientific core
BIS – 2012/ 01 Leipzig – Page 9
                03/                    http:/ l
                                             /od2.eu




Scientific core
BIS – 2012/ 01 Leipzig – Page 10
                03/                              http:/ l
                                                       /od2.eu




Scientific core



                    Intransparent for machines
BIS – 2012/ 01 Leipzig – Page 11
                                 03/                                                   http:/ l
                                                                                             /od2.eu




              Scientific core

     Universe of discourse is defined as the words over the alphabet of Unicode
     characters (Unicode Normal Form C), often called Σ*

            URI
http://example.org/sample                                “The city Berlin is the capital of
       #offset_0_42                                                 Germany.”
BIS – 2012/ 01 Leipzig – Page 12
                                 03/                                                         http:/ l
                                                                                                   /od2.eu




              Scientific core

     Universe of discourse is defined as the words over the alphabet of Unicode
     characters (Unicode Normal Form C), often called Σ*

            URI
http://example.org/sample                      context         “The city Berlin is the capital of
       #offset_0_42                            isString                   Germany.”




     referenceContext




http://example.org/sample                           isString             “Germany”
       #offset_34_41
BIS – 2012/ 01 Leipzig – Page 13
                                 03/                                              http:/ l
                                                                                        /od2.eu




               Scientific core
Define the notion of “Context” and formalize it in OWL:
     Context is similar to the German word “Betrachtungshorizont”
     In English maybe “inside context”, i.e. the text itself, which serves as a
        reference context for all included substrings.
     Definitely disjoint with groupings such as “Document”, because a “wider
       context” is needed for this.




     Example following...
BIS – 2012/ 01 Leipzig – Page 14
                03/                     http:/ l
                                              /od2.eu




Scientific core
BIS – 2012/ 01 Leipzig – Page 15
                                 03/                                              http:/ l
                                                                                        /od2.eu




               Scientific core
Define the notion of “Context” and formalize it in OWL:
     Context is similar to the German word “Betrachtungshorizont”
     In English maybe “inside context”, i.e. the text itself, which serves as a
        reference context for all included substrings.
     Definitely disjoint with groupings such as “Document”, because a “wider
       context” is needed for this.
BIS – 2012/ 01 Leipzig – Page 16
                                03/                                        http:/ l
                                                                                 /od2.eu




              Scientific Core

Goal is to research some of the implications, ...
    but I might not be able to finish it, completely.
In scope:
  Property “contextString” is inverse-functional, which means that machines can
      infer automatically that the same context occurs in different documents.
  Show consistency with ambiguity
  Define metrics that compare contexts
  Formalize the interpretation function
  Show interoperability with internal models of all major NLP frameworks
  (Partial) compatibility with the WWW and the GGG
BIS – 2012/ 01 Leipzig – Page 17
                                03/                                          http:/ l
                                                                                   /od2.eu




                Scientific Core

Out of scope:
  Transition between contexts: Do statements from a smaller context hold in a
      broader context
  Incorporate all layers of NLP (Stack). Limited to POS tags and Entity Recognition
  Fill all the question marks in the Venn diagram
BIS – 2012/ 01 Leipzig – Page 18
               03/                     http:/ l
                                             /od2.eu




Areas
BIS – 2012/ 01 Leipzig – Page 19
                03/                     http:/ l
                                              /od2.eu




Linguistic Linked Open Data Cloud
BIS – 2012/ 01 Leipzig – Page 20
                03/                     http:/ l
                                              /od2.eu




Developers study
BIS – 2012/ 01 Leipzig – Page 21
               03/                     http:/ l
                                             /od2.eu




Areas
BIS – 2012/ 01 Leipzig – Page 22
                             03/                                         http:/ l
                                                                               /od2.eu




           Evaluation


Compare to other models in NLP:
Size (RDF vs. XML) , performance, expressivity
Is NIF easy to understand and implement?
Developers study, release of the specification had quite an impact, people
   started to create extensions and use the format. 50 people on the mailing
   list.
How to evaluate Web Service integration or consistency with web architecture. If
   the way strings are represented is transparent and formalized, do I need to
   do experimental evaluation to show benefits?
BIS – 2012/ 01 Leipzig – Page 23
             03/                             http:/ l
                                                   /od2.eu




Q&A




             Thank you for your attention


       Standing on the shoulders of giants

Contenu connexe

Similaire à Thesis presentation

NIF 2.0 Phd thesis intermediate report
NIF 2.0 Phd thesis intermediate reportNIF 2.0 Phd thesis intermediate report
NIF 2.0 Phd thesis intermediate reportSebastian Hellmann
 
EBCL Presentation LLAS2012-Edinburgh
EBCL Presentation LLAS2012-EdinburghEBCL Presentation LLAS2012-Edinburgh
EBCL Presentation LLAS2012-EdinburghLiang Wang
 
Emblematica overview dlf
Emblematica overview dlfEmblematica overview dlf
Emblematica overview dlfjjett2
 
Semantics at the multimedia fragment level or how enabling the remixing of on...
Semantics at the multimedia fragment level or how enabling the remixing of on...Semantics at the multimedia fragment level or how enabling the remixing of on...
Semantics at the multimedia fragment level or how enabling the remixing of on...Raphael Troncy
 
Navigation-induced Knowledge Engineering by Example
 Navigation-induced Knowledge Engineering by Example Navigation-induced Knowledge Engineering by Example
Navigation-induced Knowledge Engineering by ExampleSebastian Hellmann
 
Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...
Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...
Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...MediaEval2012
 

Similaire à Thesis presentation (7)

NIF 2.0 Phd thesis intermediate report
NIF 2.0 Phd thesis intermediate reportNIF 2.0 Phd thesis intermediate report
NIF 2.0 Phd thesis intermediate report
 
Steffen Hennicke Presentation DM2E Kick-Off
Steffen Hennicke Presentation DM2E Kick-OffSteffen Hennicke Presentation DM2E Kick-Off
Steffen Hennicke Presentation DM2E Kick-Off
 
EBCL Presentation LLAS2012-Edinburgh
EBCL Presentation LLAS2012-EdinburghEBCL Presentation LLAS2012-Edinburgh
EBCL Presentation LLAS2012-Edinburgh
 
Emblematica overview dlf
Emblematica overview dlfEmblematica overview dlf
Emblematica overview dlf
 
Semantics at the multimedia fragment level or how enabling the remixing of on...
Semantics at the multimedia fragment level or how enabling the remixing of on...Semantics at the multimedia fragment level or how enabling the remixing of on...
Semantics at the multimedia fragment level or how enabling the remixing of on...
 
Navigation-induced Knowledge Engineering by Example
 Navigation-induced Knowledge Engineering by Example Navigation-induced Knowledge Engineering by Example
Navigation-induced Knowledge Engineering by Example
 
Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...
Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...
Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...
 

Plus de Sebastian Hellmann

Linguistic Linked Open Data, Challenges, Approaches, Future Work
Linguistic Linked Open Data, Challenges, Approaches, Future WorkLinguistic Linked Open Data, Challenges, Approaches, Future Work
Linguistic Linked Open Data, Challenges, Approaches, Future WorkSebastian Hellmann
 
DBpedia/association Introduction The Hague 12.2.2016
DBpedia/association Introduction The Hague 12.2.2016DBpedia/association Introduction The Hague 12.2.2016
DBpedia/association Introduction The Hague 12.2.2016Sebastian Hellmann
 
Lider Reference Model ld4lt session March, 3rd, 2015
Lider Reference Model ld4lt session  March, 3rd, 2015Lider Reference Model ld4lt session  March, 3rd, 2015
Lider Reference Model ld4lt session March, 3rd, 2015Sebastian Hellmann
 
LD4LT Roadmap session 19_02_2015
LD4LT Roadmap session 19_02_2015LD4LT Roadmap session 19_02_2015
LD4LT Roadmap session 19_02_2015Sebastian Hellmann
 
DBpedia: A Public Data Infrastructure for the Web of Data
DBpedia: A Public Data Infrastructure for the Web of DataDBpedia: A Public Data Infrastructure for the Web of Data
DBpedia: A Public Data Infrastructure for the Web of DataSebastian Hellmann
 
Integrating NLP using Linked Data
Integrating NLP using Linked DataIntegrating NLP using Linked Data
Integrating NLP using Linked DataSebastian Hellmann
 
NIF 2.0 Tutorial: Content Analysis and the Semantic Web
NIF 2.0 Tutorial: Content Analysis and the Semantic Web  NIF 2.0 Tutorial: Content Analysis and the Semantic Web
NIF 2.0 Tutorial: Content Analysis and the Semantic Web Sebastian Hellmann
 
Linked Data for Abbreviations and Segmentation
Linked Data for Abbreviations and SegmentationLinked Data for Abbreviations and Segmentation
Linked Data for Abbreviations and SegmentationSebastian Hellmann
 
NIF - Version 1.0 - 2011/10/23
NIF - Version 1.0 - 2011/10/23NIF - Version 1.0 - 2011/10/23
NIF - Version 1.0 - 2011/10/23Sebastian Hellmann
 
NLP2RDF Wortschatz and Linguistic LOD draft
NLP2RDF Wortschatz and Linguistic LOD draftNLP2RDF Wortschatz and Linguistic LOD draft
NLP2RDF Wortschatz and Linguistic LOD draftSebastian Hellmann
 

Plus de Sebastian Hellmann (14)

KEDL DBpedia 2019
KEDL DBpedia  2019KEDL DBpedia  2019
KEDL DBpedia 2019
 
Linguistic Linked Open Data, Challenges, Approaches, Future Work
Linguistic Linked Open Data, Challenges, Approaches, Future WorkLinguistic Linked Open Data, Challenges, Approaches, Future Work
Linguistic Linked Open Data, Challenges, Approaches, Future Work
 
DBpedia/association Introduction The Hague 12.2.2016
DBpedia/association Introduction The Hague 12.2.2016DBpedia/association Introduction The Hague 12.2.2016
DBpedia/association Introduction The Hague 12.2.2016
 
Lider Reference Model ld4lt session March, 3rd, 2015
Lider Reference Model ld4lt session  March, 3rd, 2015Lider Reference Model ld4lt session  March, 3rd, 2015
Lider Reference Model ld4lt session March, 3rd, 2015
 
LD4LT Roadmap session 19_02_2015
LD4LT Roadmap session 19_02_2015LD4LT Roadmap session 19_02_2015
LD4LT Roadmap session 19_02_2015
 
DBpedia: A Public Data Infrastructure for the Web of Data
DBpedia: A Public Data Infrastructure for the Web of DataDBpedia: A Public Data Infrastructure for the Web of Data
DBpedia: A Public Data Infrastructure for the Web of Data
 
Integrating NLP using Linked Data
Integrating NLP using Linked DataIntegrating NLP using Linked Data
Integrating NLP using Linked Data
 
NIF 2.0 Tutorial: Content Analysis and the Semantic Web
NIF 2.0 Tutorial: Content Analysis and the Semantic Web  NIF 2.0 Tutorial: Content Analysis and the Semantic Web
NIF 2.0 Tutorial: Content Analysis and the Semantic Web
 
Linked Data for Abbreviations and Segmentation
Linked Data for Abbreviations and SegmentationLinked Data for Abbreviations and Segmentation
Linked Data for Abbreviations and Segmentation
 
Introduction to LDL 2012
Introduction to LDL 2012Introduction to LDL 2012
Introduction to LDL 2012
 
NIF - Version 1.0 - 2011/10/23
NIF - Version 1.0 - 2011/10/23NIF - Version 1.0 - 2011/10/23
NIF - Version 1.0 - 2011/10/23
 
NIF - NLP Interchange Format
NIF - NLP Interchange FormatNIF - NLP Interchange Format
NIF - NLP Interchange Format
 
Tool collection as linkeddata
Tool collection as linkeddataTool collection as linkeddata
Tool collection as linkeddata
 
NLP2RDF Wortschatz and Linguistic LOD draft
NLP2RDF Wortschatz and Linguistic LOD draftNLP2RDF Wortschatz and Linguistic LOD draft
NLP2RDF Wortschatz and Linguistic LOD draft
 

Dernier

ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4MiaBumagat1
 
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONTHEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONHumphrey A Beña
 
Karra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxKarra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxAshokKarra1
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfphamnguyenenglishnb
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parentsnavabharathschool99
 
ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfSpandanaRallapalli
 
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxBarangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxCarlos105
 
Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Seán Kennedy
 
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...Postal Advocate Inc.
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfTechSoup
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxiammrhaywood
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...JhezDiaz1
 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSJoshuaGantuangco2
 
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptxAUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptxiammrhaywood
 
Culture Uniformity or Diversity IN SOCIOLOGY.pptx
Culture Uniformity or Diversity IN SOCIOLOGY.pptxCulture Uniformity or Diversity IN SOCIOLOGY.pptx
Culture Uniformity or Diversity IN SOCIOLOGY.pptxPoojaSen20
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatYousafMalik24
 

Dernier (20)

YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptxYOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4
 
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptxYOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
 
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONTHEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
 
Karra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxKarra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptx
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
 
Raw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptxRaw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptx
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parents
 
ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdf
 
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxBarangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
 
Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...
 
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
 
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptxAUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
 
Culture Uniformity or Diversity IN SOCIOLOGY.pptx
Culture Uniformity or Diversity IN SOCIOLOGY.pptxCulture Uniformity or Diversity IN SOCIOLOGY.pptx
Culture Uniformity or Diversity IN SOCIOLOGY.pptx
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice great
 

Thesis presentation

  • 1. CetgK o l g o t f t lkdD t rain n w d e u o I eine a e nr a BIS – 2012/ 01 Leipzig – Page 1 03/ http:/ l /od2.eu A Transparent Formalization of Text for Machines http://nlp2rdf.org Start: Jan 2009 Tentative End: Summer 2012 Sebastian Hellmann A S , U ivr äLipig KW n e it e z st L D Pee tt n . 0 .0 .2 1 . P g O 2 rsnaio 2 9 00 ae ht:/o 2 u t / d .e p l
  • 2. BIS – 2012/ 01 Leipzig – Page 2 03/ http:/ l /od2.eu Overview Introduction of the touched areas Scientific Core Evaluation Plan
  • 3. BIS – 2012/ 01 Leipzig – Page 3 03/ http:/ l /od2.eu The Semantic Gap
  • 4. BIS – 2012/ 01 Leipzig – Page 4 03/ http:/ l /od2.eu The Semantic Gap Most problems occurred at the bottom Data integration is difficult, if the pivots are not well defined Questions (in order): What structure to use? What URIs to use? What is a String? How can we teach machines to understand Strings (Knowledge Representation)?
  • 5. BIS – 2012/ 01 Leipzig – Page 5 03/ http:/ l /od2.eu Main question How can we formalize text in a way, which is: Transparent for machines Efficient for NLP Use Cases Consistent with the Web architecture
  • 6. BIS – 2012/ 01 Leipzig – Page 6 03/ http:/ l /od2.eu Areas
  • 7. BIS – 2012/ 01 Leipzig – Page 7 03/ http:/ l /od2.eu Preliminary definition The NLP Interchange Format (NIF) is an RDF/OWL-based format that aims to achieve interoperability between Natural Language Processing (NLP) tools, language resources and annotations. This definition is still limited to RDF and NLP and targets software integration via a common exchange format
  • 8. BIS – 2012/ 01 Leipzig – Page 8 03/ http:/ l /od2.eu Scientific core
  • 9. BIS – 2012/ 01 Leipzig – Page 9 03/ http:/ l /od2.eu Scientific core
  • 10. BIS – 2012/ 01 Leipzig – Page 10 03/ http:/ l /od2.eu Scientific core Intransparent for machines
  • 11. BIS – 2012/ 01 Leipzig – Page 11 03/ http:/ l /od2.eu Scientific core Universe of discourse is defined as the words over the alphabet of Unicode characters (Unicode Normal Form C), often called Σ* URI http://example.org/sample “The city Berlin is the capital of #offset_0_42 Germany.”
  • 12. BIS – 2012/ 01 Leipzig – Page 12 03/ http:/ l /od2.eu Scientific core Universe of discourse is defined as the words over the alphabet of Unicode characters (Unicode Normal Form C), often called Σ* URI http://example.org/sample context “The city Berlin is the capital of #offset_0_42 isString Germany.” referenceContext http://example.org/sample isString “Germany” #offset_34_41
  • 13. BIS – 2012/ 01 Leipzig – Page 13 03/ http:/ l /od2.eu Scientific core Define the notion of “Context” and formalize it in OWL: Context is similar to the German word “Betrachtungshorizont” In English maybe “inside context”, i.e. the text itself, which serves as a reference context for all included substrings. Definitely disjoint with groupings such as “Document”, because a “wider context” is needed for this. Example following...
  • 14. BIS – 2012/ 01 Leipzig – Page 14 03/ http:/ l /od2.eu Scientific core
  • 15. BIS – 2012/ 01 Leipzig – Page 15 03/ http:/ l /od2.eu Scientific core Define the notion of “Context” and formalize it in OWL: Context is similar to the German word “Betrachtungshorizont” In English maybe “inside context”, i.e. the text itself, which serves as a reference context for all included substrings. Definitely disjoint with groupings such as “Document”, because a “wider context” is needed for this.
  • 16. BIS – 2012/ 01 Leipzig – Page 16 03/ http:/ l /od2.eu Scientific Core Goal is to research some of the implications, ... but I might not be able to finish it, completely. In scope: Property “contextString” is inverse-functional, which means that machines can infer automatically that the same context occurs in different documents. Show consistency with ambiguity Define metrics that compare contexts Formalize the interpretation function Show interoperability with internal models of all major NLP frameworks (Partial) compatibility with the WWW and the GGG
  • 17. BIS – 2012/ 01 Leipzig – Page 17 03/ http:/ l /od2.eu Scientific Core Out of scope: Transition between contexts: Do statements from a smaller context hold in a broader context Incorporate all layers of NLP (Stack). Limited to POS tags and Entity Recognition Fill all the question marks in the Venn diagram
  • 18. BIS – 2012/ 01 Leipzig – Page 18 03/ http:/ l /od2.eu Areas
  • 19. BIS – 2012/ 01 Leipzig – Page 19 03/ http:/ l /od2.eu Linguistic Linked Open Data Cloud
  • 20. BIS – 2012/ 01 Leipzig – Page 20 03/ http:/ l /od2.eu Developers study
  • 21. BIS – 2012/ 01 Leipzig – Page 21 03/ http:/ l /od2.eu Areas
  • 22. BIS – 2012/ 01 Leipzig – Page 22 03/ http:/ l /od2.eu Evaluation Compare to other models in NLP: Size (RDF vs. XML) , performance, expressivity Is NIF easy to understand and implement? Developers study, release of the specification had quite an impact, people started to create extensions and use the format. 50 people on the mailing list. How to evaluate Web Service integration or consistency with web architecture. If the way strings are represented is transparent and formalized, do I need to do experimental evaluation to show benefits?
  • 23. BIS – 2012/ 01 Leipzig – Page 23 03/ http:/ l /od2.eu Q&A Thank you for your attention Standing on the shoulders of giants