SlideShare une entreprise Scribd logo
1  sur  12
Explaining Conclusions from Diverse Knowledge Sources J. William Murdock 1 , Deborah McGuinness 2 , Paulo Pinheiro da Silva 3 , Chris Welty 1 , David Ferrucci 1 1  IBM Research 2  Stanford 3  U. Texas El Paso
Core Ideas ,[object Object],[object Object],[object Object],[object Object],Lots of important information is currently unstructured (e.g., natural language text on an HTML page)
Motivating Example “ Major Julian Allen, Ph.D.,  director of the Automated System Project” Major Julian Allen  Major Julian Allen   managerOf Mississippi Automated Systems Project transitivity of  managerOf pressrelease/1107628109.html kb1.owl Why should I believe that the unstructured text says that? Why should I believe these? Why should I believe this? Who manages the Mississippi automated data infrastructure? OrganizationalRelationAnnotator EntityAnnotator2 EntityAnnotator1 Mississippi Automated Systems Project  managerOf Mississippi automated data infrastructure CoreferenceResolver managerOf
Pre-Existing  UIMA  Technology ,[object Object],[object Object],[object Object],[object Object]
Pre-Existing  Inference Web  Technology ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Taxonomy of Extraction Methods ,[object Object],[object Object],[object Object],Major Julian Allen, Ph.D.,  director of the Automated System Project. Entity Recognition Person Relation Argument Identification managerOf subject Major Julian Allen, Ph.D.,  director of the Automated System Project. Person
Motivating Example: Details (managerOf  MASProject1   MissDataInfrastructure1 ) (managerOf  MJAllen1   MissDataInfrastructure1 ) (transitiveProperty managerOf) JTP Java Theorem Prover Transitive Property Inference Direct assertion from KB1.owl IBM Coreference  Major Julian Allen   [Person] [refers to MJAllen1] , Ph.D.,  director of the  Automated System Project  [Organization]   [refers to MASProject1] Entity Identification IBM EAnnotator Major Julian Allen   [Person] , Ph.D.,  director of the  Automated System Project  [Organization] Entity Recognition direct assertion from pressrelease/1107628109.html “ Major Julian Allen, Ph.D.,  director of the Automated System Project” IBM Relation Detector Major Julian Allen, Ph.D.,  director of the Automated System Project [managerOf] Relation Recognition IBM Relation Detector Major Julian Allen   [subject] , Ph.D.,  director of the  Automated System Project  [object] Relation Argument Identification IBM Coreference (managerOf  MJAllen1   MASProject1 ) Relation Identification Direct assertion from KB1.owl Extraction Theorem Proving
 
 
Abridged PML (proof markup language) Example ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Conclusions ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
References ,[object Object],[object Object],[object Object],[object Object],[object Object]

Contenu connexe

Similaire à Iswc uimaiw

Predictive Text Analytics
Predictive Text AnalyticsPredictive Text Analytics
Predictive Text AnalyticsSeth Grimes
 
2011linked science4mccuskermcguinnessfinal
2011linked science4mccuskermcguinnessfinal2011linked science4mccuskermcguinnessfinal
2011linked science4mccuskermcguinnessfinalDeborah McGuinness
 
Analysis of ‘Unstructured’ Data
Analysis of ‘Unstructured’ DataAnalysis of ‘Unstructured’ Data
Analysis of ‘Unstructured’ DataSeth Grimes
 
Qualitative Content Analysis
Qualitative Content AnalysisQualitative Content Analysis
Qualitative Content AnalysisRicky Bilakhia
 
Test Trend Analysis : Towards robust, reliable and timely tests
Test Trend Analysis : Towards robust, reliable and timely testsTest Trend Analysis : Towards robust, reliable and timely tests
Test Trend Analysis : Towards robust, reliable and timely testsHugh McCamphill
 
BlueHat v18 || Protecting the protector, hardening machine learning defenses ...
BlueHat v18 || Protecting the protector, hardening machine learning defenses ...BlueHat v18 || Protecting the protector, hardening machine learning defenses ...
BlueHat v18 || Protecting the protector, hardening machine learning defenses ...BlueHat Security Conference
 
2013-07-19 myExperiment research objects, beyond workflows and packs (PPTX)
2013-07-19 myExperiment research objects, beyond workflows and packs (PPTX)2013-07-19 myExperiment research objects, beyond workflows and packs (PPTX)
2013-07-19 myExperiment research objects, beyond workflows and packs (PPTX)Stian Soiland-Reyes
 
[2D1]Elasticsearch 성능 최적화
[2D1]Elasticsearch 성능 최적화[2D1]Elasticsearch 성능 최적화
[2D1]Elasticsearch 성능 최적화NAVER D2
 
[2 d1] elasticsearch 성능 최적화
[2 d1] elasticsearch 성능 최적화[2 d1] elasticsearch 성능 최적화
[2 d1] elasticsearch 성능 최적화Henry Jeong
 
Getting to Know Your Data with R
Getting to Know Your Data with RGetting to Know Your Data with R
Getting to Know Your Data with RStephen Withington
 
Leveraging NTFS Timeline Forensics during the Analysis of Malware
Leveraging NTFS Timeline Forensics during the Analysis of MalwareLeveraging NTFS Timeline Forensics during the Analysis of Malware
Leveraging NTFS Timeline Forensics during the Analysis of Malwaretmugherini
 
Data Mesh @ Yelp - 2019
Data Mesh @ Yelp - 2019Data Mesh @ Yelp - 2019
Data Mesh @ Yelp - 2019Steven Moy
 
Implementing the Genetic Algorithm in XSLT: PoC
Implementing the Genetic Algorithm in XSLT: PoCImplementing the Genetic Algorithm in XSLT: PoC
Implementing the Genetic Algorithm in XSLT: PoCjimfuller2009
 
NAISTビッグデータシンポジウム - 情報 松本先生
NAISTビッグデータシンポジウム - 情報 松本先生NAISTビッグデータシンポジウム - 情報 松本先生
NAISTビッグデータシンポジウム - 情報 松本先生ysuzuki-naist
 
Services For Science April 2009
Services For Science April 2009Services For Science April 2009
Services For Science April 2009Ian Foster
 
Applications of Semantic Technology in the Real World Today
Applications of Semantic Technology in the Real World TodayApplications of Semantic Technology in the Real World Today
Applications of Semantic Technology in the Real World TodayAmit Sheth
 

Similaire à Iswc uimaiw (20)

Predictive Text Analytics
Predictive Text AnalyticsPredictive Text Analytics
Predictive Text Analytics
 
HPC For Bioinformatics
HPC For BioinformaticsHPC For Bioinformatics
HPC For Bioinformatics
 
2011linked science4mccuskermcguinnessfinal
2011linked science4mccuskermcguinnessfinal2011linked science4mccuskermcguinnessfinal
2011linked science4mccuskermcguinnessfinal
 
2012 03 01_bioinformatics_ii_les1
2012 03 01_bioinformatics_ii_les12012 03 01_bioinformatics_ii_les1
2012 03 01_bioinformatics_ii_les1
 
Text Analytics - JCC2014 Kimelfeld
Text Analytics - JCC2014 KimelfeldText Analytics - JCC2014 Kimelfeld
Text Analytics - JCC2014 Kimelfeld
 
Analysis of ‘Unstructured’ Data
Analysis of ‘Unstructured’ DataAnalysis of ‘Unstructured’ Data
Analysis of ‘Unstructured’ Data
 
Qualitative Content Analysis
Qualitative Content AnalysisQualitative Content Analysis
Qualitative Content Analysis
 
Test Trend Analysis : Towards robust, reliable and timely tests
Test Trend Analysis : Towards robust, reliable and timely testsTest Trend Analysis : Towards robust, reliable and timely tests
Test Trend Analysis : Towards robust, reliable and timely tests
 
BlueHat v18 || Protecting the protector, hardening machine learning defenses ...
BlueHat v18 || Protecting the protector, hardening machine learning defenses ...BlueHat v18 || Protecting the protector, hardening machine learning defenses ...
BlueHat v18 || Protecting the protector, hardening machine learning defenses ...
 
2013-07-19 myExperiment research objects, beyond workflows and packs (PPTX)
2013-07-19 myExperiment research objects, beyond workflows and packs (PPTX)2013-07-19 myExperiment research objects, beyond workflows and packs (PPTX)
2013-07-19 myExperiment research objects, beyond workflows and packs (PPTX)
 
[2D1]Elasticsearch 성능 최적화
[2D1]Elasticsearch 성능 최적화[2D1]Elasticsearch 성능 최적화
[2D1]Elasticsearch 성능 최적화
 
[2 d1] elasticsearch 성능 최적화
[2 d1] elasticsearch 성능 최적화[2 d1] elasticsearch 성능 최적화
[2 d1] elasticsearch 성능 최적화
 
Getting to Know Your Data with R
Getting to Know Your Data with RGetting to Know Your Data with R
Getting to Know Your Data with R
 
Leveraging NTFS Timeline Forensics during the Analysis of Malware
Leveraging NTFS Timeline Forensics during the Analysis of MalwareLeveraging NTFS Timeline Forensics during the Analysis of Malware
Leveraging NTFS Timeline Forensics during the Analysis of Malware
 
Data Mesh @ Yelp - 2019
Data Mesh @ Yelp - 2019Data Mesh @ Yelp - 2019
Data Mesh @ Yelp - 2019
 
Implementing the Genetic Algorithm in XSLT: PoC
Implementing the Genetic Algorithm in XSLT: PoCImplementing the Genetic Algorithm in XSLT: PoC
Implementing the Genetic Algorithm in XSLT: PoC
 
Resume
ResumeResume
Resume
 
NAISTビッグデータシンポジウム - 情報 松本先生
NAISTビッグデータシンポジウム - 情報 松本先生NAISTビッグデータシンポジウム - 情報 松本先生
NAISTビッグデータシンポジウム - 情報 松本先生
 
Services For Science April 2009
Services For Science April 2009Services For Science April 2009
Services For Science April 2009
 
Applications of Semantic Technology in the Real World Today
Applications of Semantic Technology in the Real World TodayApplications of Semantic Technology in the Real World Today
Applications of Semantic Technology in the Real World Today
 

Dernier

Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfMr Bounab Samir
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...Nguyen Thanh Tu Collection
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...JhezDiaz1
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPCeline George
 
Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)cama23
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxiammrhaywood
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parentsnavabharathschool99
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Jisc
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management systemChristalin Nelson
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Celine George
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxthorishapillay1
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4MiaBumagat1
 
4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptxmary850239
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxHumphrey A Beña
 
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...Postal Advocate Inc.
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 

Dernier (20)

YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptxYOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
 
Raw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptxRaw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptx
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERP
 
Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)
 
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptxYOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parents
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management system
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptx
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4
 
4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx
 
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptxLEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
 
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 

Iswc uimaiw

  • 1. Explaining Conclusions from Diverse Knowledge Sources J. William Murdock 1 , Deborah McGuinness 2 , Paulo Pinheiro da Silva 3 , Chris Welty 1 , David Ferrucci 1 1 IBM Research 2 Stanford 3 U. Texas El Paso
  • 2.
  • 3. Motivating Example “ Major Julian Allen, Ph.D., director of the Automated System Project” Major Julian Allen Major Julian Allen managerOf Mississippi Automated Systems Project transitivity of managerOf pressrelease/1107628109.html kb1.owl Why should I believe that the unstructured text says that? Why should I believe these? Why should I believe this? Who manages the Mississippi automated data infrastructure? OrganizationalRelationAnnotator EntityAnnotator2 EntityAnnotator1 Mississippi Automated Systems Project managerOf Mississippi automated data infrastructure CoreferenceResolver managerOf
  • 4.
  • 5.
  • 6.
  • 7. Motivating Example: Details (managerOf MASProject1 MissDataInfrastructure1 ) (managerOf MJAllen1 MissDataInfrastructure1 ) (transitiveProperty managerOf) JTP Java Theorem Prover Transitive Property Inference Direct assertion from KB1.owl IBM Coreference Major Julian Allen [Person] [refers to MJAllen1] , Ph.D., director of the Automated System Project [Organization] [refers to MASProject1] Entity Identification IBM EAnnotator Major Julian Allen [Person] , Ph.D., director of the Automated System Project [Organization] Entity Recognition direct assertion from pressrelease/1107628109.html “ Major Julian Allen, Ph.D., director of the Automated System Project” IBM Relation Detector Major Julian Allen, Ph.D., director of the Automated System Project [managerOf] Relation Recognition IBM Relation Detector Major Julian Allen [subject] , Ph.D., director of the Automated System Project [object] Relation Argument Identification IBM Coreference (managerOf MJAllen1 MASProject1 ) Relation Identification Direct assertion from KB1.owl Extraction Theorem Proving
  • 8.  
  • 9.  
  • 10.
  • 11.
  • 12.

Notes de l'éditeur

  1. This presentation includes work that was performed as a collaboration between IBM Research and Stanford, and one of the participants is now at Texas. The authors greatly appreciate the nomination of this paper for a best paper award.
  2. The context of this work is relatively common. There is a lot of important information out there that is not structured. We want to extract that information, combine it with formal knowledge, and reason about it. In this talk we are focusing on coherent explanations of end-to-end systems that perform these steps.
  3. For example, a user may make some request for information and get some result. In some cases, the user may be satisfied with that result as it is. However, in other cases, the user may want to know why the answer should be believed. A traditional solution to that problem is to provide some sort of logical proof that shows how facts and axioms combine to establish the result. However, in some cases the user will want to drill down even further. The user may want to know where the facts and axioms came from. Some may be directly asserted in some hand-coded knowledge base, but others may have been automatically extracted from documents. The user may wish to find out what text the fact was derived from, how that text was annotated, and even which components were responsible for each part of the extraction.
  4. One part of the background of this work is UIMA. UIMA is an architecture for analyzing unstructured information such as text or video. The architecture is undergoing standardization through OASIS. A reference implementation of UIMA is available as open source. UIMA provides shared programming interfaces and data structures for analysis; this makes it possible to develop generic tools that are not specific to a particular analysis component because they operate at the level of the structures defined by the architecture. For example, it is possible to record provenance for analysis without having to instrument individual components by developing the recording mechanisms at the level of the architecture and framework.
  5. Another part of the background of this work is Inference Web. Inference web provides infrastructure for storing and browsing provenance. It encodes process descriptions as graphs of inferences. It has been applied to a variety of different technologies that naturally lend themselves to a formal inference perspective. In this work we using Inference Web to record provenance for knowledge extraction. We show that it is possible to view extraction as a form of inference.
  6. Specifically, we have identified nine types of extraction inferences. Six of these involve the analysis of the unstructured sources and three involve integrating the analyses into a target ontology. Here we show two of the inference types. Entity Recognition involves labeling a span of text with an entity type such as person. Relation Argument Identification involves connecting text labeled as an entity to text labeled as a relationship via a role such as “subject.”
  7. Let’s revisit our motivating example, looking more closely at how the result was produced. The end-to-end system began with some text and some assertions in a knowledge base. Analysis of text begins by labeling spans of text with entity types and relation types. Given those labels, it is possible to assign arguments to relation annotations and to perform coreference over entities. All that information in combination allows us to conclude a formal logical assertion. That assertion can be combined with other assertions to draw a conclusion via theorem proving. I would like to emphasize that this trace spans two distinct kinds of technology: extraction and inference. We can look at these as two distinct modules, but the provenance shown here has a consistent form throughout the end-to-end system.
  8. This is one of the graphical interfaces that Inference Web provides for browsing provenance. Steps in the process can be viewed a level at a time...
  9. ... or they can be expanded out to see a more complete view. The interface is highly interactive, for example, a user can click on a button on each node to see a description of the component that performed the inference.
  10. This is an example of the OWL-based representation that Inference Web is based on. The inference engine responsible for this step in the process was IBM’s statistical ACE annotator. The step had three antecedents, which are identified by URI’s, so they could potentially be distributed across different locations. The inference rule that was used in this step is Relation Identification . The conclusion of this step is that entity 184 is the manager of entity 199. The language used to encode that conclusion is KIF.
  11. Our main result here is that we provide coherent provenance for an end-to-end system that reasons over both hand-coded and extracted knowledge. To that end we have represented extraction as a form of inference. UIMA has supported this work by making it possible to work with analysis components in terms of what they do instead of being forced to dig into the internal technical details of each component separately. Inference Web has supported this work by providing a formal interlingua for encoding provenance and an interface that allows us to view that provenance for complex end-to-end systems that include extraction and logical deduction.