Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.

Open IE tutorial 2018

Knowledge Graphs

Open IE tutorial 2018

  1. 1. André Freitas Open Information Extraction for QA OKBQA 2018André Freitas
  2. 2. André Freitas Building Knowledge Graphs for QA OKBQA 2018
  3. 3. Organisation • Goals of this Tutorial: – Try to answer the question: What is a knowledge graph? – Depict open IE as a foundation to build KGs. – Establish how different semantic perspectives (including open IE) can be integrated into a unified Knowledge Representation model. • Method: – Focus on the contemporary and emerging perspectives. – Sampling exemplar approaches and infrastructures on each of these emerging perspectives (not an exhaustive survey). – Emphasis on Open IE.
  4. 4. “On our best behaviour” “We need to return to our roots in Knowledge Representation and Reasoning for language and from language.” Levesque, 2013 “We should not treat English text as a monolithic source of information. Instead, we should carefully study how simple knowledge bases might be used to make sense of the simple language needed to build slightly more complex knowledge bases.”
  5. 5. Outline 1. What is a ? 2. Building a (with Open IE) 3. Querying 4. Inferences over 5. Uses of
  6. 6. What is a Knowledge Graph?
  7. 7. Some Perspectives on “What” “The Knowledge Graph is a knowledge base used by Google to enhance its search engine's search results with semantic-search information gathered from a wide variety of sources.” “A Knowledge graph (i) mainly describes real world entities and interrelations, organized in a graph (ii) defines possible classes and relations of entities in a schema(iii) allows potentially interrelating arbitrary entities with each other…” [Paulheim H.] “We defines a Knowledge Graph as an RDF graph consists of a set of RDF triples where each RDF triple (s,p,o) is an ordered set of following RDF term ….” [Pujara J. al al.] KDD 2014 Tutorial on Constructing and Mining Web-scale Knowledge Graphs, New York, August 24, 2014
  8. 8. • Open world representation of information. • Every entry point is equal cost. • Underpin Cortana, Google Assistant, Siri, Alexa. • Typically (but doesn’t have to be) expressed in RDF. • No longer a solution in search of a problem! Dan Bennett, Thomson Reuters Some Perspectives on “What”
  9. 9. Defining KG by Example KDD 2014 Tutorial on Constructing and Mining Web-scale Knowledge Graphs, New York, August 24, 2014
  10. 10. • “Knowledge is Power” Hypothesis (the Knowledge Principle): “If a program is to perform a complex task well, it must know a great deal about the world in which it operates.” • The Breadth Hypothesis: “To behave intelligently in unexpected situations, an agent must be capable of falling back on increasingly general knowledge.” KDD 2014 Tutorial on Constructing and Mining Web-scale Knowledge Graphs, New York, August 24, 2014 Some Perspectives on “Why”
  11. 11. • We’re surrounded by entities, which are connected by relations. • We need to store them somehow, e.g., using a DB or a graph. • Graphs can be processed efficiently and offer a convenient abstraction. Some Perspectives on “Why” KDD 2014 Tutorial on Constructing and Mining Web-scale Knowledge Graphs, New York, August 24, 2014
  12. 12. Some Perspectives on “Why” • Knowledge models such as Linked Data and many problems in machine learning have a natural representation as relational data. • Relations between entities are often more important for a prediction task than attributes. • For instance, can be easier to predict the party of a vice- president from the party of his president than from his attributes. [Koopman, 2010]
  13. 13. Schema on Write • Fixed data model • Slow to change • Strong enforcement Schema on Read • Capture everything • Apply logic (schema) on read • No standards The Data Management Perspective Dan Bennet, Thomsom Reuters
  14. 14. From Closed to Open Communication
  15. 15. From Closed to Open Communication
  16. 16. Intuitions on the Connection Between KGs and AI
  17. 17. Coping with the long tail of data variety frequency of use # of entities and attributes relational NoSQL schema-less unstructured Consumption (Querying, Software) Formalization Semiotic breakdown Exponential growth of vocabulary size
  18. 18. From Text to Structure
  19. 19. From Text to Structure
  20. 20. From Text to Structure
  21. 21. From Text to Structure
  22. 22. Regularities in Natural Language
  23. 23. Regularities in Natural Language
  24. 24. Regularities in Natural Language
  25. 25. Structural/Logical Form
  26. 26. Structural/Logical Form
  27. 27. Rephrasing it
  28. 28. Now we can answer this query
  29. 29. It computes!
  30. 30. Extrapolating
  31. 31. Breaking away from the linear order imposed by the medium
  32. 32. What did we get? • By integrating terms (mapping to a canonical form) we reduced complexity. – ‘Atomization’ of knowledge allows integration. • Entities, attributes and relationships are now explicit and interconnected. – Predicate-argument structures. • Ability to focus (“query”) and operate on specific sets and relations. • Entities are organized into hierarchies of sets. – With that we can express knowledge at different abstraction levels (make generalizations).
  33. 33. Building Knowledge Graphs
  34. 34. Requirements for Open IE • Banko et al. (2007) identified three major challenges for Open IE systems: – Automation. Open IE systems must rely on unsupervised extraction strategies – Transportability. As Open IE systems are intended for domain-independent usage, – Efficiency. In order to readily scale to large amounts of text, Open IE systems must be computationally efficient.
  35. 35. Four Types of Systems • Learning-based Systems • Rule-based Systems • Clause-based Systems • Systems Capturing Inter-Proposition Relationships
  36. 36. Learning-based and Rule- based Systems
  37. 37. Clause-based Systems • Incorporating a sentence re-structuring stage. • Transform complex sentences, where relations are spread over several clauses into a set of syntactically simplified independent clauses that are easy to segment into Open IE tuples.
  38. 38. Systems Capturing Inter- Proposition Relationships • Previous approaches usually ignore the context under which a proposition is complete and correct. • E.g. not distinguish between information asserted in a sentence and information that is only hypothetical or conditionally true. <the earth; is the center of ; the universe> “Early scientists believed that the earth is the center of the universe.”
  39. 39. Systems Capturing Inter- Proposition Relationships
  40. 40. Case Study: Graphene • Captures contextual relations. • Extends the default Open IE representation in order to capture inter-proposition relationships. • Increase the informativeness and expressiveness of the extracted tuples. Niklaus et al., A Sentence Simplification System for Improving Relation Extraction, COLING (2017)
  41. 41. Transformation Stage
  42. 42. Rhetorical Relations
  43. 43. Extracting Rhetorical Relations
  44. 44. Extracting Rhetorical Relations
  45. 45. Clausal & Phrasal Disembedding
  46. 46. Input Document
  47. 47. Transformation Stage
  48. 48. Relation Extraction
  49. 49. Output
  50. 50. Asian stocks fell anew and the yen rose to session highs in the afternoon as worries about North Korea simmered, after a senior Pyongyang official said the U.S. is becoming ``more vicious and more aggressive'' under President Donald Trump . Asian stocks fell anew The yen rose to session highs in the afternoon spatial attribution after Worries simmered about North Korea The U.S. is becoming becoming `` more vicious and more aggressive '' under Donald Trump A senior Pyongyang official said background and
  51. 51. Results
  52. 52. Precision: Recall: Creating a Hierarchy of Semantically-Linked Propositions in Open Information Extraction, COLING, (2018) What to expect (Wikipedia & Newswire)
  53. 53. https://github.com/Lambda-3/Graphene Niklaus et al., A Sentence Simplification System for Improving Relation Extraction, COLING (2017) Software: Extracting Knowledge Graphs from Text
  54. 54. Graphene • Resolves co-references. • Transforms complex sentences (for example, containing subordinations, coordinations, appositive phrases, etc), into simple independent sentences (one clause per sentence). • Identifies rhetorical relations between those sentences. • Extract binary relations (subject, predicate and object) from each sentence. • Merge all the extracted relations into a relation graph (knowledge graph).
  55. 55. Installing Graphene $: git clone https://github.com/Lambda-3/Graphene.git $: docker-compose up
  56. 56. Using it as a Server curl -X POST -H "Content-Type: application/json" -d '{"text": “My text.", "doCoreference": "true", "isolateSentences": "false", "format": "DEFAULT"}' -H "Accept: text/plain" http://localhost:8080/discourseSimplification/text CLI Tutorial: https://asciinema.org/a/bvhgIP8ZEgDwtmRPFctHyxALu?speed=3
  57. 57. Java API Co-reference: Discourse Simplification: Relation Extraction:
  58. 58. Evaluation • Assertedness. Extracted propositions should be asserted by the original sentence. Instead of inferring propositions • Minimal Propositions. In order to serve for semantic tasks, it is beneficial for Open IE systems to extract compact, self-contained propositions. • Completeness and Open Lexicon. Aims to extract all relations that are asserted in the input text (Banko et al. (2007)).
  59. 59. Comparing Evaluation Settings
  60. 60. Comparing Evaluation Settings
  61. 61. Extrinsic Evaluation • Potential: Stanovsky and Dagan (2016) converted the annotations of a QA-SRL dataset to an Open IE corpus, resulting in more than 10,000 extractions over 3,200 sentences Wikipedia and the Wall Street Journal.
  62. 62. MCTest comprehension data (Richardson et al.) James the Turtle was always getting in trouble. Sometimes he'd reach into the freezer and empty out all the food. Other times he'd sled on the deck and get a splinter. His aunt Jane tried as hard as she could to keep him out of trouble, but he was sneaky and got into lots of trouble behind her back. One day, James thought he would go into town and see what kind of trouble he could get into. He went to the grocery store and pulled all the pudding off the shelves and ate two jars. Then he walked to the fast food restaurant and ordered 15 bags of fries. He didn't pay, and instead headed home. His aunt was waiting for him in his room. She told James that she loved him, but he would have to start acting like a well-behaved turtle. After about a month, and after getting into lots of trouble, James finally made up his mind to be a better turtle. Q: What did James pull off of the shelves in the grocery store? A) pudding B) fries C) food D) splinters … Slide credit: Jason Weston
  63. 63. Qualitative Analysis • Wrong boundaries, where the relational or argument phrase is either too long or too small. • Redundant extraction, where the proposition asserted in an extraction is already expressed in another extraction. • Uninformative extraction, where critical information is omitted. • Missing extraction, i.e. a false negative, where either a relation is not detected by the system. • Wrong extraction, where no meaningful interpretation of the proposition is possible. • Out of scope extraction, where a system yields a correct extraction that was not recognized by the authors of the gold dataset.
  64. 64. Open Research Questions • Comparing results among different Open IE systems in a large-scale, objective and reproducible fashion. • Applicability and transferability to languages other than English. – The role of Universal Dependencies on this process • Canonicalizing relational phrases and arguments. Normalizing extractions would be highly beneficial for downstream semantic tasks, such as textual entailment or knowledge base population. • Incorporating coreference resolution into open ie evaluations A Survey on Open Information Extraction, COLING (2018)
  65. 65. From Open IE to Semantics
  66. 66. RDF-NL https://github.com/Lambda-3/Graphene/blob/master/wiki/RDFNL-Format.md “Although the Treasury will announce details of the November refunding on Monday , the funding will be delayed if Congress and President Bush fail to increase the Treasury 's borrowing capacity.”
  67. 67. RDF-NL https://github.com/Lambda-3/Graphene/blob/master/wiki/RDFNL-Format.md
  68. 68. SRDF SRDF: Korean Open Information Extraction using Singleton Property, Proc. International Semantic Web Conference, ISWC (2015). SRDF: A Novel Lexical Knowledge Graph for Whole Sentence Knowledge Extraction, LDK (2017).
  69. 69. SRDF SRDF: Korean Open Information Extraction using Singleton Property, Proc. International Semantic Web Conference, ISWC (2015). SRDF: A Novel Lexical Knowledge Graph for Whole Sentence Knowledge Extraction, LDK (2017). https://github.com/sanghanam/SRDF
  70. 70. Frame-based Extraction (Ontology Grounded) Background theories: • Combinatory Categorial Grammar [C&C], • Discourse Representation Theory [DRT, Boxer], • Frame Semantics [Fillmore 1976] • Ontology Design Patterns [Ontology Handbook]. Frameworks: • Named Entity Resolution [Stanbol, TagMe], • Coreference Resolution [CoreNLP] • Word Sense Disambiguation [Boxer, IMS]. Gangemini et al., Semantic Web Machine Reading with FRED, Semantic Web Journal, 2017
  71. 71. Frame-based Extraction (Ontology Grounded) N-ary relation and Event extraction by using frame detection Relation extraction between frames, events, concepts and entities Negation representation Modality representation Adjective semantics Temporal relation extraction from tense expressions Semantic annotation of text fragments Coreference resolution Type and taxonomy induction Incremental role propagation Entity linking to Semantic Web data Word-sense disambiguation Pattern-based subgraph extraction Named-graph generation Gangemini et al., Semantic Web Machine Reading with FRED, Semantic Web Journal, 2017
  72. 72. Frame-based Extraction The New York Times reported that John McCarthy died. He invented the programming language LISP. Gangemini et al., Semantic Web Machine Reading with FRED, Semantic Web Journal, 2017
  73. 73. Frame-based Extraction The New York Times reported that John McCarthy died. He invented the programming language LISP. Gangemini et al., Semantic Web Machine Reading with FRED, Semantic Web Journal, 2017
  74. 74. http://wit.istc.cnr.it/stlab-tools/fred/ Gangemini et al., Semantic Web Machine Reading with FRED, Semantic Web Journal, 2017 Software: FRED
  75. 75. Argumentation Structures Stab & Gurevych, Parsing Argumentation Structures in Persuasive Essays, 2016.
  76. 76. Argumentation Structures Stab & Gurevych, Parsing Argumentation Structures in Persuasive Essays, 2016.
  77. 77. Argumentative Discourse Unit Classification
  78. 78. Argumentation & Rhetorical Relations • Support: background, cause, evidence, justify, list, motivation, reason, restatement, result. • Rebuttal: antithesis, contrast, unless. • Undercut: concession.
  79. 79. Argumentation Schemes “Argumentation Schemes are forms of argument (structures of inference) that represent structures of common types of arguments used in everyday discourse, as well as in special contexts like those of legal argumentation and scientific argumentation.” Douglas Walton
  80. 80. Argumentation Schemes
  81. 81. Unified Scheme
  82. 82. Classification
  83. 83. Argument Mining Approaches What to expect? F1-score: 0.74 Stab & Gurevych, Parsing Argumentation Structures in Persuasive Essays, 2016.
  84. 84. https://www.ukp.tu-darmstadt.de/data/argumentation- mining/argument-annotated-essays/ Stab & Gurevych, Parsing Argumentation Structures in Persuasive Essays, 2016. Software: Argumentation Mining
  85. 85. Event Extraction
  86. 86. EventKG: A Multilingual Event-Centric Temporal Knowledge Graph Simon Gottschalk, Elena Demidova. ESWC 2018 EventKG is an open knowledge graph containing event-centric information: http://eventkg.l3s.uni-hannover.de/ - EventKG V1.1: 690K events and 2.3M temporal relations - 2014 FIFA World Cup - “The Space Shuttle Challenger is launched on its maiden voyage” - <Jennifer Aniston, married to, Brad Pitt, [2000-07-29,2005-10-02]> - Extracted from Wikidata, YAGO, DBpedia and Wikipedia - Integrated data in five languages: EN, FR, DE, RU, PT - Provides provenance information - High coverage of event times and locations due to integration: EventKG Wikidata DBpedia (en) Events with Time 50.82% 33.00% 7.00% Events with Location 26.13% 11.70% 6.21%
  87. 87. EventKG: A Multilingual Event-Centric Temporal Knowledge Graph Simon Gottschalk, Elena Demidova. ESWC 2018 Which science-related events took place in Lyon? - 1921: “À Lyon, fusion de la Société de médecine et de la Société des sciences médicales” SELECT DISTINCT ?description { ?event rdf:type sem:Event . ?relation rdf:object ?lyon . ?relation rdf:subject ?event . ?event dcterms:description ?description . FILTER regex(?description, "science", "i") . ?lyon owl:sameAs dbr:Lyon . }
  88. 88. EventKG: A Multilingual Event-Centric Temporal Knowledge Graph Simon Gottschalk, Elena Demidova. ESWC 2018 • EventKG builds upon and extends the Simple Event Model (SEM). • Example: The participation of Barack Obama in his second inauguration as US president in 2013 in EventKG.
  89. 89. EventKG+TL: Creating Cross-Lingual Timelines from an Event-Centric Knowledge Graph Simon Gottschalk, Elena Demidova. ESWC 2018 - An overview of events related to a query entity over a time period across languages using Event KG - Example: Brexit-related events. The pie chart size: the overall (i.e. language independent) event relevance. The colored slices: the ratio of the relevance in a language context.
  90. 90. Definition Extraction
  91. 91. Semantic Roles for Lexical Definitions Aristotle’s classic theory of definition introduced important aspects such as the genus-differentia definition pattern and the essential/non-essential property differentiation.
  92. 92. Building the Definition Graph
  93. 93. Data: WordNetGraph Silva et al., Categorization of Semantic Roles for Dictionary Definitions. Cognitive Aspects of the Lexicon CogALex@COLING, 2017. https://github.com/Lambda-3/WordnetGraph RDF graph generated from WordNet.
  94. 94. Emerging perspectives • The evolution of parsing and classification methods in NLP is inducing a new lightweight semantic representation. • This representation dialogues with elements from logics, linguistics and the Semantic/Linked Data Web (especially RDF). • However, they relax the semantic constraints of previous models (which were operating under assumptions for deductive reasoning or databases).
  95. 95. Emerging perspectives • Knowledge graphs as lexical semantic models operating under a semantic best-effort mode (canonical identifiers when possible, otherwise, words). • Possibly closer to the surface form of the text. • Priority is on segmenting, categorizing and when possible, integrating. • A representation (data model) convenient for AI engineering.
  96. 96. Categorization A fact (main clause): * Can be a taxonomic fact. s p o term, URI term, URI term, URI instance, class, triple type, property, schema property instance, class, triple
  97. 97. Categorization A fact with a context: s0 p0 o0 p1 o1 reification e.g. • subordination (modality, temporality, spatiality, RSTs) • fact probability • polarity
  98. 98. Categorization Coordinated facts: s0 p0 o0 s1 p1 o1 p2 e.g. • coordination • RSTs • ADU
  99. 99. Knowledge Graphs & Distributional Semantics (A marriage made in heaven?)
  100. 100. Distributional Semantics
  101. 101. • Computational models that build contextual semantic representations from corpus data. • Semantic context is represented by a vector. • Vectors are obtained through the statistical analysis of the linguistic contexts of a word. • Salience of contexts (cf. context weighting scheme). • Semantic similarity/relatedness as the core operation over the model. Distributional Semantic Models
  102. 102. Distributional Semantic Models • Semantic Model with low acquisition effort (automatically built from text) Simplification of the representation • Enables the construction of comprehensive commonsense/semantic KBs • What is the cost? Some level of noise (semantic best-effort) Limited semantic model
  103. 103. Distributional Semantics as Commonsense Knowledge Commonsense is here θ car dog cat bark run leash Semantic Approximation is here Semantic Model with low acquisition effort
  104. 104. Context Weighting Measures Kiela & Clark, 2014 Similarity Measures x … and of course, Glove and W2V
  105. 105. Distributional-Relational Networks Distributional Relational Networks, AAAI Symposium (2013). A Compositional-Distributional Semantic Model for Searching Complex Entity Categories, ACL *SEM (2016) Barack Obama Sonia Sotomayor nominated :is_a First Supreme Court Justice of Hispanic descent … LSA, ESA, W2V, GLOVE, … s0 p0 o0
  106. 106. The vector space is segmented 114 Dimensional reduction mechanism! A Distributional Structured Semantic Space for Querying RDF Graph Data, IJSC 2012
  107. 107. Compositionality of Complex Nominals Barack Obama Sonia Sotomayor nominated :is_a First Supreme Court Justice of Hispanic descent
  108. 108. On Complex Nominals
  109. 109. Building on Word Vector Space Models • But how can we represent the meaning of longer phrases? • By mapping them into the same vector space! the country of my birth the place where I was born
  110. 110. Compositionality • The meaning of a complex expression is a function of the meaning of its constituent parts. carnivorous plants digest slowly
  111. 111. Compositionality Principles Words in which the meaning is directly determined by their distributional behaviour (e.g., nouns). Words that act as functions transforming the distributional profile of other words (e.g., verbs, adjectives, …).
  112. 112. Compositionality Principles • Take the syntactic structure to constitute the backbone guiding the assembly of the semantic representations of phrases. • A correspondence between syntactic categories and distributional objects.
  113. 113. Mixture vs Function
  114. 114. Inducing distributional functions from corpus data - Distributional functions are induced from input to output transformation examples - Regression techniques commonly used in machine learning.
  115. 115. How should we map phrases into a vector space? Recursive Neural Networks
  116. 116. Compositional-distributional model for paraphrases A Compositional-Distributional Semantic Model for Searching Complex Entity Categories, *SEM (2016)
  117. 117. Software: Indra • Semantic approximation server • Multi-lingual (12 languages) • Multi-domain • Different compositional models https://github.com/Lambda-3/indra Semantic Relatedness for All (Languages): A Comparative Analysis of Multilingual Semantic Relatedness using Machine Translation, EKAW, (2016).
  118. 118. Data: PPDB Ganitkevitch et al., PPDB: The Paraphrase Database, LREC, 2013 http://paraphrase.org/#/download • 16 languages
  119. 119. Recursive vs recurrent neural networks 1
  120. 120. Segmented Spaces vs Unified Space s0 p0 o0 s0 p0 o0 • Assumes is <s,p,o> naturally irreconcilable. • Inherent dimensional reduction mechanism. • Facilitates the specialization of embedding-based approximations. • Easier to compute identity. • Requires complex and high- dimensional tensorial model.
  121. 121. How to access Distributional- Knowledge Graphs efficiently? • Depends on the target operations in the Knowledge Graphs (more on this later).
  122. 122. How to access Distributional- Knowledge Graphs efficiently? s0 p0 o0 s0 q Inverted index sharding disk access optimization … Multiple Randomized K-d Tree Algorithm The Priority Search K-Means Tree algorithm Database + IR Query planning Cardinality Indexing Skyline Bitmap indexes … Structured Queries Approximation Queries
  123. 123. How to access Distributional- Knowledge Graphs efficiently? s0 p0 o0 Database + IR Structured Queries Approximation Queries
  124. 124. Software: StarGraph • Distributional Knowledge Graph Database. • Word embedding Database. https://github.com/Lambda-3/Stargraph Freitas et al., Natural Language Queries over Heterogeneous Linked Data Graphs: A Distributional-Compositional Semantics Approach, 2014.
  125. 125. Emerging perspectives • Graph-based data models + Distributional Semantic Models (Word embeddings) have complementary semantic values. • Graph-based Data Models: – Facilitates querying, integration and rule-based reasoning. • Distributional Semantic Models: – Supports semantic approximation, coping with vocabulary variation.
  126. 126. Emerging perspectives • AI systems require access to comprehensive background knowledge for semantic interpretation tasks. • Inheriting from Information Retrieval and Databases: – General Indexing schemes, – Particular Indexing schemes, • Spatial, temporal, topological, probabilistic, causal, … – Query planning, – Data compression, – Distribution, – … even supporting hardware strategies.
  127. 127. Emerging perspectives • One size of embedding does not fit all: Operate with multiple distributional + compositional models for different data model types (I, C, P), different domains and different languages. • Inheriting from Information Retrieval and Databases: – Indexing schemes, – Query planning, – Data compression, – Query distribution, – even supporting hardware.
  128. 128. Best-effort Semantic Integration
  129. 129. Semantic Integration • Task: Mapping near-synonymic term references to a canonical identifier. • Goal: Reduce the entropy (complexity) of the underlying KG. • Operations: – Co-reference Resolution – (Named) Entity Linking – Predicate Reconciliation • Common aspects: – Highly dependent on the context of the mention. – Highly dependent on target entity background knowledge.
  130. 130. Software: Cobalt • KG-based co-reference resolution. Co Cobalt https://github.com/SeManchester/cobalt (to appear)
  131. 131. Software: Cobalt
  132. 132. Software: Cobalt
  133. 133. Software: Cobalt
  134. 134. Software: AGDISTIS • Agnostic Disambiguation of Named Entities Using Linked Open Data http://aksw.org/Projects/AGDISTIS.html Usbeck et al. AGDISTIS - Agnostic Disambiguation of Named Entities Using Linked Open Data, ECAI, 2015
  135. 135. Software: StarGraph • Predicate Reconciliation (Distributional-semantics based). https://github.com/Lambda-3/Stargraph Freitas et al., Natural Language Queries over Heterogeneous Linked Data Graphs: A Distributional-Compositional Semantics Approach, 2014.
  136. 136. Effective Semantic Parsing for Large KBs
  137. 137. The Vocabulary Problem Barack Obama Sonia Sotomayor nominated :is_a First Supreme Court Justice of Hispanic descent
  138. 138. The Vocabulary Problem Barack Obama Sonia Sotomayor nominated :is_a First Supreme Court Justice of Hispanic descent Latino origins selected JudgeHigh Obama Last US president
  139. 139. “On our best behaviour” “It is not enough to build knowledge bases without paying closer attention to the demands arising from their use.” Levesque, 2013 “We should explore more thoroughly the space of computations between fact retrieval and full automated logical reasoning.”
  140. 140. Vocabulary Problem for Databases Schema-agnostic query mechanisms
  141. 141. Minimizing the Semantic Entropy for the Semantic Matching Definition of a semantic pivot: first query term to be resolved in the database. • Maximizes the reduction of the semantic configuration space. • Less prone to more complex synonymic expressions and abstraction-level differences. • Semantic pivot serves as interpretation context for the remaining alignments. • proper nouns >> nouns >> complex nominals >> adjectives , verbs.
  142. 142. Distributional Semantic Relatedness ?
  143. 143. Semantic Relatedness Measure as a Ranking Function A Distributional Approach for Terminological Semantic Search on the Linked Data Web, ACM SAC, 2012152
  144. 144. Distributional Inverted Index Distributional- Relational Model Reference Commonsense corpora Core semantic approximation & composition operations Semantic Parser Query Plan Scalable semantic parsing Learn to Rank Question Answers
  145. 145. I P C 𝒒 = 𝒕Γ 𝟎, … , 𝒕Γ 𝒏 t h 0t m1 0 t m2 0 Γ= {𝑰, 𝑷, 𝑪, 𝑽} … lexical specificity # of senses lexical category … … …
  146. 146. - Vector neighborhood density - Semantic differential I P C 𝒒 = 𝒕Γ 𝟎, … , 𝒕Γ 𝒏 t h 0t m1 0 t m2 0 Γ= {𝑰, 𝑷, 𝑪, 𝑽} … lexical specificity # of senses lexical category … … … 𝜌
  147. 147. - Vector neighborhood density - Semantic differential I P C 𝒒 = 𝒕Γ 𝟎, … , 𝒕Γ 𝒏 t h 0t m1 0 t m2 0 Γ= {𝑰, 𝑷, 𝑪, 𝑽} … lexical specificity # of senses lexical category … … … Δ𝑠𝑟 Δ𝑟 Semantic pivoting
  148. 148. - Vector neighborhood density - Semantic differential - Distributional compositionality I P C 𝒒 = 𝒕Γ 𝟎, … , 𝒕Γ 𝒏 t h 0t m1 0 t m2 0 Γ= {𝑰, 𝑷, 𝑪, 𝑽} … lexical specificity # of senses lexical category … … … t h 0t m1 0 t m2 0 o t h 0t m1 0 t m1 0 = … … … …
  149. 149. Knowledge Graphs Open Information Extraction & Text Classification Question Answering Show me cases which refer to article Regulation (EU) No 575/2013. Show me cases with similar supporting arguments to Case 2008/900922. Interpreting Law at Scale
  150. 150. Question Answering
  151. 151. Question Answering
  152. 152. What to expect (@ QALD1) F1-Score: 0.72 MRR: 0.5 Freitas & Curry, Natural Language Queries over Heterogeneous Linked Data Graphs, IUI (2014).
  153. 153. Addressing the Vocabulary Problem • Hierarchy of approximation spaces – Probabilistic justification. – Semantic pivoting. How hard is the Query? Measuring the Semantic Complexity of Schema-Agnostic Queries, IWCS (2015). A Distributional Approach for Terminological Semantic Search on the Linked Data Web, ACM SAC (2012). Schema-agnostic queries over large-schema databases: a distributional semantics approach, PhD Thesis (2015). On the Semantic Mapping of Schema-agnostic Queries: A Preliminary Study, NLIWoD (2015).
  154. 154. How to cope with the vocabulary problem? • Embed resources on distributional spaces. • Use heuristics to minimize approximation errors. • Interaction Pattern (Semantic best-effort): Search >> Disambiguate >> Learn Distributional semantics = domain and language transportability. = high search recall and relevance ranking.
  155. 155. Software: StarGraph • Semantic parsing. https://github.com/Lambda-3/Stargraph Freitas et al., Natural Language Queries over Heterogeneous Linked Data Graphs: A Distributional-Compositional Semantics Approach, 2014.
  156. 156. Knowledge Graph Completion
  157. 157. The Problem Liu, Representation Learning for Large-Scale Knowledge Graphs, 2015
  158. 158. The Problem Liu, Representation Learning for Large-Scale Knowledge Graphs, 2015
  159. 159. Formulating the Distributional- Relational Representation Liu, Representation Learning for Large-Scale Knowledge Graphs, 2015
  160. 160. Formulating the Distributional- Relational Representation Liu, Representation Learning for Large-Scale Knowledge Graphs, 2015
  161. 161. Liu, Representation Learning for Large-Scale Knowledge Graphs, 2015
  162. 162. Complex Relations Liu, Representation Learning for Large-Scale Knowledge Graphs, 2015
  163. 163. TransD Liu, Representation Learning for Large-Scale Knowledge Graphs, 2015
  164. 164. KG2E Liu, Representation Learning for Large-Scale Knowledge Graphs, 2015
  165. 165. Complex Relations: RL4KG Liu, Representation Learning for Large-Scale Knowledge Graphs, 2015
  166. 166. RL4KG with Entity Descriptions Liu, Representation Learning for Large-Scale Knowledge Graphs, 2015
  167. 167. RL4KG with Entity Descriptions Liu, Representation Learning for Large-Scale Knowledge Graphs, 2015
  168. 168. Relation Paths • Complex Inference patterns for composition. Liu, Representation Learning for Large-Scale Knowledge Graphs, 2015
  169. 169. Relation Paths
  170. 170. Path-based TransE
  171. 171. Representation of Relation Paths Liu, Representation Learning for Large-Scale Knowledge Graphs, 2015
  172. 172. Path-based TransE Addition, multiplication, RNNv Liu, Representation Learning for Large-Scale Knowledge Graphs, 2015
  173. 173. What to expect (PTransE@FB15K) Relation Prediction
  174. 174. What to expect (PTransE@FB15K) Relation Prediction
  175. 175. Software: KB2E Relation Extraction Knowledge Graph Embeddings including TransE, TransH, TransR and PTransE. https://github.com/thunlp/KB2E Yankai Lin, Zhiyuan Liu, Maosong Sun, Yang Liu, Xuan Zhu. Learning Entity and Relation Embeddings for Knowledge Graph Completion. The 29th AAAI Conference on Artificial Intelligence (AAAI'15). KB2E
  176. 176. Natural Language Inference
  177. 177. Recognizing and Justifying Text Entailments (TE) using Definition KGs
  178. 178. Distributional semantic relatedness as a Selectivity Heuristics Distributional heuristics target source answer
  179. 179. Distributional semantic relatedness as a Selectivity Heuristics Distributional heuristics target source answer
  180. 180. Distributional semantic relatedness as a Selectivity Heuristics Distributional heuristics target source answer
  181. 181. Distributional Navigation Algorithm
  182. 182. Pre-Processing
  183. 183. Inference
  184. 184. Generation
  185. 185. Explainable AI “The right to explanation” “The data subject should have the right not to be subject to a decision, which may include a measure, evaluating personal aspects relating to him or her which is based solely on automated processing …” “… such processing should be subject to suitable safeguards, … to obtain an explanation of the decision …”
  186. 186. What to expect (TE@Boeing-Princeton-ISI) F1-Score: 0.59 What to expect (TE@Guardian Headline Samples) F1-Score: 0.53 Recognizing and Justifying Text Entailment through Distributional Navigation on Definition Graphs, AAAI, 2018.
  187. 187. Emerging perspectives • Distributional-relational models in KB completion explored a large range of representation paradigms. – Opportunity for exporting these representation models to other tasks. • Definition-based models can provide a corpus-viable, low-data and explainable alternative to embedding- based models.
  188. 188. Architecture
  189. 189. Entity Linking Open IE Taxonomy Extraction Integration Arg. Classif.Co-reference Resolution KG Completion Natural Language Inference Named Entity Recognition Semantic Parsing KG Construction Inference Distributional Semantics Server Query By Example Query spatial temporal probabilistic causal Indexes NL Generation NL Query Answers Explanations Definition Extraction
  190. 190. Entity Linking Integration Co-reference Resolution KG Completion Natural Language Inference Named Entity Recognition Semantic Parsing KG Construction Inference Distributional Semantics Server Query By Example Query spatial temporal probabilistic causal Indexes NL Generation NL Query Answers Explanations M T M T Open IE Taxonomy Extraction Arg. Classif. Definition Extraction https://goo.gl/qEG5Rk
  191. 191. Knowledge Graphs in Use
  192. 192. Asked Financial Analysts ”What ruins your day?” Customer ranked pain points Analyst workday Source: Customer meetings, TR internal analyst survey 20% 15% 20% 10% 15% 20%Assemble Synthesize Interpret Meetings Financial Modeling Communicate “An analyst used to cover 40 firms, now it is 150 and the tools haven’t changed. Director, $4B US L/S equity fund “They all still do it manually” Equities, Market data team $20B+ hedge fund “The problem has gotten worse with more data and more information” Director, Research/Tech, $10B Multi-strat 1. Information overload 2. Understanding relationships 3. Unable to track impact events 4. Cannot link internal & external research, especially text What they told Geoff Horrell & Dan Bennett
  193. 193. What’s in the Graph? • Organizations – including names, address, identifiers, Country of HQ/Incorp • Industry Classification • Hierarchy – Parent, Ultimate Parent, Affiliates • Officers & Directors • Job History, Education • Suppliers & Customers • Comparable Companies • Joint Ventures & Strategic Alliances • Meta-Data Geoff Horrell & Dan Bennett
  194. 194. • 125,000,000 (equity instruments & quotes). • 75,000,000 meta-data objects (countries, regions, cities, currencies, commodities, holidays, industry schemes, scripts, languages, time zones, value domains, units, identifiers). • 200,000,000 strategic relationships (supplier, customer, competitor, joint venture, alliance, industry, ownership, affiliate). What’s in the Graph? Geoff Horrell & Dan Bennett
  195. 195. What’s in the Graph? Geoff Horrell & Dan Bennett
  196. 196. All Connections between Onshore Oil Drilling & Venezuela Geoff Horrell & Dan Bennett
  197. 197. One Week Snapshot • 6,778 news articles with company news where at least one organization has 80% relevance to the article. • 135,267 companies are 2 steps away. • 217,387 strategic relationships. • Typical analyst portfolio is 200 companies. • Each customer creates their own relative weights for each type of relationship. • Requires around 800,000 shortest path calculations to deliver the ranked news feed. Each calculation optimised to take 10ms. Geoff Horrell & Dan Bennett
  198. 198. Explainable Findings From Tensor Inferences Back to KGs
  199. 199. Explainable Findings From Tensor Inferences Back to KGs
  200. 200. Take-away Message • The evolution of methods, tools and the availability of data in NLP creates the demand for a knowledge representation method to support complex AI systems. • A relaxed version of RDF (RDF-NL?) can provide this answer. – Establishes a dialogue with a standard (with existing data). – Inherits optimization aspects from Databases. • Word-embeddings (DSMs) + compositional models + RDF. • Moving beyond facts and taxonomies: rhetorical structures, arguments, polarity stories, pragmatics.
  201. 201. Take-away Message • Syntactical and lexical features can go a long way for structuring text. – Context-preserving. • Integration (entity reconciliation) as semantic-best effort. – Embrace schema on read. • KGs can support explainable AI: – Meeting point between extraction, reasoning and querying. – Definition-based models. • Inherit infrastructures from DB and IR.
  202. 202. Take-away Message Opportunities: • ML orchestrated pipelines with: – Richer discourse-representation models. – Explicit semantic representations (centered on KGs). – Different compositional/distributional models (beyond W2V & Glove) • KGs and impact on explainability. • Quantifying domain and language transportability.

×