Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.

Smart Data Webinar: Organizing Data and Knowledge - The Role of Taxonomies and Ontologies

905 vues

Publié le

No single approach to knowledge classification and access is best for every application.
This webinar will help participants choose the right approach(es) to support their own cognitive computing application.

The science and engineering of data management for computational efficiency is well-understood. We have algorithms and heuristics to pre-fetch data and instructions and distribute them based on properties of the algorithms, data sets, applications, and system software and hardware. We have decades of experience fine-tuning hardware, networks, operating systems, compilers and applications based on physics. Now we need to start thinking in terms of biology.

Fortunately, we don’t have to actually model the 100B neurons or 100-500 trillion synapses in the human brain in hardware or software. We do need a well-specified knowledge model to organize refined data based on how we expect to query and further refine it. What we store constrains which questions a cognitive system may be able to answer. How we organize this knowledge may determine whether our system can answer questions or generate hypotheses efficiently or effectively.

Publié dans : Technologie
  • Soyez le premier à commenter

Smart Data Webinar: Organizing Data and Knowledge - The Role of Taxonomies and Ontologies

  1. 1. Organizing Data & Knowledge The Role of Taxonomies & Ontologies Adrian Bowles, PhD Founder, STORM Insights, Inc. Lead Analyst, AI, Aragon Research info@storminsights.com Copyright (c) 2017 by STORM Insights Inc. All Rights Reserved. AUGUST 10, 2017
  2. 2. Copyright (c) 2017 by STORM Insights Inc. All Rights Reserved. AGENDA - ORGANIZING DATA AND KNOWLEDGE: 4 QUESTIONS What Are We Trying to Accomplish? Why Is It SO Important? How Do We Do It Today? How Will It Be Done In The Future?
  3. 3. Learn Plan Reason Understand Model Copyright (c) 2017 by STORM Insights Inc. All Rights Reserved. THE GOAL IS TO BUILD SYSTEMS THAT UNDERSTAND AND LEARN Plan (v) Identify a goal/desired state and a set of steps/activities to reach that state. Reason (v) An evidence-based process for determining the truth or probability of a conclusion. Deductive - Top down reduction, Results are Certain Inductive - Bottom up generalizations, creating hypotheses with confidence levels/probability Abductive - Bottom up, probabalistic development of theories from observations Understand Awareness of the meaning of data. Learn To acquire understanding of data.
  4. 4. Model Copyright (c) 2017 by STORM Insights Inc. All Rights Reserved. Corpus The complete machine-readable record of a domain. Distance A metric for the similarity of two items based on their relative locations in n- space according to an algorithm. Assumptions Implicit or explicit data or relationships held to be valid. Hypothesis An evidence-based testable assertion that explains a phenomenon or relationship. COGNITIVE COMPUTING FUNDAMENTALS Model The Corpus, Assumptions, Algorithms Used to Generate & Score Hypotheses or Calculate The Strength of a Relationship
  5. 5. Model Copyright (c) 2017 by STORM Insights Inc. All Rights Reserved. COGNITIVE COMPUTING FUNDAMENTALS: SAMPLE ASSUMPTIONS Model The Corpus, Assumptions, Algorithms Used to Generate & Score Hypotheses or Calculate The Strength of a Relationship Principles that control the development and representation of natural intelligence in the neocortex provide a guide to the implementation of machine intelligence.(Numenta Hierarchical Temporal Memory) A function applied to a string representing data or a concept results in a value or vector meaningful for comparison. 
 e.g. Using Kolmogorov complexity to measure the strength of relationships in Memory-Based Reasoning.
  6. 6. Copyright (c) 2017 by STORM Insights Inc. All Rights Reserved. PAST AND (RE)PRESENT No taxation without representation. No Machine Intelligence without (knowledge) representation.
  7. 7. Copyright (c) 2017 by STORM Insights Inc. All Rights Reserved. FUNDAMENTAL PRINCIPLES “When the map and the terrain disagree, believe the terrain.” Gause and Weinberg (Exploring Requirements) It is the pervading law of all things organic, and inorganic, of all things physical and metaphysical, of all things human and all things superhuman, of all true manifestations of the head, of the heart, of the soul, that the life is recognizable in its expression, that form ever follows function. That is the law. Louis Sullivan: The Tall Office Building Artistically Considered, 1896
  8. 8. Copyright (c) 2017 by STORM Insights Inc. All Rights Reserved. WHAT IS KNOWLEDGE? (BEYOND DATA) Knowledge may include facts or beliefs and general information with context.
  9. 9. Copyright (c) 2017 by STORM Insights Inc. All Rights Reserved. Under ideal conditions, people are good - but not perfect - when communicating in natural languages. We… understand in context (environment & our own frame of reference) attempt to resolve ambiguity have to deal with competing signals, noise fill in words and meaning and may not hear/understand - what was said/meant… X X’ Y HOW DO PEOPLE COPE WITH IMPRECISION?
  10. 10. Copyright (c) 2017 by STORM Insights Inc. All Rights Reserved. THINK VS REPRESENT
  11. 11. CHOICES HAVE CONSEQUENCES How You Think About a Domain… …influences your choice of maps and models… rules and representations…and required operations.
  12. 12. Copyright (c) 2017 by STORM Insights Inc. All Rights Reserved. In the N ew s COMMON SENSE VS COMMON KNOWLEDGE
  13. 13. Copyright (c) 2017 by STORM Insights Inc. All Rights Reserved. Classic AI WHAT DO YOU SEE?
  14. 14. Copyright (c) 2017 by STORM Insights Inc. All Rights Reserved. Classic AI WHAT DO YOU SEE? 83 Birds 1 Flock A Carwash in Your Future
  15. 15. Copyright (c) 2017 by STORM Insights Inc. All Rights Reserved. WHAT DO YOU SEE, AND WHAT DO YOU FEEL? Motorcycles A Race Noises Smells Pollution What you capture depends on your context and experiences.
  16. 16. WHAT DO YOU SEE?
  17. 17. WHAT DO YOU SEE? Model Toy Airplane Memory
  18. 18. Copyright (c) 2017 by STORM Insights Inc. All Rights Reserved. ONE SIZE FITS ONE ” To solve really hard problems, we'll have to use several different representations. This is because each particular kind of data structure has its own virtues and deficiencies, and none by itself would seem adequate for all the different functions involved with what we call common sense.” Marvin Minsky
  19. 19. Copyright (c) 2017 by STORM Insights Inc. All Rights Reserved. TWO THINGS NOBODY TELLS YOU ABOUT DATA… • All data is structured
 Google used a neural network with16,000 processors to search 10,000,000 images from YouTube to identify…cats. • Beliefs change, truth doesn’t Representing belief as fact will eventually trip up any system “Facts change in regular and mathematically understandable ways.” Samuel Arbesman, The Half-life of Facts, 2012, Penguin Books.
  20. 20. Copyright (c) 2017 by STORM Insights Inc. All Rights Reserved. DEEP STRUCTURE REQUIRES STRONGER METHODS FOR ANALYSIS TO FIND CONCEPTS Perception: obvious structure is easy to process… but most of the interesting stuff isn’t obvious to a computer.
  21. 21. IS THERE A UNIVERSAL SOLUTION?
  22. 22. IF YOU ONLY HAVE A HAMMER, EVERY PROBLEM LOOKS LIKE A NAIL
  23. 23. DIFFERENT DOMAINS, AND DIFFERENT APPLICATIONS, CALL FOR DIFFERENT REPRESENTATIONS
  24. 24. WHY THIS ORGANIZATION?
  25. 25. SOMETIMES YOU HAVE TO IMPROVISE
  26. 26. Copyright (c) 2017 by STORM Insights Inc. All Rights Reserved. START WITH A TAXONOMY A taxonomy represents the formal structure of classes or types of objects within a domain. •Generally hierarchical and provide names for each class in the domain. •May also capture the membership properties of each object in relation to the other objects. •The rules of a specific taxonomy are used to classify or categorize any object in the domain, so they must be complete, consistent, and unambiguous. This rigor in specification should ensure that any newly discovered object must fit into one, and only one, category or object class.
  27. 27. Copyright (c) 2017 by STORM Insights Inc. All Rights Reserved. ONTOLOGIES An ontology formalizes and specifies the names, definitions, and attributes of entities within a domain. For practical purposes, an accepted ontology defines the domain.
  28. 28. Copyright (c) 2017 by STORM Insights Inc. All Rights Reserved. ONTOLOGIES EVOLVE - SYSTEMS MUST BE FLEXIBLE
  29. 29. Copyright (c) 2017 by STORM Insights Inc. All Rights Reserved. DESIGN CHOICE: FINDING SYMBOLS VS USING STATISTICS Symbolic Logic Representations Reasoning Concepts Statistical Models Mechanical Theorem Proving
  30. 30. Copyright (c) 2017 by STORM Insights Inc. All Rights Reserved. RDF - Resource Description Framework - A directed, labeled graph. RDFS - RDF Specifications Suite Recommendations (Language for representing RDF vocabularies) SPARQL - A Semantic Protocol & Query Language for RDF Data OWL - The Web Ontology Language is a Semantic We language designed to represent knowledge about things and relationships between things on the Web. An OWL Document is an Ontology. https://www.w3.org/2013/data/ THE SEMANTIC WEB - ALL DATA SHOULD BE ASSOCIATED WITH SEMANTIC ATTRIBUTES (MEANING) BASICS OF THE W3C SEMANTIC WEB ONTOLOGY STACK
  31. 31. Copyright (c) 2017 by STORM Insights Inc. All Rights Reserved. COMMERCIAL SOLUTIONS Copyright (c) Digital Reasoning.
  32. 32. Copyright (c) 2017 by STORM Insights Inc. All Rights Reserved. Copyright (c) Digital Reasoning. COMMERCIAL SOLUTIONS
  33. 33. Copyright (c) 2017 by STORM Insights Inc. All Rights Reserved. RECOGNIZING CONCEPTS VS UNDERSTANDING Courtesy of LoopAI Labs.
  34. 34. Copyright (c) 2017 by STORM Insights Inc. All Rights Reserved. RECOGNITION IS NOT UNDERSTANDING. https://arxiv.org/abs/1112.6209
  35. 35. Copyright (c) 2017 by STORM Insights Inc. All Rights Reserved. PROXIMITY/DISTANCE ALGORITHMS Mapped with vectors, proximity algorithm based on purpose. Mapping for autocorrect/complete vs Mapping for meaning Boy Bay Map Mop Man Nay May Mope Buy Hop Hope Boy Bay Map Mop Man Nay May Mope BuyHop HopeSimilar structure -> similar meaning in vision, not always in language. Memory-Based Reasoning
  36. 36. Model Copyright (c) 2017 by STORM Insights Inc. All Rights Reserved. GETTING STARTED: CLASSIFICATION NEEDS VALIDATION & VERIFICATION Classify Hypothesize Analyze or Synthesize Perceive
  37. 37. Copyright (c) 2017 by STORM Insights Inc. All Rights Reserved. USE PRE-BUILT KNOWLEDGE RESOURCES
  38. 38. Copyright (c) 2017 by STORM Insights Inc. All Rights Reserved. OPENCYC
  39. 39. Copyright (c) 2017 by STORM Insights Inc. All Rights Reserved. OFF THE SHELF KNOWLEDGE - NEED TO ASSOCIATE/RECOGNIZE/UNDERSTAND TO ORGANIZE/REPRESENT Wordnet(R) Princeton University "About WordNet." Princeton University. 2010. <http:// wordnet.princeton.edu>
  40. 40. STUCK? TRANSLATE, TRANSFORM, TRANSMOGRIFY. Analogy Abstraction Is-A vs Has-A
  41. 41. GRAPHS SHOULD BE PART OF YOUR TOOLKIT A graph is a structure with vertices and edges. a e dc b Old Post Road Cross Highway Compo Shinbone Alley Elk Road Old Post Road Paved Old Post Road 11 miles Elk Road Dirt Elk Road 2 miles Cross Highway toll road Cross Highway 250 miles Main Street 1 mile Shinbone Alley .5 miles a bus stop b gas station b Shell c Elementary school d House e Office building May be labeled, edges may be directed, all may be stored/processed by properties represented as key/value pairs.
  42. 42. Copyright (c) 2017 by STORM Insights Inc. All Rights Reserved. GRAPHS 101. SORT OF You Probably Already Think In Graphs if… You watch detective shows You know trivia about movies You remember relationships between people You took a biology class
  43. 43. Copyright (c) 2017 by STORM Insights Inc. All Rights Reserved. GRAPHS 101. SORT OF Wikipedia contributors. "Taxonomy (biology)." Wikipedia, The Free Encyclopedia. Wikipedia, The Free Encyclopedia, 11 May. 2016. Web. 12 May. 2016.
  44. 44. Copyright (c) 2017 by STORM Insights Inc. All Rights Reserved. GRAPHS 101. SORT OF
  45. 45. Copyright (c) 2017 by STORM Insights Inc. All Rights Reserved. GRAPHS 101. SORT OF Family Tree LinkedIn Tree
  46. 46. Copyright (c) 2017 by STORM Insights Inc. All Rights Reserved. GRAPHS 101. SORT OF Typical crazy wall whiteboard - from Fargo. A screen from IBM I2 Coplink
  47. 47. Copyright (c) 2017 by STORM Insights Inc. All Rights Reserved. BUY OR BUILD IT YOURSELF WITH… Commercial tools Open Source tools Prebuilt data Graphs…
  48. 48. Copyright (c) 2017 by STORM Insights Inc. All Rights Reserved. Sentiment/Emotion/Theme/Concept Analysis Don’t let the search for perfection interfere with the path to progress. IF THE INTELLIGENCE IS ARTIFICIAL, WHY OBSESS ABOUT UNDERSTANDING?
  49. 49. adrian@storminsights.com adrian@aragonresearch.com Twitter @ajbowles Skype ajbowles If you would like to connect on LinkedIn, please let me know that you that you registered for the Smart Data webinar series. Upcoming SmartData Webinar Dates & Topics Sept. 14 Advances in Natural Language Processing II: 
 NL Generation Oct. 12 Choosing the Right Data Management Architecture 
 for Cognitive Computing Nov. 9 See Me Feel Me, Touch Me, Heal Me:
 The Rise of the Cognitive Interface
 KEEP IN TOUCH New Content from Aragon Research AragonResearch.com
  50. 50. Copyright (c) 2017 by STORM Insights Inc. All Rights Reserved. #MODERNAI END OF DECK
  51. 51. Copyright (c) 2017 by STORM Insights Inc. All Rights Reserved. NLP Natural Language Understanding NLU Natural Language Generation NLG ? NATURAL LANGUAGE PROCESSING (NLP)
  52. 52. Copyright (c) 2017 by STORM Insights Inc. All Rights Reserved. DEEP NLU - DEEP QA WITH IBM WATSON Question Analysis What is being asked? Question classification: any words with double meanings? Puzzle question, factoid…? Detect focus LAT relations
  53. 53. Copyright (c) 2017 by STORM Insights Inc. All Rights Reserved. Google Cloud NLP Focus: Extract meaning NLU COMPONENTS
  54. 54. Copyright (c) 2017 by STORM Insights Inc. All Rights Reserved. IBM Watson Conversation BUILD IT YOURSELF WITH…
  55. 55. Copyright (c) 2017 by STORM Insights Inc. All Rights Reserved. AGI MINIMUM REQUIREMENTS or Big Knowledge + Modest Processing (Reasoning, KM…) Big Processing + Big Data (Reasoning, KM…) With sufficient processing power, and access to enough clean, validated data, just in time knowledge acquisition. Starting with sufficient knowledge (includes the model with assumptions) makes processing requirements relatively modest to accommodate incremental activities.
  56. 56. Copyright (c) 2017 by STORM Insights Inc. All Rights reserved. Content Acquisition Building the corpus For Jeopardy! this had to be completed before the game commenced. Ingested encyclopedias, dictionaries, thesauri, newswire articles, literary works, databases, taxonomies, ontologies… IRL, we can identify and use new resources based on the problem at hand.
  57. 57. Copyright (c) 2017 by STORM Insights Inc. All Rights Reserved. CASE STUDY: DEEP QA WITH IBM WATSON Question Analysis What is being asked? Question classification: any words with double meanings? Puzzle question, factoid…? Detect focus LAT relations
  58. 58. Copyright (c) 2017 by STORM Insights Inc. All Rights Reserved. CASE STUDY: DEEP QA WITH IBM WATSON Relation-detection “They’re the two states you could be reentering if you’re crossing Florida’s norther border.” Category: Head North borders(Florida, ?,x,north)
  59. 59. Copyright (c) 2017 by STORM Insights Inc. All Rights Reserved. CASE STUDY: DEEP QA WITH IBM WATSON
  60. 60. Copyright (c) 2017 by STORM Insights Inc. All Rights Reserved. CASE STUDY: DEEP QA WITH IBM WATSON
  61. 61. Copyright (c) 2017 by STORM Insights Inc. All Rights Reserved. CASE STUDY: DEEP QA WITH IBM WATSON Hypothesis Generation & Scoring Use a candidate answer with the question, try to prove correct with a degree of confidence supported by the evidence. Scoring may use a variety of relationships: temporal spatial geospatial taxonomic classification correlation between candidate and question…
  62. 62. Copyright (c) 2017 by STORM Insights Inc. All Rights Reserved. CASE STUDY: DEEP QA WITH IBM WATSON “Chile shares its longest land border with this country.”
  63. 63. Copyright (c) 2017 by STORM Insights Inc. All Rights Reserved. CASE STUDY: DEEP QA WITH IBM WATSON Evaluating Potential Answers Watson scores evidence in multiple dimensions What works for a factoid question may not work for a puzzle question. “Chile shares its longest land border with this country.”
  64. 64. Copyright (c) 2017 by STORM Insights Inc. All Rights Reserved. CASE STUDY: DEEP QA WITH IBM WATSON
  65. 65. Copyright (c) 2017 by STORM Insights Inc. All Rights Reserved. CASE STUDY: DEEP QA WITH IBM WATSON Merging & Ranking Identifying the most likely answer based on confidence scores. Answer scores are merged before ranking and confidence estimation. Uses ML approach to compare with training set data when confidence scores in different categories result in “too close to call” results.
  66. 66. Copyright (c) 2017 by STORM Insights Inc. All Rights Reserved. GRAPHS 101. SORT OF Wikipedia contributors. "Graph database." Wikipedia, The Free Encyclopedia. Wikipedia, The Free Encyclopedia, 11
  67. 67. Copyright (c) 2017 by STORM Insights Inc. All Rights Reserved. LEXICAL ANSWER TYPE DISTRIBUTION Predicting lexical answer types in open domain question and answering (qa) systems US 20130035931 A1 2013, Ferrucci, Gliozzo, Kalyanpur
  68. 68. Copyright (c) 2017 by STORM Insights Inc. All Rights Reserved. DEEP QA VS SEARCH

×