2. Data
Cognitive computing is the simulation of human
thought process in a computerized model.
What makes system Cognitive ?
Cognitive computing system consists of
• Contextual insight from Model
• Hypothesis generation
• Continuous self-learning systems
What are the features of Cognitive System ?
• Learn from experience with data/evidence and improve
knowledge and performance without reprogramming
• Generate and/or evaluate conflicting hypothesis based on the
current state of its knowledge
• Report on findings in a way that justifies conclusions based on
confidence in the evidence
• Discover patterns in data, with/without explicit guidance from a
user regarding the nature of the pattern
• Utilizes NLP to extract meaning from textual data and use deep
learning tools to extract features from images, video, voice and
sensors
• Uses variety of predictive analytics algorithms and statistical
techniques
Overview of Cognitive Computing
Cognitive Computing High Level Process Flow
3
Cognitive
Processor
Categorize, Find
Patterns &
Relationships, Match
and Learn
Index
Orchestration
Information
Hub
2
Dialog
Loop
1
Ingestion
Loop
5
Exploratio
n Loop
4
Context
Filters
3. Corpora
Continuous Machine LearningExternal
Data
Resources
Unstructured
(text)
(Videos,
images,
sensors,
sound)
Structured
(databases)
Internal Data Sources
Feature Extraction Machine/Deep Learning NLP
Processing
Services
Language
Image
Sensors
Voice
Discovery Diagnostics
Predictive
Prescriptive
Analytics Services
Ontologies
Taxonomies
Applications & Visualization
Services
Score
Hypothesis
Generate
Hypothesis
Elements of Cognitive System
• Iterative Hypothesis Generation and Evaluation (Learning
Process)
• Data Access, Metadata and Management Services
• Corpus, Taxonomies and Ontologies
• Data Analytics Services & Continuous Machine Learning
• Presentation and Visualization Services
Iterative Hypothesis Generation
• Hypothesis is testable assertion based on evidence that explains
some observed phenomena. Cognitive computing system have
ability to form hypothesis automatically from given database
and external news etc. and test it by using statistical/machine
learning models to check its statistical significance. If it found
significant report the results to end user
Data Access, Metadata and Management Services
• Manages the various Internal & External databases, for both
Structured (databases) and Unstructured data (Text, Videos,
sensors, audio etc.) either Data servers or Cloud based
Elements of Cognitive System
Elements of Cognitive System
Model
Data Access, Metadata and Management Services
WorkloadAutomationServices,APIs,BusinessSecurity
Services
4. Continuous Machine LearningExternal
Data
Resources
Unstructured
(text)
(Videos,
images,
sensors,
sound)
Structured
(databases)
Internal Data Sources
Feature Extraction Machine/Deep Learning NLP
Processing
Services
Language
Image
Sensors
Voice
Discovery Diagnostics
Predictive
Prescriptive
Analytics Services
Ontologies
Taxonomies
Corpora
Applications & Visualization
Services
Score
Hypothesis
Generate
Hypothesis
Corpus, Taxonomies & Ontologies
• Corpus is the knowledge base of ingested data and is used to
manage codified knowledge. Data required to establish the
domain for the system is included. Corpus may include
ontologies which define specific entities and their relationships.
Ontologies are often developed by industry groups to classify
industry-specific elements such as standard chemical
compounds, machine parts or medical diseases and treatments.
Taxonomy works hand in hand with ontologies and provides
context within the ontology
Data Analytics Services & Continuous Machine
Learning
• Data Analytics services are the techniques used to gain an
understanding of the data ingested and managed within the
corpus. Typically users can take advantage of structured,
unstructured and semi-structured data that has been ingested.
Whereas Continuous Machine Learning creates new hypothesis
whenever new data, analysis and interaction happens and also
models get updated on continual basis
Presentation & Visualization Services
• Presents data in visual format and enable users to change the
metrics interactively and let users understand data in a visual
way
Elements of Cognitive System
Elements of Cognitive System
Model
Data Access, Metadata and Management Services
WorkloadAutomationServices,APIs,BusinessSecurity
Services
5. Repeat the process
iteratively until preset
no.of leaves reached /
threshold correlation
Select K nearest words (typically K =
2 for binary trees) using Cosine
similarity/Euclidean distance and
create left & right branches/leaves
Choosing Root
Word
Word to Word cosine
similarity matrix
Word Tokenizer, Stop
word removal, Entity
Recognizer, TF-IDF
Automated Hypothesis Generation
• Objective is to generate insights & form hypothesis from
unstructured data like web pages, social media data &
Wikipedia etc.
• In first stage all the documents were converted into readable
format and followed by tokenization, stop word removal,
recognizing Nouns using NER and subsequently to create TF-IDF
for all documents. Output of this stage would be matrix of M x
N (M – No.of documents & N – No.of words)
• In this stage word to word cosine similarity matrix will be
created for all the words/word vector w.r.to other words (N x N)
• Root word will be identified manually or based on context from
N x N matrix and create as a Node for Tree, which will be
constructed from N x N matrix
• Generate binary tree starting from root node and add left and
right branches iteratively until some preset threshold reaches or
no.of nodes added
• Unknown relationships could be identified in the tree structure
which are not obvious to identify
• Relationships could be verified in the database to validate the
hypothesis of significance, in case of any significances found
report the results to user
Automated Hypothesis Generation – Cognition from Text Data
Automated Hypothesis Generation
Word 1 Word 2 Word K
Doc 1
Doc 2
Doc N
Word 1 Word 2 Word K
Word 1
Word 2
Word K
Laundering
Laundering
DrugsCorruption
6. Interactive Self Served Machine Learning
Tool
• Interactive tool provides the solutions to
customers for given datasets with a click of
buttons
• Interactive tool integrates tightly with analytical
tools & cognition based on unstructured external
databases like Wikipedia pages, blogs, books,
journals etc.
• At every stage customer have choice to choose or
type for analytical technique etc.
• An agent assists user on every step in navigating
appropriately
• Interactive tool have provision to visualize the
relations from unstructured data between various
entities
• Significant relations between variables identified
form corpus and validated with actual datasets are
displayed in tool
Interactive Self Served Machine Learning Tool
Interactive Cognitive Computing Machine Learning Tool
How can
I help
you ?
Cognitive Computing Machine Learning Tool
Type your question here …
How can
I help
you ?
Cognitive Computing Machine Learning Tool
I would like to do Analysis
Select
one
option !
Cognitive Computing Machine Learning Tool
Me: I would like to do Analysis
Bot: Great, Supervised or
Unsupervised ?
Supervised Unsupervised
Select
Algorithm !
Cognitive Computing Machine Learning Tool
Bot: Great, Supervised or
Unsupervised ?
Me: Supervised
Logistic
Regression
Random Forest
7. • Cognitive computing and Big data Analytics by Judith S. Hurwitz, Marcia Kaufman & Adrian Bowles from Wiley publication
• Automated Hypothesis Generation Based on Mining Scientific Literature by IBM research
https://scholar.harvard.edu/files/alacoste/files/p1877-spangler.pdf
References