"From IA to AI in Healthcare" - Walter De Brouwer (CEO/Founder, doc.ai/Scanadu)

The human intellect is like peacock feathers,  
just an extravagant display intended to attract a mate.  
Oh we think we are so great ..aha.. 
but a peacock can barely fly and eats insects.

The Big Nine have open-sourced their toolkits for horizontal AI
Company Product
DLKIT
CNTK
DSSTNE

We now have supercomputers with thousands of cores (GPUs)
3584*4=14,336 CUDA cores , $1/core

sentence = """This is what I have to you 
..DL approach to NLP.""" 
tokens = nltk.word_tokenize(sentence) 
tagged = nltk.pos_tag(tokens) 
entities = nltk.chunk.ne_chunk(tagged) 
from nltk.corpus import treebank 
t = treebank.parsed_sents('wsj_0001.mrg')[0]t.draw() 
as alternative ... 
vocabulary_size = 8000 
unknown_token = "UNKNOWN_TOKEN" 
sentence_start_token = "SENTENCE_START" 
sentence_end_token = "SENTENCE_END" 
# Read the data and append SENTENCE_START and SENTENCE_END tokens 
print "Reading my CV ﬁle..." 
with open('CV-EN_v0.6.csv', 'rb') as f: 
reader = csv.reader(f, skipinitialspace=True) 
reader.next() 
# Split full comments into sentences 
sentences = itertools.chain(*[nltk.sent_tokenize(x[0].decode('utf-8').lower()) for x in reader]) 
# Append SENTENCE_START and SENTENCE_END 
sentences = ["%s %s %s" % (sentence_start_token, x, sentence_end_token) for x in sentences] 
print "Parsed %d sentences." % (len(sentences))  
# Tokenize the sentences into words 
tokenized_sentences = [nltk.word_tokenize(sent) for sent in sentences] 
# Count the word frequencies 
word_freq = nltk.FreqDist(itertools.chain(*tokenized_sentences)) 
print "Found %d unique words tokens." % len(word_freq.items()) 
# Get the most common words and build index_to_word and word_to_index vectors 
vocab = word_freq.most_common(vocabulary_size-1) 
index_to_word = [x[0] for x in vocab] 
index_to_word.append(unknown_token) 
word_to_index = dict([(w,i) for i,w in enumerate(index_to_word)]) 
print "Using vocabulary size %d." % vocabulary_size 
print "The least frequent word in our vocabulary is '%s' and appeared %d times." % (vocab[-1][0], vocab[-1][1])  
# Replace all words not in our vocabulary with the unknown token 
for i, sent in enumerate(tokenized_sentences): 
tokenized_sentences[i] = [w if w in word_to_index else unknown_token for w in sent] 
print "nExample sentence: '%s'" % sentences[0] 
print "nExample sentence after Pre-processing: '%s'" % tokenized_sentences[0] 
# Create the training data 
X_train = np.asarray([[word_to_index[w] for w in sent[:-1]] for sent in tokenized_sentences]) 
y_train = np.asarray([[word_to_index[w] for w in sent[1:]] for sent in tokenized_sentences])
do
i have your attention now?

Government is aligned
The Chinese Academy of Sciences
has issued invitations to apply for
funding for PMI projects worth  
$9.2 bn
NIH is funding PMI projects worth
$215 M

MEDICAL RECORD: The unfair fight for what is ours
IHTFPIHTFP
SCRAPE
50% Self-reported information 
20% biomarkers (blood, urine results)
 
10% scans 
20% doctor notes
RECREATE
70%
30%

Everythingthatdoesnotlearnwilldisappear

DIALOG SYSTEMS in a CLOSED DOMAIN
VA⊃BOT
Supervised Learning
Discriminative Models: CRF, RF, NN, SVM
SA, state tracker
Logistic regression for intentions (analysis)
(CRF) labels + script answers (generation)
4.0 Artificial Intelligence(AI) her
Rule-based
Pattern matching
PyAIML
• 3.0 Intelligent Agents (IA) doc
• 2.0 Virtual Assistant (VA) Amy, Clara, Melody
• 1.0 Chatbot (BOT) Mitsuku, Alice, Botlibre, Luka
AI⊇IA Zeno Behavior : 0.9999999..
IA⊇VA

Learning methodologies
1. Unsupervised (clustered patterns, machine generated)
2. Supervised (labels, classiﬁcation)
3. RL (by experts, pre-release)
4. Memory-based (where concrete cases are used at runtime, so
learning is "just" remembering)
5. Semantic embedding: word2vec
6. “One-shot learning”

Algorithms
‣clustering (k-NN)
‣classiﬁcation (discrete)
‣regression (continuous)
‣structural matching to match present circumstance in past
cases (CB)
‣structural elaboration and substitution
(use past examples for context-appropriate generation)

1. faceted (allowing users to explore a collection of information by applying
multiple ﬁlters)
2. autocomplete
3. multimode
4. Speaker analysis
5. Goal-oriented dialogue
6. Longitudinal representation (Micro-Moments which are intensive-rich)
CONVERSATIONAL SEARCH

Linguistic Authority
1. NLU (bAbi, SemEval) or Winograd Schema = Google-
proof PDP
2. MemNN
3. bidirectional CG (RRG, TAG, FCG, HPSG, CCG, DG,
LFG..)
4. OFSM (Optimizing Finite State Machines)

KR/Ontology
‣ LOINC (clinical values)
‣ ICD9, ICD10 (diagnoses)
‣ CPT (medical services)
‣ SNOMED (EHR terms)
‣ RXNORM (medications
‣ MeSH (how to combine structured vocabularies with data-driven
techniques)
Memory
‣ MemNN
‣ app
‣ FSM
‣ Pontifex Maximus
Association mining: Location-aware (Item-item CF recommender using DSSTNE)

IHTFP
RESUME
Name: doc 
Nationality: AI 
Age:1.577e+10ms 
Weight: 28.085(atomic 28.084–28.086) 
Address: Exxact Spectrum TXR410-0032R  
Family: NVIDIA DIGITS™ 
IQ: 14,336 CUDA cores 
Atomic number: 14 
Gender: Si 
Student: big data medicine 
University: Enigma.ai 
Profession: rob0-medstudent 
Drivers license:4x NVIDIA GTX 1070/1080 Pascal GPUs 
Character: stochastic 
Language: python,C++, LUA, English, Mandarin 
Graduation: 2017 
Ambition in life: helping my carbon-based medical colleagues with
number crunching and their students (patients) with education 
Courses: crunching of genomes and phenomes 
Semantic neurons:LOINC, ICD9, ICD10, CPT, SNOMED, RXNORM 
Mission: Medical Access 4 all 
Thesis: n=1 
Hobbies: TensorFlow, Torch 
You are today 
I am tomorrow
CORTEX

Top down: Model the Target
medication/therapy
disease/diagnosis
results
scans/blood test
human
medical record
symptom
Discriminate models
(supervised learning)
• SVM
• NN
• RF
• CRF

Bottom up: Target the Model
OMICS
Genome
Health
Blood panel
Phenome
Blood
Generative Models
(unsupervised learning)
• HMM
• Naive Bayes
• DNN
• RBM
Data-driven models

Conversational Intelligent Agents (CIAs) are a BLUE OCEAN market space.
LicensingLead Generation
Store
B2B
sponsored  
bots
Native  
Content
Afﬁliate 
Marketing
Surveys
Consultations
Therapy
$200Bn industry

GENOMICS
1 MILLION GENOMES (2016)
1 BILLION GENOMES (2023)
World Shortage of doctors: 5,000,000

CRISPR will boost Personal Genomic

Health:ANovelAssetCategory
LIFE EXPECTANCY QUARTILE CREDIT TRANCHE INSURANCE HEALTHCARE
73.9
82.32
89.31
95-100
LOWER
MEDIAN
UPPER
LEVERAGED
AA-
AA
AA+
AAA
JUNIOR
SENIOR
SUPER SENIOR
LSS
STANDARD
SELECT
PREFERRED
PREFERRED PLUS
REACTIVE
ACTIVE
PROACTIVE
LIFE EXTENSION
AIRLINES
ECONOMY
ECONOMY+
BUSINESS
FIRST CLASS

Healthcare will become a continuous
function
Discrete
function
Continuous
function

More is More
.56=0.015625
.512=0.000244140625

walter@doc.ai
@walterdebrouwer
walterdebrouwer
www.doc.ai
walterdb
doc Incorporated 
540 Bryant Street  
Palo Alto, CA 94301

"From IA to AI in Healthcare" - Walter De Brouwer (CEO/Founder, doc.ai/Scanadu)

"From IA to AI in Healthcare" - Walter De Brouwer (CEO/Founder, doc.ai/Scanadu)

Recommandé

Recommandé

Contenu connexe

En vedette

En vedette (12)

Similaire à "From IA to AI in Healthcare" - Walter De Brouwer (CEO/Founder, doc.ai/Scanadu)

Similaire à "From IA to AI in Healthcare" - Walter De Brouwer (CEO/Founder, doc.ai/Scanadu) (20)

Dernier

Dernier (20)

"From IA to AI in Healthcare" - Walter De Brouwer (CEO/Founder, doc.ai/Scanadu)