"From IA to AI in Healthcare" - Walter De Brouwer (CEO/Founder, doc.ai/Scanadu)
Delivered at the inaugural Hyper Wellbeing Summit, 14th November 2016, Mountain View, California.
For more information including details of subsequent events, please visit http://hyperwellbeing.com
The summit was created to foster a community around an emerging industry - Wellness as a Service (WaaS). Consumer technologies, in particular wearables and mobile, are powering a consumer revolution. A revolution to turn health and wellness into platform delivered services. A revolution enabling consumer data-driven disease risk reduction. A revolution extending health care past sick care towards consumer-led lifelong health, wellness and lifestyle optimization.
WaaS newsletter sign-up http://eepurl.com/b71fdr
@hyperwellbeing
Automating Google Workspace (GWS) & more with Apps Script
"From IA to AI in Healthcare" - Walter De Brouwer (CEO/Founder, doc.ai/Scanadu)
1.
2. The human intellect is like peacock feathers,
just an extravagant display intended to attract a mate.
Oh we think we are so great ..aha..
but a peacock can barely fly and eats insects.
9. We now have supercomputers with thousands of cores (GPUs)
3584*4=14,336 CUDA cores , $1/core
10.
11. sentence = """This is what I have to you
..DL approach to NLP."""
tokens = nltk.word_tokenize(sentence)
tagged = nltk.pos_tag(tokens)
entities = nltk.chunk.ne_chunk(tagged)
from nltk.corpus import treebank
t = treebank.parsed_sents('wsj_0001.mrg')[0]t.draw()
as alternative ...
vocabulary_size = 8000
unknown_token = "UNKNOWN_TOKEN"
sentence_start_token = "SENTENCE_START"
sentence_end_token = "SENTENCE_END"
# Read the data and append SENTENCE_START and SENTENCE_END tokens
print "Reading my CV file..."
with open('CV-EN_v0.6.csv', 'rb') as f:
reader = csv.reader(f, skipinitialspace=True)
reader.next()
# Split full comments into sentences
sentences = itertools.chain(*[nltk.sent_tokenize(x[0].decode('utf-8').lower()) for x in reader])
# Append SENTENCE_START and SENTENCE_END
sentences = ["%s %s %s" % (sentence_start_token, x, sentence_end_token) for x in sentences]
print "Parsed %d sentences." % (len(sentences))
# Tokenize the sentences into words
tokenized_sentences = [nltk.word_tokenize(sent) for sent in sentences]
# Count the word frequencies
word_freq = nltk.FreqDist(itertools.chain(*tokenized_sentences))
print "Found %d unique words tokens." % len(word_freq.items())
# Get the most common words and build index_to_word and word_to_index vectors
vocab = word_freq.most_common(vocabulary_size-1)
index_to_word = [x[0] for x in vocab]
index_to_word.append(unknown_token)
word_to_index = dict([(w,i) for i,w in enumerate(index_to_word)])
print "Using vocabulary size %d." % vocabulary_size
print "The least frequent word in our vocabulary is '%s' and appeared %d times." % (vocab[-1][0], vocab[-1][1])
# Replace all words not in our vocabulary with the unknown token
for i, sent in enumerate(tokenized_sentences):
tokenized_sentences[i] = [w if w in word_to_index else unknown_token for w in sent]
print "nExample sentence: '%s'" % sentences[0]
print "nExample sentence after Pre-processing: '%s'" % tokenized_sentences[0]
# Create the training data
X_train = np.asarray([[word_to_index[w] for w in sent[:-1]] for sent in tokenized_sentences])
y_train = np.asarray([[word_to_index[w] for w in sent[1:]] for sent in tokenized_sentences])
do
i have your attention now?
13. Government is aligned
The Chinese Academy of Sciences
has issued invitations to apply for
funding for PMI projects worth
$9.2 bn
NIH is funding PMI projects worth
$215 M
14. MEDICAL RECORD: The unfair fight for what is ours
IHTFPIHTFP
SCRAPE
50% Self-reported information
20% biomarkers (blood, urine results)
10% scans
20% doctor notes
RECREATE
70%
30%
19. 1. faceted (allowing users to explore a collection of information by applying
multiple filters)
2. autocomplete
3. multimode
4. Speaker analysis
5. Goal-oriented dialogue
6. Longitudinal representation (Micro-Moments which are intensive-rich)
CONVERSATIONAL SEARCH
21. KR/Ontology
‣ LOINC (clinical values)
‣ ICD9, ICD10 (diagnoses)
‣ CPT (medical services)
‣ SNOMED (EHR terms)
‣ RXNORM (medications
‣ MeSH (how to combine structured vocabularies with data-driven
techniques)
Memory
‣ MemNN
‣ app
‣ FSM
‣ Pontifex Maximus
Association mining: Location-aware (Item-item CF recommender using DSSTNE)
22. IHTFP
RESUME
Name: doc
Nationality: AI
Age:1.577e+10ms
Weight: 28.085(atomic 28.084–28.086)
Address: Exxact Spectrum TXR410-0032R
Family: NVIDIA DIGITS™
IQ: 14,336 CUDA cores
Atomic number: 14
Gender: Si
Student: big data medicine
University: Enigma.ai
Profession: rob0-medstudent
Drivers license:4x NVIDIA GTX 1070/1080 Pascal GPUs
Character: stochastic
Language: python,C++, LUA, English, Mandarin
Graduation: 2017
Ambition in life: helping my carbon-based medical colleagues with
number crunching and their students (patients) with education
Courses: crunching of genomes and phenomes
Semantic neurons:LOINC, ICD9, ICD10, CPT, SNOMED, RXNORM
Mission: Medical Access 4 all
Thesis: n=1
Hobbies: TensorFlow, Torch
You are today
I am tomorrow
CORTEX
23. Top down: Model the Target
medication/therapy
disease/diagnosis
results
scans/blood test
human
medical record
symptom
Discriminate models
(supervised learning)
• SVM
• NN
• RF
• CRF
24. Bottom up: Target the Model
OMICS
Genome
Health
Blood panel
Phenome
Blood
Generative Models
(unsupervised learning)
• HMM
• Naive Bayes
• DNN
• RBM
Data-driven models
25. Conversational Intelligent Agents (CIAs) are a BLUE OCEAN market space.
LicensingLead Generation
Store
B2B
sponsored
bots
Native
Content
Affiliate
Marketing
Surveys
Consultations
Therapy
$200Bn industry
29. Health:ANovelAssetCategory
LIFE EXPECTANCY QUARTILE CREDIT TRANCHE INSURANCE HEALTHCARE
73.9
82.32
89.31
95-100
LOWER
MEDIAN
UPPER
LEVERAGED
AA-
AA
AA+
AAA
JUNIOR
SENIOR
SUPER SENIOR
LSS
STANDARD
SELECT
PREFERRED
PREFERRED PLUS
REACTIVE
ACTIVE
PROACTIVE
LIFE EXTENSION
AIRLINES
ECONOMY
ECONOMY+
BUSINESS
FIRST CLASS