SlideShare une entreprise Scribd logo
1  sur  36
Télécharger pour lire hors ligne
The human intellect is like peacock feathers, 

just an extravagant display intended to attract a mate. 

Oh we think we are so great ..aha..

but a peacock can barely fly and eats insects.
What is Vertical AI?
The Big Nine have open-sourced their toolkits for horizontal AI
Company Product
DLKIT
CNTK
DSSTNE
Citizen doctors
We now have supercomputers with thousands of cores (GPUs)
3584*4=14,336 CUDA cores , $1/core
sentence = """This is what I have to you

..DL approach to NLP."""

tokens = nltk.word_tokenize(sentence)

tagged = nltk.pos_tag(tokens)

entities = nltk.chunk.ne_chunk(tagged)

from nltk.corpus import treebank

t = treebank.parsed_sents('wsj_0001.mrg')[0]t.draw()

as alternative ...

vocabulary_size = 8000

unknown_token = "UNKNOWN_TOKEN"

sentence_start_token = "SENTENCE_START"

sentence_end_token = "SENTENCE_END"

# Read the data and append SENTENCE_START and SENTENCE_END tokens

print "Reading my CV file..."

with open('CV-EN_v0.6.csv', 'rb') as f:

reader = csv.reader(f, skipinitialspace=True)

reader.next()

# Split full comments into sentences

sentences = itertools.chain(*[nltk.sent_tokenize(x[0].decode('utf-8').lower()) for x in reader])

# Append SENTENCE_START and SENTENCE_END

sentences = ["%s %s %s" % (sentence_start_token, x, sentence_end_token) for x in sentences]

print "Parsed %d sentences." % (len(sentences)) 

# Tokenize the sentences into words

tokenized_sentences = [nltk.word_tokenize(sent) for sent in sentences]

# Count the word frequencies

word_freq = nltk.FreqDist(itertools.chain(*tokenized_sentences))

print "Found %d unique words tokens." % len(word_freq.items())

# Get the most common words and build index_to_word and word_to_index vectors

vocab = word_freq.most_common(vocabulary_size-1)

index_to_word = [x[0] for x in vocab]

index_to_word.append(unknown_token)

word_to_index = dict([(w,i) for i,w in enumerate(index_to_word)])

print "Using vocabulary size %d." % vocabulary_size

print "The least frequent word in our vocabulary is '%s' and appeared %d times." % (vocab[-1][0], vocab[-1][1]) 

# Replace all words not in our vocabulary with the unknown token

for i, sent in enumerate(tokenized_sentences):

tokenized_sentences[i] = [w if w in word_to_index else unknown_token for w in sent]

print "nExample sentence: '%s'" % sentences[0]

print "nExample sentence after Pre-processing: '%s'" % tokenized_sentences[0]

# Create the training data

X_train = np.asarray([[word_to_index[w] for w in sent[:-1]] for sent in tokenized_sentences])

y_train = np.asarray([[word_to_index[w] for w in sent[1:]] for sent in tokenized_sentences])
do
i have your attention now?
We are DATA
WEARE CODE
Government is aligned
The Chinese Academy of Sciences
has issued invitations to apply for
funding for PMI projects worth 

$9.2 bn
NIH is funding PMI projects worth
$215 M
MEDICAL RECORD: The unfair fight for what is ours
IHTFPIHTFP
SCRAPE
50% Self-reported information

20% biomarkers (blood, urine results)


10% scans

20% doctor notes
RECREATE
70%
30%
Everythingthatdoesnotlearnwilldisappear
DIALOG SYSTEMS in a CLOSED DOMAIN
VA⊃BOT
Supervised Learning
Discriminative Models: CRF, RF, NN, SVM
SA, state tracker
Logistic regression for intentions (analysis)
(CRF) labels + script answers (generation)
4.0 Artificial Intelligence(AI) her
Rule-based
Pattern matching
PyAIML
• 3.0 Intelligent Agents (IA) doc
• 2.0 Virtual Assistant (VA) Amy, Clara, Melody
• 1.0 Chatbot (BOT) Mitsuku, Alice, Botlibre, Luka
AI⊇IA Zeno Behavior : 0.9999999..
IA⊇VA
Learning methodologies
1. Unsupervised (clustered patterns, machine generated)
2. Supervised (labels, classification)
3. RL (by experts, pre-release)
4. Memory-based (where concrete cases are used at runtime, so
learning is "just" remembering)
5. Semantic embedding: word2vec
6. “One-shot learning”
Algorithms
‣clustering (k-NN)
‣classification (discrete)
‣regression (continuous)
‣structural matching to match present circumstance in past
cases (CB)
‣structural elaboration and substitution
(use past examples for context-appropriate generation)
1. faceted (allowing users to explore a collection of information by applying
multiple filters)
2. autocomplete
3. multimode
4. Speaker analysis
5. Goal-oriented dialogue
6. Longitudinal representation (Micro-Moments which are intensive-rich)
CONVERSATIONAL SEARCH
Linguistic Authority
1. NLU (bAbi, SemEval) or Winograd Schema = Google-
proof PDP
2. MemNN
3. bidirectional CG (RRG, TAG, FCG, HPSG, CCG, DG,
LFG..)
4. OFSM (Optimizing Finite State Machines)
KR/Ontology
‣ LOINC (clinical values)
‣ ICD9, ICD10 (diagnoses)
‣ CPT (medical services)
‣ SNOMED (EHR terms)
‣ RXNORM (medications
‣ MeSH (how to combine structured vocabularies with data-driven
techniques)
Memory
‣ MemNN
‣ app
‣ FSM
‣ Pontifex Maximus
Association mining: Location-aware (Item-item CF recommender using DSSTNE)
IHTFP
RESUME
Name: doc

Nationality: AI

Age:1.577e+10ms

Weight: 28.085(atomic 28.084–28.086)

Address: Exxact Spectrum TXR410-0032R 

Family: NVIDIA DIGITS™

IQ: 14,336 CUDA cores

Atomic number: 14

Gender: Si

Student: big data medicine

University: Enigma.ai

Profession: rob0-medstudent

Drivers license:4x NVIDIA GTX 1070/1080 Pascal GPUs

Character: stochastic

Language: python,C++, LUA, English, Mandarin

Graduation: 2017

Ambition in life: helping my carbon-based medical colleagues with
number crunching and their students (patients) with education

Courses: crunching of genomes and phenomes

Semantic neurons:LOINC, ICD9, ICD10, CPT, SNOMED, RXNORM

Mission: Medical Access 4 all

Thesis: n=1

Hobbies: TensorFlow, Torch

You are today

I am tomorrow
CORTEX
Top down: Model the Target
medication/therapy
disease/diagnosis
results
scans/blood test
human
medical record
symptom
Discriminate models
(supervised learning)
• SVM
• NN
• RF
• CRF
Bottom up: Target the Model
OMICS
Genome
Health
Blood panel
Phenome
Blood
Generative Models
(unsupervised learning)
• HMM
• Naive Bayes
• DNN
• RBM
Data-driven models
Conversational Intelligent Agents (CIAs) are a BLUE OCEAN market space.
LicensingLead Generation
Store
B2B
sponsored 

bots
Native 

Content
Affiliate

Marketing
Surveys
Consultations
Therapy
$200Bn industry
GENOMICS
1 MILLION GENOMES (2016)
1 BILLION GENOMES (2023)
World Shortage of doctors: 5,000,000
CRISPR will boost Personal Genomic
Data becomes a revenue stream
Health:ANovelAssetCategory
LIFE EXPECTANCY QUARTILE CREDIT TRANCHE INSURANCE HEALTHCARE
73.9
82.32
89.31
95-100
LOWER
MEDIAN
UPPER
LEVERAGED
AA-
AA
AA+
AAA
JUNIOR
SENIOR
SUPER SENIOR
LSS
STANDARD
SELECT
PREFERRED
PREFERRED PLUS
REACTIVE
ACTIVE
PROACTIVE
LIFE EXTENSION
AIRLINES
ECONOMY
ECONOMY+
BUSINESS
FIRST CLASS
∆=
21st C20th
n = 1
21st C20th
N x*
Healthcare will become a continuous
function
Discrete
function
Continuous
function
More is More
.56=0.015625
.512=0.000244140625
walter@doc.ai
@walterdebrouwer
walterdebrouwer
www.doc.ai
walterdb
doc Incorporated

540 Bryant Street 

Palo Alto, CA 94301
"From IA to AI in Healthcare" - Walter De Brouwer (CEO/Founder, doc.ai/Scanadu)

Contenu connexe

En vedette

En vedette (12)

"Alliances with Lifestyle Medicine for Wellness as a Service (WaaS)" - Ingrid...
"Alliances with Lifestyle Medicine for Wellness as a Service (WaaS)" - Ingrid..."Alliances with Lifestyle Medicine for Wellness as a Service (WaaS)" - Ingrid...
"Alliances with Lifestyle Medicine for Wellness as a Service (WaaS)" - Ingrid...
 
"Launching a New Industry – Scientific Wellness" - Mia Nease (Head of Commerc...
"Launching a New Industry – Scientific Wellness" - Mia Nease (Head of Commerc..."Launching a New Industry – Scientific Wellness" - Mia Nease (Head of Commerc...
"Launching a New Industry – Scientific Wellness" - Mia Nease (Head of Commerc...
 
"Enabling Individual Wellness through Computational Systems Biology, Cloud An...
"Enabling Individual Wellness through Computational Systems Biology, Cloud An..."Enabling Individual Wellness through Computational Systems Biology, Cloud An...
"Enabling Individual Wellness through Computational Systems Biology, Cloud An...
 
"Continuous Digital Biomarkers from Wearable Devices" - Brandon Ballinger (Co...
"Continuous Digital Biomarkers from Wearable Devices" - Brandon Ballinger (Co..."Continuous Digital Biomarkers from Wearable Devices" - Brandon Ballinger (Co...
"Continuous Digital Biomarkers from Wearable Devices" - Brandon Ballinger (Co...
 
Notes on HyperWellbeing Summit
Notes on HyperWellbeing SummitNotes on HyperWellbeing Summit
Notes on HyperWellbeing Summit
 
"Future of Consumer Bio-Sensors" - Stanley Yang (CEO, NeuroSky)
"Future of Consumer Bio-Sensors" - Stanley Yang (CEO, NeuroSky)"Future of Consumer Bio-Sensors" - Stanley Yang (CEO, NeuroSky)
"Future of Consumer Bio-Sensors" - Stanley Yang (CEO, NeuroSky)
 
"Electronics for Behavioral Health" - Jim Doscher (GM Healthcare, Analog Devi...
"Electronics for Behavioral Health" - Jim Doscher (GM Healthcare, Analog Devi..."Electronics for Behavioral Health" - Jim Doscher (GM Healthcare, Analog Devi...
"Electronics for Behavioral Health" - Jim Doscher (GM Healthcare, Analog Devi...
 
"Body Analytics for Triggering Healthier, Happier and more Productive Lives" ...
"Body Analytics for Triggering Healthier, Happier and more Productive Lives" ..."Body Analytics for Triggering Healthier, Happier and more Productive Lives" ...
"Body Analytics for Triggering Healthier, Happier and more Productive Lives" ...
 
"Wearables, Biometrics and Mindfulness as Medicine" - Joe Burton (Founder/CEO...
"Wearables, Biometrics and Mindfulness as Medicine" - Joe Burton (Founder/CEO..."Wearables, Biometrics and Mindfulness as Medicine" - Joe Burton (Founder/CEO...
"Wearables, Biometrics and Mindfulness as Medicine" - Joe Burton (Founder/CEO...
 
"The Future of Sleep with Neurotechnology" - Anant Sachetee (Rythm)
"The Future of Sleep with Neurotechnology" - Anant Sachetee (Rythm)"The Future of Sleep with Neurotechnology" - Anant Sachetee (Rythm)
"The Future of Sleep with Neurotechnology" - Anant Sachetee (Rythm)
 
"Augmenting Human Decision Making: Machine Intelligence Could Help Eradicate ...
"Augmenting Human Decision Making: Machine Intelligence Could Help Eradicate ..."Augmenting Human Decision Making: Machine Intelligence Could Help Eradicate ...
"Augmenting Human Decision Making: Machine Intelligence Could Help Eradicate ...
 
"Empowering Consumer with Smart Devices and Smart Data for Proactive ‘Health ...
"Empowering Consumer with Smart Devices and Smart Data for Proactive ‘Health ..."Empowering Consumer with Smart Devices and Smart Data for Proactive ‘Health ...
"Empowering Consumer with Smart Devices and Smart Data for Proactive ‘Health ...
 

Similaire à "From IA to AI in Healthcare" - Walter De Brouwer (CEO/Founder, doc.ai/Scanadu)

II-SDV 2017: The Next Era: Deep Learning for Biomedical Research
II-SDV 2017: The Next Era: Deep Learning for Biomedical ResearchII-SDV 2017: The Next Era: Deep Learning for Biomedical Research
II-SDV 2017: The Next Era: Deep Learning for Biomedical Research
Dr. Haxel Consult
 
A Distributed Architecture System for Recognizing Textual Entailment
A Distributed Architecture System for Recognizing Textual EntailmentA Distributed Architecture System for Recognizing Textual Entailment
A Distributed Architecture System for Recognizing Textual Entailment
Faculty of Computer Science
 
Genetic programming with clojure.spec and Beyond
Genetic programming with clojure.spec and BeyondGenetic programming with clojure.spec and Beyond
Genetic programming with clojure.spec and Beyond
Carin Meier
 
Practical semantics in the pharmaceutical industry - the Open PHACTS project
Practical semantics in the pharmaceutical industry - the Open PHACTS projectPractical semantics in the pharmaceutical industry - the Open PHACTS project
Practical semantics in the pharmaceutical industry - the Open PHACTS project
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
download
downloaddownload
download
butest
 
download
downloaddownload
download
butest
 
Swertz bosc2010 molgenis
Swertz bosc2010 molgenisSwertz bosc2010 molgenis
Swertz bosc2010 molgenis
BOSC 2010
 

Similaire à "From IA to AI in Healthcare" - Walter De Brouwer (CEO/Founder, doc.ai/Scanadu) (20)

II-SDV 2017: The Next Era: Deep Learning for Biomedical Research
II-SDV 2017: The Next Era: Deep Learning for Biomedical ResearchII-SDV 2017: The Next Era: Deep Learning for Biomedical Research
II-SDV 2017: The Next Era: Deep Learning for Biomedical Research
 
Named entity recognition (ner) with nltk
Named entity recognition (ner) with nltkNamed entity recognition (ner) with nltk
Named entity recognition (ner) with nltk
 
Text analysis using python
Text analysis using pythonText analysis using python
Text analysis using python
 
Cheat sheets for AI
Cheat sheets for AICheat sheets for AI
Cheat sheets for AI
 
A Distributed Architecture System for Recognizing Textual Entailment
A Distributed Architecture System for Recognizing Textual EntailmentA Distributed Architecture System for Recognizing Textual Entailment
A Distributed Architecture System for Recognizing Textual Entailment
 
Distributed Deep Learning + others for Spark Meetup
Distributed Deep Learning + others for Spark MeetupDistributed Deep Learning + others for Spark Meetup
Distributed Deep Learning + others for Spark Meetup
 
wendi_ppt
wendi_pptwendi_ppt
wendi_ppt
 
Epinomics cassandra summit-submit
Epinomics cassandra summit-submitEpinomics cassandra summit-submit
Epinomics cassandra summit-submit
 
Genetic programming with clojure.spec and Beyond
Genetic programming with clojure.spec and BeyondGenetic programming with clojure.spec and Beyond
Genetic programming with clojure.spec and Beyond
 
MACHINE LEARNING & ARTIFICIAL INTELLIGENCE: BEYOND DIAGNOSIS
MACHINE LEARNING & ARTIFICIAL INTELLIGENCE: BEYOND DIAGNOSIS MACHINE LEARNING & ARTIFICIAL INTELLIGENCE: BEYOND DIAGNOSIS
MACHINE LEARNING & ARTIFICIAL INTELLIGENCE: BEYOND DIAGNOSIS
 
An Incomplete Introduction to Artificial Intelligence
An Incomplete Introduction to Artificial IntelligenceAn Incomplete Introduction to Artificial Intelligence
An Incomplete Introduction to Artificial Intelligence
 
Brains, Data, and Machine Intelligence (2014 04 14 London Meetup)
Brains, Data, and Machine Intelligence (2014 04 14 London Meetup)Brains, Data, and Machine Intelligence (2014 04 14 London Meetup)
Brains, Data, and Machine Intelligence (2014 04 14 London Meetup)
 
Categorizing and pos tagging with nltk python
Categorizing and pos tagging with nltk pythonCategorizing and pos tagging with nltk python
Categorizing and pos tagging with nltk python
 
Inauguration Function - Ohio Center of Excellence in Knowledge-Enabled Comput...
Inauguration Function - Ohio Center of Excellence in Knowledge-Enabled Comput...Inauguration Function - Ohio Center of Excellence in Knowledge-Enabled Comput...
Inauguration Function - Ohio Center of Excellence in Knowledge-Enabled Comput...
 
Functional (web) development with Clojure
Functional (web) development with ClojureFunctional (web) development with Clojure
Functional (web) development with Clojure
 
Practical semantics in the pharmaceutical industry - the Open PHACTS project
Practical semantics in the pharmaceutical industry - the Open PHACTS projectPractical semantics in the pharmaceutical industry - the Open PHACTS project
Practical semantics in the pharmaceutical industry - the Open PHACTS project
 
And Then There Are Algorithms
And Then There Are AlgorithmsAnd Then There Are Algorithms
And Then There Are Algorithms
 
download
downloaddownload
download
 
download
downloaddownload
download
 
Swertz bosc2010 molgenis
Swertz bosc2010 molgenisSwertz bosc2010 molgenis
Swertz bosc2010 molgenis
 

Dernier

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Dernier (20)

What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 

"From IA to AI in Healthcare" - Walter De Brouwer (CEO/Founder, doc.ai/Scanadu)

  • 1.
  • 2. The human intellect is like peacock feathers, 
 just an extravagant display intended to attract a mate. 
 Oh we think we are so great ..aha..
 but a peacock can barely fly and eats insects.
  • 4.
  • 5.
  • 6. The Big Nine have open-sourced their toolkits for horizontal AI Company Product DLKIT CNTK DSSTNE
  • 7.
  • 9. We now have supercomputers with thousands of cores (GPUs) 3584*4=14,336 CUDA cores , $1/core
  • 10.
  • 11. sentence = """This is what I have to you
 ..DL approach to NLP."""
 tokens = nltk.word_tokenize(sentence)
 tagged = nltk.pos_tag(tokens)
 entities = nltk.chunk.ne_chunk(tagged)
 from nltk.corpus import treebank
 t = treebank.parsed_sents('wsj_0001.mrg')[0]t.draw()
 as alternative ...
 vocabulary_size = 8000
 unknown_token = "UNKNOWN_TOKEN"
 sentence_start_token = "SENTENCE_START"
 sentence_end_token = "SENTENCE_END"
 # Read the data and append SENTENCE_START and SENTENCE_END tokens
 print "Reading my CV file..."
 with open('CV-EN_v0.6.csv', 'rb') as f:
 reader = csv.reader(f, skipinitialspace=True)
 reader.next()
 # Split full comments into sentences
 sentences = itertools.chain(*[nltk.sent_tokenize(x[0].decode('utf-8').lower()) for x in reader])
 # Append SENTENCE_START and SENTENCE_END
 sentences = ["%s %s %s" % (sentence_start_token, x, sentence_end_token) for x in sentences]
 print "Parsed %d sentences." % (len(sentences)) 
 # Tokenize the sentences into words
 tokenized_sentences = [nltk.word_tokenize(sent) for sent in sentences]
 # Count the word frequencies
 word_freq = nltk.FreqDist(itertools.chain(*tokenized_sentences))
 print "Found %d unique words tokens." % len(word_freq.items())
 # Get the most common words and build index_to_word and word_to_index vectors
 vocab = word_freq.most_common(vocabulary_size-1)
 index_to_word = [x[0] for x in vocab]
 index_to_word.append(unknown_token)
 word_to_index = dict([(w,i) for i,w in enumerate(index_to_word)])
 print "Using vocabulary size %d." % vocabulary_size
 print "The least frequent word in our vocabulary is '%s' and appeared %d times." % (vocab[-1][0], vocab[-1][1]) 
 # Replace all words not in our vocabulary with the unknown token
 for i, sent in enumerate(tokenized_sentences):
 tokenized_sentences[i] = [w if w in word_to_index else unknown_token for w in sent]
 print "nExample sentence: '%s'" % sentences[0]
 print "nExample sentence after Pre-processing: '%s'" % tokenized_sentences[0]
 # Create the training data
 X_train = np.asarray([[word_to_index[w] for w in sent[:-1]] for sent in tokenized_sentences])
 y_train = np.asarray([[word_to_index[w] for w in sent[1:]] for sent in tokenized_sentences]) do i have your attention now?
  • 13. Government is aligned The Chinese Academy of Sciences has issued invitations to apply for funding for PMI projects worth 
 $9.2 bn NIH is funding PMI projects worth $215 M
  • 14. MEDICAL RECORD: The unfair fight for what is ours IHTFPIHTFP SCRAPE 50% Self-reported information
 20% biomarkers (blood, urine results) 
 10% scans
 20% doctor notes RECREATE 70% 30%
  • 16. DIALOG SYSTEMS in a CLOSED DOMAIN VA⊃BOT Supervised Learning Discriminative Models: CRF, RF, NN, SVM SA, state tracker Logistic regression for intentions (analysis) (CRF) labels + script answers (generation) 4.0 Artificial Intelligence(AI) her Rule-based Pattern matching PyAIML • 3.0 Intelligent Agents (IA) doc • 2.0 Virtual Assistant (VA) Amy, Clara, Melody • 1.0 Chatbot (BOT) Mitsuku, Alice, Botlibre, Luka AI⊇IA Zeno Behavior : 0.9999999.. IA⊇VA
  • 17. Learning methodologies 1. Unsupervised (clustered patterns, machine generated) 2. Supervised (labels, classification) 3. RL (by experts, pre-release) 4. Memory-based (where concrete cases are used at runtime, so learning is "just" remembering) 5. Semantic embedding: word2vec 6. “One-shot learning”
  • 18. Algorithms ‣clustering (k-NN) ‣classification (discrete) ‣regression (continuous) ‣structural matching to match present circumstance in past cases (CB) ‣structural elaboration and substitution (use past examples for context-appropriate generation)
  • 19. 1. faceted (allowing users to explore a collection of information by applying multiple filters) 2. autocomplete 3. multimode 4. Speaker analysis 5. Goal-oriented dialogue 6. Longitudinal representation (Micro-Moments which are intensive-rich) CONVERSATIONAL SEARCH
  • 20. Linguistic Authority 1. NLU (bAbi, SemEval) or Winograd Schema = Google- proof PDP 2. MemNN 3. bidirectional CG (RRG, TAG, FCG, HPSG, CCG, DG, LFG..) 4. OFSM (Optimizing Finite State Machines)
  • 21. KR/Ontology ‣ LOINC (clinical values) ‣ ICD9, ICD10 (diagnoses) ‣ CPT (medical services) ‣ SNOMED (EHR terms) ‣ RXNORM (medications ‣ MeSH (how to combine structured vocabularies with data-driven techniques) Memory ‣ MemNN ‣ app ‣ FSM ‣ Pontifex Maximus Association mining: Location-aware (Item-item CF recommender using DSSTNE)
  • 22. IHTFP RESUME Name: doc
 Nationality: AI
 Age:1.577e+10ms
 Weight: 28.085(atomic 28.084–28.086)
 Address: Exxact Spectrum TXR410-0032R 
 Family: NVIDIA DIGITS™
 IQ: 14,336 CUDA cores
 Atomic number: 14
 Gender: Si
 Student: big data medicine
 University: Enigma.ai
 Profession: rob0-medstudent
 Drivers license:4x NVIDIA GTX 1070/1080 Pascal GPUs
 Character: stochastic
 Language: python,C++, LUA, English, Mandarin
 Graduation: 2017
 Ambition in life: helping my carbon-based medical colleagues with number crunching and their students (patients) with education
 Courses: crunching of genomes and phenomes
 Semantic neurons:LOINC, ICD9, ICD10, CPT, SNOMED, RXNORM
 Mission: Medical Access 4 all
 Thesis: n=1
 Hobbies: TensorFlow, Torch
 You are today
 I am tomorrow CORTEX
  • 23. Top down: Model the Target medication/therapy disease/diagnosis results scans/blood test human medical record symptom Discriminate models (supervised learning) • SVM • NN • RF • CRF
  • 24. Bottom up: Target the Model OMICS Genome Health Blood panel Phenome Blood Generative Models (unsupervised learning) • HMM • Naive Bayes • DNN • RBM Data-driven models
  • 25. Conversational Intelligent Agents (CIAs) are a BLUE OCEAN market space. LicensingLead Generation Store B2B sponsored 
 bots Native 
 Content Affiliate
 Marketing Surveys Consultations Therapy $200Bn industry
  • 26. GENOMICS 1 MILLION GENOMES (2016) 1 BILLION GENOMES (2023) World Shortage of doctors: 5,000,000
  • 27. CRISPR will boost Personal Genomic
  • 28. Data becomes a revenue stream
  • 29. Health:ANovelAssetCategory LIFE EXPECTANCY QUARTILE CREDIT TRANCHE INSURANCE HEALTHCARE 73.9 82.32 89.31 95-100 LOWER MEDIAN UPPER LEVERAGED AA- AA AA+ AAA JUNIOR SENIOR SUPER SENIOR LSS STANDARD SELECT PREFERRED PREFERRED PLUS REACTIVE ACTIVE PROACTIVE LIFE EXTENSION AIRLINES ECONOMY ECONOMY+ BUSINESS FIRST CLASS
  • 31. n = 1
  • 33. Healthcare will become a continuous function Discrete function Continuous function