SlideShare une entreprise Scribd logo
1  sur  36
KNOWLEDGE GRAPH CONSTRUCTION
FOR RESEARCH & MEDICINE
Paul Groth (@pgroth)
pgroth.com
Disruptive Technology Director
Elsevier Labs (@elsevierlabs)
Connected Data London 2017
Contributions: Brad Allen, Pascal Coupet, Sujit Pal, Craig Stanley, Ron Daniel, Alex de Jong
Our customers are facing challenges in
science and health
1. Industrial Research Institute 2. The Lancet 3. Tufts 4. World Health Organization
Elsevier is in a unique position to make a contribution
towards solving these challenges
Life-saving drugs are expensive to develop.3
Global research spend is growing every year.1
3.4%
from 2015
Predicted spend
$1.9TN
research in 2016
Studies:
70-80% of
research asks the
wrong questions
or cannot be
reproduced
Researchers lack the tools they need to be
effective.2
Preventable medicalerrors:
Third largest cause of death in theUS
Health providers cannot save lives without the best
information.4
$2.5BN
median pharmaceutical
spend per drug
1/20
successrate
of drugs
Heart
Disease
611k
Cancer
585k
Medical
Error
225k 149k
Respiratory
Illness
ELSEVIER’S BUSINESS: PROVIDING ANSWERS FOR
RESEARCHERS, DOCTORS AND NURSES
My work is moving towards a new field; what should I know?
• Journal articles, reference works, profiles of researchers, funders &
institutions
• Recommendations of people to connect with, reading lists, topic pages
How should I treat my patient given her condition & history?
• Journal articles, reference works, medical guidelines, electronic health
records
• Treatment plan with alternatives personalized for the patient
How can I master the subject matter of the course I am taking?
• Course syllabus, reference works, course objectives, student history
• Quiz plan based on the student’s history and course objectives
THE ROLE OF METADATA IN THE SECOND MACHINE AGE – DC-2016 / KØBENHAVN / 13 OCTOBER
ANSWERS ARE ABOUT THINGS, NOT JUST WORKS
Why shouldn’t a search on an author return
information about the author, including the
author’s works? Where was the author born,
when did she live, what is she known for? … All of
this is possible, but only if we can make some
fundamental changes in our approach to
bibliographic description. ... The challenge for us
lies in transforming what we can of our data into
interrelated “things” without overindulging that
metaphor.
Coyle, K. (2016). FRBR, before and after: a look at our
bibliographical models. Chicago: ALA Editions.
THE ROLE OF METADATA IN THE SECOND MACHINE AGE – DC-2016 / KØBENHAVN / 13 OCTOBER
KNOWLEDGE GRAPHS DEFINED
• Knowledge graphs are "graph structured knowledge bases (KBs) which store factual
information in form of relationships between entities” (Nickel, M., Murphy, K., Tresp, V. and
Gabrilovich, E. (2015). A review of relational machine learning for knowledge graphs.
arXiv:1503.00759v3)
• Knowledge graphs are metadata evolved beyond the focus on the work, linking people, concepts,
things and events
• Knowledge Graphs are focused on things to provide answers
THE ROLE OF METADATA IN THE SECOND MACHINE AGE – DC-2016 / KØBENHAVN / 13 OCTOBER
ELSEVIER’S KNOWLEDGE PLATFORM
Products
Data & Content
Sources
Knowledge
Graphs
Platforms &
Shared Services
Entity Hubs
Usage logs Pathways EHRsArticles Authors Institutions
SyllabiCitations ChemicalsBooks DrugsFunders
Funder Hub Article HubProfile Hub Journal Hub Institution Hub
Research HealthcareLife Sciences
Content Life Sciences Search IdentityResearch
Reaxys CK SherpathScopus SD ROS
THE ROLE OF METADATA IN THE SECOND MACHINE AGE – DC-2016 / KØBENHAVN / 13 OCTOBER
THE GROWTH OF SCIENCE COMPLICATES OUR EFFORTS
MORE DOMAINS & MORE SPECIFICITY
Gregory, K., Groth, P., Cousijn, H., Scharnhorst, A.,
& Wyatt, S. (2017). Searching Data: A Review of
Observational Data Retrieval Practices. arXiv
preprint arXiv:1707.06937.
Some observations from @gregory_km
survey:
1. The needs and behaviours of specific user groups
(e.g. early career researchers, policy makers,
students) are not well documented.
2. Background uses of observational data are better
documented than foreground uses.
3. Reconstructing data tables from journal articles,
using general search engines, and making direct data
requests are common.
MACHINE READING
TOPIC PAGES
Definition
Related
terms
Relevant
ranked
snippets
HOW - OVERVIEW
Content
Books, Articles, Ontologies ...
• Identification of concepts
• Disambiguation
• Domain/sub-domain
identification
• Abbreviations,
variants
• Gazeteering
• Identification and
classification of text snippets
around concepts
• Features building for
concept/snippet pairs
• Lexical, syntactic,
semantic, doc
structure …
• Ranking concept snippet pairs
• Machine learning
• Hand made rules
• Similarities
• Deduplication
Technologies
NLP, ML
• Curation
• White list driven
• Black list
• Corrections/improve
ments
• Evaluation
• Gold set by domain
• Random set by
domain
• By SMEs (Subject
Matter Experts)
• Automation
• Content Enrichment
Framework
• Taxonomy coverage
extension
Knowledge Graph
Concepts, snippets, meta data, …
| 16
OmniScience
Neuros
cience
Extension vocabularies by domains to provide coverage
Number of Concepts Number of Labels
OmniScience 01.16.11 45969 47421
OmniScience Neuroscience branch 21/11/2016 2356 2455
OmniScience Extension Neuroscience branch 21/11/2016 23932 101276
| 17
Concept Bad Good
Inferior Colliculus
By comparing activation obtained in an equivalent
standard ( non-cardiac-gated ) fMRI experiment ,
Guimaraes and colleagues found that cardiac-
gated activation maps yielded much greater
activation in subcortical nuclei , such as the
inferior colliculus .
The inferior colliculus (IC) is part of the tectum of the midbrain (mesencephalon) comprising the quadrigeminal
plate (Lamina quadrigemina). It is located caudal to the superior colliculus on the dorsal surface of the
mesencephalon ( Figure 36.7 FIGURE 36.7Overview of the human brainstem; view from dorsal. The superior and
inferior colliculi form the quadrigeminal plate. Parts of the cerebellum are removed.). The ventral border is
formed by the lateral lemniscus. The inferior colliculus is the largest nucleus of the human auditory system. …
Purkinje cells
It is felt that the aminopyridines are likely to
increase the excitability of the potassium channel-
rich cerebellar Purkinje cells in the flocculus (
Etzion and Grossman , 2001 ) .
Purkinje cells are the most salient cellular elements of the cerebellar cortex. They are arranged in a single row
throughout the entire cerebellar cortex between the molecular (outer) layer and the granular (inner) layer. They
are among the largest neurons and have a round perikaryon, classically described as shaped “like a chianti
bottle,” with a highly branched dendritic tree shaped like a candelabrum and extending into the molecular layer
where they are contacted by incoming systems of afferent fibers from granule neurons and the brainstem…
Olfactory Bulb
The most common sites used for induction of
kindling include the amygdala, perforant path ,
dorsal hippocampus , olfactory bulb , and
perirhinal cortex.
The olfactory bulb is the first relay station of the central olfactory system in the vertebrate brain and contains in
its superficial layer a few thousand glomeruli, spherical neuropils with sharp borders ( Figure 1 Figure 1Axonal
projection pattern of olfactory sensory neurons to the glomeruli of the rodent olfactory bulb. The olfactory
epithelium in rats and mice is divided into four zones (zones 1–4). A given odorant receptor is expressed by
sensory neurons located within one zone of the epithelium. Individual olfactory sensory neurons express a single
odorant receptor…
Examples of good and bad snippets
19
One Weird Trick from Natural Language Processing (NLP)
• Knowledge bases are populated by scanning text and doing Information Extraction
• Most information extraction systems are looking for very specific things, like drug-drug interactions
• Best accuracy for that one kind of data, but misses out on all the other concepts and relations in the text
• For broad knowledge base, use Open Information Extraction that only uses some knowledge of grammar
• The weird trick for open information extraction … a simple algorithm, known as ReVerb*:
1. Find “relation phrases” starting with a verb and ending with a verb or preposition
2. Find noun phrases before and after the relation phrase
3. Discard relation phrases not used with multiple combinations of arguments.
In addition, brain scans were performed to exclude
other causes of dementia.
* Fader et al. Identifying Relations for Open Information Extraction
20
ReVerb output
# SD Documents Scanned 14,000,000
Extracted ReVerb Triples 473,350,566
21
Universal schemas
* Using newer Universal Schema algorithms than described in next few slides
• … are a specific technique from the Information Extraction and the Automatic Knowledge
Base Completion literature
• … are an unsupervised method to ‘learn’ by combining text extracts with existing knowledge base
assertions
Applications:
• Extend a medical knowledge base
• scan incoming literature to suggest new additions to EMMeT and show the
underlying evidence to the taxonomy editor.
• scan literature backlog to find evidence for data already in EMMeT
• Literature Surveillance
• scan incoming literature to find existing facts even if expressed in very different ways
• find new concepts in the literature related to an existing EMMeT concept*. Let taxonomy editor
decide whether to add new concept and relation to EMMeT
22
Universal schemas – Predict ‘missing‘ EMMeT facts
• Make a matrix:
• columns for the relation phrases
from ReVerb or the semantic
relations from EMMeT
• rows are the pairs of concepts
linked by a relation
• A ‘1.0’ in a cell if those concepts
were linked by that relation
• Outlined cells in diagram
are the ones initialized to
1.
• Factorize matrix to ExK and KxR, then
recombine.
• “Learns” the correlations between text
relations and EMMeT relations, in the
context of pairs of objects.
• Cells going from 0 to > 0
indicates potential.
• Find new triples to go into EMMeT e.g.,
(glaucoma, has_alternativeProcedure,
biofeedback)
BUILDING
KNOWLEDGE GRAPHS
FROM DATA
24
Medical Graph – Statistical correlations at scale
I65
Occlusion and stenosis
of precerebral arteries
G40
Epilepsy
has_successor
I61
C71
Malignant neoplasm
of brain
odds ratio: 1.12
intracerebral
hemorrhage has_successor criteria1:
• Correlation selected by
preditive modeling
algorithmus
• No. of relations is higher
than in mirrored relation
• p-value < 0,05
• Odds ratios balanced over
all covariates.
1 Criteria based on: Jensen et.al.: Temporal disease trajectories condensed from population-wide registry data
covering 6.2 million patients. Nature Communications, 2014 Jun 24 ;5:4022. doi: 10.1038/ncomms5022.
Other
covariates
Primary care
Secondary care
Drug prescriptions
5m patients
each 6 years longitudinality
25
Medical Graph in practice, patient 35: risk of depression
• 49 year old man
• Dx: overweight,
diabetes,
hypertension,
anxiety disorder
 has an absolute
risk of 36% to
develop a
depression within
the next 4 years
26
… and rationale of why model thinks this
27
• Targets for prediction: ICD-coded diagnoses
• Only incident patients per diagnose considered, i.e. diagnosis-free 2009 – 2010
• if these patients remain diagnosis-free 2011 - 2014 (observation period), then 0 else 1
• Covariates: all ICD-/ATC-codes, age and sex measured in 2010
Example: Model to predict „I50 – Heart Failure“
27
Analysis Design
Predict 4 year long-term effects, balanced for all co-variables
I50 -
I50 free
patients
2009 2010
time
I50 -
(coded
as 0)
I50 +
(coded
as1)
2011 2014
Covariates
Remaining I50 free patients/ newly I50 diagnosed patients
28
28
(A) integrate & clean
Research on anonymized claims data
Primary care
Secondary care
Drug prescriptions
Other data
Visits & diagnoses
Visits, diagnoses &
procedures
Drug presciptions
Further cooperations just started
Will enable analysis of vital and laboratory parameters
Data integration
& cleaning
• Data cleaning
• Longitudinally linked &
integrated for analytics
• Anonymized
6 Mio patients
6 years
> 1.5b events
Billing data flow
60+ sickness funds
29
Technology
stack
feature
extraction
For 3.8m patients:
• age, gender
• all diagnoses: ICD10-coded, 3 digits, i.e. 2054 codes
• all medications: ATC-coded, 5 digits, i.e. 906 codes
• death, hospitalization
Results in: 6277 features
• 1623 targets, 2011-2014
• 2320 covariates, 2010
• 2334 filter-columns, 2009-2010
data mining Calculate prevalence, incidence, mean age for all covariates (i.e. diseases
and medications)
machine
learning
Predictive modelling for ~1600 targets
• Linear classification model, resulting in odds ratios
• Calculation of p-values
(B) mine & learn
Calculate statistics & build prediction models for ~1600 targets
KNOWLEDGE GRAPHS
AS A BACKBONE
30
• Total concepts = 540,632
• 100+ person years of clinical
expert knowledge
EMMeT
EXPANDING TO INCLUDE IMAGE SNIPPETS
34
Enrichment though integration using linked data
CONCLUSION
• Knowledge graphs are critical components for delivering customer value
• AI techniques such as machine learning and predictive modelling from data are key parts of
knowledge graph construction
• This is particularly the case as the amount, speed and specificity of data and requirements
accelerates
• Leveraging existing assets such as ontologies, data, and controlled (i.e. connected data) have
been key assets for Elsevier in the build out of knowledge graphs
• Another talk is how all this enables intelligent based solutions
• Oh and we are hiring 
• Paul Groth p.groth@elsevier.com
PANEL: DOES CONNECTED DATA NEED AI
OR AI NEED CONNECTED DATA?
Tweet your questions via Direct Message to @Connected_Data or #ConnectedData
MODERATOR
Szymon Klarman
Knowledge Architect
PANELIST
Spyros Kotoulas
Research Scientist
IBM Research Ireland
@SpyrosKotoulas
PANELIST
Tara Raafat
Chief Ontologist
NextAngles
PANELIST
Freddy Lecue
Principal Scientist
Accenture Technology
Labs
@freddylecue
PANELIST
Paul Groth
Disruptive Technology
Director
Elsevier Labs
@pgroth

Contenu connexe

Tendances

How Bio Ontologies Enable Open Science
How Bio Ontologies Enable Open ScienceHow Bio Ontologies Enable Open Science
How Bio Ontologies Enable Open Sciencedrnigam
 
Semantics as a service at EMBL-EBI
Semantics as a service at EMBL-EBISemantics as a service at EMBL-EBI
Semantics as a service at EMBL-EBISimon Jupp
 
Importing life science at a into Neo4j
Importing life science at a into Neo4jImporting life science at a into Neo4j
Importing life science at a into Neo4jSimon Jupp
 
20130622 okfn hackathon t2
20130622 okfn hackathon t220130622 okfn hackathon t2
20130622 okfn hackathon t2Seonho Kim
 
Data retreival system
Data retreival systemData retreival system
Data retreival systemShikha Thakur
 
ReVeaLD: A User-driven Domain Specific Interactive Search Platform for Biomed...
ReVeaLD: A User-driven Domain Specific Interactive Search Platform for Biomed...ReVeaLD: A User-driven Domain Specific Interactive Search Platform for Biomed...
ReVeaLD: A User-driven Domain Specific Interactive Search Platform for Biomed...Maulik Kamdar
 
The Rhetoric of Research Objects
The Rhetoric of Research ObjectsThe Rhetoric of Research Objects
The Rhetoric of Research ObjectsCarole Goble
 
The Research Object Initiative: Frameworks and Use Cases
The Research Object Initiative:Frameworks and Use CasesThe Research Object Initiative:Frameworks and Use Cases
The Research Object Initiative: Frameworks and Use CasesCarole Goble
 
Bio ontologies and semantic technologies[2]
Bio ontologies and semantic technologies[2]Bio ontologies and semantic technologies[2]
Bio ontologies and semantic technologies[2]Prof. Wim Van Criekinge
 
Zmasek TOPSAN Biohackathon 2011
Zmasek TOPSAN Biohackathon 2011Zmasek TOPSAN Biohackathon 2011
Zmasek TOPSAN Biohackathon 2011cmzmasek
 
Data retriveal ,srg and dbget
Data retriveal ,srg and dbgetData retriveal ,srg and dbget
Data retriveal ,srg and dbgetSurendraKumar338
 
A knowledge capture framework for domain specific search systems
A knowledge capture framework for domain specific search systemsA knowledge capture framework for domain specific search systems
A knowledge capture framework for domain specific search systemsramakanz
 
FAIRDOM - FAIR Asset management and sharing experiences in Systems and Synthe...
FAIRDOM - FAIR Asset management and sharing experiences in Systems and Synthe...FAIRDOM - FAIR Asset management and sharing experiences in Systems and Synthe...
FAIRDOM - FAIR Asset management and sharing experiences in Systems and Synthe...Carole Goble
 
An Up-to-date Knowledge Base and Focused Exploration System for Human Perform...
An Up-to-date Knowledge Base and Focused Exploration System for Human Perform...An Up-to-date Knowledge Base and Focused Exploration System for Human Perform...
An Up-to-date Knowledge Base and Focused Exploration System for Human Perform...Artificial Intelligence Institute at UofSC
 

Tendances (20)

Prosite
PrositeProsite
Prosite
 
How Bio Ontologies Enable Open Science
How Bio Ontologies Enable Open ScienceHow Bio Ontologies Enable Open Science
How Bio Ontologies Enable Open Science
 
Semantics as a service at EMBL-EBI
Semantics as a service at EMBL-EBISemantics as a service at EMBL-EBI
Semantics as a service at EMBL-EBI
 
Importing life science at a into Neo4j
Importing life science at a into Neo4jImporting life science at a into Neo4j
Importing life science at a into Neo4j
 
20130622 okfn hackathon t2
20130622 okfn hackathon t220130622 okfn hackathon t2
20130622 okfn hackathon t2
 
Data retreival system
Data retreival systemData retreival system
Data retreival system
 
ReVeaLD: A User-driven Domain Specific Interactive Search Platform for Biomed...
ReVeaLD: A User-driven Domain Specific Interactive Search Platform for Biomed...ReVeaLD: A User-driven Domain Specific Interactive Search Platform for Biomed...
ReVeaLD: A User-driven Domain Specific Interactive Search Platform for Biomed...
 
The Rhetoric of Research Objects
The Rhetoric of Research ObjectsThe Rhetoric of Research Objects
The Rhetoric of Research Objects
 
The Research Object Initiative: Frameworks and Use Cases
The Research Object Initiative:Frameworks and Use CasesThe Research Object Initiative:Frameworks and Use Cases
The Research Object Initiative: Frameworks and Use Cases
 
Major databases in bioinformatics
Major databases in bioinformaticsMajor databases in bioinformatics
Major databases in bioinformatics
 
Bio ontologies and semantic technologies[2]
Bio ontologies and semantic technologies[2]Bio ontologies and semantic technologies[2]
Bio ontologies and semantic technologies[2]
 
Zmasek TOPSAN Biohackathon 2011
Zmasek TOPSAN Biohackathon 2011Zmasek TOPSAN Biohackathon 2011
Zmasek TOPSAN Biohackathon 2011
 
Data retriveal ,srg and dbget
Data retriveal ,srg and dbgetData retriveal ,srg and dbget
Data retriveal ,srg and dbget
 
A knowledge capture framework for domain specific search systems
A knowledge capture framework for domain specific search systemsA knowledge capture framework for domain specific search systems
A knowledge capture framework for domain specific search systems
 
Metadata in the BioSample Online Repository are Impaired by Numerous Anomalie...
Metadata in the BioSample Online Repository are Impaired by Numerous Anomalie...Metadata in the BioSample Online Repository are Impaired by Numerous Anomalie...
Metadata in the BioSample Online Repository are Impaired by Numerous Anomalie...
 
FAIRDOM - FAIR Asset management and sharing experiences in Systems and Synthe...
FAIRDOM - FAIR Asset management and sharing experiences in Systems and Synthe...FAIRDOM - FAIR Asset management and sharing experiences in Systems and Synthe...
FAIRDOM - FAIR Asset management and sharing experiences in Systems and Synthe...
 
An Up-to-date Knowledge Base and Focused Exploration System for Human Perform...
An Up-to-date Knowledge Base and Focused Exploration System for Human Perform...An Up-to-date Knowledge Base and Focused Exploration System for Human Perform...
An Up-to-date Knowledge Base and Focused Exploration System for Human Perform...
 
Blasta
BlastaBlasta
Blasta
 
(Expasy)
(Expasy)(Expasy)
(Expasy)
 
An Open Repository Model for Acquiring Knowledge About Scientific Experiments
An Open Repository Model for Acquiring Knowledge About Scientific ExperimentsAn Open Repository Model for Acquiring Knowledge About Scientific Experiments
An Open Repository Model for Acquiring Knowledge About Scientific Experiments
 

Similaire à Paul Groth

Knowledge graph construction for research & medicine
Knowledge graph construction for research & medicineKnowledge graph construction for research & medicine
Knowledge graph construction for research & medicinePaul Groth
 
Diversity and Depth: Implementing AI across many long tail domains
Diversity and Depth: Implementing AI across many long tail domainsDiversity and Depth: Implementing AI across many long tail domains
Diversity and Depth: Implementing AI across many long tail domainsPaul Groth
 
Applying machine learning techniques to big data in the scholarly domain
Applying machine learning techniques to big data in the scholarly domainApplying machine learning techniques to big data in the scholarly domain
Applying machine learning techniques to big data in the scholarly domainAngelo Salatino
 
Effective search of bibliographic databases
Effective search of bibliographic databasesEffective search of bibliographic databases
Effective search of bibliographic databasesTarek Tawfik Amin
 
The Neuroscience Information Framework:The present and future of neuroscience...
The Neuroscience Information Framework:The present and future of neuroscience...The Neuroscience Information Framework:The present and future of neuroscience...
The Neuroscience Information Framework:The present and future of neuroscience...Neuroscience Information Framework
 
The Computer Science Ontology: A Large-Scale Taxonomy of Research Areas
The Computer Science Ontology: A Large-Scale Taxonomy of Research AreasThe Computer Science Ontology: A Large-Scale Taxonomy of Research Areas
The Computer Science Ontology: A Large-Scale Taxonomy of Research AreasAngelo Salatino
 
The Computer Science Ontology: A Large-Scale Taxonomy of Research Areas
The Computer Science Ontology:  A Large-Scale Taxonomy of Research AreasThe Computer Science Ontology:  A Large-Scale Taxonomy of Research Areas
The Computer Science Ontology: A Large-Scale Taxonomy of Research AreasAngelo Salatino
 
Presentation to the J. Craig Venter Institute, Dec. 2014
Presentation to the J. Craig Venter Institute, Dec. 2014Presentation to the J. Craig Venter Institute, Dec. 2014
Presentation to the J. Craig Venter Institute, Dec. 2014Mark Wilkinson
 
The real world of ontologies and phenotype representation: perspectives from...
The real world of ontologies and phenotype representation:  perspectives from...The real world of ontologies and phenotype representation:  perspectives from...
The real world of ontologies and phenotype representation: perspectives from...Maryann Martone
 
How do we know what we don’t know: Using the Neuroscience Information Framew...
How do we know what we don’t know:  Using the Neuroscience Information Framew...How do we know what we don’t know:  Using the Neuroscience Information Framew...
How do we know what we don’t know: Using the Neuroscience Information Framew...Maryann Martone
 
Combining Explicit and Latent Web Semantics for Maintaining Knowledge Graphs
Combining Explicit and Latent Web Semantics for Maintaining Knowledge GraphsCombining Explicit and Latent Web Semantics for Maintaining Knowledge Graphs
Combining Explicit and Latent Web Semantics for Maintaining Knowledge GraphsPaul Groth
 
Sci Know Mine 2013: What can we learn from topic modeling on 350M academic do...
Sci Know Mine 2013: What can we learn from topic modeling on 350M academic do...Sci Know Mine 2013: What can we learn from topic modeling on 350M academic do...
Sci Know Mine 2013: What can we learn from topic modeling on 350M academic do...William Gunn
 
Program of Academic Excellence
Program of Academic ExcellenceProgram of Academic Excellence
Program of Academic ExcellenceDarrell W. Gunter
 
How do we know what we don't know?  Exploring the data and knowledge space th...
How do we know what we don't know?  Exploring the data and knowledge space th...How do we know what we don't know?  Exploring the data and knowledge space th...
How do we know what we don't know?  Exploring the data and knowledge space th...Maryann Martone
 
Sources of Change in Modern Knowledge Organization Systems
Sources of Change in Modern Knowledge Organization SystemsSources of Change in Modern Knowledge Organization Systems
Sources of Change in Modern Knowledge Organization SystemsPaul Groth
 
Clinical Epidemiology - Systematic PubMed Searching Workshop
Clinical Epidemiology - Systematic PubMed Searching WorkshopClinical Epidemiology - Systematic PubMed Searching Workshop
Clinical Epidemiology - Systematic PubMed Searching WorkshopRobin Featherstone
 
Data-knowledge transition zones within the biomedical research ecosystem
Data-knowledge transition zones within the biomedical research ecosystemData-knowledge transition zones within the biomedical research ecosystem
Data-knowledge transition zones within the biomedical research ecosystemMaryann Martone
 
How to Conduct a Systematic Search
How to Conduct a Systematic SearchHow to Conduct a Systematic Search
How to Conduct a Systematic SearchRobin Featherstone
 
The possibility and probability of a global Neuroscience Information Framework
The possibility and probability of a global Neuroscience Information Framework The possibility and probability of a global Neuroscience Information Framework
The possibility and probability of a global Neuroscience Information Framework Neuroscience Information Framework
 

Similaire à Paul Groth (20)

Knowledge graph construction for research & medicine
Knowledge graph construction for research & medicineKnowledge graph construction for research & medicine
Knowledge graph construction for research & medicine
 
Diversity and Depth: Implementing AI across many long tail domains
Diversity and Depth: Implementing AI across many long tail domainsDiversity and Depth: Implementing AI across many long tail domains
Diversity and Depth: Implementing AI across many long tail domains
 
Applying machine learning techniques to big data in the scholarly domain
Applying machine learning techniques to big data in the scholarly domainApplying machine learning techniques to big data in the scholarly domain
Applying machine learning techniques to big data in the scholarly domain
 
Effective search of bibliographic databases
Effective search of bibliographic databasesEffective search of bibliographic databases
Effective search of bibliographic databases
 
The Neuroscience Information Framework:The present and future of neuroscience...
The Neuroscience Information Framework:The present and future of neuroscience...The Neuroscience Information Framework:The present and future of neuroscience...
The Neuroscience Information Framework:The present and future of neuroscience...
 
The Computer Science Ontology: A Large-Scale Taxonomy of Research Areas
The Computer Science Ontology: A Large-Scale Taxonomy of Research AreasThe Computer Science Ontology: A Large-Scale Taxonomy of Research Areas
The Computer Science Ontology: A Large-Scale Taxonomy of Research Areas
 
The Computer Science Ontology: A Large-Scale Taxonomy of Research Areas
The Computer Science Ontology:  A Large-Scale Taxonomy of Research AreasThe Computer Science Ontology:  A Large-Scale Taxonomy of Research Areas
The Computer Science Ontology: A Large-Scale Taxonomy of Research Areas
 
Presentation to the J. Craig Venter Institute, Dec. 2014
Presentation to the J. Craig Venter Institute, Dec. 2014Presentation to the J. Craig Venter Institute, Dec. 2014
Presentation to the J. Craig Venter Institute, Dec. 2014
 
The real world of ontologies and phenotype representation: perspectives from...
The real world of ontologies and phenotype representation:  perspectives from...The real world of ontologies and phenotype representation:  perspectives from...
The real world of ontologies and phenotype representation: perspectives from...
 
How do we know what we don’t know: Using the Neuroscience Information Framew...
How do we know what we don’t know:  Using the Neuroscience Information Framew...How do we know what we don’t know:  Using the Neuroscience Information Framew...
How do we know what we don’t know: Using the Neuroscience Information Framew...
 
Combining Explicit and Latent Web Semantics for Maintaining Knowledge Graphs
Combining Explicit and Latent Web Semantics for Maintaining Knowledge GraphsCombining Explicit and Latent Web Semantics for Maintaining Knowledge Graphs
Combining Explicit and Latent Web Semantics for Maintaining Knowledge Graphs
 
Sci Know Mine 2013: What can we learn from topic modeling on 350M academic do...
Sci Know Mine 2013: What can we learn from topic modeling on 350M academic do...Sci Know Mine 2013: What can we learn from topic modeling on 350M academic do...
Sci Know Mine 2013: What can we learn from topic modeling on 350M academic do...
 
Text-Mining PubMed Search Results to Identify Emerging Technologies Relevant ...
Text-Mining PubMed Search Results to Identify Emerging Technologies Relevant ...Text-Mining PubMed Search Results to Identify Emerging Technologies Relevant ...
Text-Mining PubMed Search Results to Identify Emerging Technologies Relevant ...
 
Program of Academic Excellence
Program of Academic ExcellenceProgram of Academic Excellence
Program of Academic Excellence
 
How do we know what we don't know?  Exploring the data and knowledge space th...
How do we know what we don't know?  Exploring the data and knowledge space th...How do we know what we don't know?  Exploring the data and knowledge space th...
How do we know what we don't know?  Exploring the data and knowledge space th...
 
Sources of Change in Modern Knowledge Organization Systems
Sources of Change in Modern Knowledge Organization SystemsSources of Change in Modern Knowledge Organization Systems
Sources of Change in Modern Knowledge Organization Systems
 
Clinical Epidemiology - Systematic PubMed Searching Workshop
Clinical Epidemiology - Systematic PubMed Searching WorkshopClinical Epidemiology - Systematic PubMed Searching Workshop
Clinical Epidemiology - Systematic PubMed Searching Workshop
 
Data-knowledge transition zones within the biomedical research ecosystem
Data-knowledge transition zones within the biomedical research ecosystemData-knowledge transition zones within the biomedical research ecosystem
Data-knowledge transition zones within the biomedical research ecosystem
 
How to Conduct a Systematic Search
How to Conduct a Systematic SearchHow to Conduct a Systematic Search
How to Conduct a Systematic Search
 
The possibility and probability of a global Neuroscience Information Framework
The possibility and probability of a global Neuroscience Information Framework The possibility and probability of a global Neuroscience Information Framework
The possibility and probability of a global Neuroscience Information Framework
 

Plus de Connected Data World

Systems that learn and reason | Frank Van Harmelen
Systems that learn and reason | Frank Van HarmelenSystems that learn and reason | Frank Van Harmelen
Systems that learn and reason | Frank Van HarmelenConnected Data World
 
Graph Abstractions Matter by Ora Lassila
Graph Abstractions Matter by Ora LassilaGraph Abstractions Matter by Ora Lassila
Graph Abstractions Matter by Ora LassilaConnected Data World
 
Κnowledge Architecture: Combining Strategy, Data Science and Information Arch...
Κnowledge Architecture: Combining Strategy, Data Science and Information Arch...Κnowledge Architecture: Combining Strategy, Data Science and Information Arch...
Κnowledge Architecture: Combining Strategy, Data Science and Information Arch...Connected Data World
 
How to get started with Graph Machine Learning
How to get started with Graph Machine LearningHow to get started with Graph Machine Learning
How to get started with Graph Machine LearningConnected Data World
 
The years of the graph: The future of the future is here
The years of the graph: The future of the future is hereThe years of the graph: The future of the future is here
The years of the graph: The future of the future is hereConnected Data World
 
From Taxonomies and Schemas to Knowledge Graphs: Parts 1 & 2
From Taxonomies and Schemas to Knowledge Graphs: Parts 1 & 2From Taxonomies and Schemas to Knowledge Graphs: Parts 1 & 2
From Taxonomies and Schemas to Knowledge Graphs: Parts 1 & 2Connected Data World
 
From Taxonomies and Schemas to Knowledge Graphs: Part 3
From Taxonomies and Schemas to Knowledge Graphs: Part 3From Taxonomies and Schemas to Knowledge Graphs: Part 3
From Taxonomies and Schemas to Knowledge Graphs: Part 3Connected Data World
 
In Search of the Universal Data Model
In Search of the Universal Data ModelIn Search of the Universal Data Model
In Search of the Universal Data ModelConnected Data World
 
Graph in Apache Cassandra. The World’s Most Scalable Graph Database
Graph in Apache Cassandra. The World’s Most Scalable Graph DatabaseGraph in Apache Cassandra. The World’s Most Scalable Graph Database
Graph in Apache Cassandra. The World’s Most Scalable Graph DatabaseConnected Data World
 
Enterprise Data Governance: Leveraging Knowledge Graph & AI in support of a d...
Enterprise Data Governance: Leveraging Knowledge Graph & AI in support of a d...Enterprise Data Governance: Leveraging Knowledge Graph & AI in support of a d...
Enterprise Data Governance: Leveraging Knowledge Graph & AI in support of a d...Connected Data World
 
Powering Question-Driven Problem Solving to Improve the Chances of Finding Ne...
Powering Question-Driven Problem Solving to Improve the Chances of Finding Ne...Powering Question-Driven Problem Solving to Improve the Chances of Finding Ne...
Powering Question-Driven Problem Solving to Improve the Chances of Finding Ne...Connected Data World
 
Semantic similarity for faster Knowledge Graph delivery at scale
Semantic similarity for faster Knowledge Graph delivery at scaleSemantic similarity for faster Knowledge Graph delivery at scale
Semantic similarity for faster Knowledge Graph delivery at scaleConnected Data World
 
Knowledge Graphs and AI to Hyper-Personalise the Fashion Retail Experience at...
Knowledge Graphs and AI to Hyper-Personalise the Fashion Retail Experience at...Knowledge Graphs and AI to Hyper-Personalise the Fashion Retail Experience at...
Knowledge Graphs and AI to Hyper-Personalise the Fashion Retail Experience at...Connected Data World
 
Schema, Google & The Future of the Web
Schema, Google & The Future of the WebSchema, Google & The Future of the Web
Schema, Google & The Future of the WebConnected Data World
 
RAPIDS cuGraph – Accelerating all your Graph needs
RAPIDS cuGraph – Accelerating all your Graph needsRAPIDS cuGraph – Accelerating all your Graph needs
RAPIDS cuGraph – Accelerating all your Graph needsConnected Data World
 
Elegant and Scalable Code Querying with Code Property Graphs
Elegant and Scalable Code Querying with Code Property GraphsElegant and Scalable Code Querying with Code Property Graphs
Elegant and Scalable Code Querying with Code Property GraphsConnected Data World
 
From Knowledge Graphs to AI-powered SEO: Using taxonomies, schemas and knowle...
From Knowledge Graphs to AI-powered SEO: Using taxonomies, schemas and knowle...From Knowledge Graphs to AI-powered SEO: Using taxonomies, schemas and knowle...
From Knowledge Graphs to AI-powered SEO: Using taxonomies, schemas and knowle...Connected Data World
 
Graph for Good: Empowering your NGO
Graph for Good: Empowering your NGOGraph for Good: Empowering your NGO
Graph for Good: Empowering your NGOConnected Data World
 

Plus de Connected Data World (20)

Systems that learn and reason | Frank Van Harmelen
Systems that learn and reason | Frank Van HarmelenSystems that learn and reason | Frank Van Harmelen
Systems that learn and reason | Frank Van Harmelen
 
Graph Abstractions Matter by Ora Lassila
Graph Abstractions Matter by Ora LassilaGraph Abstractions Matter by Ora Lassila
Graph Abstractions Matter by Ora Lassila
 
Κnowledge Architecture: Combining Strategy, Data Science and Information Arch...
Κnowledge Architecture: Combining Strategy, Data Science and Information Arch...Κnowledge Architecture: Combining Strategy, Data Science and Information Arch...
Κnowledge Architecture: Combining Strategy, Data Science and Information Arch...
 
How to get started with Graph Machine Learning
How to get started with Graph Machine LearningHow to get started with Graph Machine Learning
How to get started with Graph Machine Learning
 
Graphs in sustainable finance
Graphs in sustainable financeGraphs in sustainable finance
Graphs in sustainable finance
 
The years of the graph: The future of the future is here
The years of the graph: The future of the future is hereThe years of the graph: The future of the future is here
The years of the graph: The future of the future is here
 
From Taxonomies and Schemas to Knowledge Graphs: Parts 1 & 2
From Taxonomies and Schemas to Knowledge Graphs: Parts 1 & 2From Taxonomies and Schemas to Knowledge Graphs: Parts 1 & 2
From Taxonomies and Schemas to Knowledge Graphs: Parts 1 & 2
 
From Taxonomies and Schemas to Knowledge Graphs: Part 3
From Taxonomies and Schemas to Knowledge Graphs: Part 3From Taxonomies and Schemas to Knowledge Graphs: Part 3
From Taxonomies and Schemas to Knowledge Graphs: Part 3
 
In Search of the Universal Data Model
In Search of the Universal Data ModelIn Search of the Universal Data Model
In Search of the Universal Data Model
 
Graph in Apache Cassandra. The World’s Most Scalable Graph Database
Graph in Apache Cassandra. The World’s Most Scalable Graph DatabaseGraph in Apache Cassandra. The World’s Most Scalable Graph Database
Graph in Apache Cassandra. The World’s Most Scalable Graph Database
 
Enterprise Data Governance: Leveraging Knowledge Graph & AI in support of a d...
Enterprise Data Governance: Leveraging Knowledge Graph & AI in support of a d...Enterprise Data Governance: Leveraging Knowledge Graph & AI in support of a d...
Enterprise Data Governance: Leveraging Knowledge Graph & AI in support of a d...
 
Graph Realities
Graph RealitiesGraph Realities
Graph Realities
 
Powering Question-Driven Problem Solving to Improve the Chances of Finding Ne...
Powering Question-Driven Problem Solving to Improve the Chances of Finding Ne...Powering Question-Driven Problem Solving to Improve the Chances of Finding Ne...
Powering Question-Driven Problem Solving to Improve the Chances of Finding Ne...
 
Semantic similarity for faster Knowledge Graph delivery at scale
Semantic similarity for faster Knowledge Graph delivery at scaleSemantic similarity for faster Knowledge Graph delivery at scale
Semantic similarity for faster Knowledge Graph delivery at scale
 
Knowledge Graphs and AI to Hyper-Personalise the Fashion Retail Experience at...
Knowledge Graphs and AI to Hyper-Personalise the Fashion Retail Experience at...Knowledge Graphs and AI to Hyper-Personalise the Fashion Retail Experience at...
Knowledge Graphs and AI to Hyper-Personalise the Fashion Retail Experience at...
 
Schema, Google & The Future of the Web
Schema, Google & The Future of the WebSchema, Google & The Future of the Web
Schema, Google & The Future of the Web
 
RAPIDS cuGraph – Accelerating all your Graph needs
RAPIDS cuGraph – Accelerating all your Graph needsRAPIDS cuGraph – Accelerating all your Graph needs
RAPIDS cuGraph – Accelerating all your Graph needs
 
Elegant and Scalable Code Querying with Code Property Graphs
Elegant and Scalable Code Querying with Code Property GraphsElegant and Scalable Code Querying with Code Property Graphs
Elegant and Scalable Code Querying with Code Property Graphs
 
From Knowledge Graphs to AI-powered SEO: Using taxonomies, schemas and knowle...
From Knowledge Graphs to AI-powered SEO: Using taxonomies, schemas and knowle...From Knowledge Graphs to AI-powered SEO: Using taxonomies, schemas and knowle...
From Knowledge Graphs to AI-powered SEO: Using taxonomies, schemas and knowle...
 
Graph for Good: Empowering your NGO
Graph for Good: Empowering your NGOGraph for Good: Empowering your NGO
Graph for Good: Empowering your NGO
 

Dernier

Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 

Dernier (20)

Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 

Paul Groth

  • 1. KNOWLEDGE GRAPH CONSTRUCTION FOR RESEARCH & MEDICINE Paul Groth (@pgroth) pgroth.com Disruptive Technology Director Elsevier Labs (@elsevierlabs) Connected Data London 2017 Contributions: Brad Allen, Pascal Coupet, Sujit Pal, Craig Stanley, Ron Daniel, Alex de Jong
  • 2. Our customers are facing challenges in science and health 1. Industrial Research Institute 2. The Lancet 3. Tufts 4. World Health Organization Elsevier is in a unique position to make a contribution towards solving these challenges Life-saving drugs are expensive to develop.3 Global research spend is growing every year.1 3.4% from 2015 Predicted spend $1.9TN research in 2016 Studies: 70-80% of research asks the wrong questions or cannot be reproduced Researchers lack the tools they need to be effective.2 Preventable medicalerrors: Third largest cause of death in theUS Health providers cannot save lives without the best information.4 $2.5BN median pharmaceutical spend per drug 1/20 successrate of drugs Heart Disease 611k Cancer 585k Medical Error 225k 149k Respiratory Illness
  • 3. ELSEVIER’S BUSINESS: PROVIDING ANSWERS FOR RESEARCHERS, DOCTORS AND NURSES My work is moving towards a new field; what should I know? • Journal articles, reference works, profiles of researchers, funders & institutions • Recommendations of people to connect with, reading lists, topic pages How should I treat my patient given her condition & history? • Journal articles, reference works, medical guidelines, electronic health records • Treatment plan with alternatives personalized for the patient How can I master the subject matter of the course I am taking? • Course syllabus, reference works, course objectives, student history • Quiz plan based on the student’s history and course objectives
  • 4. THE ROLE OF METADATA IN THE SECOND MACHINE AGE – DC-2016 / KØBENHAVN / 13 OCTOBER ANSWERS ARE ABOUT THINGS, NOT JUST WORKS Why shouldn’t a search on an author return information about the author, including the author’s works? Where was the author born, when did she live, what is she known for? … All of this is possible, but only if we can make some fundamental changes in our approach to bibliographic description. ... The challenge for us lies in transforming what we can of our data into interrelated “things” without overindulging that metaphor. Coyle, K. (2016). FRBR, before and after: a look at our bibliographical models. Chicago: ALA Editions.
  • 5. THE ROLE OF METADATA IN THE SECOND MACHINE AGE – DC-2016 / KØBENHAVN / 13 OCTOBER KNOWLEDGE GRAPHS DEFINED • Knowledge graphs are "graph structured knowledge bases (KBs) which store factual information in form of relationships between entities” (Nickel, M., Murphy, K., Tresp, V. and Gabrilovich, E. (2015). A review of relational machine learning for knowledge graphs. arXiv:1503.00759v3) • Knowledge graphs are metadata evolved beyond the focus on the work, linking people, concepts, things and events • Knowledge Graphs are focused on things to provide answers
  • 6.
  • 7.
  • 8.
  • 9. THE ROLE OF METADATA IN THE SECOND MACHINE AGE – DC-2016 / KØBENHAVN / 13 OCTOBER ELSEVIER’S KNOWLEDGE PLATFORM Products Data & Content Sources Knowledge Graphs Platforms & Shared Services Entity Hubs Usage logs Pathways EHRsArticles Authors Institutions SyllabiCitations ChemicalsBooks DrugsFunders Funder Hub Article HubProfile Hub Journal Hub Institution Hub Research HealthcareLife Sciences Content Life Sciences Search IdentityResearch Reaxys CK SherpathScopus SD ROS
  • 10. THE ROLE OF METADATA IN THE SECOND MACHINE AGE – DC-2016 / KØBENHAVN / 13 OCTOBER THE GROWTH OF SCIENCE COMPLICATES OUR EFFORTS
  • 11. MORE DOMAINS & MORE SPECIFICITY Gregory, K., Groth, P., Cousijn, H., Scharnhorst, A., & Wyatt, S. (2017). Searching Data: A Review of Observational Data Retrieval Practices. arXiv preprint arXiv:1707.06937. Some observations from @gregory_km survey: 1. The needs and behaviours of specific user groups (e.g. early career researchers, policy makers, students) are not well documented. 2. Background uses of observational data are better documented than foreground uses. 3. Reconstructing data tables from journal articles, using general search engines, and making direct data requests are common.
  • 14.
  • 15. HOW - OVERVIEW Content Books, Articles, Ontologies ... • Identification of concepts • Disambiguation • Domain/sub-domain identification • Abbreviations, variants • Gazeteering • Identification and classification of text snippets around concepts • Features building for concept/snippet pairs • Lexical, syntactic, semantic, doc structure … • Ranking concept snippet pairs • Machine learning • Hand made rules • Similarities • Deduplication Technologies NLP, ML • Curation • White list driven • Black list • Corrections/improve ments • Evaluation • Gold set by domain • Random set by domain • By SMEs (Subject Matter Experts) • Automation • Content Enrichment Framework • Taxonomy coverage extension Knowledge Graph Concepts, snippets, meta data, …
  • 16. | 16 OmniScience Neuros cience Extension vocabularies by domains to provide coverage Number of Concepts Number of Labels OmniScience 01.16.11 45969 47421 OmniScience Neuroscience branch 21/11/2016 2356 2455 OmniScience Extension Neuroscience branch 21/11/2016 23932 101276
  • 17. | 17 Concept Bad Good Inferior Colliculus By comparing activation obtained in an equivalent standard ( non-cardiac-gated ) fMRI experiment , Guimaraes and colleagues found that cardiac- gated activation maps yielded much greater activation in subcortical nuclei , such as the inferior colliculus . The inferior colliculus (IC) is part of the tectum of the midbrain (mesencephalon) comprising the quadrigeminal plate (Lamina quadrigemina). It is located caudal to the superior colliculus on the dorsal surface of the mesencephalon ( Figure 36.7 FIGURE 36.7Overview of the human brainstem; view from dorsal. The superior and inferior colliculi form the quadrigeminal plate. Parts of the cerebellum are removed.). The ventral border is formed by the lateral lemniscus. The inferior colliculus is the largest nucleus of the human auditory system. … Purkinje cells It is felt that the aminopyridines are likely to increase the excitability of the potassium channel- rich cerebellar Purkinje cells in the flocculus ( Etzion and Grossman , 2001 ) . Purkinje cells are the most salient cellular elements of the cerebellar cortex. They are arranged in a single row throughout the entire cerebellar cortex between the molecular (outer) layer and the granular (inner) layer. They are among the largest neurons and have a round perikaryon, classically described as shaped “like a chianti bottle,” with a highly branched dendritic tree shaped like a candelabrum and extending into the molecular layer where they are contacted by incoming systems of afferent fibers from granule neurons and the brainstem… Olfactory Bulb The most common sites used for induction of kindling include the amygdala, perforant path , dorsal hippocampus , olfactory bulb , and perirhinal cortex. The olfactory bulb is the first relay station of the central olfactory system in the vertebrate brain and contains in its superficial layer a few thousand glomeruli, spherical neuropils with sharp borders ( Figure 1 Figure 1Axonal projection pattern of olfactory sensory neurons to the glomeruli of the rodent olfactory bulb. The olfactory epithelium in rats and mice is divided into four zones (zones 1–4). A given odorant receptor is expressed by sensory neurons located within one zone of the epithelium. Individual olfactory sensory neurons express a single odorant receptor… Examples of good and bad snippets
  • 18.
  • 19. 19 One Weird Trick from Natural Language Processing (NLP) • Knowledge bases are populated by scanning text and doing Information Extraction • Most information extraction systems are looking for very specific things, like drug-drug interactions • Best accuracy for that one kind of data, but misses out on all the other concepts and relations in the text • For broad knowledge base, use Open Information Extraction that only uses some knowledge of grammar • The weird trick for open information extraction … a simple algorithm, known as ReVerb*: 1. Find “relation phrases” starting with a verb and ending with a verb or preposition 2. Find noun phrases before and after the relation phrase 3. Discard relation phrases not used with multiple combinations of arguments. In addition, brain scans were performed to exclude other causes of dementia. * Fader et al. Identifying Relations for Open Information Extraction
  • 20. 20 ReVerb output # SD Documents Scanned 14,000,000 Extracted ReVerb Triples 473,350,566
  • 21. 21 Universal schemas * Using newer Universal Schema algorithms than described in next few slides • … are a specific technique from the Information Extraction and the Automatic Knowledge Base Completion literature • … are an unsupervised method to ‘learn’ by combining text extracts with existing knowledge base assertions Applications: • Extend a medical knowledge base • scan incoming literature to suggest new additions to EMMeT and show the underlying evidence to the taxonomy editor. • scan literature backlog to find evidence for data already in EMMeT • Literature Surveillance • scan incoming literature to find existing facts even if expressed in very different ways • find new concepts in the literature related to an existing EMMeT concept*. Let taxonomy editor decide whether to add new concept and relation to EMMeT
  • 22. 22 Universal schemas – Predict ‘missing‘ EMMeT facts • Make a matrix: • columns for the relation phrases from ReVerb or the semantic relations from EMMeT • rows are the pairs of concepts linked by a relation • A ‘1.0’ in a cell if those concepts were linked by that relation • Outlined cells in diagram are the ones initialized to 1. • Factorize matrix to ExK and KxR, then recombine. • “Learns” the correlations between text relations and EMMeT relations, in the context of pairs of objects. • Cells going from 0 to > 0 indicates potential. • Find new triples to go into EMMeT e.g., (glaucoma, has_alternativeProcedure, biofeedback)
  • 24. 24 Medical Graph – Statistical correlations at scale I65 Occlusion and stenosis of precerebral arteries G40 Epilepsy has_successor I61 C71 Malignant neoplasm of brain odds ratio: 1.12 intracerebral hemorrhage has_successor criteria1: • Correlation selected by preditive modeling algorithmus • No. of relations is higher than in mirrored relation • p-value < 0,05 • Odds ratios balanced over all covariates. 1 Criteria based on: Jensen et.al.: Temporal disease trajectories condensed from population-wide registry data covering 6.2 million patients. Nature Communications, 2014 Jun 24 ;5:4022. doi: 10.1038/ncomms5022. Other covariates Primary care Secondary care Drug prescriptions 5m patients each 6 years longitudinality
  • 25. 25 Medical Graph in practice, patient 35: risk of depression • 49 year old man • Dx: overweight, diabetes, hypertension, anxiety disorder  has an absolute risk of 36% to develop a depression within the next 4 years
  • 26. 26 … and rationale of why model thinks this
  • 27. 27 • Targets for prediction: ICD-coded diagnoses • Only incident patients per diagnose considered, i.e. diagnosis-free 2009 – 2010 • if these patients remain diagnosis-free 2011 - 2014 (observation period), then 0 else 1 • Covariates: all ICD-/ATC-codes, age and sex measured in 2010 Example: Model to predict „I50 – Heart Failure“ 27 Analysis Design Predict 4 year long-term effects, balanced for all co-variables I50 - I50 free patients 2009 2010 time I50 - (coded as 0) I50 + (coded as1) 2011 2014 Covariates Remaining I50 free patients/ newly I50 diagnosed patients
  • 28. 28 28 (A) integrate & clean Research on anonymized claims data Primary care Secondary care Drug prescriptions Other data Visits & diagnoses Visits, diagnoses & procedures Drug presciptions Further cooperations just started Will enable analysis of vital and laboratory parameters Data integration & cleaning • Data cleaning • Longitudinally linked & integrated for analytics • Anonymized 6 Mio patients 6 years > 1.5b events Billing data flow 60+ sickness funds
  • 29. 29 Technology stack feature extraction For 3.8m patients: • age, gender • all diagnoses: ICD10-coded, 3 digits, i.e. 2054 codes • all medications: ATC-coded, 5 digits, i.e. 906 codes • death, hospitalization Results in: 6277 features • 1623 targets, 2011-2014 • 2320 covariates, 2010 • 2334 filter-columns, 2009-2010 data mining Calculate prevalence, incidence, mean age for all covariates (i.e. diseases and medications) machine learning Predictive modelling for ~1600 targets • Linear classification model, resulting in odds ratios • Calculation of p-values (B) mine & learn Calculate statistics & build prediction models for ~1600 targets
  • 30. KNOWLEDGE GRAPHS AS A BACKBONE 30
  • 31.
  • 32. • Total concepts = 540,632 • 100+ person years of clinical expert knowledge EMMeT
  • 33. EXPANDING TO INCLUDE IMAGE SNIPPETS
  • 34. 34 Enrichment though integration using linked data
  • 35. CONCLUSION • Knowledge graphs are critical components for delivering customer value • AI techniques such as machine learning and predictive modelling from data are key parts of knowledge graph construction • This is particularly the case as the amount, speed and specificity of data and requirements accelerates • Leveraging existing assets such as ontologies, data, and controlled (i.e. connected data) have been key assets for Elsevier in the build out of knowledge graphs • Another talk is how all this enables intelligent based solutions • Oh and we are hiring  • Paul Groth p.groth@elsevier.com
  • 36. PANEL: DOES CONNECTED DATA NEED AI OR AI NEED CONNECTED DATA? Tweet your questions via Direct Message to @Connected_Data or #ConnectedData MODERATOR Szymon Klarman Knowledge Architect PANELIST Spyros Kotoulas Research Scientist IBM Research Ireland @SpyrosKotoulas PANELIST Tara Raafat Chief Ontologist NextAngles PANELIST Freddy Lecue Principal Scientist Accenture Technology Labs @freddylecue PANELIST Paul Groth Disruptive Technology Director Elsevier Labs @pgroth

Notes de l'éditeur

  1. Work with dans Reviewed 400 papers deep dive 114
  2. Labs researcher, Paul Groth, came across a technique called “Universal Schemas” in 2015. The technique allows one to build a knowledge base that combines structured info like EMMeT with unstructured info like ReVerb triples. It can learn from the correlations that already exist and suggest things that are missing. Unsupervised is very important because it means the construction of the rough underlying knowledge base is scalable and not limited by the availability of experts.
  3. <explanation> So now we have a method that can point out facts to investigate. Not shown here is a separate program, the Unwinder, that looks at the numbers in the matrix and unwinds the operations to get back to the original text that suggested the new fact. The person editing EMMeT can look at those pieces of text and mark if they are or are not relevant, and if they are, should the new fact be added. The links to the literature come along as the evidence. This doesn’t make the decision on what strength value to give, but it does put the evidence for it in one place. Raw predictions not good enough for fully automatic operation, but are plenty good enough to help taxonomy editors and other people do their job much faster. This kind of thing can work the other way – given an EMMeT fact, scan the literature for evidence for it, even if it is expressed in some rare way.