The unescapable rise of machine learning (ML) and artificial intelligence (AI) challenges the role of existing text analytics techniques such as Named Entity Recognition and Natural Language Processing in extracting information from scientific text. Often these rely on underlying ontologies to provide the semantic foundation for more complex linguistic and statistical analysis. This paper investigates how ontologies and ontology-led text analysis fits with emerging ML/AI algorithms and the synergies brought by combining the two approaches. We highlight real-world use-cases from across the Pharmaceutical and Life Science sector where SciBite’s text analytics systems have been employed to create next-generation enterprise data infrastructure for many of the world’s leading companies.
Call Girl Raipur 📲 9999965857 ヅ10k NiGhT Call Girls In Raipur
Are Ontologies Relevant In A Machine Learning World
1. Are Ontologies Relevant In A
Machine Learning World?
Dr. Lee Harland
Founder & Chief Scientific Officer, SciBite Limited
April 2019
2. The good news is
I have discovered
inefficiencies…
…The bad news is
that you are one
of them.
https://timoelliott.com/blog/cartoons/artificial-intelligence-cartoons
3. Ontologies Enable Us To Communicate
Knowledge Graph
Ontology
Taxonomy
Categorisation
Thesaurus
Controlled Vocabulary
List
Human Validated
(Consensus/Authority)
Machine
Understandable
Vital to ensure we’re all
talking about the same thing!
NCBI:txid9365
4. What Do We Know About “Viagra”?
https://bioportal.bioontology.org
https://www.ebi.ac.uk/ols
7. Ontologies To Interpret ML Output Well Established
+100s more use ontologies to
interpret output from ML genomics
analysis
8. Ontologies Enhancing Training & Execution
… Clinically-driven
taxonomy of disease…
useful in generating
training classes that
are both well-suited
for machine learning
classifiers and
medically relevant.
…. Taxonomy provides a
2-level validation
strategy….
https://www.nature.com/articles/nature21056
doi:10.1038/nature21056
9. Enhancing Training & Execution (2)
doi:10.1038/s41591-018-0335-9
https://www.nature.com/articles/s41591-018-0335-9
https://www.newscientist.com/article/2193361-ai-can-diagnose-
childhood-illnesses-better-than-some-doctors/
10. What’s Good For One….
…The overarching principles in
DeepQA are massive
parallelism, many experts,
pervasive confidence
estimation, and integration of
shallow and deep knowledge…
…In addition to the content for
the answer and evidence sources,
DeepQA leverages other kinds of
semistructured and structured
content. Another step in the
content-acquisition process is to
identify and collect these
resources, which include
databases, taxonomies, and
ontologies, such as dbPedia
WordNet, and the Yago ontology…
https://www.aaai.org/Magazine/Watson/watson.php
13. Virtuous Circle Of Machine & Human Learning
Training
Execution
Interpretation
Enrichment
Validation
14. • Large numbers of disorganised documents (i.e. CRO documents)
• Need to align these to internal taxonomy of categories (e.g. M4 hierarchy
from FDA)
• Also need to identify key pieces of metadata (e.g. what is the study
compound? Title? Assay… etc )
• Manual process, incredibly time consuming
Pfizer Acquisition Challenge
CREDIT: Pfizer Computational Sciences
http://www.bio-
itworld.com/2018/08/08/a-
new-machine-learning-
approach-to-document-
classification-a-pfizer/scibite-
collaboration.aspx
18. Ontologies Are The Key To Unlocking FAIR
….Therefore, we are
confident that the
true cost of not
having FAIR research
data is much higher
than the estimated
€10.2bn per year...
https://publications.europa.eu/en/publication-detail/-/publication/d375368c-1a0a-11e9-8d04-01aa75ed71a1
https://www.go-fair.org
21. F.A.I.R @ BMS
• Public Bioassay
Ontology
• Augmented with
BMS-specific
terms
• Users can suggest
new assays etc.
• Reactive, semantic
form fields
CREDIT: BMS AIMS Team
22. SciBite Semantics Empower Scientific Infrastructure
F.A.I.R
Data
Catalogues
Knowledge
Graphs
Analytics
& ML/A.I
Semantic
Q&A
Improve
Integrity
Smart
Data Entry
Semantic
Search
Electronic Lab Notebooks
L.I.M.S.
Assay Registration
Asset
Management
Semantic
MDM
Ontology
Management
Departmental Search
Enterprise Search
Pharmacovigilance
Drug Repurposing
Horizon Scanning
Phenotype Triangulation
Outcomes Prediction
Portfolio
Analytics
Clinical Data Search
Genomic Data Processing
F.A.I.R & Search
Data Stewardship Data Mining & AI
23. Data’s Dynamic Duo
Acknowledgements
BMS: AIMS Team
AZ: R&D Search Team, Integrative Informatics Team,
Sinequa
Pfizer: Computational Sciences CoE
LifeArc
All my colleagues at SciBite
ML