We have created structure based chemical ontologies that are used to classify chemical compounds automatically. These classifications can be used with success in semantic search engines to find all representatives of a chemical class. In the present paper we would like to demonstrate use cases when utilizing these chemical classes as features in typical machine learning approaches.
Thus, we have used the co-occurrence of chemical compounds with biological and physico-chemical properties in scientific articles to train models that predict properties of novel compounds that did not occur in those training sets. One example is the prediction of hepatotoxicity as well as bioavailability. In principle, one can use any property that is found in the textual vicinity of compounds to build such predictive models. Criteria will be presented that allow to judge the quality and predictive power of such models.
Similaire à AI-SDV 2020: AI-augmented Question Answering and Semantic Search for Life Science Enterprises Angela Bauch (Biomax Informatics, Germany) (20)
AI-SDV 2020: AI-augmented Question Answering and Semantic Search for Life Science Enterprises Angela Bauch (Biomax Informatics, Germany)
1. www.biomax.com
AILANI
Artificial Intelligence LANguage Interface
AI-augmented Question Answering and Semantic Search
for Life Science Enterprises
Angela Bauch (Assistant Director Product Management)
Biomax Informatics AG
6.10.2020 AI-SDV
2. Ask AILANI...
What increases the temperature leading to superconducting?
How much CO2 is released into the atmosphere?
What is a Bcr Abl inhibitor?
What is the Mooney viscosity of butadiene styrene rubber?
How does SARS compare to MERS?
Who won the nobel prize in economics?
How to treat schizophrenia?
What gene is associated with speech?
How to prevent cytokine storms in COVID-19 infections?
What are efficient organocatalysts?
4. Benefit from AI and Natural Language Processing
AI answers
Keyword results
5. Keyword results
Deeper insight with AILANI
Results panel
Drill-down relevant results based
on public/proprietary data
Infobox with context information based
on public and proprietary data
AI suggestionsAI answers
Search journey with smart
breadcrumbs
6. Semantic Search Results
combining:
- Results from natural language queries
Machine learning system
-Results matching with „beliefs“
NLP query parser
- Results on scientific concepts and
subsequent manual filtering
Artificial Intelligence (AI) with AILANI
Semantic Objects
(Growing Meta Ontology /
Semantic Network)
tagging
indexing
NLP Algorithms
Beliefs
(Triples + Evidence)
Global Search
Additional
specialized Searches
Spezialized
data detectors
OCR, OSR,
pattern detectors
Inhouse Documents
News Archive
Unstructured data
Chemical Search
Public / Proprietary
Data Resources
Structured data
Semantic Core
Medline abstracts
PMC full texts
Newsfeeds
COVID-19 patents
7. Content sources (https://www.ailani.ai)
> Literature
> Medline abstracts
> PubMedCentral articles
> ClinicalTrials.gov
> COVID-19 lit (Elsevier, medRxiv, bioRxiv)
> Newsfeeds (WHO, FDA, FoodNav...)
> COVID-19 patents
> extendable by proprietary documents (Word,
Powerpoint, ELN)
Databases
> 70 public databases (UniProtKB, PubChem,
ChEMBL, DrugBank, PDB, ICD-10, dbSNP…)
> extendable by proprietary structured data (from
antibody inventory to chemical library and IP
portfolio)
> Ontologies
> 120 life science ontologies (Disease, GO,
NCBITax, ChEBI, FMA, HP…)
> extendable by proprietary ontologies
8. The AILANI Knowledge Graph
The AILANI starter knowledge
base has ~ 44 Million triples—Customer implementations ofAILANI typically handle> 400 Million triples
12. Semantic Search Algorithms — AI Question Answering
> From all documents, a candidate set of 4-sentence
chunks is determined, which likely contain possible
answers.
> The candidate set is generated using the semantic
tagging index, using ontologies for contextualization.
Therefore, the quality of the ontologies has a certain
influence on the final answer generated by the AI
models.
> From the candidate set of 4-sentence chunks, a neural
network extracts direct answers to the entered
question.
> Based on the ML score, the answers are reported in
ranked order
Query
Ontology Tags
Cand date
enten es
TagsTe t Query
ns er
tra tor
odel
an ed st
o ns ers
Ontology
Tagg ng
er e
nde or
Quest on
ns er ng
21. Gene expression of DRD2 in humans
Data source: Human Allen Brain Atlas. Hawrylycz et al., 2012, Nature
In: NeuroXM™ Brain Science Suite.
22. Full brain fiber tracking in humans
Data source: Human Connectome Project. Van Essen et al. (2013), NeuroImage
In: NeuroXM™ Brain Science Suite.
23. DRD2 expression and brain connectivity in basal ganglia
NeuroXM™ Brain Science Suite.
24. Conclusions
• Find inhibitors
• Explore the knowledge graph
• Get the chemical and patent perspective from a potential drug: ruxolitinib
Search
Which gene encodes for dopamine receptor in human?
• Get detailed background information on DRD2
• Get gene expression data in brain regions (mouse; human)
• Get connectivity data in basal ganglia
Search
How to mitigate COVID-19 associated cytokine storm ?