SlideShare une entreprise Scribd logo
1  sur  34
Télécharger pour lire hors ligne
Cloud hosted APIs for
cheminformatics
designed for real time
user interfaces
Alex M. Clark, Ph.D.
March 2014
© 2014 Molecular Materials Informatics, Inc.!
http://molmatinf.com
MOLECULAR MATERIALS INFORMATICS
Data Regimes
• Differences in kind based on size:
- small: <1000 molecules; document-sized
- medium: <100K; filesystem, heavy duty
- large: database servers; limited operations
• Nimble client (mobile apps, web) either:
- operate on small collections
- limited window onto large collections
• Workflows using medium data are tricky
!2
MOLECULAR MATERIALS INFORMATICS
Overview
• Describing a workflow for tuberculosis; doing
scaffold analysis, model building, open data
• Split into:
- mobile apps as the user interface
- cloud-hosted algorithms for hard work and
access to large data
- desktop-based sections for medium data
• Mobile+cloud very convenient for small data,
and for well established tasks
• Desktop still primary for method development
!3
MOLECULAR MATERIALS INFORMATICS
TB Mobile
• Begins with a mobile app:
- ~90 curated targets
- ~ 800 molecules
• TB inhibition data abundant,
but mostly no target info
• Want all the actives against
the inhA target (157)
• Generate leads using
scaffold analysis
!4
MOLECULAR MATERIALS INFORMATICS
TB Mobile
• Begins with a mobile app:
- ~90 curated targets
- ~ 800 molecules
• TB inhibition data abundant,
but mostly no target info
• Want all the actives against
the inhA target (157)
• Generate leads using
scaffold analysis
!4
MOLECULAR MATERIALS INFORMATICS
TB Mobile
• Begins with a mobile app:
- ~90 curated targets
- ~ 800 molecules
• TB inhibition data abundant,
but mostly no target info
• Want all the actives against
the inhA target (157)
• Generate leads using
scaffold analysis
!4
MOLECULAR MATERIALS INFORMATICS
TB Mobile
• Begins with a mobile app:
- ~90 curated targets
- ~ 800 molecules
• TB inhibition data abundant,
but mostly no target info
• Want all the actives against
the inhA target (157)
• Generate leads using
scaffold analysis
!4
MOLECULAR MATERIALS INFORMATICS
TB Mobile
• Begins with a mobile app:
- ~90 curated targets
- ~ 800 molecules
• TB inhibition data abundant,
but mostly no target info
• Want all the actives against
the inhA target (157)
• Generate leads using
scaffold analysis
!4
MOLECULAR MATERIALS INFORMATICS
Scaffold Fragments
• What medicinally relevant scaffolds to use?
!5
157 related
compounds
scaffoldy!
fragments
TB activity
structures
templatey!
scaffolds
…
MOLECULAR MATERIALS INFORMATICS
Filtering Scaffold Candidates
• Candidates analysed & trimmed
• Overall architecture is a stream
!6
Read InhA Fragment Merge
Sort
PropertiesFilterWrite
HeavyAtoms
Isomorphisms
Macrocycles
Frequency
157 molecules
124 fragments
MOLECULAR MATERIALS INFORMATICS
Pipelining
• Not quite cloud (yet)
• Infrastructure for streaming
nodes together: build
workflows using a script
• Roadmap: build selected
workflows, out of
prepackaged nodes
• Expose as webservices: for
use by mobile apps
!7
{
"op":"com.mmi.core.op.CollapseUnique",
"id":102,
"name":"Collapse",
"parameters":
{
"keyColumn":"Molecule",
"countColumn":"Degeneracy",
"collapseColumn":["Target"],
"collapseOperator":[","]
},
"inputs":[[101,1]],
"outputs":1
},
{
"op":"com.mmi.core.op.Sort",
"id":103,
"name":"Collapse",
"parameters":
{
"columns":["Degeneracy"],
"directions":[-1]
},
"inputs":[[102,1]],
"outputs":1
},
{
"op":"com.mmi.core.op.MoleculeProperties",
"id":104,
"name":"Properties",
"parameters":
{
"heavyAtoms":"HeavyAtoms",
"isomorphisms":"Isomorphisms",
"macrocycles":"Macrocycles"
},
"inputs":[[103,1]],
"outputs":1
},
{
"op":"com.mmi.core.op.FilterProperties",
"id":105,
"name":"Filter",
"parameters":
{
"name":["HeavyAtoms","Isomorphisms","Macrocycles"],
"operator":[">=","<=","="],
"value":[10,4,0]
},
"inputs":[[104,1]],
"outputs":1
},
MOLECULAR MATERIALS INFORMATICS
Fragmentation
• Consider each structure: break it into pieces,
enumerate scaffold-like fragments
!8
MOLECULAR MATERIALS INFORMATICS
Decorating
• Have scaffoldy fragments, 5425 measurements
!9
• Do a trial matching:
templates & stats
MOLECULAR MATERIALS INFORMATICS
Scaffold Selection
!10
Assays
Filter
5425 molecules
Templates
Precursor
• Keep molecules based on at
least one template
• Output is suitable for the next
stage in the workflow
87 actives
138 inactives
MOLECULAR MATERIALS INFORMATICS
SAR Table App
• Back to mobile apps: want to deliver the 225
compounds to iPad/iPhone…
- email
- dropbox
- web
• SAR Table app designed for small documents:
content creation, focused analysis, and cloud-
assisted functions
!11
MOLECULAR MATERIALS INFORMATICS
Import
• Launch datasheet, draw first scaffold…
!12
MOLECULAR MATERIALS INFORMATICS
Import
• Launch datasheet, draw first scaffold…
!12
MOLECULAR MATERIALS INFORMATICS
Scaffold Assignment
• Ask the webservice to assist: complex, fast
!13
MOLECULAR MATERIALS INFORMATICS
Scaffold Assignment
• Ask the webservice to assist: complex, fast
!13
MOLECULAR MATERIALS INFORMATICS
Scaffold Assignment
• Ask the webservice to assist: complex, fast
!13
MOLECULAR MATERIALS INFORMATICS
Multi-Scaffold Assignment
!14
• Assign scaffolds in bulk: complex, quite fast
MOLECULAR MATERIALS INFORMATICS
Multi-Scaffold Assignment
!14
• Assign scaffolds in bulk: complex, quite fast
MOLECULAR MATERIALS INFORMATICS
Multi-Scaffold Assignment
!14
• Assign scaffolds in bulk: complex, quite fast
MOLECULAR MATERIALS INFORMATICS
More Data
• Have scaffolds and
substituents assigned
• Can gain valuable
insight just from that
• What about public
databases: what else do
our 3 scaffolds match?
!15
MOLECULAR MATERIALS INFORMATICS
ChemSpider
Searching
• Search for a template; optionally narrow
substituent values; want only new compounds
!16
initiate
MetaSearch
poll
• Substructure searches farmed out
to well known large data services
• Middleware post-processes with
scaffold analysis & assignment
PubChem
MOLECULAR MATERIALS INFORMATICS
Results
• Results are marked up
• Uses existing
fragments for context
• No duplicate structures
• All compounds are
known…
• … can be made or
purchased.
!17
MOLECULAR MATERIALS INFORMATICS
Results
• Results are marked up
• Uses existing
fragments for context
• No duplicate structures
• All compounds are
known…
• … can be made or
purchased.
!17
MOLECULAR MATERIALS INFORMATICS
Model Building
• Use structures with known
activities to create a
structure-activity model
!18
WebService
data
partial
model
final
model
• Slow calculation, small data
MOLECULAR MATERIALS INFORMATICS
Model Application
• Predicted activities for looked-up compounds…
!19
MOLECULAR MATERIALS INFORMATICS
Matrix View
• Plot R1 vs R4: examine second order SAR
!20
MOLECULAR MATERIALS INFORMATICS
Filling in Blanks
• Each blank cell: create &
score chimeric structures
• Gather distribution of
activities
• Total calculation: slow
• Performance: overhead
amortised in blocks (e.g.
10 cells per request)
!21
MOLECULAR MATERIALS INFORMATICS
Matrix Predictions
• Shows measured, available & hypothetical…
!22
MOLECULAR MATERIALS INFORMATICS
Conclusion
• Mobile+cloud can accomplish many
sophisticated tasks
• Stateless webservices very easy to deploy
• Work on small datasets, use large databases
• Medium sized data is problematic
• Can fallback to desktop: facile communication
• Apps & webservices very well suited to
mature workflow tasks
!23
Acknowledgments
http://molmatinf.com
http://molsync.com
http://cheminf20.org
!
@aclarkxyz
• Sean Ekins, Barry Bunin
& CDD
• RSC & ChemSpider,
PubChem, ChEBI
• Inquiries to
info@molmatinf.com

Contenu connexe

Similaire à Cloud hosted APIs for cheminformatics on mobile devices (ACS Dallas 2014)

Alex Clark : NETTAB 2013
Alex Clark : NETTAB 2013Alex Clark : NETTAB 2013
Alex Clark : NETTAB 2013Alex Clark
 
CLIMB System Introduction Talk - CLIMB Launch
CLIMB System Introduction Talk - CLIMB LaunchCLIMB System Introduction Talk - CLIMB Launch
CLIMB System Introduction Talk - CLIMB LaunchTom Connor
 
Practical cheminformatics workflows with mobile apps
Practical cheminformatics workflows with mobile appsPractical cheminformatics workflows with mobile apps
Practical cheminformatics workflows with mobile appsAlex Clark
 
Hadoop ecosystem for health/life sciences
Hadoop ecosystem for health/life sciencesHadoop ecosystem for health/life sciences
Hadoop ecosystem for health/life sciencesUri Laserson
 
Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...
Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...
Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...Lucas Jellema
 
Climb stateoftheartintro
Climb stateoftheartintroClimb stateoftheartintro
Climb stateoftheartintrothomasrconnor
 
How and why you need to build a big data lab
How and why you need to build a big data labHow and why you need to build a big data lab
How and why you need to build a big data labChris Kernaghan
 
Deploying Big Data Platforms
Deploying Big Data PlatformsDeploying Big Data Platforms
Deploying Big Data PlatformsChris Kernaghan
 
Data Café — A Platform For Creating Biomedical Data Lakes
Data Café — A Platform For Creating Biomedical Data LakesData Café — A Platform For Creating Biomedical Data Lakes
Data Café — A Platform For Creating Biomedical Data LakesPradeeban Kathiravelu, Ph.D.
 
Using Familiar BI Tools and Hadoop to Analyze Enterprise Networks
Using Familiar BI Tools and Hadoop to Analyze Enterprise NetworksUsing Familiar BI Tools and Hadoop to Analyze Enterprise Networks
Using Familiar BI Tools and Hadoop to Analyze Enterprise NetworksMapR Technologies
 
Jeremy Engle's slides from Redshift / Big Data meetup on July 13, 2017
Jeremy Engle's slides from Redshift / Big Data meetup on July 13, 2017Jeremy Engle's slides from Redshift / Big Data meetup on July 13, 2017
Jeremy Engle's slides from Redshift / Big Data meetup on July 13, 2017AWS Chicago
 
Reaction Lab Notebooks for Mobile Devices - Alex M. Clark - GDCh 2013
Reaction Lab Notebooks for Mobile Devices - Alex M. Clark - GDCh 2013Reaction Lab Notebooks for Mobile Devices - Alex M. Clark - GDCh 2013
Reaction Lab Notebooks for Mobile Devices - Alex M. Clark - GDCh 2013Alex Clark
 
(ATS4-DEV02) Accelrys Query Service: Technology and Tools
(ATS4-DEV02) Accelrys Query Service: Technology and Tools(ATS4-DEV02) Accelrys Query Service: Technology and Tools
(ATS4-DEV02) Accelrys Query Service: Technology and ToolsBIOVIA
 
CloudLab Overview
CloudLab OverviewCloudLab Overview
CloudLab OverviewEd Dodds
 
Reproducibility - The myths and truths of pipeline bioinformatics
Reproducibility - The myths and truths of pipeline bioinformaticsReproducibility - The myths and truths of pipeline bioinformatics
Reproducibility - The myths and truths of pipeline bioinformaticsSimon Cockell
 
Using Familiar BI Tools and Hadoop to Analyze Enterprise Networks
Using Familiar BI Tools and Hadoop to Analyze Enterprise NetworksUsing Familiar BI Tools and Hadoop to Analyze Enterprise Networks
Using Familiar BI Tools and Hadoop to Analyze Enterprise NetworksDataWorks Summit
 
5 Things that Make Hadoop a Game Changer
5 Things that Make Hadoop a Game Changer5 Things that Make Hadoop a Game Changer
5 Things that Make Hadoop a Game ChangerCaserta
 
20160331 sa introduction to big data pipelining berlin meetup 0.3
20160331 sa introduction to big data pipelining berlin meetup   0.320160331 sa introduction to big data pipelining berlin meetup   0.3
20160331 sa introduction to big data pipelining berlin meetup 0.3Simon Ambridge
 

Similaire à Cloud hosted APIs for cheminformatics on mobile devices (ACS Dallas 2014) (20)

Alex Clark : NETTAB 2013
Alex Clark : NETTAB 2013Alex Clark : NETTAB 2013
Alex Clark : NETTAB 2013
 
CLIMB System Introduction Talk - CLIMB Launch
CLIMB System Introduction Talk - CLIMB LaunchCLIMB System Introduction Talk - CLIMB Launch
CLIMB System Introduction Talk - CLIMB Launch
 
Practical cheminformatics workflows with mobile apps
Practical cheminformatics workflows with mobile appsPractical cheminformatics workflows with mobile apps
Practical cheminformatics workflows with mobile apps
 
Hadoop ecosystem for health/life sciences
Hadoop ecosystem for health/life sciencesHadoop ecosystem for health/life sciences
Hadoop ecosystem for health/life sciences
 
Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...
Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...
Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...
 
Unit ii sem-v-hadoop
Unit ii  sem-v-hadoopUnit ii  sem-v-hadoop
Unit ii sem-v-hadoop
 
Climb stateoftheartintro
Climb stateoftheartintroClimb stateoftheartintro
Climb stateoftheartintro
 
How and why you need to build a big data lab
How and why you need to build a big data labHow and why you need to build a big data lab
How and why you need to build a big data lab
 
Deploying Big Data Platforms
Deploying Big Data PlatformsDeploying Big Data Platforms
Deploying Big Data Platforms
 
Data Café — A Platform For Creating Biomedical Data Lakes
Data Café — A Platform For Creating Biomedical Data LakesData Café — A Platform For Creating Biomedical Data Lakes
Data Café — A Platform For Creating Biomedical Data Lakes
 
No sq lv1_0
No sq lv1_0No sq lv1_0
No sq lv1_0
 
Using Familiar BI Tools and Hadoop to Analyze Enterprise Networks
Using Familiar BI Tools and Hadoop to Analyze Enterprise NetworksUsing Familiar BI Tools and Hadoop to Analyze Enterprise Networks
Using Familiar BI Tools and Hadoop to Analyze Enterprise Networks
 
Jeremy Engle's slides from Redshift / Big Data meetup on July 13, 2017
Jeremy Engle's slides from Redshift / Big Data meetup on July 13, 2017Jeremy Engle's slides from Redshift / Big Data meetup on July 13, 2017
Jeremy Engle's slides from Redshift / Big Data meetup on July 13, 2017
 
Reaction Lab Notebooks for Mobile Devices - Alex M. Clark - GDCh 2013
Reaction Lab Notebooks for Mobile Devices - Alex M. Clark - GDCh 2013Reaction Lab Notebooks for Mobile Devices - Alex M. Clark - GDCh 2013
Reaction Lab Notebooks for Mobile Devices - Alex M. Clark - GDCh 2013
 
(ATS4-DEV02) Accelrys Query Service: Technology and Tools
(ATS4-DEV02) Accelrys Query Service: Technology and Tools(ATS4-DEV02) Accelrys Query Service: Technology and Tools
(ATS4-DEV02) Accelrys Query Service: Technology and Tools
 
CloudLab Overview
CloudLab OverviewCloudLab Overview
CloudLab Overview
 
Reproducibility - The myths and truths of pipeline bioinformatics
Reproducibility - The myths and truths of pipeline bioinformaticsReproducibility - The myths and truths of pipeline bioinformatics
Reproducibility - The myths and truths of pipeline bioinformatics
 
Using Familiar BI Tools and Hadoop to Analyze Enterprise Networks
Using Familiar BI Tools and Hadoop to Analyze Enterprise NetworksUsing Familiar BI Tools and Hadoop to Analyze Enterprise Networks
Using Familiar BI Tools and Hadoop to Analyze Enterprise Networks
 
5 Things that Make Hadoop a Game Changer
5 Things that Make Hadoop a Game Changer5 Things that Make Hadoop a Game Changer
5 Things that Make Hadoop a Game Changer
 
20160331 sa introduction to big data pipelining berlin meetup 0.3
20160331 sa introduction to big data pipelining berlin meetup   0.320160331 sa introduction to big data pipelining berlin meetup   0.3
20160331 sa introduction to big data pipelining berlin meetup 0.3
 

Plus de Alex Clark

Mixtures QSAR: modelling collections of chemicals
Mixtures QSAR: modelling collections of chemicalsMixtures QSAR: modelling collections of chemicals
Mixtures QSAR: modelling collections of chemicalsAlex Clark
 
Mixtures InChI: a story of how standards drive upstream products
Mixtures InChI: a story of how standards drive upstream productsMixtures InChI: a story of how standards drive upstream products
Mixtures InChI: a story of how standards drive upstream productsAlex Clark
 
Mixtures as first class citizens in the realm of informatics
Mixtures as first class citizens in the realm of informaticsMixtures as first class citizens in the realm of informatics
Mixtures as first class citizens in the realm of informaticsAlex Clark
 
Mixtures: informatics for formulations and consumer products
Mixtures: informatics for formulations and consumer productsMixtures: informatics for formulations and consumer products
Mixtures: informatics for formulations and consumer productsAlex Clark
 
Coordination InChI (2019)
Coordination InChI (2019)Coordination InChI (2019)
Coordination InChI (2019)Alex Clark
 
Chemical mixtures: File format, open source tools, example data, and mixtures...
Chemical mixtures: File format, open source tools, example data, and mixtures...Chemical mixtures: File format, open source tools, example data, and mixtures...
Chemical mixtures: File format, open source tools, example data, and mixtures...Alex Clark
 
Bringing bioassay protocols to the world of informatics, using semantic annot...
Bringing bioassay protocols to the world of informatics, using semantic annot...Bringing bioassay protocols to the world of informatics, using semantic annot...
Bringing bioassay protocols to the world of informatics, using semantic annot...Alex Clark
 
ACS CINF Luncheon talk (Boston 2018)
ACS CINF Luncheon talk (Boston 2018)ACS CINF Luncheon talk (Boston 2018)
ACS CINF Luncheon talk (Boston 2018)Alex Clark
 
Autonomous model building with a preponderance of well annotated assay protocols
Autonomous model building with a preponderance of well annotated assay protocolsAutonomous model building with a preponderance of well annotated assay protocols
Autonomous model building with a preponderance of well annotated assay protocolsAlex Clark
 
Representing molecules with minimalism: A solution to the entropy of informatics
Representing molecules with minimalism: A solution to the entropy of informaticsRepresenting molecules with minimalism: A solution to the entropy of informatics
Representing molecules with minimalism: A solution to the entropy of informaticsAlex Clark
 
CDD BioAssay Express: Expanding the target dimension: How to visualize a lot ...
CDD BioAssay Express: Expanding the target dimension: How to visualize a lot ...CDD BioAssay Express: Expanding the target dimension: How to visualize a lot ...
CDD BioAssay Express: Expanding the target dimension: How to visualize a lot ...Alex Clark
 
BioAssay Express
BioAssay ExpressBioAssay Express
BioAssay ExpressAlex Clark
 
SLAS2016: Why have one model when you could have thousands?
SLAS2016: Why have one model when you could have thousands?SLAS2016: Why have one model when you could have thousands?
SLAS2016: Why have one model when you could have thousands?Alex Clark
 
The anatomy of a chemical reaction: Dissection by machine learning algorithms
The anatomy of a chemical reaction: Dissection by machine learning algorithmsThe anatomy of a chemical reaction: Dissection by machine learning algorithms
The anatomy of a chemical reaction: Dissection by machine learning algorithmsAlex Clark
 
Compact models for compact devices: Visualisation of SAR using mobile apps
Compact models for compact devices: Visualisation of SAR using mobile appsCompact models for compact devices: Visualisation of SAR using mobile apps
Compact models for compact devices: Visualisation of SAR using mobile appsAlex Clark
 
Green chemistry in chemical reactions: informatics by design
Green chemistry in chemical reactions: informatics by designGreen chemistry in chemical reactions: informatics by design
Green chemistry in chemical reactions: informatics by designAlex Clark
 
ICCE 2014: The Green Lab Notebook
ICCE 2014: The Green Lab NotebookICCE 2014: The Green Lab Notebook
ICCE 2014: The Green Lab NotebookAlex Clark
 
Building a mobile reaction lab notebook (ACS Dallas 2014)
Building a mobile reaction lab notebook (ACS Dallas 2014)Building a mobile reaction lab notebook (ACS Dallas 2014)
Building a mobile reaction lab notebook (ACS Dallas 2014)Alex Clark
 
Open Drug Discovery Teams @ Hacking Health Montreal
Open Drug Discovery Teams @ Hacking Health MontrealOpen Drug Discovery Teams @ Hacking Health Montreal
Open Drug Discovery Teams @ Hacking Health MontrealAlex Clark
 
Pistoia Alliance App Strategy
Pistoia Alliance App StrategyPistoia Alliance App Strategy
Pistoia Alliance App StrategyAlex Clark
 

Plus de Alex Clark (20)

Mixtures QSAR: modelling collections of chemicals
Mixtures QSAR: modelling collections of chemicalsMixtures QSAR: modelling collections of chemicals
Mixtures QSAR: modelling collections of chemicals
 
Mixtures InChI: a story of how standards drive upstream products
Mixtures InChI: a story of how standards drive upstream productsMixtures InChI: a story of how standards drive upstream products
Mixtures InChI: a story of how standards drive upstream products
 
Mixtures as first class citizens in the realm of informatics
Mixtures as first class citizens in the realm of informaticsMixtures as first class citizens in the realm of informatics
Mixtures as first class citizens in the realm of informatics
 
Mixtures: informatics for formulations and consumer products
Mixtures: informatics for formulations and consumer productsMixtures: informatics for formulations and consumer products
Mixtures: informatics for formulations and consumer products
 
Coordination InChI (2019)
Coordination InChI (2019)Coordination InChI (2019)
Coordination InChI (2019)
 
Chemical mixtures: File format, open source tools, example data, and mixtures...
Chemical mixtures: File format, open source tools, example data, and mixtures...Chemical mixtures: File format, open source tools, example data, and mixtures...
Chemical mixtures: File format, open source tools, example data, and mixtures...
 
Bringing bioassay protocols to the world of informatics, using semantic annot...
Bringing bioassay protocols to the world of informatics, using semantic annot...Bringing bioassay protocols to the world of informatics, using semantic annot...
Bringing bioassay protocols to the world of informatics, using semantic annot...
 
ACS CINF Luncheon talk (Boston 2018)
ACS CINF Luncheon talk (Boston 2018)ACS CINF Luncheon talk (Boston 2018)
ACS CINF Luncheon talk (Boston 2018)
 
Autonomous model building with a preponderance of well annotated assay protocols
Autonomous model building with a preponderance of well annotated assay protocolsAutonomous model building with a preponderance of well annotated assay protocols
Autonomous model building with a preponderance of well annotated assay protocols
 
Representing molecules with minimalism: A solution to the entropy of informatics
Representing molecules with minimalism: A solution to the entropy of informaticsRepresenting molecules with minimalism: A solution to the entropy of informatics
Representing molecules with minimalism: A solution to the entropy of informatics
 
CDD BioAssay Express: Expanding the target dimension: How to visualize a lot ...
CDD BioAssay Express: Expanding the target dimension: How to visualize a lot ...CDD BioAssay Express: Expanding the target dimension: How to visualize a lot ...
CDD BioAssay Express: Expanding the target dimension: How to visualize a lot ...
 
BioAssay Express
BioAssay ExpressBioAssay Express
BioAssay Express
 
SLAS2016: Why have one model when you could have thousands?
SLAS2016: Why have one model when you could have thousands?SLAS2016: Why have one model when you could have thousands?
SLAS2016: Why have one model when you could have thousands?
 
The anatomy of a chemical reaction: Dissection by machine learning algorithms
The anatomy of a chemical reaction: Dissection by machine learning algorithmsThe anatomy of a chemical reaction: Dissection by machine learning algorithms
The anatomy of a chemical reaction: Dissection by machine learning algorithms
 
Compact models for compact devices: Visualisation of SAR using mobile apps
Compact models for compact devices: Visualisation of SAR using mobile appsCompact models for compact devices: Visualisation of SAR using mobile apps
Compact models for compact devices: Visualisation of SAR using mobile apps
 
Green chemistry in chemical reactions: informatics by design
Green chemistry in chemical reactions: informatics by designGreen chemistry in chemical reactions: informatics by design
Green chemistry in chemical reactions: informatics by design
 
ICCE 2014: The Green Lab Notebook
ICCE 2014: The Green Lab NotebookICCE 2014: The Green Lab Notebook
ICCE 2014: The Green Lab Notebook
 
Building a mobile reaction lab notebook (ACS Dallas 2014)
Building a mobile reaction lab notebook (ACS Dallas 2014)Building a mobile reaction lab notebook (ACS Dallas 2014)
Building a mobile reaction lab notebook (ACS Dallas 2014)
 
Open Drug Discovery Teams @ Hacking Health Montreal
Open Drug Discovery Teams @ Hacking Health MontrealOpen Drug Discovery Teams @ Hacking Health Montreal
Open Drug Discovery Teams @ Hacking Health Montreal
 
Pistoia Alliance App Strategy
Pistoia Alliance App StrategyPistoia Alliance App Strategy
Pistoia Alliance App Strategy
 

Dernier

Secrets of Divine Love - A Spiritual Journey into the Heart of Islam - A. Helwa
Secrets of Divine Love - A Spiritual Journey into the Heart of Islam - A. HelwaSecrets of Divine Love - A Spiritual Journey into the Heart of Islam - A. Helwa
Secrets of Divine Love - A Spiritual Journey into the Heart of Islam - A. HelwaNodd Nittong
 
Ayodhya Temple saw its first Big Navratri Festival!
Ayodhya Temple saw its first Big Navratri Festival!Ayodhya Temple saw its first Big Navratri Festival!
Ayodhya Temple saw its first Big Navratri Festival!All in One Trendz
 
PROPHECY-- The End Of My People Forever!
PROPHECY-- The End Of My People Forever!PROPHECY-- The End Of My People Forever!
PROPHECY-- The End Of My People Forever!spy7777777guy
 
Gangaur Celebrations 2024 - Rajasthani Sewa Samaj Karimnagar, Telangana State...
Gangaur Celebrations 2024 - Rajasthani Sewa Samaj Karimnagar, Telangana State...Gangaur Celebrations 2024 - Rajasthani Sewa Samaj Karimnagar, Telangana State...
Gangaur Celebrations 2024 - Rajasthani Sewa Samaj Karimnagar, Telangana State...INDIAN YOUTH SECURED ORGANISATION
 
PROPHECY-- The End Of My People Forever!
PROPHECY-- The End Of My People Forever!PROPHECY-- The End Of My People Forever!
PROPHECY-- The End Of My People Forever!spy7777777guy
 
A Tsunami Tragedy ~ Wise Reflections for Troubled Times (Eng. & Chi.).pptx
A Tsunami Tragedy ~ Wise Reflections for Troubled Times (Eng. & Chi.).pptxA Tsunami Tragedy ~ Wise Reflections for Troubled Times (Eng. & Chi.).pptx
A Tsunami Tragedy ~ Wise Reflections for Troubled Times (Eng. & Chi.).pptxOH TEIK BIN
 
"There are probably more Nobel Laureates who are people of faith than is gen...
 "There are probably more Nobel Laureates who are people of faith than is gen... "There are probably more Nobel Laureates who are people of faith than is gen...
"There are probably more Nobel Laureates who are people of faith than is gen...Steven Camilleri
 
Deerfoot Church of Christ Bulletin 3 31 24
Deerfoot Church of Christ Bulletin 3 31 24Deerfoot Church of Christ Bulletin 3 31 24
Deerfoot Church of Christ Bulletin 3 31 24deerfootcoc
 
Meaningful Pursuits: Pursuing Obedience_Ecclesiastes.pptx
Meaningful Pursuits: Pursuing Obedience_Ecclesiastes.pptxMeaningful Pursuits: Pursuing Obedience_Ecclesiastes.pptx
Meaningful Pursuits: Pursuing Obedience_Ecclesiastes.pptxStephen Palm
 
The King 'Great Goodness' Part 1 Mahasilava Jataka (Eng. & Chi.).pptx
The King 'Great Goodness' Part 1 Mahasilava Jataka (Eng. & Chi.).pptxThe King 'Great Goodness' Part 1 Mahasilava Jataka (Eng. & Chi.).pptx
The King 'Great Goodness' Part 1 Mahasilava Jataka (Eng. & Chi.).pptxOH TEIK BIN
 
Deerfoot Church of Christ Bulletin 4 14 24
Deerfoot Church of Christ Bulletin 4 14 24Deerfoot Church of Christ Bulletin 4 14 24
Deerfoot Church of Christ Bulletin 4 14 24deerfootcoc
 
The-Clear-Quran,-A-Thematic-English-Translation-by-Dr-Mustafa-Khattab.pdf
The-Clear-Quran,-A-Thematic-English-Translation-by-Dr-Mustafa-Khattab.pdfThe-Clear-Quran,-A-Thematic-English-Translation-by-Dr-Mustafa-Khattab.pdf
The-Clear-Quran,-A-Thematic-English-Translation-by-Dr-Mustafa-Khattab.pdfSana Khan
 
Prach Autism AI - Artificial Intelligence
Prach Autism AI - Artificial IntelligencePrach Autism AI - Artificial Intelligence
Prach Autism AI - Artificial Intelligenceprachaibot
 
A357 Hate can stir up strife, but love can cover up all mistakes. hate, love...
A357 Hate can stir up strife, but love can cover up all mistakes.  hate, love...A357 Hate can stir up strife, but love can cover up all mistakes.  hate, love...
A357 Hate can stir up strife, but love can cover up all mistakes. hate, love...franktsao4
 
Interesting facts about Hindu Mythology.pptx
Interesting facts about Hindu Mythology.pptxInteresting facts about Hindu Mythology.pptx
Interesting facts about Hindu Mythology.pptxaskganesha.com
 
empathy map for students very useful.pptx
empathy map for students very useful.pptxempathy map for students very useful.pptx
empathy map for students very useful.pptxGeorgePhilips7
 

Dernier (20)

Secrets of Divine Love - A Spiritual Journey into the Heart of Islam - A. Helwa
Secrets of Divine Love - A Spiritual Journey into the Heart of Islam - A. HelwaSecrets of Divine Love - A Spiritual Journey into the Heart of Islam - A. Helwa
Secrets of Divine Love - A Spiritual Journey into the Heart of Islam - A. Helwa
 
English - The Dangers of Wine Alcohol.pptx
English - The Dangers of Wine Alcohol.pptxEnglish - The Dangers of Wine Alcohol.pptx
English - The Dangers of Wine Alcohol.pptx
 
Ayodhya Temple saw its first Big Navratri Festival!
Ayodhya Temple saw its first Big Navratri Festival!Ayodhya Temple saw its first Big Navratri Festival!
Ayodhya Temple saw its first Big Navratri Festival!
 
PROPHECY-- The End Of My People Forever!
PROPHECY-- The End Of My People Forever!PROPHECY-- The End Of My People Forever!
PROPHECY-- The End Of My People Forever!
 
Gangaur Celebrations 2024 - Rajasthani Sewa Samaj Karimnagar, Telangana State...
Gangaur Celebrations 2024 - Rajasthani Sewa Samaj Karimnagar, Telangana State...Gangaur Celebrations 2024 - Rajasthani Sewa Samaj Karimnagar, Telangana State...
Gangaur Celebrations 2024 - Rajasthani Sewa Samaj Karimnagar, Telangana State...
 
English - The Psalms of King Solomon.pdf
English - The Psalms of King Solomon.pdfEnglish - The Psalms of King Solomon.pdf
English - The Psalms of King Solomon.pdf
 
PROPHECY-- The End Of My People Forever!
PROPHECY-- The End Of My People Forever!PROPHECY-- The End Of My People Forever!
PROPHECY-- The End Of My People Forever!
 
A Tsunami Tragedy ~ Wise Reflections for Troubled Times (Eng. & Chi.).pptx
A Tsunami Tragedy ~ Wise Reflections for Troubled Times (Eng. & Chi.).pptxA Tsunami Tragedy ~ Wise Reflections for Troubled Times (Eng. & Chi.).pptx
A Tsunami Tragedy ~ Wise Reflections for Troubled Times (Eng. & Chi.).pptx
 
"There are probably more Nobel Laureates who are people of faith than is gen...
 "There are probably more Nobel Laureates who are people of faith than is gen... "There are probably more Nobel Laureates who are people of faith than is gen...
"There are probably more Nobel Laureates who are people of faith than is gen...
 
The spiritual moderator of vincentian groups
The spiritual moderator of vincentian groupsThe spiritual moderator of vincentian groups
The spiritual moderator of vincentian groups
 
Deerfoot Church of Christ Bulletin 3 31 24
Deerfoot Church of Christ Bulletin 3 31 24Deerfoot Church of Christ Bulletin 3 31 24
Deerfoot Church of Christ Bulletin 3 31 24
 
Meaningful Pursuits: Pursuing Obedience_Ecclesiastes.pptx
Meaningful Pursuits: Pursuing Obedience_Ecclesiastes.pptxMeaningful Pursuits: Pursuing Obedience_Ecclesiastes.pptx
Meaningful Pursuits: Pursuing Obedience_Ecclesiastes.pptx
 
The King 'Great Goodness' Part 1 Mahasilava Jataka (Eng. & Chi.).pptx
The King 'Great Goodness' Part 1 Mahasilava Jataka (Eng. & Chi.).pptxThe King 'Great Goodness' Part 1 Mahasilava Jataka (Eng. & Chi.).pptx
The King 'Great Goodness' Part 1 Mahasilava Jataka (Eng. & Chi.).pptx
 
Top 8 Krishna Bhajan Lyrics in English.pdf
Top 8 Krishna Bhajan Lyrics in English.pdfTop 8 Krishna Bhajan Lyrics in English.pdf
Top 8 Krishna Bhajan Lyrics in English.pdf
 
Deerfoot Church of Christ Bulletin 4 14 24
Deerfoot Church of Christ Bulletin 4 14 24Deerfoot Church of Christ Bulletin 4 14 24
Deerfoot Church of Christ Bulletin 4 14 24
 
The-Clear-Quran,-A-Thematic-English-Translation-by-Dr-Mustafa-Khattab.pdf
The-Clear-Quran,-A-Thematic-English-Translation-by-Dr-Mustafa-Khattab.pdfThe-Clear-Quran,-A-Thematic-English-Translation-by-Dr-Mustafa-Khattab.pdf
The-Clear-Quran,-A-Thematic-English-Translation-by-Dr-Mustafa-Khattab.pdf
 
Prach Autism AI - Artificial Intelligence
Prach Autism AI - Artificial IntelligencePrach Autism AI - Artificial Intelligence
Prach Autism AI - Artificial Intelligence
 
A357 Hate can stir up strife, but love can cover up all mistakes. hate, love...
A357 Hate can stir up strife, but love can cover up all mistakes.  hate, love...A357 Hate can stir up strife, but love can cover up all mistakes.  hate, love...
A357 Hate can stir up strife, but love can cover up all mistakes. hate, love...
 
Interesting facts about Hindu Mythology.pptx
Interesting facts about Hindu Mythology.pptxInteresting facts about Hindu Mythology.pptx
Interesting facts about Hindu Mythology.pptx
 
empathy map for students very useful.pptx
empathy map for students very useful.pptxempathy map for students very useful.pptx
empathy map for students very useful.pptx
 

Cloud hosted APIs for cheminformatics on mobile devices (ACS Dallas 2014)

  • 1. Cloud hosted APIs for cheminformatics designed for real time user interfaces Alex M. Clark, Ph.D. March 2014 © 2014 Molecular Materials Informatics, Inc.! http://molmatinf.com
  • 2. MOLECULAR MATERIALS INFORMATICS Data Regimes • Differences in kind based on size: - small: <1000 molecules; document-sized - medium: <100K; filesystem, heavy duty - large: database servers; limited operations • Nimble client (mobile apps, web) either: - operate on small collections - limited window onto large collections • Workflows using medium data are tricky !2
  • 3. MOLECULAR MATERIALS INFORMATICS Overview • Describing a workflow for tuberculosis; doing scaffold analysis, model building, open data • Split into: - mobile apps as the user interface - cloud-hosted algorithms for hard work and access to large data - desktop-based sections for medium data • Mobile+cloud very convenient for small data, and for well established tasks • Desktop still primary for method development !3
  • 4. MOLECULAR MATERIALS INFORMATICS TB Mobile • Begins with a mobile app: - ~90 curated targets - ~ 800 molecules • TB inhibition data abundant, but mostly no target info • Want all the actives against the inhA target (157) • Generate leads using scaffold analysis !4
  • 5. MOLECULAR MATERIALS INFORMATICS TB Mobile • Begins with a mobile app: - ~90 curated targets - ~ 800 molecules • TB inhibition data abundant, but mostly no target info • Want all the actives against the inhA target (157) • Generate leads using scaffold analysis !4
  • 6. MOLECULAR MATERIALS INFORMATICS TB Mobile • Begins with a mobile app: - ~90 curated targets - ~ 800 molecules • TB inhibition data abundant, but mostly no target info • Want all the actives against the inhA target (157) • Generate leads using scaffold analysis !4
  • 7. MOLECULAR MATERIALS INFORMATICS TB Mobile • Begins with a mobile app: - ~90 curated targets - ~ 800 molecules • TB inhibition data abundant, but mostly no target info • Want all the actives against the inhA target (157) • Generate leads using scaffold analysis !4
  • 8. MOLECULAR MATERIALS INFORMATICS TB Mobile • Begins with a mobile app: - ~90 curated targets - ~ 800 molecules • TB inhibition data abundant, but mostly no target info • Want all the actives against the inhA target (157) • Generate leads using scaffold analysis !4
  • 9. MOLECULAR MATERIALS INFORMATICS Scaffold Fragments • What medicinally relevant scaffolds to use? !5 157 related compounds scaffoldy! fragments TB activity structures templatey! scaffolds …
  • 10. MOLECULAR MATERIALS INFORMATICS Filtering Scaffold Candidates • Candidates analysed & trimmed • Overall architecture is a stream !6 Read InhA Fragment Merge Sort PropertiesFilterWrite HeavyAtoms Isomorphisms Macrocycles Frequency 157 molecules 124 fragments
  • 11. MOLECULAR MATERIALS INFORMATICS Pipelining • Not quite cloud (yet) • Infrastructure for streaming nodes together: build workflows using a script • Roadmap: build selected workflows, out of prepackaged nodes • Expose as webservices: for use by mobile apps !7 { "op":"com.mmi.core.op.CollapseUnique", "id":102, "name":"Collapse", "parameters": { "keyColumn":"Molecule", "countColumn":"Degeneracy", "collapseColumn":["Target"], "collapseOperator":[","] }, "inputs":[[101,1]], "outputs":1 }, { "op":"com.mmi.core.op.Sort", "id":103, "name":"Collapse", "parameters": { "columns":["Degeneracy"], "directions":[-1] }, "inputs":[[102,1]], "outputs":1 }, { "op":"com.mmi.core.op.MoleculeProperties", "id":104, "name":"Properties", "parameters": { "heavyAtoms":"HeavyAtoms", "isomorphisms":"Isomorphisms", "macrocycles":"Macrocycles" }, "inputs":[[103,1]], "outputs":1 }, { "op":"com.mmi.core.op.FilterProperties", "id":105, "name":"Filter", "parameters": { "name":["HeavyAtoms","Isomorphisms","Macrocycles"], "operator":[">=","<=","="], "value":[10,4,0] }, "inputs":[[104,1]], "outputs":1 },
  • 12. MOLECULAR MATERIALS INFORMATICS Fragmentation • Consider each structure: break it into pieces, enumerate scaffold-like fragments !8
  • 13. MOLECULAR MATERIALS INFORMATICS Decorating • Have scaffoldy fragments, 5425 measurements !9 • Do a trial matching: templates & stats
  • 14. MOLECULAR MATERIALS INFORMATICS Scaffold Selection !10 Assays Filter 5425 molecules Templates Precursor • Keep molecules based on at least one template • Output is suitable for the next stage in the workflow 87 actives 138 inactives
  • 15. MOLECULAR MATERIALS INFORMATICS SAR Table App • Back to mobile apps: want to deliver the 225 compounds to iPad/iPhone… - email - dropbox - web • SAR Table app designed for small documents: content creation, focused analysis, and cloud- assisted functions !11
  • 16. MOLECULAR MATERIALS INFORMATICS Import • Launch datasheet, draw first scaffold… !12
  • 17. MOLECULAR MATERIALS INFORMATICS Import • Launch datasheet, draw first scaffold… !12
  • 18. MOLECULAR MATERIALS INFORMATICS Scaffold Assignment • Ask the webservice to assist: complex, fast !13
  • 19. MOLECULAR MATERIALS INFORMATICS Scaffold Assignment • Ask the webservice to assist: complex, fast !13
  • 20. MOLECULAR MATERIALS INFORMATICS Scaffold Assignment • Ask the webservice to assist: complex, fast !13
  • 21. MOLECULAR MATERIALS INFORMATICS Multi-Scaffold Assignment !14 • Assign scaffolds in bulk: complex, quite fast
  • 22. MOLECULAR MATERIALS INFORMATICS Multi-Scaffold Assignment !14 • Assign scaffolds in bulk: complex, quite fast
  • 23. MOLECULAR MATERIALS INFORMATICS Multi-Scaffold Assignment !14 • Assign scaffolds in bulk: complex, quite fast
  • 24. MOLECULAR MATERIALS INFORMATICS More Data • Have scaffolds and substituents assigned • Can gain valuable insight just from that • What about public databases: what else do our 3 scaffolds match? !15
  • 25. MOLECULAR MATERIALS INFORMATICS ChemSpider Searching • Search for a template; optionally narrow substituent values; want only new compounds !16 initiate MetaSearch poll • Substructure searches farmed out to well known large data services • Middleware post-processes with scaffold analysis & assignment PubChem
  • 26. MOLECULAR MATERIALS INFORMATICS Results • Results are marked up • Uses existing fragments for context • No duplicate structures • All compounds are known… • … can be made or purchased. !17
  • 27. MOLECULAR MATERIALS INFORMATICS Results • Results are marked up • Uses existing fragments for context • No duplicate structures • All compounds are known… • … can be made or purchased. !17
  • 28. MOLECULAR MATERIALS INFORMATICS Model Building • Use structures with known activities to create a structure-activity model !18 WebService data partial model final model • Slow calculation, small data
  • 29. MOLECULAR MATERIALS INFORMATICS Model Application • Predicted activities for looked-up compounds… !19
  • 30. MOLECULAR MATERIALS INFORMATICS Matrix View • Plot R1 vs R4: examine second order SAR !20
  • 31. MOLECULAR MATERIALS INFORMATICS Filling in Blanks • Each blank cell: create & score chimeric structures • Gather distribution of activities • Total calculation: slow • Performance: overhead amortised in blocks (e.g. 10 cells per request) !21
  • 32. MOLECULAR MATERIALS INFORMATICS Matrix Predictions • Shows measured, available & hypothetical… !22
  • 33. MOLECULAR MATERIALS INFORMATICS Conclusion • Mobile+cloud can accomplish many sophisticated tasks • Stateless webservices very easy to deploy • Work on small datasets, use large databases • Medium sized data is problematic • Can fallback to desktop: facile communication • Apps & webservices very well suited to mature workflow tasks !23
  • 34. Acknowledgments http://molmatinf.com http://molsync.com http://cheminf20.org ! @aclarkxyz • Sean Ekins, Barry Bunin & CDD • RSC & ChemSpider, PubChem, ChEBI • Inquiries to info@molmatinf.com