SlideShare une entreprise Scribd logo
1  sur  24
Télécharger pour lire hors ligne
Mobile + cloud: a viable
replacement for desktop
cheminformatics?

                                                 Dr. Alex M. Clark
                                                          April 2013



  © 2013 Molecular Materials Informatics, Inc.
    http://molmatinf.com
2


Two platform stacks
 desktop/             phone/
  laptop               tablet


    file                 web
  server              services



 compute                cloud
  cluster             resources
3


What's the point?
•   Mobile computing

    -   a phone or tablet can follow you just about anywhere

    -   almost all other software is migrating to mobile...

    -   ... should chemistry be the last reason to use a PC?

    -   it's time for an overhaul anyway (the 1980s are over)

•   Cloud computing

    -   economies of scale: pay as you go, as much as you need

    -   no air conditioned rooms full of servers

    -   no junk appliances under the desk

    -   no administrators
4


Mobile interface
   •   Less compute power: just enough for interactivity

   •   Limited storage: think of it as a cache for network

   •   Code portability: zero legacy code reuse

   •   Clumsy touchscreen: very tight screen real estate
       planning

   •   Different style: UI paradigms must be rethought
       from scratch

       -   e.g. conventional molecule sketchers don't work
           well...

       -   ... different set of concepts, learning new
           interface
5


         Cloud computation
•   Offloading heavy calculations to a webservice can bring mobile
    interfaces up to parity with traditional stack

•   Webservices either:

    -   stateless: short tasks, finished before HTTP timeout

    -   asynchronous: client maintains & resubmits state

    -   task state: temporary state for duration of task

    -   persistent state: login as a user to access content

•   Workflow based on short exchanges: split evenly for efficiency and
    user experience

•   Not easy to do, but worthwhile
6


                           Workflow
•   Ingredients:

    -   selected compounds + activity from the GSK malaria leads (248 compounds)

    -   three known scaffolds:




                             •       The SAR Table app

                                 -    running on an iPad

                                 -    calling upon cloud-based resources
7


                    Data importing


•   Download chemical structures from:

    -   email attachments

    -   arbitrary URL

    -   network filesystem (e.g. Dropbox)

•   Directly into SAR Table app

•   Passing small documents to/from apps
    is easy and effective
8


             Specify scaffold




•   Draw scaffold for first structure, select scaffold matching action
9


                  Scaffold matching
•   Produces all useful combinations of
    scaffold + R-groups

•   User selects from the degenerate
    possibilities

•   Defers to a webservice: molsync.com

•   Service properties:

    -   complex calculation

    -   short (<< 3 seconds)

    -   stateless

    -   small input, small output
10


              Multiple matching




•   Can try to match many scaffolds to many molecules

•   Similar webservice request (complex, fast, stateless, small data)

•   Iterative dance between app interface + web service:

    -   algorithm applies chemical structure logic, user resolves degeneracy
11


              Searching for more




•   Search for templates:

    -   ChEBI

    -   PubChem                     •       Service properties:
                                        -     complex labour-intensive
•   Intermediate webservice
                                        -     medium speed (< 3 minutes)
    layered on top of publicly
    available search capabilities       -     stateful reference data
                                        -     small input, small output
12


Template results

       •   Scaffold & R-group decomposition
           handled automatically by the meta-
           layer

       •   Now have numerous additional
           compounds in the table

       •   Duplicates excluded

       •   Additional compounds are all known:

            -    lookup recipe

            -    purchase from vendor

       •   Only thing missing is activity...
13


                Activity visualisation
•   Setup a scheme...

    -   Units

    -   Transform

    -   Colours

    -   Range




•   Customise for the data

•   All these leads quite active

•   Define 5.9 to 6.5 as the low activity range (red to yellow)

•   Define 6.5 to 8.1 as the high activity range (yellow to green)
14


Model building
      •   Have some data, but not all?

          -   No problem, build a model...

      •   Based on a generic method:
          descriptors based on substructures

      •   Webservice calls iterate over
          refinement steps

      •   Service properties:

          -   complex labour-intensive
              calculation

          -   medium speed (each cycle <
              10 seconds)

          -   state managed by client

          -   small input, small output
15


Model application

        •   The model is then applied to
            compounds with unknown activity

        •   Compounds with green wedges can
            be

            -   prepared/purchased

            -   measured

            -   real data included

        •   A measurements are added,
            predictivity improves
16


Matrix view

              •   Select axes, e.g.

                  -   R4

                  -   R5

              •   Distinct fragments
                  plotted on axes

              •   Activity summarised
                  by colour coding
17


Empty cells
       •   Can use the previously built model
           to estimate activity for empty cells

       •   The webservice creates
           hypothetical compounds for each
           cell:

           -   R4 and R5 are implied

           -   scaffold, R1, R2 and R3 are
               selected from currently known
               cases

       •   Service properties:

           -   complex labour-intensive
               calculation
           -   fast speed (< 30 seconds)
           -   batch state
           -   small input, small output
18


Proposing compounds
      •   Any of the cells that look good: tap on one,
          create a new partial specification...

      •   ... define the missing variables from the row
          view.
19


                     New compounds




•   Fill in missing substituents from a list
    of existing fragments

•   Heat-map colours denote activity
    distribution
20


                  Presenting data
•   Can output data, bitmaps for structures, and multi-page PDFs for the
    whole matrix view (arts'n'crafts)

•   More specialised output is possible:

    -   vector graphics (SVG, EPS)

    -   Microsoft Word & Excel documents with tables of vector graphics,
        designed for manuscript preparation

•   Tricky formatting done by webservice

    -   complex calculation

    -   fast speed (< 1 second)

    -   stateless

    -   small input, small output
21


                     Sharing data
•   Can pass the SAR Table document, or individual structures, to other
    apps via:

    -   open in

    -   clipboard




•   Send by email with attachments

•   Store/retrieve on remote filesystems (e.g. Dropbox via MolSync app)

•   Share on the web, by uploading to an open repository

•   Tweet the web link, directly from the app                 Molecular Informatics
                                                                vol. 31, p. 569
                                                                    (2012)
22


                    Conclusion

•   Major workflow chunks are already possible without using a PC

•   Mobile devices are powerful enough to drive a great user
    experience with complex data

•   Cloud-based infrastructure can boost functionality, that is hard or
    impossible to provide on the device

•   BUT notice the trend:

    -   small inputs, small outputs, restricted state

•   Apps are great for working on small documents that are easily
    passed around (app-to-app or via network)
23


                   Future work
•   The largest remaining problem is big data

    -   requires complex serverside infrastructure

    -   with security

    -   and interoperability

•   The Pistoia Alliance app strategy intends to address this



                Prognistication
•   Apps will become increasingly prevalent in chemistry

•   So will cloud computing
Acknowledgments
• Sean Ekins

• Antony Williams

• Evan Bolton

• The Pistoia Alliance   http://molmatinf.com
                         http://molsync.com
• Inquiries to           http://cheminf20.org
  info@molmatinf.com
                         @aclarkxyz

Contenu connexe

Tendances

Microsoft Openness Mongo DB
Microsoft Openness Mongo DBMicrosoft Openness Mongo DB
Microsoft Openness Mongo DB
Heriyadi Janwar
 
Hafslund SESAM - Semantic integration in practice
Hafslund SESAM - Semantic integration in practiceHafslund SESAM - Semantic integration in practice
Hafslund SESAM - Semantic integration in practice
Lars Marius Garshol
 
VMworld 2013: Strategic Reasons for Classifying Workloads for Tier 1 Virtuali...
VMworld 2013: Strategic Reasons for Classifying Workloads for Tier 1 Virtuali...VMworld 2013: Strategic Reasons for Classifying Workloads for Tier 1 Virtuali...
VMworld 2013: Strategic Reasons for Classifying Workloads for Tier 1 Virtuali...
VMworld
 

Tendances (20)

Cloud computing
Cloud computingCloud computing
Cloud computing
 
Hadoop World 2011: Building Realtime Big Data Services at Facebook with Hadoo...
Hadoop World 2011: Building Realtime Big Data Services at Facebook with Hadoo...Hadoop World 2011: Building Realtime Big Data Services at Facebook with Hadoo...
Hadoop World 2011: Building Realtime Big Data Services at Facebook with Hadoo...
 
Microsoft Openness Mongo DB
Microsoft Openness Mongo DBMicrosoft Openness Mongo DB
Microsoft Openness Mongo DB
 
North Bay Ruby Meetup 101911
North Bay Ruby Meetup 101911North Bay Ruby Meetup 101911
North Bay Ruby Meetup 101911
 
MongoDB San Francisco 2013: Storing eBay's Media Metadata on MongoDB present...
MongoDB San Francisco 2013: Storing eBay's Media Metadata on MongoDB  present...MongoDB San Francisco 2013: Storing eBay's Media Metadata on MongoDB  present...
MongoDB San Francisco 2013: Storing eBay's Media Metadata on MongoDB present...
 
Apache Tez - Accelerating Hadoop Data Processing
Apache Tez - Accelerating Hadoop Data ProcessingApache Tez - Accelerating Hadoop Data Processing
Apache Tez - Accelerating Hadoop Data Processing
 
Design, Scale and Performance of MapR's Distribution for Hadoop
Design, Scale and Performance of MapR's Distribution for HadoopDesign, Scale and Performance of MapR's Distribution for Hadoop
Design, Scale and Performance of MapR's Distribution for Hadoop
 
Hadoop World 2011: Next Generation Apache Hadoop MapReduce - Mohadev Konar, H...
Hadoop World 2011: Next Generation Apache Hadoop MapReduce - Mohadev Konar, H...Hadoop World 2011: Next Generation Apache Hadoop MapReduce - Mohadev Konar, H...
Hadoop World 2011: Next Generation Apache Hadoop MapReduce - Mohadev Konar, H...
 
Demystifying flink memory allocation and tuning - Roshan Naik, Uber
Demystifying flink memory allocation and tuning - Roshan Naik, UberDemystifying flink memory allocation and tuning - Roshan Naik, Uber
Demystifying flink memory allocation and tuning - Roshan Naik, Uber
 
Writing Scalable Software in Java
Writing Scalable Software in JavaWriting Scalable Software in Java
Writing Scalable Software in Java
 
Hafslund SESAM - Semantic integration in practice
Hafslund SESAM - Semantic integration in practiceHafslund SESAM - Semantic integration in practice
Hafslund SESAM - Semantic integration in practice
 
Hadoop World 2011: Apache HBase Road Map - Jonathan Gray - Facebook
Hadoop World 2011: Apache HBase Road Map - Jonathan Gray - FacebookHadoop World 2011: Apache HBase Road Map - Jonathan Gray - Facebook
Hadoop World 2011: Apache HBase Road Map - Jonathan Gray - Facebook
 
HBaseCon 2012 | Low Latency OLAP with HBase - Cosmin Lehene, Adobe
HBaseCon 2012 | Low Latency OLAP with HBase - Cosmin Lehene, AdobeHBaseCon 2012 | Low Latency OLAP with HBase - Cosmin Lehene, Adobe
HBaseCon 2012 | Low Latency OLAP with HBase - Cosmin Lehene, Adobe
 
Introduction to map reduce
Introduction to map reduceIntroduction to map reduce
Introduction to map reduce
 
NoSQL
NoSQLNoSQL
NoSQL
 
Hadoop World 2011: Hadoop Network and Compute Architecture Considerations - J...
Hadoop World 2011: Hadoop Network and Compute Architecture Considerations - J...Hadoop World 2011: Hadoop Network and Compute Architecture Considerations - J...
Hadoop World 2011: Hadoop Network and Compute Architecture Considerations - J...
 
Introduccion a Hadoop / Introduction to Hadoop
Introduccion a Hadoop / Introduction to HadoopIntroduccion a Hadoop / Introduction to Hadoop
Introduccion a Hadoop / Introduction to Hadoop
 
VMworld 2013: Strategic Reasons for Classifying Workloads for Tier 1 Virtuali...
VMworld 2013: Strategic Reasons for Classifying Workloads for Tier 1 Virtuali...VMworld 2013: Strategic Reasons for Classifying Workloads for Tier 1 Virtuali...
VMworld 2013: Strategic Reasons for Classifying Workloads for Tier 1 Virtuali...
 
HBaseCon 2013: Compaction Improvements in Apache HBase
HBaseCon 2013: Compaction Improvements in Apache HBaseHBaseCon 2013: Compaction Improvements in Apache HBase
HBaseCon 2013: Compaction Improvements in Apache HBase
 
10c introduction
10c introduction10c introduction
10c introduction
 

En vedette (6)

Best apps of 2011
Best apps of 2011Best apps of 2011
Best apps of 2011
 
Rolling stone myxer announcement
Rolling stone myxer announcementRolling stone myxer announcement
Rolling stone myxer announcement
 
USERSTORY — сбор требований при помощи прототипов
USERSTORY — сбор требований при помощи прототиповUSERSTORY — сбор требований при помощи прототипов
USERSTORY — сбор требований при помощи прототипов
 
BioAssay Express
BioAssay ExpressBioAssay Express
BioAssay Express
 
My slideshare
My slideshareMy slideshare
My slideshare
 
SLAS2016: Why have one model when you could have thousands?
SLAS2016: Why have one model when you could have thousands?SLAS2016: Why have one model when you could have thousands?
SLAS2016: Why have one model when you could have thousands?
 

Similaire à Mobile+Cloud: a viable replacement for desktop cheminformatics?

Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Precisely
 
Kognitio overview jan 2013
Kognitio overview jan 2013Kognitio overview jan 2013
Kognitio overview jan 2013
Kognitio
 

Similaire à Mobile+Cloud: a viable replacement for desktop cheminformatics? (20)

2014 09-12 lambda-architecture-at-indix
2014 09-12 lambda-architecture-at-indix2014 09-12 lambda-architecture-at-indix
2014 09-12 lambda-architecture-at-indix
 
Building FoundationDB
Building FoundationDBBuilding FoundationDB
Building FoundationDB
 
eHarmony in the Cloud
eHarmony in the CloudeHarmony in the Cloud
eHarmony in the Cloud
 
Introduction to Cloud Data Center and Network Issues
Introduction to Cloud Data Center and Network IssuesIntroduction to Cloud Data Center and Network Issues
Introduction to Cloud Data Center and Network Issues
 
Distributed Computing with Apache Hadoop: Technology Overview
Distributed Computing with Apache Hadoop: Technology OverviewDistributed Computing with Apache Hadoop: Technology Overview
Distributed Computing with Apache Hadoop: Technology Overview
 
Deploying Grid Services Using Apache Hadoop
Deploying Grid Services Using Apache HadoopDeploying Grid Services Using Apache Hadoop
Deploying Grid Services Using Apache Hadoop
 
Cloud infrastructure. Google File System and MapReduce - Andrii Vozniuk
Cloud infrastructure. Google File System and MapReduce - Andrii VozniukCloud infrastructure. Google File System and MapReduce - Andrii Vozniuk
Cloud infrastructure. Google File System and MapReduce - Andrii Vozniuk
 
Apache Tez : Accelerating Hadoop Query Processing
Apache Tez : Accelerating Hadoop Query ProcessingApache Tez : Accelerating Hadoop Query Processing
Apache Tez : Accelerating Hadoop Query Processing
 
ROMA User-Customizable NoSQL Database in Ruby
ROMA User-Customizable NoSQL Database in RubyROMA User-Customizable NoSQL Database in Ruby
ROMA User-Customizable NoSQL Database in Ruby
 
NOSQL, CouchDB, and the Cloud
NOSQL, CouchDB, and the CloudNOSQL, CouchDB, and the Cloud
NOSQL, CouchDB, and the Cloud
 
YARN Ready: Integrating to YARN with Tez
YARN Ready: Integrating to YARN with Tez YARN Ready: Integrating to YARN with Tez
YARN Ready: Integrating to YARN with Tez
 
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
 
Kognitio overview jan 2013
Kognitio overview jan 2013Kognitio overview jan 2013
Kognitio overview jan 2013
 
Kognitio overview jan 2013
Kognitio overview jan 2013Kognitio overview jan 2013
Kognitio overview jan 2013
 
Building real time data-driven products
Building real time data-driven productsBuilding real time data-driven products
Building real time data-driven products
 
John adams talk cloudy
John adams   talk cloudyJohn adams   talk cloudy
John adams talk cloudy
 
Using a Fast Operational Database to Build Real-time Streaming Aggregations
Using a Fast Operational Database to Build Real-time Streaming AggregationsUsing a Fast Operational Database to Build Real-time Streaming Aggregations
Using a Fast Operational Database to Build Real-time Streaming Aggregations
 
Building a Front End for a Sensor Data Cloud
Building a Front End for a Sensor Data CloudBuilding a Front End for a Sensor Data Cloud
Building a Front End for a Sensor Data Cloud
 
Tez big datacamp-la-bikas_saha
Tez big datacamp-la-bikas_sahaTez big datacamp-la-bikas_saha
Tez big datacamp-la-bikas_saha
 
Scalable machine learning
Scalable machine learningScalable machine learning
Scalable machine learning
 

Plus de Alex Clark

Representing molecules with minimalism: A solution to the entropy of informatics
Representing molecules with minimalism: A solution to the entropy of informaticsRepresenting molecules with minimalism: A solution to the entropy of informatics
Representing molecules with minimalism: A solution to the entropy of informatics
Alex Clark
 

Plus de Alex Clark (20)

Mixtures QSAR: modelling collections of chemicals
Mixtures QSAR: modelling collections of chemicalsMixtures QSAR: modelling collections of chemicals
Mixtures QSAR: modelling collections of chemicals
 
Mixtures InChI: a story of how standards drive upstream products
Mixtures InChI: a story of how standards drive upstream productsMixtures InChI: a story of how standards drive upstream products
Mixtures InChI: a story of how standards drive upstream products
 
Mixtures as first class citizens in the realm of informatics
Mixtures as first class citizens in the realm of informaticsMixtures as first class citizens in the realm of informatics
Mixtures as first class citizens in the realm of informatics
 
Mixtures: informatics for formulations and consumer products
Mixtures: informatics for formulations and consumer productsMixtures: informatics for formulations and consumer products
Mixtures: informatics for formulations and consumer products
 
Coordination InChI (2019)
Coordination InChI (2019)Coordination InChI (2019)
Coordination InChI (2019)
 
Chemical mixtures: File format, open source tools, example data, and mixtures...
Chemical mixtures: File format, open source tools, example data, and mixtures...Chemical mixtures: File format, open source tools, example data, and mixtures...
Chemical mixtures: File format, open source tools, example data, and mixtures...
 
Bringing bioassay protocols to the world of informatics, using semantic annot...
Bringing bioassay protocols to the world of informatics, using semantic annot...Bringing bioassay protocols to the world of informatics, using semantic annot...
Bringing bioassay protocols to the world of informatics, using semantic annot...
 
ACS CINF Luncheon talk (Boston 2018)
ACS CINF Luncheon talk (Boston 2018)ACS CINF Luncheon talk (Boston 2018)
ACS CINF Luncheon talk (Boston 2018)
 
Autonomous model building with a preponderance of well annotated assay protocols
Autonomous model building with a preponderance of well annotated assay protocolsAutonomous model building with a preponderance of well annotated assay protocols
Autonomous model building with a preponderance of well annotated assay protocols
 
Representing molecules with minimalism: A solution to the entropy of informatics
Representing molecules with minimalism: A solution to the entropy of informaticsRepresenting molecules with minimalism: A solution to the entropy of informatics
Representing molecules with minimalism: A solution to the entropy of informatics
 
CDD BioAssay Express: Expanding the target dimension: How to visualize a lot ...
CDD BioAssay Express: Expanding the target dimension: How to visualize a lot ...CDD BioAssay Express: Expanding the target dimension: How to visualize a lot ...
CDD BioAssay Express: Expanding the target dimension: How to visualize a lot ...
 
The anatomy of a chemical reaction: Dissection by machine learning algorithms
The anatomy of a chemical reaction: Dissection by machine learning algorithmsThe anatomy of a chemical reaction: Dissection by machine learning algorithms
The anatomy of a chemical reaction: Dissection by machine learning algorithms
 
Compact models for compact devices: Visualisation of SAR using mobile apps
Compact models for compact devices: Visualisation of SAR using mobile appsCompact models for compact devices: Visualisation of SAR using mobile apps
Compact models for compact devices: Visualisation of SAR using mobile apps
 
Green chemistry in chemical reactions: informatics by design
Green chemistry in chemical reactions: informatics by designGreen chemistry in chemical reactions: informatics by design
Green chemistry in chemical reactions: informatics by design
 
ICCE 2014: The Green Lab Notebook
ICCE 2014: The Green Lab NotebookICCE 2014: The Green Lab Notebook
ICCE 2014: The Green Lab Notebook
 
Cloud hosted APIs for cheminformatics on mobile devices (ACS Dallas 2014)
Cloud hosted APIs for cheminformatics on mobile devices (ACS Dallas 2014)Cloud hosted APIs for cheminformatics on mobile devices (ACS Dallas 2014)
Cloud hosted APIs for cheminformatics on mobile devices (ACS Dallas 2014)
 
Building a mobile reaction lab notebook (ACS Dallas 2014)
Building a mobile reaction lab notebook (ACS Dallas 2014)Building a mobile reaction lab notebook (ACS Dallas 2014)
Building a mobile reaction lab notebook (ACS Dallas 2014)
 
Reaction Lab Notebooks for Mobile Devices - Alex M. Clark - GDCh 2013
Reaction Lab Notebooks for Mobile Devices - Alex M. Clark - GDCh 2013Reaction Lab Notebooks for Mobile Devices - Alex M. Clark - GDCh 2013
Reaction Lab Notebooks for Mobile Devices - Alex M. Clark - GDCh 2013
 
Alex Clark : NETTAB 2013
Alex Clark : NETTAB 2013Alex Clark : NETTAB 2013
Alex Clark : NETTAB 2013
 
Open Drug Discovery Teams @ Hacking Health Montreal
Open Drug Discovery Teams @ Hacking Health MontrealOpen Drug Discovery Teams @ Hacking Health Montreal
Open Drug Discovery Teams @ Hacking Health Montreal
 

Dernier

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Dernier (20)

ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 

Mobile+Cloud: a viable replacement for desktop cheminformatics?

  • 1. Mobile + cloud: a viable replacement for desktop cheminformatics? Dr. Alex M. Clark April 2013 © 2013 Molecular Materials Informatics, Inc. http://molmatinf.com
  • 2. 2 Two platform stacks desktop/ phone/ laptop tablet file web server services compute cloud cluster resources
  • 3. 3 What's the point? • Mobile computing - a phone or tablet can follow you just about anywhere - almost all other software is migrating to mobile... - ... should chemistry be the last reason to use a PC? - it's time for an overhaul anyway (the 1980s are over) • Cloud computing - economies of scale: pay as you go, as much as you need - no air conditioned rooms full of servers - no junk appliances under the desk - no administrators
  • 4. 4 Mobile interface • Less compute power: just enough for interactivity • Limited storage: think of it as a cache for network • Code portability: zero legacy code reuse • Clumsy touchscreen: very tight screen real estate planning • Different style: UI paradigms must be rethought from scratch - e.g. conventional molecule sketchers don't work well... - ... different set of concepts, learning new interface
  • 5. 5 Cloud computation • Offloading heavy calculations to a webservice can bring mobile interfaces up to parity with traditional stack • Webservices either: - stateless: short tasks, finished before HTTP timeout - asynchronous: client maintains & resubmits state - task state: temporary state for duration of task - persistent state: login as a user to access content • Workflow based on short exchanges: split evenly for efficiency and user experience • Not easy to do, but worthwhile
  • 6. 6 Workflow • Ingredients: - selected compounds + activity from the GSK malaria leads (248 compounds) - three known scaffolds: • The SAR Table app - running on an iPad - calling upon cloud-based resources
  • 7. 7 Data importing • Download chemical structures from: - email attachments - arbitrary URL - network filesystem (e.g. Dropbox) • Directly into SAR Table app • Passing small documents to/from apps is easy and effective
  • 8. 8 Specify scaffold • Draw scaffold for first structure, select scaffold matching action
  • 9. 9 Scaffold matching • Produces all useful combinations of scaffold + R-groups • User selects from the degenerate possibilities • Defers to a webservice: molsync.com • Service properties: - complex calculation - short (<< 3 seconds) - stateless - small input, small output
  • 10. 10 Multiple matching • Can try to match many scaffolds to many molecules • Similar webservice request (complex, fast, stateless, small data) • Iterative dance between app interface + web service: - algorithm applies chemical structure logic, user resolves degeneracy
  • 11. 11 Searching for more • Search for templates: - ChEBI - PubChem • Service properties: - complex labour-intensive • Intermediate webservice - medium speed (< 3 minutes) layered on top of publicly available search capabilities - stateful reference data - small input, small output
  • 12. 12 Template results • Scaffold & R-group decomposition handled automatically by the meta- layer • Now have numerous additional compounds in the table • Duplicates excluded • Additional compounds are all known: - lookup recipe - purchase from vendor • Only thing missing is activity...
  • 13. 13 Activity visualisation • Setup a scheme... - Units - Transform - Colours - Range • Customise for the data • All these leads quite active • Define 5.9 to 6.5 as the low activity range (red to yellow) • Define 6.5 to 8.1 as the high activity range (yellow to green)
  • 14. 14 Model building • Have some data, but not all? - No problem, build a model... • Based on a generic method: descriptors based on substructures • Webservice calls iterate over refinement steps • Service properties: - complex labour-intensive calculation - medium speed (each cycle < 10 seconds) - state managed by client - small input, small output
  • 15. 15 Model application • The model is then applied to compounds with unknown activity • Compounds with green wedges can be - prepared/purchased - measured - real data included • A measurements are added, predictivity improves
  • 16. 16 Matrix view • Select axes, e.g. - R4 - R5 • Distinct fragments plotted on axes • Activity summarised by colour coding
  • 17. 17 Empty cells • Can use the previously built model to estimate activity for empty cells • The webservice creates hypothetical compounds for each cell: - R4 and R5 are implied - scaffold, R1, R2 and R3 are selected from currently known cases • Service properties: - complex labour-intensive calculation - fast speed (< 30 seconds) - batch state - small input, small output
  • 18. 18 Proposing compounds • Any of the cells that look good: tap on one, create a new partial specification... • ... define the missing variables from the row view.
  • 19. 19 New compounds • Fill in missing substituents from a list of existing fragments • Heat-map colours denote activity distribution
  • 20. 20 Presenting data • Can output data, bitmaps for structures, and multi-page PDFs for the whole matrix view (arts'n'crafts) • More specialised output is possible: - vector graphics (SVG, EPS) - Microsoft Word & Excel documents with tables of vector graphics, designed for manuscript preparation • Tricky formatting done by webservice - complex calculation - fast speed (< 1 second) - stateless - small input, small output
  • 21. 21 Sharing data • Can pass the SAR Table document, or individual structures, to other apps via: - open in - clipboard • Send by email with attachments • Store/retrieve on remote filesystems (e.g. Dropbox via MolSync app) • Share on the web, by uploading to an open repository • Tweet the web link, directly from the app Molecular Informatics vol. 31, p. 569 (2012)
  • 22. 22 Conclusion • Major workflow chunks are already possible without using a PC • Mobile devices are powerful enough to drive a great user experience with complex data • Cloud-based infrastructure can boost functionality, that is hard or impossible to provide on the device • BUT notice the trend: - small inputs, small outputs, restricted state • Apps are great for working on small documents that are easily passed around (app-to-app or via network)
  • 23. 23 Future work • The largest remaining problem is big data - requires complex serverside infrastructure - with security - and interoperability • The Pistoia Alliance app strategy intends to address this Prognistication • Apps will become increasingly prevalent in chemistry • So will cloud computing
  • 24. Acknowledgments • Sean Ekins • Antony Williams • Evan Bolton • The Pistoia Alliance http://molmatinf.com http://molsync.com • Inquiries to http://cheminf20.org info@molmatinf.com @aclarkxyz