SlideShare une entreprise Scribd logo
1  sur  27
Télécharger pour lire hors ligne
A Web-Service
                                                 Approach for
                                                  Distributed
                                                  Access to
                                                Methods, Data
                                                 and Models
A Web-Service Approach for Distributed          Rajarshi Guha
                                                Geoffrey Fox
 Access to Methods, Data and Models            Kevin E. Gilbert
                                                Marlon Pierce
                                                 David Wild
(I Don’t Care Where My Data and Methods Are)
                                               Overview

                                               Pub3D

  Rajarshi Guha Geoffrey Fox Kevin E. Gilbert   Model Exchange

           Marlon Pierce David Wild

               School of Informatics
                Indiana University


         235th ACS National Meeting
               6th March, 2008
A Web-Service
Outline            Approach for
                    Distributed
                    Access to
                  Methods, Data
                   and Models

                  Rajarshi Guha
                  Geoffrey Fox
                 Kevin E. Gilbert
                  Marlon Pierce
                   David Wild
Overview
                 Overview

                 Pub3D

                 Model Exchange
Pub3D



Model Exchange
A Web-Service
Local versus Distributed Resources                               Approach for
                                                                  Distributed
                                                                  Access to
                                                                Methods, Data
                                                                 and Models

                                                                Rajarshi Guha
                                                                Geoffrey Fox
                                                               Kevin E. Gilbert
                                                                Marlon Pierce
                                                                 David Wild

                                                               Overview

                                                               Pub3D

                                                               Model Exchange




    Access arbitrary resources (methods, data, applications)
    All resources look like function calls
A Web-Service
Combining Data & Functionality                                     Approach for
                                                                    Distributed
                                                                    Access to
                                                                  Methods, Data
                                                                   and Models

                                                                  Rajarshi Guha
                                                                  Geoffrey Fox
                                                                 Kevin E. Gilbert
    But what do we mean by “combining” all these                  Marlon Pierce
                                                                   David Wild
    resources?
                                                                 Overview
    Different levels of complexity
                                                                 Pub3D
        Keep track of new additions to a database via an RSS
                                                                 Model Exchange
        feed
        Use Yahoo Pipes to combine and manipulate output
        easily
        Write full fledged programs in your language of choice
    Web services allow us to support all these activities in a
    uniform manner
A Web-Service
Web Services - What’s Available                               Approach for
                                                               Distributed
                                                               Access to
                                                             Methods, Data
                                                              and Models

                                                             Rajarshi Guha
                                                             Geoffrey Fox
                                                            Kevin E. Gilbert
Cheminformatics functionality                                Marlon Pierce
                                                              David Wild

       Molecular descriptors                                Overview
       Similarity (2D, 3D)                                  Pub3D

       Format conversion                                    Model Exchange


       Depictions
       3D coordinate generation
       Summarized on ChemBioGrid




Dong, X. et al, J. Chem. Inf. Model., 2007, 47, 1303–1307
http://www.chembiogrid.org/projects/proj_core.html
A Web-Service
Web Services for Modeling                                            Approach for
                                                                      Distributed
                                                                      Access to
                                                                    Methods, Data
                                                                     and Models
       Computational web services can be viewed as wrappers         Rajarshi Guha
       around the actual program                                    Geoffrey Fox
                                                                   Kevin E. Gilbert
                                                                    Marlon Pierce
       Since predictive models are a common feature in               David Wild

       cheminformatics we’d like to support them as well           Overview
       This leads to a number of requirements                      Pub3D
               Ability to develop models                           Model Exchange
               Store (deploy) models
               Use the models via the web service infrastructure
       We provide a computational infrastructure based on R
       which supports
               Feature selection
               Model development
               Model deployment
               Arbitrary R code


Guha, R. J. Chem. Inf. Model., 2008, 48, 456–464
A Web-Service
What’s Available?                                                          Approach for
                                                                            Distributed
                                                                            Access to
                                                                          Methods, Data
                                                                           and Models

                                                                          Rajarshi Guha
       Regression (OLS, CNN, RF)                                          Geoffrey Fox
                                                                         Kevin E. Gilbert
       Classification (LDA)                                                Marlon Pierce
                                                                           David Wild
       Clustering (k-means)
                                                                         Overview
       Feature selection (stepwise and exhaustive)                       Pub3D
       Automated model generation                                        Model Exchange

               Load X, Y data, build linear and non-linear models with
               optional LOO CV
       Deployed models
               Random forests for 60 NCI DTP cell lines
               Cytotoxicity
               Ames mutagenicity




Wang, H. et al., J. Chem. Inf. Model., 2007, 47, 2063–2076
Guha, R. and Sch¨rer, S., J. Comp. Aid. Molec. Des, 2008, ASAP
                u
A Web-Service
Web Services - What’s Available                                    Approach for
                                                                    Distributed
                                                                    Access to
                                                                  Methods, Data
                                                                   and Models

                                                                  Rajarshi Guha
                                                                  Geoffrey Fox
                                                                 Kevin E. Gilbert
                                                                  Marlon Pierce
                                                                   David Wild
Data sources
                                                                 Overview
      We maintain a number of databases
                                                                 Pub3D
             PubChem mirror (mainly for local research)
                                                                 Model Exchange
             Pub3D
      Directly accessible via queries in SQL
      We also encapsulate specific queries as web service calls




http://www.chembiogrid.org/projects/proj_db.html
A Web-Service
REST Frontends to SOAP Services                             Approach for
                                                             Distributed
                                                             Access to
                                                           Methods, Data
                                                            and Models

                                                           Rajarshi Guha
                                                           Geoffrey Fox
    REST is a network architecture that avoids complex    Kevin E. Gilbert
                                                           Marlon Pierce
    message formats                                         David Wild

        No extra libraries required                       Overview
        Access services using URL’s
                                                          Pub3D
        Much simpler interface compared to SOAP
                                                          Model Exchange
    We’ve been putting REST interfaces onto a number of
    our SOAP services
    Current REST services include
        Database (3D structure, PubChem mirror)
        Molecular descriptors
        Depictions
A Web-Service
REST 3D Structure Service                                   Approach for
                                                             Distributed
                                                             Access to
                                                           Methods, Data
    To get a 3D structure for a PubChem CID                 and Models

                                                           Rajarshi Guha
                                                           Geoffrey Fox
                                                          Kevin E. Gilbert
 http://www.chembiogrid.org/cheminfo/rest/db/pub3d/2244    Marlon Pierce
                                                            David Wild

                                                          Overview

                                                          Pub3D

                                                          Model Exchange
A Web-Service
REST Depiction Service                                        Approach for
                                                               Distributed
                                                               Access to
                                                             Methods, Data
    To get a 2D depiction for arbitrary SMILES                and Models

                                                             Rajarshi Guha
                                                             Geoffrey Fox
                                                            Kevin E. Gilbert
 http://www.chembiogrid.org/cheminfo/rest/depict/C(=O)OCC    Marlon Pierce
                                                              David Wild

                                                            Overview

                                                            Pub3D

                                                            Model Exchange
A Web-Service
REST Descriptor Service                                   Approach for
                                                           Distributed
                                                           Access to
                                                         Methods, Data
    To get descriptors for an arbitrary SMILES string     and Models

                                                         Rajarshi Guha
                                                         Geoffrey Fox
                                                        Kevin E. Gilbert
 http://www.chembiogrid.org/cheminfo/rest/desc/          Marlon Pierce
 descriptors/CC(=O)OCCN                                   David Wild

                                                        Overview

                                                        Pub3D

                                                        Model Exchange
A Web-Service
Pub3D                                                                            Approach for
                                                                                  Distributed
                                                                                  Access to
                                                                                Methods, Data
                                                                                 and Models

                                                                                Rajarshi Guha
                                                                                Geoffrey Fox
       A 3D structure database derived from PubChem                            Kevin E. Gilbert
                                                                                Marlon Pierce
       Current version contains a single 3D structure for 17M                    David Wild
       compounds
                                                                               Overview
               Structures obtained using MMFF94                                Pub3D
               Not the lowest energy conformer
                                                                               Model Exchange
       Structures can be retrieved in SD format by CID using a
       web page or web service interfaces
       We also store distance moment shape descriptors
       allowing us to perform shape similarity searches
       http://www.chembiogrid.org/cheminfo/p3d/




Ballester, P.J. and Graham Richards, W., J. Comp. Chem., 2007, 28, 1711–1723
A Web-Service
Pub3D Performance                                                                                                                                                                     Approach for
                                                                                                                                                                                       Distributed
                                                                                                                                                                                       Access to
                                                                                                                                                                                     Methods, Data
                            Shape searches can be as fast as 5s, for reasonably large                                                                                                 and Models

                            result sets                                                                                                                                              Rajarshi Guha
                                                                                                                                                                                     Geoffrey Fox
                            Fast enough for us to explore the “density of space”                                                                                                    Kevin E. Gilbert
                                                                                                                                                                                     Marlon Pierce
                            around a given query compound                                                                                                                             David Wild

                                                                                                                                                                                    Overview
                                                                                                                                        Similarity
                                Query Times for 10000 Random CIDs (R = 0.4)
                                                                                                                1       0.91     0.83       0.77       0.71       0.67       0.62   Pub3D
                     3500




                                                                                                       140000
                                                                                                                      Levitra (110634)
                                                                                                                                                                                    Model Exchange
                     3000




                                                                                                                      Diazepam (3016)
                                                                                                                      Taxol (36314)
                     2500




                                                                              Nearest Neighbor Count
                                                                                                                      Didanosine (50599)




                                                                                                       100000
 Number of Queries




                                                                                                                      Calcascorbin (6247)
                     2000




                                                                                                                                                                              q
                     1500




                                                                                                       60000                                                                  q
                                                                                                                                                                         q
                     1000




                                                                                                                                                                   q     q

                                                                                                                                                              q
                                                                                                       20000




                                                                                                                                                                   q
                     500




                                                                                                                                                        q
                                                                                                                                                              q
                                                                                                                                                   q
                                                                                                                                                        q
                                                                                                                                             q     q
                                                                                                                 q        q       q     q
                                                                                                                                        q    q
                     0




                                                                                                                                  q
                                                                                                       0




                                                                                                                 q        q




                            0          20        40          60       80                                        0.0      0.1     0.2        0.3        0.4        0.5        0.6

                                                Time (sec)                                                                               Radius




Guha, R. et al., J. Chem. Inf. Model., 2006, 46, 1713–1722
A Web-Service
Pub3D & Clustering                     Approach for
                                        Distributed
                                        Access to
                                      Methods, Data
                                       and Models

                                      Rajarshi Guha
                                      Geoffrey Fox
                                     Kevin E. Gilbert
                                      Marlon Pierce
                                       David Wild
   Indexing gives good
   performance                       Overview

                                     Pub3D
   As index size increases,
                                     Model Exchange
   performance degrades
   Could add more RAM
   Clustering the database
   allows us to scale to
   significantly larger collections
A Web-Service
Pub3D & Clustering                     Approach for
                                        Distributed
                                        Access to
                                      Methods, Data
                                       and Models

                                      Rajarshi Guha
                                      Geoffrey Fox
                                     Kevin E. Gilbert
                                      Marlon Pierce
                                       David Wild
   Indexing gives good
   performance                       Overview

                                     Pub3D
   As index size increases,
                                     Model Exchange
   performance degrades
   Could add more RAM
   Clustering the database
   allows us to scale to
   significantly larger collections
A Web-Service
Pub3D and Conformers                                            Approach for
                                                                 Distributed
                                                                 Access to
                                                               Methods, Data
                                                                and Models

                                                               Rajarshi Guha
                                                               Geoffrey Fox
                                                              Kevin E. Gilbert
    Single, arbitrary conformers aren’t too useful             Marlon Pierce
                                                                David Wild
    We have currently generated conformers for a subset of
    PubChem (4 to 10 heavy atoms)                             Overview

                                                              Pub3D
        3 Kcal energy window
                                                              Model Exchange
        243,892 compounds
        ≈ 2M conformers
    Clustering will be vital when conformers are considered
        Allows us to handle arbitrary numbers of conformers
        May need to consider some sort of compression
        Could use cluster information to optimize queries
A Web-Service
Model Exchange                                                  Approach for
                                                                 Distributed
                                                                 Access to
                                                               Methods, Data
                                                                and Models

                                                               Rajarshi Guha
                                                               Geoffrey Fox
                                                              Kevin E. Gilbert
   The literature contains many published models               Marlon Pierce
                                                                David Wild
   No way to utilize them unless we manually rebuild them
       In many cases we will not have access to descriptors   Overview

                                                              Pub3D
   Difficult to search for models specifically
                                                              Model Exchange
   We should be able to do the following
       Search for models
       Exchange them
       Execute them
   Is this a “format” issue?
       To some extent yes
A Web-Service
Model Exchange                                                  Approach for
                                                                 Distributed
                                                                 Access to
                                                               Methods, Data
                                                                and Models

                                                               Rajarshi Guha
                                                               Geoffrey Fox
                                                              Kevin E. Gilbert
   The literature contains many published models               Marlon Pierce
                                                                David Wild
   No way to utilize them unless we manually rebuild them
       In many cases we will not have access to descriptors   Overview

                                                              Pub3D
   Difficult to search for models specifically
                                                              Model Exchange
   We should be able to do the following
       Search for models
       Exchange them
       Execute them
   Is this a “format” issue?
       To some extent yes
A Web-Service
PMML for Model Exchange                                          Approach for
                                                                  Distributed
                                                                  Access to
                                                                Methods, Data
                                                                 and Models

                                                                Rajarshi Guha
                                                                Geoffrey Fox
   Predictive Modelling Markup Language                        Kevin E. Gilbert
                                                                Marlon Pierce
   A standard supported by the Data Modeling Group               David Wild

   Allows you to serialize predictive models to XML            Overview
       Linear, logistic regression                             Pub3D
       Tree models                                             Model Exchange
       Neural networks, Na¨ Bayes models
                             ıve
       Association models
       Ensemble models (random forests, arbitrary ensembles)
   Supported by a number of platforms
       IBM, Salford Systems
       SAS, SPSS
       R
A Web-Service
PMML for Model Exchange                                        Approach for
                                                                Distributed
                                                                Access to
                                                              Methods, Data
                                                               and Models

                                                              Rajarshi Guha
                                                              Geoffrey Fox
                                                             Kevin E. Gilbert
                                                              Marlon Pierce
                                                               David Wild
   Since it’s XML, it’s extensible
                                                             Overview
   PMML allows us to specify the model itself                Pub3D
   But we need to add extra information to make a model      Model Exchange
   truly searchable
       Provenance (who made it, when was it made, . . . )
       Property (what is being modeled)
       Requirements (what are the inputs, how to get them)
A Web-Service
PMML for Model Exchange                               Approach for
                                                       Distributed
                                                       Access to
                                                     Methods, Data
Provenance                                            and Models

                                                     Rajarshi Guha
                                                     Geoffrey Fox
    Easily solved using Dublin Core                 Kevin E. Gilbert
                                                     Marlon Pierce
                                                      David Wild

Property                                            Overview

                                                    Pub3D
    Could be addressed using keywords
                                                    Model Exchange
    Could reuse pre-existing ontologies

Requirements
    This is tricky
    Need to identify what descriptors were used
    What software, version
    How to evaluate the descriptors (if possible)
A Web-Service
PMML for Model Exchange                               Approach for
                                                       Distributed
                                                       Access to
                                                     Methods, Data
Provenance                                            and Models

                                                     Rajarshi Guha
                                                     Geoffrey Fox
    Easily solved using Dublin Core                 Kevin E. Gilbert
                                                     Marlon Pierce
                                                      David Wild

Property                                            Overview

                                                    Pub3D
    Could be addressed using keywords
                                                    Model Exchange
    Could reuse pre-existing ontologies

Requirements
    This is tricky
    Need to identify what descriptors were used
    What software, version
    How to evaluate the descriptors (if possible)
A Web-Service
PMML for Model Exchange                               Approach for
                                                       Distributed
                                                       Access to
                                                     Methods, Data
Provenance                                            and Models

                                                     Rajarshi Guha
                                                     Geoffrey Fox
    Easily solved using Dublin Core                 Kevin E. Gilbert
                                                     Marlon Pierce
                                                      David Wild

Property                                            Overview

                                                    Pub3D
    Could be addressed using keywords
                                                    Model Exchange
    Could reuse pre-existing ontologies

Requirements
    This is tricky
    Need to identify what descriptors were used
    What software, version
    How to evaluate the descriptors (if possible)
A Web-Service
Summary                                                          Approach for
                                                                  Distributed
                                                                  Access to
                                                                Methods, Data
                                                                 and Models

                                                                Rajarshi Guha
                                                                Geoffrey Fox
                                                               Kevin E. Gilbert
   Pub3D is a shape searchable version of PubChem               Marlon Pierce
                                                                 David Wild
   Conformers will have to be considered for it to be useful
                                                               Overview
   Are searches meaningful? Benchmarks required                Pub3D

                                                               Model Exchange




   Web services provide one approach to model deployment
   We should be able to search for models explicitly
   PMML is one extensible solution the addresses model
   exchange
   Automated model execution is more challenging
A Web-Service
Summary                                                          Approach for
                                                                  Distributed
                                                                  Access to
                                                                Methods, Data
                                                                 and Models

                                                                Rajarshi Guha
                                                                Geoffrey Fox
                                                               Kevin E. Gilbert
   Pub3D is a shape searchable version of PubChem               Marlon Pierce
                                                                 David Wild
   Conformers will have to be considered for it to be useful
                                                               Overview
   Are searches meaningful? Benchmarks required                Pub3D

                                                               Model Exchange




   Web services provide one approach to model deployment
   We should be able to search for models explicitly
   PMML is one extensible solution the addresses model
   exchange
   Automated model execution is more challenging
A Web-Service
Acknowledgments     Approach for
                     Distributed
                     Access to
                   Methods, Data
                    and Models

                   Rajarshi Guha
                   Geoffrey Fox
                  Kevin E. Gilbert
                   Marlon Pierce
                    David Wild

                  Overview

                  Pub3D
   Kangseok Kim
                  Model Exchange
   NIH

Contenu connexe

Plus de Rajarshi Guha

Pharos: A Torch to Use in Your Journey in the Dark Genome
Pharos: A Torch to Use in Your Journey in the Dark GenomePharos: A Torch to Use in Your Journey in the Dark Genome
Pharos: A Torch to Use in Your Journey in the Dark GenomeRajarshi Guha
 
Pharos: Putting targets in context
Pharos: Putting targets in contextPharos: Putting targets in context
Pharos: Putting targets in contextRajarshi Guha
 
Pharos – A Torch to Use in Your Journey In the Dark Genome
Pharos – A Torch to Use in Your Journey In the Dark GenomePharos – A Torch to Use in Your Journey In the Dark Genome
Pharos – A Torch to Use in Your Journey In the Dark GenomeRajarshi Guha
 
Pharos - Face of the KMC
Pharos - Face of the KMCPharos - Face of the KMC
Pharos - Face of the KMCRajarshi Guha
 
Fingerprinting Chemical Structures
Fingerprinting Chemical StructuresFingerprinting Chemical Structures
Fingerprinting Chemical StructuresRajarshi Guha
 
Exploring Compound Combinations in High Throughput Settings: Going Beyond 1D...
Exploring Compound Combinations in High Throughput Settings: Going Beyond 1D...Exploring Compound Combinations in High Throughput Settings: Going Beyond 1D...
Exploring Compound Combinations in High Throughput Settings: Going Beyond 1D...Rajarshi Guha
 
When the whole is better than the parts
When the whole is better than the partsWhen the whole is better than the parts
When the whole is better than the partsRajarshi Guha
 
Exploring Compound Combinations in High Throughput Settings: Going Beyond 1D ...
Exploring Compound Combinations in High Throughput Settings: Going Beyond 1D ...Exploring Compound Combinations in High Throughput Settings: Going Beyond 1D ...
Exploring Compound Combinations in High Throughput Settings: Going Beyond 1D ...Rajarshi Guha
 
Pushing Chemical Biology Through the Pipes
Pushing Chemical Biology Through the PipesPushing Chemical Biology Through the Pipes
Pushing Chemical Biology Through the PipesRajarshi Guha
 
Characterization and visualization of compound combination responses in a hig...
Characterization and visualization of compound combination responses in a hig...Characterization and visualization of compound combination responses in a hig...
Characterization and visualization of compound combination responses in a hig...Rajarshi Guha
 
The BioAssay Research Database
The BioAssay Research DatabaseThe BioAssay Research Database
The BioAssay Research DatabaseRajarshi Guha
 
Cloudy with a Touch of Cheminformatics
Cloudy with a Touch of CheminformaticsCloudy with a Touch of Cheminformatics
Cloudy with a Touch of CheminformaticsRajarshi Guha
 
Chemical Data Mining: Open Source & Reproducible
Chemical Data Mining: Open Source & ReproducibleChemical Data Mining: Open Source & Reproducible
Chemical Data Mining: Open Source & ReproducibleRajarshi Guha
 
Chemogenomics in the cloud: Is the sky the limit?
Chemogenomics in the cloud: Is the sky the limit?Chemogenomics in the cloud: Is the sky the limit?
Chemogenomics in the cloud: Is the sky the limit?Rajarshi Guha
 
Quantifying Text Sentiment in R
Quantifying Text Sentiment in RQuantifying Text Sentiment in R
Quantifying Text Sentiment in RRajarshi Guha
 
PMML for QSAR Model Exchange
PMML for QSAR Model Exchange PMML for QSAR Model Exchange
PMML for QSAR Model Exchange Rajarshi Guha
 
Small Molecules and siRNA: Methods to Explore Bioactivity Data
Small Molecules and siRNA: Methods to Explore Bioactivity DataSmall Molecules and siRNA: Methods to Explore Bioactivity Data
Small Molecules and siRNA: Methods to Explore Bioactivity DataRajarshi Guha
 
Predicting Activity Cliffs - Can Machine Learning Handle Special Cases?
Predicting Activity Cliffs - Can Machine Learning Handle Special Cases?Predicting Activity Cliffs - Can Machine Learning Handle Special Cases?
Predicting Activity Cliffs - Can Machine Learning Handle Special Cases?Rajarshi Guha
 
R & CDK: A Sturdy Platform in the Oceans of Chemical Data}
R & CDK: A Sturdy Platform in the Oceans of Chemical Data}R & CDK: A Sturdy Platform in the Oceans of Chemical Data}
R & CDK: A Sturdy Platform in the Oceans of Chemical Data}Rajarshi Guha
 

Plus de Rajarshi Guha (20)

Pharos: A Torch to Use in Your Journey in the Dark Genome
Pharos: A Torch to Use in Your Journey in the Dark GenomePharos: A Torch to Use in Your Journey in the Dark Genome
Pharos: A Torch to Use in Your Journey in the Dark Genome
 
Pharos: Putting targets in context
Pharos: Putting targets in contextPharos: Putting targets in context
Pharos: Putting targets in context
 
Pharos – A Torch to Use in Your Journey In the Dark Genome
Pharos – A Torch to Use in Your Journey In the Dark GenomePharos – A Torch to Use in Your Journey In the Dark Genome
Pharos – A Torch to Use in Your Journey In the Dark Genome
 
Pharos - Face of the KMC
Pharos - Face of the KMCPharos - Face of the KMC
Pharos - Face of the KMC
 
Fingerprinting Chemical Structures
Fingerprinting Chemical StructuresFingerprinting Chemical Structures
Fingerprinting Chemical Structures
 
Exploring Compound Combinations in High Throughput Settings: Going Beyond 1D...
Exploring Compound Combinations in High Throughput Settings: Going Beyond 1D...Exploring Compound Combinations in High Throughput Settings: Going Beyond 1D...
Exploring Compound Combinations in High Throughput Settings: Going Beyond 1D...
 
When the whole is better than the parts
When the whole is better than the partsWhen the whole is better than the parts
When the whole is better than the parts
 
Exploring Compound Combinations in High Throughput Settings: Going Beyond 1D ...
Exploring Compound Combinations in High Throughput Settings: Going Beyond 1D ...Exploring Compound Combinations in High Throughput Settings: Going Beyond 1D ...
Exploring Compound Combinations in High Throughput Settings: Going Beyond 1D ...
 
Pushing Chemical Biology Through the Pipes
Pushing Chemical Biology Through the PipesPushing Chemical Biology Through the Pipes
Pushing Chemical Biology Through the Pipes
 
Characterization and visualization of compound combination responses in a hig...
Characterization and visualization of compound combination responses in a hig...Characterization and visualization of compound combination responses in a hig...
Characterization and visualization of compound combination responses in a hig...
 
The BioAssay Research Database
The BioAssay Research DatabaseThe BioAssay Research Database
The BioAssay Research Database
 
Cloudy with a Touch of Cheminformatics
Cloudy with a Touch of CheminformaticsCloudy with a Touch of Cheminformatics
Cloudy with a Touch of Cheminformatics
 
Chemical Data Mining: Open Source & Reproducible
Chemical Data Mining: Open Source & ReproducibleChemical Data Mining: Open Source & Reproducible
Chemical Data Mining: Open Source & Reproducible
 
Chemogenomics in the cloud: Is the sky the limit?
Chemogenomics in the cloud: Is the sky the limit?Chemogenomics in the cloud: Is the sky the limit?
Chemogenomics in the cloud: Is the sky the limit?
 
Quantifying Text Sentiment in R
Quantifying Text Sentiment in RQuantifying Text Sentiment in R
Quantifying Text Sentiment in R
 
PMML for QSAR Model Exchange
PMML for QSAR Model Exchange PMML for QSAR Model Exchange
PMML for QSAR Model Exchange
 
Smashing Molecules
Smashing MoleculesSmashing Molecules
Smashing Molecules
 
Small Molecules and siRNA: Methods to Explore Bioactivity Data
Small Molecules and siRNA: Methods to Explore Bioactivity DataSmall Molecules and siRNA: Methods to Explore Bioactivity Data
Small Molecules and siRNA: Methods to Explore Bioactivity Data
 
Predicting Activity Cliffs - Can Machine Learning Handle Special Cases?
Predicting Activity Cliffs - Can Machine Learning Handle Special Cases?Predicting Activity Cliffs - Can Machine Learning Handle Special Cases?
Predicting Activity Cliffs - Can Machine Learning Handle Special Cases?
 
R & CDK: A Sturdy Platform in the Oceans of Chemical Data}
R & CDK: A Sturdy Platform in the Oceans of Chemical Data}R & CDK: A Sturdy Platform in the Oceans of Chemical Data}
R & CDK: A Sturdy Platform in the Oceans of Chemical Data}
 

Dernier

Scientific Writing :Research Discourse
Scientific  Writing :Research  DiscourseScientific  Writing :Research  Discourse
Scientific Writing :Research DiscourseAnita GoswamiGiri
 
Oppenheimer Film Discussion for Philosophy and Film
Oppenheimer Film Discussion for Philosophy and FilmOppenheimer Film Discussion for Philosophy and Film
Oppenheimer Film Discussion for Philosophy and FilmStan Meyer
 
4.11.24 Poverty and Inequality in America.pptx
4.11.24 Poverty and Inequality in America.pptx4.11.24 Poverty and Inequality in America.pptx
4.11.24 Poverty and Inequality in America.pptxmary850239
 
MS4 level being good citizen -imperative- (1) (1).pdf
MS4 level   being good citizen -imperative- (1) (1).pdfMS4 level   being good citizen -imperative- (1) (1).pdf
MS4 level being good citizen -imperative- (1) (1).pdfMr Bounab Samir
 
Grade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptxGrade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptxkarenfajardo43
 
4.11.24 Mass Incarceration and the New Jim Crow.pptx
4.11.24 Mass Incarceration and the New Jim Crow.pptx4.11.24 Mass Incarceration and the New Jim Crow.pptx
4.11.24 Mass Incarceration and the New Jim Crow.pptxmary850239
 
Measures of Position DECILES for ungrouped data
Measures of Position DECILES for ungrouped dataMeasures of Position DECILES for ungrouped data
Measures of Position DECILES for ungrouped dataBabyAnnMotar
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxHumphrey A Beña
 
ICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdfICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdfVanessa Camilleri
 
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITW
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITWQ-Factor HISPOL Quiz-6th April 2024, Quiz Club NITW
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITWQuiz Club NITW
 
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...DhatriParmar
 
Active Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdfActive Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdfPatidar M
 
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...Nguyen Thanh Tu Collection
 
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...DhatriParmar
 
Expanded definition: technical and operational
Expanded definition: technical and operationalExpanded definition: technical and operational
Expanded definition: technical and operationalssuser3e220a
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)lakshayb543
 
Q-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITWQ-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITWQuiz Club NITW
 

Dernier (20)

Scientific Writing :Research Discourse
Scientific  Writing :Research  DiscourseScientific  Writing :Research  Discourse
Scientific Writing :Research Discourse
 
Oppenheimer Film Discussion for Philosophy and Film
Oppenheimer Film Discussion for Philosophy and FilmOppenheimer Film Discussion for Philosophy and Film
Oppenheimer Film Discussion for Philosophy and Film
 
INCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptx
INCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptxINCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptx
INCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptx
 
4.11.24 Poverty and Inequality in America.pptx
4.11.24 Poverty and Inequality in America.pptx4.11.24 Poverty and Inequality in America.pptx
4.11.24 Poverty and Inequality in America.pptx
 
MS4 level being good citizen -imperative- (1) (1).pdf
MS4 level   being good citizen -imperative- (1) (1).pdfMS4 level   being good citizen -imperative- (1) (1).pdf
MS4 level being good citizen -imperative- (1) (1).pdf
 
Grade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptxGrade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptx
 
4.11.24 Mass Incarceration and the New Jim Crow.pptx
4.11.24 Mass Incarceration and the New Jim Crow.pptx4.11.24 Mass Incarceration and the New Jim Crow.pptx
4.11.24 Mass Incarceration and the New Jim Crow.pptx
 
Measures of Position DECILES for ungrouped data
Measures of Position DECILES for ungrouped dataMeasures of Position DECILES for ungrouped data
Measures of Position DECILES for ungrouped data
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
 
ICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdfICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdf
 
prashanth updated resume 2024 for Teaching Profession
prashanth updated resume 2024 for Teaching Professionprashanth updated resume 2024 for Teaching Profession
prashanth updated resume 2024 for Teaching Profession
 
Faculty Profile prashantha K EEE dept Sri Sairam college of Engineering
Faculty Profile prashantha K EEE dept Sri Sairam college of EngineeringFaculty Profile prashantha K EEE dept Sri Sairam college of Engineering
Faculty Profile prashantha K EEE dept Sri Sairam college of Engineering
 
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITW
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITWQ-Factor HISPOL Quiz-6th April 2024, Quiz Club NITW
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITW
 
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...
 
Active Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdfActive Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdf
 
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
 
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...
 
Expanded definition: technical and operational
Expanded definition: technical and operationalExpanded definition: technical and operational
Expanded definition: technical and operational
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
 
Q-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITWQ-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITW
 

I Don't Care Where My Data and Methods Are: A Web-Service Approach for Distributed Access to Methods, Data and Models

  • 1. A Web-Service Approach for Distributed Access to Methods, Data and Models A Web-Service Approach for Distributed Rajarshi Guha Geoffrey Fox Access to Methods, Data and Models Kevin E. Gilbert Marlon Pierce David Wild (I Don’t Care Where My Data and Methods Are) Overview Pub3D Rajarshi Guha Geoffrey Fox Kevin E. Gilbert Model Exchange Marlon Pierce David Wild School of Informatics Indiana University 235th ACS National Meeting 6th March, 2008
  • 2. A Web-Service Outline Approach for Distributed Access to Methods, Data and Models Rajarshi Guha Geoffrey Fox Kevin E. Gilbert Marlon Pierce David Wild Overview Overview Pub3D Model Exchange Pub3D Model Exchange
  • 3. A Web-Service Local versus Distributed Resources Approach for Distributed Access to Methods, Data and Models Rajarshi Guha Geoffrey Fox Kevin E. Gilbert Marlon Pierce David Wild Overview Pub3D Model Exchange Access arbitrary resources (methods, data, applications) All resources look like function calls
  • 4. A Web-Service Combining Data & Functionality Approach for Distributed Access to Methods, Data and Models Rajarshi Guha Geoffrey Fox Kevin E. Gilbert But what do we mean by “combining” all these Marlon Pierce David Wild resources? Overview Different levels of complexity Pub3D Keep track of new additions to a database via an RSS Model Exchange feed Use Yahoo Pipes to combine and manipulate output easily Write full fledged programs in your language of choice Web services allow us to support all these activities in a uniform manner
  • 5. A Web-Service Web Services - What’s Available Approach for Distributed Access to Methods, Data and Models Rajarshi Guha Geoffrey Fox Kevin E. Gilbert Cheminformatics functionality Marlon Pierce David Wild Molecular descriptors Overview Similarity (2D, 3D) Pub3D Format conversion Model Exchange Depictions 3D coordinate generation Summarized on ChemBioGrid Dong, X. et al, J. Chem. Inf. Model., 2007, 47, 1303–1307 http://www.chembiogrid.org/projects/proj_core.html
  • 6. A Web-Service Web Services for Modeling Approach for Distributed Access to Methods, Data and Models Computational web services can be viewed as wrappers Rajarshi Guha around the actual program Geoffrey Fox Kevin E. Gilbert Marlon Pierce Since predictive models are a common feature in David Wild cheminformatics we’d like to support them as well Overview This leads to a number of requirements Pub3D Ability to develop models Model Exchange Store (deploy) models Use the models via the web service infrastructure We provide a computational infrastructure based on R which supports Feature selection Model development Model deployment Arbitrary R code Guha, R. J. Chem. Inf. Model., 2008, 48, 456–464
  • 7. A Web-Service What’s Available? Approach for Distributed Access to Methods, Data and Models Rajarshi Guha Regression (OLS, CNN, RF) Geoffrey Fox Kevin E. Gilbert Classification (LDA) Marlon Pierce David Wild Clustering (k-means) Overview Feature selection (stepwise and exhaustive) Pub3D Automated model generation Model Exchange Load X, Y data, build linear and non-linear models with optional LOO CV Deployed models Random forests for 60 NCI DTP cell lines Cytotoxicity Ames mutagenicity Wang, H. et al., J. Chem. Inf. Model., 2007, 47, 2063–2076 Guha, R. and Sch¨rer, S., J. Comp. Aid. Molec. Des, 2008, ASAP u
  • 8. A Web-Service Web Services - What’s Available Approach for Distributed Access to Methods, Data and Models Rajarshi Guha Geoffrey Fox Kevin E. Gilbert Marlon Pierce David Wild Data sources Overview We maintain a number of databases Pub3D PubChem mirror (mainly for local research) Model Exchange Pub3D Directly accessible via queries in SQL We also encapsulate specific queries as web service calls http://www.chembiogrid.org/projects/proj_db.html
  • 9. A Web-Service REST Frontends to SOAP Services Approach for Distributed Access to Methods, Data and Models Rajarshi Guha Geoffrey Fox REST is a network architecture that avoids complex Kevin E. Gilbert Marlon Pierce message formats David Wild No extra libraries required Overview Access services using URL’s Pub3D Much simpler interface compared to SOAP Model Exchange We’ve been putting REST interfaces onto a number of our SOAP services Current REST services include Database (3D structure, PubChem mirror) Molecular descriptors Depictions
  • 10. A Web-Service REST 3D Structure Service Approach for Distributed Access to Methods, Data To get a 3D structure for a PubChem CID and Models Rajarshi Guha Geoffrey Fox Kevin E. Gilbert http://www.chembiogrid.org/cheminfo/rest/db/pub3d/2244 Marlon Pierce David Wild Overview Pub3D Model Exchange
  • 11. A Web-Service REST Depiction Service Approach for Distributed Access to Methods, Data To get a 2D depiction for arbitrary SMILES and Models Rajarshi Guha Geoffrey Fox Kevin E. Gilbert http://www.chembiogrid.org/cheminfo/rest/depict/C(=O)OCC Marlon Pierce David Wild Overview Pub3D Model Exchange
  • 12. A Web-Service REST Descriptor Service Approach for Distributed Access to Methods, Data To get descriptors for an arbitrary SMILES string and Models Rajarshi Guha Geoffrey Fox Kevin E. Gilbert http://www.chembiogrid.org/cheminfo/rest/desc/ Marlon Pierce descriptors/CC(=O)OCCN David Wild Overview Pub3D Model Exchange
  • 13. A Web-Service Pub3D Approach for Distributed Access to Methods, Data and Models Rajarshi Guha Geoffrey Fox A 3D structure database derived from PubChem Kevin E. Gilbert Marlon Pierce Current version contains a single 3D structure for 17M David Wild compounds Overview Structures obtained using MMFF94 Pub3D Not the lowest energy conformer Model Exchange Structures can be retrieved in SD format by CID using a web page or web service interfaces We also store distance moment shape descriptors allowing us to perform shape similarity searches http://www.chembiogrid.org/cheminfo/p3d/ Ballester, P.J. and Graham Richards, W., J. Comp. Chem., 2007, 28, 1711–1723
  • 14. A Web-Service Pub3D Performance Approach for Distributed Access to Methods, Data Shape searches can be as fast as 5s, for reasonably large and Models result sets Rajarshi Guha Geoffrey Fox Fast enough for us to explore the “density of space” Kevin E. Gilbert Marlon Pierce around a given query compound David Wild Overview Similarity Query Times for 10000 Random CIDs (R = 0.4) 1 0.91 0.83 0.77 0.71 0.67 0.62 Pub3D 3500 140000 Levitra (110634) Model Exchange 3000 Diazepam (3016) Taxol (36314) 2500 Nearest Neighbor Count Didanosine (50599) 100000 Number of Queries Calcascorbin (6247) 2000 q 1500 60000 q q 1000 q q q 20000 q 500 q q q q q q q q q q q q 0 q 0 q q 0 20 40 60 80 0.0 0.1 0.2 0.3 0.4 0.5 0.6 Time (sec) Radius Guha, R. et al., J. Chem. Inf. Model., 2006, 46, 1713–1722
  • 15. A Web-Service Pub3D & Clustering Approach for Distributed Access to Methods, Data and Models Rajarshi Guha Geoffrey Fox Kevin E. Gilbert Marlon Pierce David Wild Indexing gives good performance Overview Pub3D As index size increases, Model Exchange performance degrades Could add more RAM Clustering the database allows us to scale to significantly larger collections
  • 16. A Web-Service Pub3D & Clustering Approach for Distributed Access to Methods, Data and Models Rajarshi Guha Geoffrey Fox Kevin E. Gilbert Marlon Pierce David Wild Indexing gives good performance Overview Pub3D As index size increases, Model Exchange performance degrades Could add more RAM Clustering the database allows us to scale to significantly larger collections
  • 17. A Web-Service Pub3D and Conformers Approach for Distributed Access to Methods, Data and Models Rajarshi Guha Geoffrey Fox Kevin E. Gilbert Single, arbitrary conformers aren’t too useful Marlon Pierce David Wild We have currently generated conformers for a subset of PubChem (4 to 10 heavy atoms) Overview Pub3D 3 Kcal energy window Model Exchange 243,892 compounds ≈ 2M conformers Clustering will be vital when conformers are considered Allows us to handle arbitrary numbers of conformers May need to consider some sort of compression Could use cluster information to optimize queries
  • 18. A Web-Service Model Exchange Approach for Distributed Access to Methods, Data and Models Rajarshi Guha Geoffrey Fox Kevin E. Gilbert The literature contains many published models Marlon Pierce David Wild No way to utilize them unless we manually rebuild them In many cases we will not have access to descriptors Overview Pub3D Difficult to search for models specifically Model Exchange We should be able to do the following Search for models Exchange them Execute them Is this a “format” issue? To some extent yes
  • 19. A Web-Service Model Exchange Approach for Distributed Access to Methods, Data and Models Rajarshi Guha Geoffrey Fox Kevin E. Gilbert The literature contains many published models Marlon Pierce David Wild No way to utilize them unless we manually rebuild them In many cases we will not have access to descriptors Overview Pub3D Difficult to search for models specifically Model Exchange We should be able to do the following Search for models Exchange them Execute them Is this a “format” issue? To some extent yes
  • 20. A Web-Service PMML for Model Exchange Approach for Distributed Access to Methods, Data and Models Rajarshi Guha Geoffrey Fox Predictive Modelling Markup Language Kevin E. Gilbert Marlon Pierce A standard supported by the Data Modeling Group David Wild Allows you to serialize predictive models to XML Overview Linear, logistic regression Pub3D Tree models Model Exchange Neural networks, Na¨ Bayes models ıve Association models Ensemble models (random forests, arbitrary ensembles) Supported by a number of platforms IBM, Salford Systems SAS, SPSS R
  • 21. A Web-Service PMML for Model Exchange Approach for Distributed Access to Methods, Data and Models Rajarshi Guha Geoffrey Fox Kevin E. Gilbert Marlon Pierce David Wild Since it’s XML, it’s extensible Overview PMML allows us to specify the model itself Pub3D But we need to add extra information to make a model Model Exchange truly searchable Provenance (who made it, when was it made, . . . ) Property (what is being modeled) Requirements (what are the inputs, how to get them)
  • 22. A Web-Service PMML for Model Exchange Approach for Distributed Access to Methods, Data Provenance and Models Rajarshi Guha Geoffrey Fox Easily solved using Dublin Core Kevin E. Gilbert Marlon Pierce David Wild Property Overview Pub3D Could be addressed using keywords Model Exchange Could reuse pre-existing ontologies Requirements This is tricky Need to identify what descriptors were used What software, version How to evaluate the descriptors (if possible)
  • 23. A Web-Service PMML for Model Exchange Approach for Distributed Access to Methods, Data Provenance and Models Rajarshi Guha Geoffrey Fox Easily solved using Dublin Core Kevin E. Gilbert Marlon Pierce David Wild Property Overview Pub3D Could be addressed using keywords Model Exchange Could reuse pre-existing ontologies Requirements This is tricky Need to identify what descriptors were used What software, version How to evaluate the descriptors (if possible)
  • 24. A Web-Service PMML for Model Exchange Approach for Distributed Access to Methods, Data Provenance and Models Rajarshi Guha Geoffrey Fox Easily solved using Dublin Core Kevin E. Gilbert Marlon Pierce David Wild Property Overview Pub3D Could be addressed using keywords Model Exchange Could reuse pre-existing ontologies Requirements This is tricky Need to identify what descriptors were used What software, version How to evaluate the descriptors (if possible)
  • 25. A Web-Service Summary Approach for Distributed Access to Methods, Data and Models Rajarshi Guha Geoffrey Fox Kevin E. Gilbert Pub3D is a shape searchable version of PubChem Marlon Pierce David Wild Conformers will have to be considered for it to be useful Overview Are searches meaningful? Benchmarks required Pub3D Model Exchange Web services provide one approach to model deployment We should be able to search for models explicitly PMML is one extensible solution the addresses model exchange Automated model execution is more challenging
  • 26. A Web-Service Summary Approach for Distributed Access to Methods, Data and Models Rajarshi Guha Geoffrey Fox Kevin E. Gilbert Pub3D is a shape searchable version of PubChem Marlon Pierce David Wild Conformers will have to be considered for it to be useful Overview Are searches meaningful? Benchmarks required Pub3D Model Exchange Web services provide one approach to model deployment We should be able to search for models explicitly PMML is one extensible solution the addresses model exchange Automated model execution is more challenging
  • 27. A Web-Service Acknowledgments Approach for Distributed Access to Methods, Data and Models Rajarshi Guha Geoffrey Fox Kevin E. Gilbert Marlon Pierce David Wild Overview Pub3D Kangseok Kim Model Exchange NIH