SlideShare une entreprise Scribd logo
1  sur  32
Tim Cheeseright, Mark Mackey, Rob Scoffin, Martin Slater

Assessing the similarity of compound collections
using molecular fields: Does it add value?



                                                                          1
Conclusions

> It works brilliantly
> All synthetic steps gave yields of 100%
> All enrichments were perfect
> All new molecules were sub nM
> All QSARs were totally predictive, q2 = 1.0


> We expect the call from Sweden any day now


                                                2
Conclusions

> Work in progress
> 3D similarity can add value to compound
  selection
> Full matrix of similarities possibly unnecessary
> Using probes looks like a possible solution
> Not a panacea




                                                     3
Agenda & Background

> Fields & similarity
> Generating screening compounds using Fields
> Selecting a 10K “diverse” library for screening
  from commercial compounds
   > Initial thoughts
   > Problems
   > More Initial thoughts
   > A solution but not a complete one
> Conclusions
                                                    4
Field Points

Condensed representation of electrostatic, hydrophobic
and shape properties (“protein‟s view”)
   > Molecular Field Extrema (“Field Points”)




       2D                3D Molecular                       Field Points
                         Electrostatic      = Positive
                        Potential (MEP)     = Negative
                                            = Shape
                                            = Hydrophobic
                                                                           5
Improved MM Electrostatics

> Field patterns from XED force field reproduce
  experimental results
        Experimental           Using XEDs         Not using XEDs




  Interaction of Acetone and
 Any-OH from small molecule
                                            XED adds ‘p-orbitals’ to
       crystal structures
                                            get better representation
                                            of atoms
                                                                   6
Non-Classical Comparisons




                            7
Molecular Alignment



             0.82




                     0.66                             0.98




                    Cheeseright et al, J. Chem Inf. Mod., 2006, 665
                                                              8
Using Fields

>   Bioisosteric groups
>   Virtual Screening
>   Pharmacophore hypothesis
>   Qualitative SAR interpretation
>   3D QSAR
>   Library Design




                                     9
Field based library design success




                                     10
Libraries from Fields

> Small, custom synthesised libraries (~100s -
  1000s compds)
> Low scaffold diversity
> Highly targeted
> Lots of manual design




                                                 11
An Opportunity & a Challenge

> Provide a small diverse screening library 10K for
  a small biotech company

  > Diversity in potential biological targets to be hit

  > Minimum redundancy in the set


  > Maximum chance of success in finding a lead within
    available budget and screening resources


                                                          12
Initial thoughts

> Customised design not an option - commercial
  compounds only
> Using Fields to successfully select compounds for
  screening performed many times
   > Virtual screening
   > Always in a specific biological context
> What about using Fields to choose a „diverse‟ set
> Possible problem with numbers
   > 10,000 cmpd library small
   > 9,000,000 commercially available molecules v. large for 3D
     diversity

                                                              13
Initial thoughts

> Compare 3D and 2D similarities for compound
  collections - are we wasting our time?
> Take a small compound collection
> Full NxN calculation
> 3D method = Fields & Shape
> 2D method = atom pairs


> Compare and Contrast

                                                14
Conformations

> 3D Method requires conformations - which
  one(s) to use?
> What is the similarity of 2 compounds in 3D ?
  > Context is important!
  > Highest across all conformations?
  > Average ?
  > Lowest ?
> For 3D, similarity calculation is Nconfs x Nconfs


                                                      15
Compound Collection

> BIONET 'Rule of Three' ('Ro3') Fragment
  Library: “7,907 'Ro3'-compliant fragments”
> Conformation hunt on every fragment 
  Maximum of 5 conformations (!)
> Full N x N similarity matrix, 3D & 2D (60 Million
  data points)


> ~30 compounds failed conformation hunting

                                                      17
Problems

> 400Mb of data
> Tedious to use and examine
Pilot study just using the first 500 compounds
   > Some chemical families in this area
   > Still a large dataset to deal with (250,000 data points)
> 2D similarities and fragments
   > Small changes cause disproportionately high changes
   > Atom pairs particularly bad
   > Switch to KNIME fingerprints
    All 2D values lower than „normal‟

                                                                18
Comparing 2D and 3D metrics


                              Agreement




                                          19
Example - Similar Scores



                2D sim = 0.9
    101                              104


               3D field sim = 0.87




                                           22
Example - Higher 3D Sim



                 2D sim = 0.1
             (other methods=0.3)


               3D field sim = 0.82




                                     23
Example - Higher 3D Sim



               2D sim = 0.2

        141                   454




               3D sim = 0.7




                                    24
Example - Higher 3D Sim



                  2D sim = 0.3

              (other methods 0.55)
        437                           440




                 3D field sim = 0.8




                                            25
So…

> Pilot study suggests some added value
> Full matrix painful even if we could calculate it

> What about a reduced matrix?
   > Use „Probe‟ compounds to tease out molecules that are
     different in Field space
   How many probes?
   Across how many molecules


> We were running out of time…

                                                             26
Compound selection by Field Diversity

> Proposed workflow for generation of a field diverse library:


     9M                                Pick 200
  commercial                                       Calc. 200 X 200
                                       sub-set
  compounds                                         2D similarity                Pick 100
                      Calc. Shape                      matrix                    Diverse
                      Diversity by                                                 Field
   Property               PMI                                                     probes
    Filters
               1.2M                   Pick 20K
                                      sub-set
                                                   Calc. 20K X 100
                                                   Field similarity
                                                       matrix


                                      Pick 12K
                                                                       3D PCA on
                                        Field
                                                                      Field matrix
                                     Diverse set
                                                                                            27
Field Diverse library: Outcome

12K „Field Diverse‟ library mapped by 3D PCA on the
100 x 20,000 „Field Similarity Fingerprint‟
               Ammoniums
               Piperidines             Distinct separation of
                                       charged species within
                                       this space



                                       ….so what!!

                     Benzoic and
                     aliphatic acids



                                                                30
Field Diverse library: Outcome

12K „Field Diverse‟ library mapped by 3D PCA


                                   Distinct separation of by
                                   molecules by size within
                                   this space



                                   ….so what!!

                    Decreasing
                       Size



                                                          31
Deeper - Moderate „Field Similarity‟

                           Alignment to „template1‟




                                                  32
Deeper - Moderate „Field Similarity‟

Random selection of mols   Alignment to „template1‟




                                                  33
Deeper - Moderate „Field Similarity‟

                           Alignment to „template‟




                                                     35
Is the chemical space sensible?

                                  Small sulphonamides




                                  Large esters




    Two example clusters                         36
Conclusions

> Work in progress
> Full similarity matrix shows potential of 3D sim to
  add value
> Full matrix difficult to handle and possibly
  unnecessary
> Using probes looks like a possible solution
> Not a panacea - still need to play the numbers
  game

                                                    37
Acknowledgements

> Cresset
  > Martin Slater
  > Rob Scoffin
  > Mark Mackey
  > James Melville
> Mission Therapeutics
  > Keith Menear




                         38

Contenu connexe

Similaire à Tim Cheeseright, Assessing the Similarities of Compound collections using molecular fields: Does it add value?

Mark Mackey, Cresset, 'Meet Molecular Architect, A new product for understand...
Mark Mackey, Cresset, 'Meet Molecular Architect, A new product for understand...Mark Mackey, Cresset, 'Meet Molecular Architect, A new product for understand...
Mark Mackey, Cresset, 'Meet Molecular Architect, A new product for understand...Cresset
 
Towards a 2-dimensional Self-organized Framework for Structured Population-ba...
Towards a 2-dimensional Self-organized Framework for Structured Population-ba...Towards a 2-dimensional Self-organized Framework for Structured Population-ba...
Towards a 2-dimensional Self-organized Framework for Structured Population-ba...Carlos M. Fernandes
 
Laser Beam Homogenizer
Laser Beam HomogenizerLaser Beam Homogenizer
Laser Beam HomogenizerVikram Sachan
 
Master thesispresentation
Master thesispresentationMaster thesispresentation
Master thesispresentationMatthew Urffer
 
Reducing the dimensionality of data with neural networks
Reducing the dimensionality of data with neural networksReducing the dimensionality of data with neural networks
Reducing the dimensionality of data with neural networksHakky St
 
A Performance Study of BDD-Based Model Checking
A Performance Study of BDD-Based Model CheckingA Performance Study of BDD-Based Model Checking
A Performance Study of BDD-Based Model CheckingOlivier Coudert
 
IGARSS-SAR-Pritt.pptx
IGARSS-SAR-Pritt.pptxIGARSS-SAR-Pritt.pptx
IGARSS-SAR-Pritt.pptxgrssieee
 
IGARSS-SAR-Pritt.pptx
IGARSS-SAR-Pritt.pptxIGARSS-SAR-Pritt.pptx
IGARSS-SAR-Pritt.pptxgrssieee
 
Tim Cheeseright, Cresset, 'Introducing Fragment Growing in FieldStere and oth...
Tim Cheeseright, Cresset, 'Introducing Fragment Growing in FieldStere and oth...Tim Cheeseright, Cresset, 'Introducing Fragment Growing in FieldStere and oth...
Tim Cheeseright, Cresset, 'Introducing Fragment Growing in FieldStere and oth...Cresset
 
Large Scale Parallel FDTD Simulation of Full 3D Photonic Crystal Structures
Large Scale Parallel FDTD Simulation of Full 3D Photonic Crystal StructuresLarge Scale Parallel FDTD Simulation of Full 3D Photonic Crystal Structures
Large Scale Parallel FDTD Simulation of Full 3D Photonic Crystal Structuresayubimoak
 
COMSOL Training Series (NNMDC Initiative)
COMSOL Training Series (NNMDC Initiative)COMSOL Training Series (NNMDC Initiative)
COMSOL Training Series (NNMDC Initiative)Aniket Tekawade
 
Xerrada a Aachen l'any 2007 sobre ferrofluids
Xerrada a Aachen l'any 2007 sobre ferrofluidsXerrada a Aachen l'any 2007 sobre ferrofluids
Xerrada a Aachen l'any 2007 sobre ferrofluidsjoanjosepcerdapi
 
Distributed approximate spectral clustering for large scale datasets
Distributed approximate spectral clustering for large scale datasetsDistributed approximate spectral clustering for large scale datasets
Distributed approximate spectral clustering for large scale datasetsBita Kazemi
 
David Evans, Eli-Lilly, 'Field-Aligned Matched Pairs'
David Evans, Eli-Lilly, 'Field-Aligned Matched Pairs'David Evans, Eli-Lilly, 'Field-Aligned Matched Pairs'
David Evans, Eli-Lilly, 'Field-Aligned Matched Pairs'Cresset
 
Lung Cancer Prediction using Image Classification
Lung Cancer Prediction using Image ClassificationLung Cancer Prediction using Image Classification
Lung Cancer Prediction using Image ClassificationShreshth Saxena
 
Intelligent Placement of Datacenter for Internet Services
Intelligent Placement of Datacenter for Internet Services Intelligent Placement of Datacenter for Internet Services
Intelligent Placement of Datacenter for Internet Services Arinto Murdopo
 
Class 21: Changing State
Class 21: Changing StateClass 21: Changing State
Class 21: Changing StateDavid Evans
 

Similaire à Tim Cheeseright, Assessing the Similarities of Compound collections using molecular fields: Does it add value? (20)

Mark Mackey, Cresset, 'Meet Molecular Architect, A new product for understand...
Mark Mackey, Cresset, 'Meet Molecular Architect, A new product for understand...Mark Mackey, Cresset, 'Meet Molecular Architect, A new product for understand...
Mark Mackey, Cresset, 'Meet Molecular Architect, A new product for understand...
 
Towards a 2-dimensional Self-organized Framework for Structured Population-ba...
Towards a 2-dimensional Self-organized Framework for Structured Population-ba...Towards a 2-dimensional Self-organized Framework for Structured Population-ba...
Towards a 2-dimensional Self-organized Framework for Structured Population-ba...
 
Laser Beam Homogenizer
Laser Beam HomogenizerLaser Beam Homogenizer
Laser Beam Homogenizer
 
Master thesispresentation
Master thesispresentationMaster thesispresentation
Master thesispresentation
 
Reducing the dimensionality of data with neural networks
Reducing the dimensionality of data with neural networksReducing the dimensionality of data with neural networks
Reducing the dimensionality of data with neural networks
 
A Performance Study of BDD-Based Model Checking
A Performance Study of BDD-Based Model CheckingA Performance Study of BDD-Based Model Checking
A Performance Study of BDD-Based Model Checking
 
IGARSS-SAR-Pritt.pptx
IGARSS-SAR-Pritt.pptxIGARSS-SAR-Pritt.pptx
IGARSS-SAR-Pritt.pptx
 
IGARSS-SAR-Pritt.pptx
IGARSS-SAR-Pritt.pptxIGARSS-SAR-Pritt.pptx
IGARSS-SAR-Pritt.pptx
 
Tim Cheeseright, Cresset, 'Introducing Fragment Growing in FieldStere and oth...
Tim Cheeseright, Cresset, 'Introducing Fragment Growing in FieldStere and oth...Tim Cheeseright, Cresset, 'Introducing Fragment Growing in FieldStere and oth...
Tim Cheeseright, Cresset, 'Introducing Fragment Growing in FieldStere and oth...
 
Fullprof Refinement
Fullprof RefinementFullprof Refinement
Fullprof Refinement
 
Large Scale Parallel FDTD Simulation of Full 3D Photonic Crystal Structures
Large Scale Parallel FDTD Simulation of Full 3D Photonic Crystal StructuresLarge Scale Parallel FDTD Simulation of Full 3D Photonic Crystal Structures
Large Scale Parallel FDTD Simulation of Full 3D Photonic Crystal Structures
 
COMSOL Training Series (NNMDC Initiative)
COMSOL Training Series (NNMDC Initiative)COMSOL Training Series (NNMDC Initiative)
COMSOL Training Series (NNMDC Initiative)
 
Talk at SMASH 2011
Talk at SMASH 2011  Talk at SMASH 2011
Talk at SMASH 2011
 
Xerrada a Aachen l'any 2007 sobre ferrofluids
Xerrada a Aachen l'any 2007 sobre ferrofluidsXerrada a Aachen l'any 2007 sobre ferrofluids
Xerrada a Aachen l'any 2007 sobre ferrofluids
 
Distributed approximate spectral clustering for large scale datasets
Distributed approximate spectral clustering for large scale datasetsDistributed approximate spectral clustering for large scale datasets
Distributed approximate spectral clustering for large scale datasets
 
David Evans, Eli-Lilly, 'Field-Aligned Matched Pairs'
David Evans, Eli-Lilly, 'Field-Aligned Matched Pairs'David Evans, Eli-Lilly, 'Field-Aligned Matched Pairs'
David Evans, Eli-Lilly, 'Field-Aligned Matched Pairs'
 
Lung Cancer Prediction using Image Classification
Lung Cancer Prediction using Image ClassificationLung Cancer Prediction using Image Classification
Lung Cancer Prediction using Image Classification
 
Intelligent Placement of Datacenter for Internet Services
Intelligent Placement of Datacenter for Internet Services Intelligent Placement of Datacenter for Internet Services
Intelligent Placement of Datacenter for Internet Services
 
Resolution
ResolutionResolution
Resolution
 
Class 21: Changing State
Class 21: Changing StateClass 21: Changing State
Class 21: Changing State
 

Plus de Cresset

Discovery and optimization of novel small molecule HIV-1 entry inhibitors usi...
Discovery and optimization of novel small molecule HIV-1 entry inhibitors usi...Discovery and optimization of novel small molecule HIV-1 entry inhibitors usi...
Discovery and optimization of novel small molecule HIV-1 entry inhibitors usi...Cresset
 
Organic converstions: an aid in perspective
Organic converstions: an aid in perspectiveOrganic converstions: an aid in perspective
Organic converstions: an aid in perspectiveCresset
 
Identification of novel potential anti cancer agents using network pharmacolo...
Identification of novel potential anti cancer agents using network pharmacolo...Identification of novel potential anti cancer agents using network pharmacolo...
Identification of novel potential anti cancer agents using network pharmacolo...Cresset
 
Knowledge-based chemical fragment analysis in protein binding sites
Knowledge-based chemical fragment analysis in protein binding sitesKnowledge-based chemical fragment analysis in protein binding sites
Knowledge-based chemical fragment analysis in protein binding sitesCresset
 
Using waterswap to predict and understand binding affinities
Using waterswap to predict and understand binding affinitiesUsing waterswap to predict and understand binding affinities
Using waterswap to predict and understand binding affinitiesCresset
 
Smart drug re-profiling using computational chemistry tools novel biology and...
Smart drug re-profiling using computational chemistry tools novel biology and...Smart drug re-profiling using computational chemistry tools novel biology and...
Smart drug re-profiling using computational chemistry tools novel biology and...Cresset
 
New features in cresst products
New features in cresst productsNew features in cresst products
New features in cresst productsCresset
 
Comparing the electrostatic properties of protein active sites and other cres...
Comparing the electrostatic properties of protein active sites and other cres...Comparing the electrostatic properties of protein active sites and other cres...
Comparing the electrostatic properties of protein active sites and other cres...Cresset
 
Torch for medicinal chemists
Torch for medicinal chemistsTorch for medicinal chemists
Torch for medicinal chemistsCresset
 
Discovery and optimization of novel small molecule HIV-1 entry inhibitors usi...
Discovery and optimization of novel small molecule HIV-1 entry inhibitors usi...Discovery and optimization of novel small molecule HIV-1 entry inhibitors usi...
Discovery and optimization of novel small molecule HIV-1 entry inhibitors usi...Cresset
 
Smart drug re profiling using computational chemistry tools novel biology and...
Smart drug re profiling using computational chemistry tools novel biology and...Smart drug re profiling using computational chemistry tools novel biology and...
Smart drug re profiling using computational chemistry tools novel biology and...Cresset
 
Intelligent library design for protein families and beyond sp
Intelligent library design for protein families and beyond spIntelligent library design for protein families and beyond sp
Intelligent library design for protein families and beyond spCresset
 
Intelligent library design for protein families and beyond sp
Intelligent library design for protein families and beyond spIntelligent library design for protein families and beyond sp
Intelligent library design for protein families and beyond spCresset
 
Cresset: 25 year of Fields
Cresset: 25 year of FieldsCresset: 25 year of Fields
Cresset: 25 year of FieldsCresset
 
Rob Scoffin, Cresset, 'The Cresset Future'
Rob Scoffin, Cresset, 'The Cresset Future'Rob Scoffin, Cresset, 'The Cresset Future'
Rob Scoffin, Cresset, 'The Cresset Future'Cresset
 
Chris Ullman, Isogenica, 'The use of CIS display for drug discovery'
Chris Ullman, Isogenica, 'The use of CIS display for drug discovery'Chris Ullman, Isogenica, 'The use of CIS display for drug discovery'
Chris Ullman, Isogenica, 'The use of CIS display for drug discovery'Cresset
 
Simon McIntosh-Smith, University of Bristol, 'Accelerating molecular docking ...
Simon McIntosh-Smith, University of Bristol, 'Accelerating molecular docking ...Simon McIntosh-Smith, University of Bristol, 'Accelerating molecular docking ...
Simon McIntosh-Smith, University of Bristol, 'Accelerating molecular docking ...Cresset
 
Raphael Geney, Galapagos, H-bond strength predictions: Could we do better?
Raphael Geney, Galapagos, H-bond strength predictions: Could we do better?Raphael Geney, Galapagos, H-bond strength predictions: Could we do better?
Raphael Geney, Galapagos, H-bond strength predictions: Could we do better?Cresset
 

Plus de Cresset (18)

Discovery and optimization of novel small molecule HIV-1 entry inhibitors usi...
Discovery and optimization of novel small molecule HIV-1 entry inhibitors usi...Discovery and optimization of novel small molecule HIV-1 entry inhibitors usi...
Discovery and optimization of novel small molecule HIV-1 entry inhibitors usi...
 
Organic converstions: an aid in perspective
Organic converstions: an aid in perspectiveOrganic converstions: an aid in perspective
Organic converstions: an aid in perspective
 
Identification of novel potential anti cancer agents using network pharmacolo...
Identification of novel potential anti cancer agents using network pharmacolo...Identification of novel potential anti cancer agents using network pharmacolo...
Identification of novel potential anti cancer agents using network pharmacolo...
 
Knowledge-based chemical fragment analysis in protein binding sites
Knowledge-based chemical fragment analysis in protein binding sitesKnowledge-based chemical fragment analysis in protein binding sites
Knowledge-based chemical fragment analysis in protein binding sites
 
Using waterswap to predict and understand binding affinities
Using waterswap to predict and understand binding affinitiesUsing waterswap to predict and understand binding affinities
Using waterswap to predict and understand binding affinities
 
Smart drug re-profiling using computational chemistry tools novel biology and...
Smart drug re-profiling using computational chemistry tools novel biology and...Smart drug re-profiling using computational chemistry tools novel biology and...
Smart drug re-profiling using computational chemistry tools novel biology and...
 
New features in cresst products
New features in cresst productsNew features in cresst products
New features in cresst products
 
Comparing the electrostatic properties of protein active sites and other cres...
Comparing the electrostatic properties of protein active sites and other cres...Comparing the electrostatic properties of protein active sites and other cres...
Comparing the electrostatic properties of protein active sites and other cres...
 
Torch for medicinal chemists
Torch for medicinal chemistsTorch for medicinal chemists
Torch for medicinal chemists
 
Discovery and optimization of novel small molecule HIV-1 entry inhibitors usi...
Discovery and optimization of novel small molecule HIV-1 entry inhibitors usi...Discovery and optimization of novel small molecule HIV-1 entry inhibitors usi...
Discovery and optimization of novel small molecule HIV-1 entry inhibitors usi...
 
Smart drug re profiling using computational chemistry tools novel biology and...
Smart drug re profiling using computational chemistry tools novel biology and...Smart drug re profiling using computational chemistry tools novel biology and...
Smart drug re profiling using computational chemistry tools novel biology and...
 
Intelligent library design for protein families and beyond sp
Intelligent library design for protein families and beyond spIntelligent library design for protein families and beyond sp
Intelligent library design for protein families and beyond sp
 
Intelligent library design for protein families and beyond sp
Intelligent library design for protein families and beyond spIntelligent library design for protein families and beyond sp
Intelligent library design for protein families and beyond sp
 
Cresset: 25 year of Fields
Cresset: 25 year of FieldsCresset: 25 year of Fields
Cresset: 25 year of Fields
 
Rob Scoffin, Cresset, 'The Cresset Future'
Rob Scoffin, Cresset, 'The Cresset Future'Rob Scoffin, Cresset, 'The Cresset Future'
Rob Scoffin, Cresset, 'The Cresset Future'
 
Chris Ullman, Isogenica, 'The use of CIS display for drug discovery'
Chris Ullman, Isogenica, 'The use of CIS display for drug discovery'Chris Ullman, Isogenica, 'The use of CIS display for drug discovery'
Chris Ullman, Isogenica, 'The use of CIS display for drug discovery'
 
Simon McIntosh-Smith, University of Bristol, 'Accelerating molecular docking ...
Simon McIntosh-Smith, University of Bristol, 'Accelerating molecular docking ...Simon McIntosh-Smith, University of Bristol, 'Accelerating molecular docking ...
Simon McIntosh-Smith, University of Bristol, 'Accelerating molecular docking ...
 
Raphael Geney, Galapagos, H-bond strength predictions: Could we do better?
Raphael Geney, Galapagos, H-bond strength predictions: Could we do better?Raphael Geney, Galapagos, H-bond strength predictions: Could we do better?
Raphael Geney, Galapagos, H-bond strength predictions: Could we do better?
 

Dernier

TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...panagenda
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...AliaaTarek5
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 

Dernier (20)

TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 

Tim Cheeseright, Assessing the Similarities of Compound collections using molecular fields: Does it add value?

  • 1. Tim Cheeseright, Mark Mackey, Rob Scoffin, Martin Slater Assessing the similarity of compound collections using molecular fields: Does it add value? 1
  • 2. Conclusions > It works brilliantly > All synthetic steps gave yields of 100% > All enrichments were perfect > All new molecules were sub nM > All QSARs were totally predictive, q2 = 1.0 > We expect the call from Sweden any day now 2
  • 3. Conclusions > Work in progress > 3D similarity can add value to compound selection > Full matrix of similarities possibly unnecessary > Using probes looks like a possible solution > Not a panacea 3
  • 4. Agenda & Background > Fields & similarity > Generating screening compounds using Fields > Selecting a 10K “diverse” library for screening from commercial compounds > Initial thoughts > Problems > More Initial thoughts > A solution but not a complete one > Conclusions 4
  • 5. Field Points Condensed representation of electrostatic, hydrophobic and shape properties (“protein‟s view”) > Molecular Field Extrema (“Field Points”) 2D 3D Molecular Field Points Electrostatic = Positive Potential (MEP) = Negative = Shape = Hydrophobic 5
  • 6. Improved MM Electrostatics > Field patterns from XED force field reproduce experimental results Experimental Using XEDs Not using XEDs Interaction of Acetone and Any-OH from small molecule XED adds ‘p-orbitals’ to crystal structures get better representation of atoms 6
  • 8. Molecular Alignment 0.82 0.66 0.98 Cheeseright et al, J. Chem Inf. Mod., 2006, 665 8
  • 9. Using Fields > Bioisosteric groups > Virtual Screening > Pharmacophore hypothesis > Qualitative SAR interpretation > 3D QSAR > Library Design 9
  • 10. Field based library design success 10
  • 11. Libraries from Fields > Small, custom synthesised libraries (~100s - 1000s compds) > Low scaffold diversity > Highly targeted > Lots of manual design 11
  • 12. An Opportunity & a Challenge > Provide a small diverse screening library 10K for a small biotech company > Diversity in potential biological targets to be hit > Minimum redundancy in the set > Maximum chance of success in finding a lead within available budget and screening resources 12
  • 13. Initial thoughts > Customised design not an option - commercial compounds only > Using Fields to successfully select compounds for screening performed many times > Virtual screening > Always in a specific biological context > What about using Fields to choose a „diverse‟ set > Possible problem with numbers > 10,000 cmpd library small > 9,000,000 commercially available molecules v. large for 3D diversity 13
  • 14. Initial thoughts > Compare 3D and 2D similarities for compound collections - are we wasting our time? > Take a small compound collection > Full NxN calculation > 3D method = Fields & Shape > 2D method = atom pairs > Compare and Contrast 14
  • 15. Conformations > 3D Method requires conformations - which one(s) to use? > What is the similarity of 2 compounds in 3D ? > Context is important! > Highest across all conformations? > Average ? > Lowest ? > For 3D, similarity calculation is Nconfs x Nconfs 15
  • 16. Compound Collection > BIONET 'Rule of Three' ('Ro3') Fragment Library: “7,907 'Ro3'-compliant fragments” > Conformation hunt on every fragment  Maximum of 5 conformations (!) > Full N x N similarity matrix, 3D & 2D (60 Million data points) > ~30 compounds failed conformation hunting 17
  • 17. Problems > 400Mb of data > Tedious to use and examine Pilot study just using the first 500 compounds > Some chemical families in this area > Still a large dataset to deal with (250,000 data points) > 2D similarities and fragments > Small changes cause disproportionately high changes > Atom pairs particularly bad > Switch to KNIME fingerprints  All 2D values lower than „normal‟ 18
  • 18. Comparing 2D and 3D metrics Agreement 19
  • 19. Example - Similar Scores 2D sim = 0.9 101 104 3D field sim = 0.87 22
  • 20. Example - Higher 3D Sim 2D sim = 0.1 (other methods=0.3) 3D field sim = 0.82 23
  • 21. Example - Higher 3D Sim 2D sim = 0.2 141 454 3D sim = 0.7 24
  • 22. Example - Higher 3D Sim 2D sim = 0.3 (other methods 0.55) 437 440 3D field sim = 0.8 25
  • 23. So… > Pilot study suggests some added value > Full matrix painful even if we could calculate it > What about a reduced matrix? > Use „Probe‟ compounds to tease out molecules that are different in Field space How many probes? Across how many molecules > We were running out of time… 26
  • 24. Compound selection by Field Diversity > Proposed workflow for generation of a field diverse library: 9M Pick 200 commercial Calc. 200 X 200 sub-set compounds 2D similarity Pick 100 Calc. Shape matrix Diverse Diversity by Field Property PMI probes Filters 1.2M Pick 20K sub-set Calc. 20K X 100 Field similarity matrix Pick 12K 3D PCA on Field Field matrix Diverse set 27
  • 25. Field Diverse library: Outcome 12K „Field Diverse‟ library mapped by 3D PCA on the 100 x 20,000 „Field Similarity Fingerprint‟ Ammoniums Piperidines Distinct separation of charged species within this space ….so what!! Benzoic and aliphatic acids 30
  • 26. Field Diverse library: Outcome 12K „Field Diverse‟ library mapped by 3D PCA Distinct separation of by molecules by size within this space ….so what!! Decreasing Size 31
  • 27. Deeper - Moderate „Field Similarity‟ Alignment to „template1‟ 32
  • 28. Deeper - Moderate „Field Similarity‟ Random selection of mols Alignment to „template1‟ 33
  • 29. Deeper - Moderate „Field Similarity‟ Alignment to „template‟ 35
  • 30. Is the chemical space sensible? Small sulphonamides Large esters Two example clusters 36
  • 31. Conclusions > Work in progress > Full similarity matrix shows potential of 3D sim to add value > Full matrix difficult to handle and possibly unnecessary > Using probes looks like a possible solution > Not a panacea - still need to play the numbers game 37
  • 32. Acknowledgements > Cresset > Martin Slater > Rob Scoffin > Mark Mackey > James Melville > Mission Therapeutics > Keith Menear 38

Notes de l'éditeur

  1. Notes:The 2D drawing of a molecule gives limited information about its nature – in real life, molecules take on a 3D geometry whose nature can’t be truly represented by a flat cartoon.Consider the electrostatic potential surrounding a molecule and map that potential out to a surface as shown in the second figure. Field Points are points that are placed at the extrema of the MEP, with the point size governed by the size of the electrostatic contribution.Spatial points are also included at the van der Walls radii extrema.
  2. 1) Commercial databases 9 million filtered for Heavy atom count:  >11 < 30 correspond to roughly  Mwt >140  < 500  (4,655,051 cpds)  (2) Further filtered for rotatable bond count < 5  reactive group filters applied (removes nasties like aldehydes, ketones, hydrazones, alkylhalides, isocyanates, nitrosyl etc… see below for full list), charge filters < 3 formal charges neg. or positive.    (1,282,042 mols passed these filters). (3) For this list of compounds we intend to calculate logP, HBA, HBD, PMI and shadow indices and select 20K on shape diversity. I believe this is going to be a reasonable approximation of field similarity since fields are also heavily dependent on 3D conformation. (4) From this data we also intend to pick 100 probe molecules and use these to calculate similarity v the 20K set. This gives a 20K set each with a 100 bit field fingerprint.  This is the equivalent of a completing a 2M virtual screen. (5) This fingerprint can be subjected to a PCA analysis to reduce the data effectively to a 3 dimensional ‘field space’ from which a diverse 12 K set can be chosen. From a practical point of view it will be difficult to expand this process to a bigger data set although if 3d shape sim correlates well with Field sim then the PMI selection may be enough – we simply don’t know until we do the experiment.  (6) We will provide the 12K SD file set for you to purchase with 2000 cpd redundancy for those which are  not available or too expensive etc. (a) filtered on properties and nasty functionality to obtain a 1.2 million compound data set.  (b) On this set we ran a PMI shape descriptor calculation on a single ‘lowest energy’ conformation for each molecule in the set. (c) From this we picked a 20K shape diverse set using the PMI defined shape space.  (d)From the 20K set I picked a diverse 200 cpd set in the same way.(e) We applied to this 200 an all by all 2D similarity matrix ‘200 by 200’ we could then ensure 2D dissimilarity in the choice of a set of a 100 probe molecules. (f) These 100 probe molecules were used as templates to measure Field similarity against each of the 20K cpds and thus produce a 100 bit number for each of the 20K cpds.                (g) From the Field similarity matrix we collapsed the ‘ ~20000 X 100’ matrix to ~20000 X 3 dimensions using PCA to define the 3D fieldspace.                (h) 12k Field diverse compounds were selected from this 3D Fieldspace.
  3. Theoretically, field based metrics should be a good way to assess the similarity/diversity of fragment collections?? Diversity of fragment databases?? In Fieldstere
  4. Should have probably done a 200 X 200 field similarity at this stage to ensure picking field diverse probes? But 2d disim also ensured we were avoiding picking too similar chemotypes for the probes – probably doesn’t matter. Theoretically, field based metrics should be a good way to assess the similarity/diversity of fragment collections?? Diversity of fragment databases?? In FieldstereNever tried using a smaller number of probes – could increase/decrease discrimination?
  5. Picked a cluster set from the space 3D PCA – selected an arbitrary conformer then flexibly aligned (Falign) the rest – plot surface. Bottom 8 Fsim less than 6
  6. Picked a cluster set from the space 3D PCA – selected an arbitary conformer then flexibly aligned (Falign) the rest – plot surface. Bottom 8 Fsim less than 6
  7. Againselected an arbitary conformer (different one this time) then flexibly aligned (Falign) the rest – plot surface. Bottom 5 Fsim less than 6
  8. Picked a second cluster and repeated with another Arbitary template – Fsims all > 6 discarded 4 which were below 6. – Cluster still OKConclude: Evenin this space - clusters of close field similarity are still fairly diverse!!
  9. Separation of chemically intuitive groupings – DHP-like esters/lactones………….compact sulphonamides – clusters on periphery are truly Field dissimilar.