SlideShare a Scribd company logo
1 of 52
Computational Chemistry Robots ACS Sep 2005 Computational Chemistry Robots J. A. Townsend, P. Murray-Rust,  S. M. Tyrrell, Y. Zhang [email_address]
[object Object],[object Object],[object Object]
Aspects of complete automation ,[object Object],[object Object],[object Object]
Approaches to conformance ,[object Object],[object Object],[object Object],[object Object]
The overall view molecules computation dissemination
The overall view molecules computation dissemination Check  results
Components of System ,[object Object],[object Object],[object Object],[object Object],[object Object]
Computing the NCI database MOPAC PM5 a a MOPAC PM5 – collaboration with J.J.P. Stewart
Protocol Log Files Parse System Crashes Science Errors Analysis Pathological Behaviour Statistics Other Science Disseminate Results Unsuitable Data Program Crashes Inform Developer
Taverna ,[object Object],[object Object],[object Object],[object Object],[object Object]
An Example Taverna Workflow
Parsing Log Files to CML Coordinates Molecular Formula Calculation Type Point Group Dipole Total Energy Computational Chemistry Log Files
CompChem Output Coordinates Energy Levels Vibrations Coordinates Energy Level Vibration CML File CMLCore CMLCore CMLComp CMLSpect Input/jobControl General Parsers
Dissemination of results LOG FILE CML FILE HUMAN DISPLAY WWMM* Server and DSpace Outside world JUMBOMarker NLP-based log file parser * World Wide Molecular Matrix
InChI: IUPAC International Chemical Identifier ,[object Object],[object Object],[object Object]
Proteus molecules * Calculation JUNK     Cured by MOPAC * Proteus was a shape changing ocean deity
Proteus molecules Calculation Input     JUNK
How do we know our results are valid? Computational Method 1 Computational Method 2 Experiment
J.J.P. Stewart’s example Calculated   H f   –  Expt   H f
GAMESS MOPAC results GAMESS a 631G* B3LYP Log Files a  Project with Kim Baldridge and Wibke Sudholt
Protocol Log Files Parse System Crashes Science Errors Analysis Pathological Behaviour Statistics Other Science Disseminate Results Unsuitable Data Program Crashes Inform Developer
Repeat runs, different methods Multiple runs give same final structure from same input Changing memory allocation doesn’t make a difference
Pathological behaviour - Early detection 100 min 631G*, B3LYP 200 min 15 min   631G*, B3LYP   10080 min divinyl ether  trans-Crotonaldehyde Z matrix
Times to run jobs
Analysis of different computational methods Mean  - Overall difference Normality  - Distribution of values Outliers  - Unusual molecules? Variance  - Spread of the data, depends    on both distributions.    (standard deviation)
Probability Plot (Normal QQ plot)
Mean of distribution (Approx - 0.03  Å ) Range over which sample distribution is  approximately normal Outliers Probability Plot (Normal QQ plot) S.D. 0.020  Å
All bonds*   r (MOPAC – GAMESS) /  Å * Excludes bonds to Hydrogenc
All bonds*   r (MOPAC – GAMESS) /  Å Good agreement Nearly normal  Outliers S.D. 0.005  Å * Excludes bonds to Hydrogenc
2- Bad molecules and data usually cause outliers Na P O O H H
Mean   r (M - G) /  Å  Standard Error of the Mean / Å All values given to 3 significant figures   C N O F S Cl C -0.006 0.020 -0.010 -0.014 -0.040 -0.037 0.000 0.000 0.000 0.001 0.001 0.001 N   0.006 -0.037   -0.055     0.001 0.001   0.009   O     -0.087   -0.070       0.004   0.014  
 r CC bonds (M - G) /  Å
 r CC bonds (M - G) /  Å Good agreement Nearly normal Outliers S.D. 0.013  Å JUNK
Selection of molecules with C C   r (M - G) > 0.05 Angstroms
Y = 0.0277 X – 0.0061 Non aromatic C C bonds adjacent to CF n
 r NN bonds (M - G) /  Å
Good agreement Nearly normal Kink S.D. 0.022  Å  r NN bonds (M - G) /  Å
Density plot of   r NN bonds (M - G) /  Å
LEFT RIGHT Density plot of   r NN bonds (M - G) /  Å
Most common fragments found in  Left set but not Right set C(sp 3 ) C(sp 3 ) (sp 3 ) S(sp 2 ) N(ar) N (ar) C(sp 2 ) S(sp 2 ) N(ar) N (ar) C(sp 2 ) Or
GAMESS Log Files Comparison of theory and experiment CIF* CIF* CIF* CIF* CIF* CIF 2 CML * CIF: Crystallographic Information File
Reading Acta Crystallographica Section E
All bonds*   r (Cryst. – GAMESS) / Å  Single molecules, no disorder * Excludes bonds to Hydrogenc
All bonds*   r (Cryst. – GAMESS) / Å  Single molecules, no disorder Mean   r  - 0.011  Å Nearly normal Outliers S.D. 0.014  Å * Excludes bonds to Hydrogenc
 r CC bonds (C – G) / Å
Mean   r - 0.01  Å Nearly normal S.D. 0.009  Å  r CC bonds (C – G) / Å
 r CO bonds (C – G) / Å
Good agreement Nearly normal Outliers ? S.D. 0.011  Å  r CO bonds (C – G) / Å
 r = +0.08  Å Chemistry can cause outliers H movement
Conclusions ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Thanks J.J.P. Stewart Kim Baldridge Wibke Sudholt Simon Tyrrell Yong Zhang Peter Murray-Rust Unilever
Questions Homepage: http://wwmm.ch.cam.ac.uk InChI FAQ: http://wwmm.ch.cam.ac.uk/inchifaq R: http:// www.r-project.org Taverna: http://taverna.sourceforge.net/ MOPAC 2002: http://www.cachesoftware.com/mopac/ GAMESS: http:// www.msg.ameslab.gov/GAMESS/GAMESS.html

More Related Content

What's hot

Introduction to OECD QSAR Toolbox
Introduction to OECD QSAR ToolboxIntroduction to OECD QSAR Toolbox
Introduction to OECD QSAR Toolbox
guestcfca1eb1
 
CHESC Methane Hydrate Poster
CHESC Methane Hydrate PosterCHESC Methane Hydrate Poster
CHESC Methane Hydrate Poster
Jiarong Zhou
 
ACSSA Halide-Water Poster
ACSSA Halide-Water PosterACSSA Halide-Water Poster
ACSSA Halide-Water Poster
Jiarong Zhou
 
Harcourt-Essen Reaction
Harcourt-Essen ReactionHarcourt-Essen Reaction
Harcourt-Essen Reaction
Rafia Aslam
 
Fac/Mer Isomerism in Fe(II) Complexes
Fac/Mer Isomerism in Fe(II) ComplexesFac/Mer Isomerism in Fe(II) Complexes
Fac/Mer Isomerism in Fe(II) Complexes
Rafia Aslam
 
Regression Modelling of Thermal Degradation Kinetics, of Concentrated, Aqueou...
Regression Modelling of Thermal Degradation Kinetics, of Concentrated, Aqueou...Regression Modelling of Thermal Degradation Kinetics, of Concentrated, Aqueou...
Regression Modelling of Thermal Degradation Kinetics, of Concentrated, Aqueou...
Shaukat Mazari
 
Quantative Structure-Activity Relationships (QSAR)
Quantative Structure-Activity Relationships (QSAR)Quantative Structure-Activity Relationships (QSAR)
Quantative Structure-Activity Relationships (QSAR)
Atai Rabby
 

What's hot (20)

Introduction to OECD QSAR Toolbox
Introduction to OECD QSAR ToolboxIntroduction to OECD QSAR Toolbox
Introduction to OECD QSAR Toolbox
 
A guide to molecular mechanics and quantum chemical calculations
A guide to molecular mechanics and quantum chemical calculationsA guide to molecular mechanics and quantum chemical calculations
A guide to molecular mechanics and quantum chemical calculations
 
Molecular mechanics
Molecular mechanicsMolecular mechanics
Molecular mechanics
 
CHESC Methane Hydrate Poster
CHESC Methane Hydrate PosterCHESC Methane Hydrate Poster
CHESC Methane Hydrate Poster
 
ACSSA Halide-Water Poster
ACSSA Halide-Water PosterACSSA Halide-Water Poster
ACSSA Halide-Water Poster
 
Harcourt-Essen Reaction
Harcourt-Essen ReactionHarcourt-Essen Reaction
Harcourt-Essen Reaction
 
Qsar lecture
Qsar lectureQsar lecture
Qsar lecture
 
Linking Ab Initio-Calphad for the Assessment of the AluminiumLutetium System
Linking Ab Initio-Calphad for the Assessment of the AluminiumLutetium SystemLinking Ab Initio-Calphad for the Assessment of the AluminiumLutetium System
Linking Ab Initio-Calphad for the Assessment of the AluminiumLutetium System
 
Introduction to Quantitative Structure Activity Relationships
Introduction to Quantitative Structure Activity RelationshipsIntroduction to Quantitative Structure Activity Relationships
Introduction to Quantitative Structure Activity Relationships
 
Fac/Mer Isomerism in Fe(II) Complexes
Fac/Mer Isomerism in Fe(II) ComplexesFac/Mer Isomerism in Fe(II) Complexes
Fac/Mer Isomerism in Fe(II) Complexes
 
QSAR
QSARQSAR
QSAR
 
Regression Modelling of Thermal Degradation Kinetics, of Concentrated, Aqueou...
Regression Modelling of Thermal Degradation Kinetics, of Concentrated, Aqueou...Regression Modelling of Thermal Degradation Kinetics, of Concentrated, Aqueou...
Regression Modelling of Thermal Degradation Kinetics, of Concentrated, Aqueou...
 
Qsar ppt
Qsar pptQsar ppt
Qsar ppt
 
Steric parameters taft’s steric factor (es)
Steric parameters  taft’s steric factor (es)Steric parameters  taft’s steric factor (es)
Steric parameters taft’s steric factor (es)
 
Hammett parameters
Hammett parametersHammett parameters
Hammett parameters
 
Quantative Structure-Activity Relationships (QSAR)
Quantative Structure-Activity Relationships (QSAR)Quantative Structure-Activity Relationships (QSAR)
Quantative Structure-Activity Relationships (QSAR)
 
Free wilson analysis qsar
Free wilson analysis qsarFree wilson analysis qsar
Free wilson analysis qsar
 
Chemical kinetics- Physical Chemistry
Chemical kinetics- Physical ChemistryChemical kinetics- Physical Chemistry
Chemical kinetics- Physical Chemistry
 
Relationship between hansch analysis and free wilson analysis
Relationship between hansch analysis and free wilson analysisRelationship between hansch analysis and free wilson analysis
Relationship between hansch analysis and free wilson analysis
 
1 s2.0-s037838121100207 x-main.correlation of thermodynamic modeling and mole...
1 s2.0-s037838121100207 x-main.correlation of thermodynamic modeling and mole...1 s2.0-s037838121100207 x-main.correlation of thermodynamic modeling and mole...
1 s2.0-s037838121100207 x-main.correlation of thermodynamic modeling and mole...
 

Similar to Computational Chemistry Robots

Bits protein structure
Bits protein structureBits protein structure
Bits protein structure
BITS
 
EPA Summer 2013_Portable Pharmacokinetic Parameter Prediction Tool
EPA Summer 2013_Portable Pharmacokinetic Parameter Prediction ToolEPA Summer 2013_Portable Pharmacokinetic Parameter Prediction Tool
EPA Summer 2013_Portable Pharmacokinetic Parameter Prediction Tool
Emerald Feng
 
Vapor Combustor Improvement Project LinkedIn Presentation February 2016
Vapor Combustor Improvement Project LinkedIn Presentation February 2016Vapor Combustor Improvement Project LinkedIn Presentation February 2016
Vapor Combustor Improvement Project LinkedIn Presentation February 2016
Tim Krimmel, MEM
 
LSBB_NOK_bob1
LSBB_NOK_bob1LSBB_NOK_bob1
LSBB_NOK_bob1
THWIN BOB
 

Similar to Computational Chemistry Robots (20)

Bits protein structure
Bits protein structureBits protein structure
Bits protein structure
 
LSSC2011 Optimization of intermolecular interaction potential energy paramete...
LSSC2011 Optimization of intermolecular interaction potential energy paramete...LSSC2011 Optimization of intermolecular interaction potential energy paramete...
LSSC2011 Optimization of intermolecular interaction potential energy paramete...
 
Fault detection in power transformers using random neural networks
Fault detection in power transformers using random neural networksFault detection in power transformers using random neural networks
Fault detection in power transformers using random neural networks
 
Molecular design: How to and how not to?
Molecular design:  How to and how not to?Molecular design:  How to and how not to?
Molecular design: How to and how not to?
 
EPA Summer 2013_Portable Pharmacokinetic Parameter Prediction Tool
EPA Summer 2013_Portable Pharmacokinetic Parameter Prediction ToolEPA Summer 2013_Portable Pharmacokinetic Parameter Prediction Tool
EPA Summer 2013_Portable Pharmacokinetic Parameter Prediction Tool
 
Conformation of Transmembrane Segments of a Protein by Coarse Grain Model
Conformation of Transmembrane Segments of a Protein by Coarse Grain Model Conformation of Transmembrane Segments of a Protein by Coarse Grain Model
Conformation of Transmembrane Segments of a Protein by Coarse Grain Model
 
23AFMC_Beamer.pdf
23AFMC_Beamer.pdf23AFMC_Beamer.pdf
23AFMC_Beamer.pdf
 
Vapor Combustor Improvement Project LinkedIn Presentation February 2016
Vapor Combustor Improvement Project LinkedIn Presentation February 2016Vapor Combustor Improvement Project LinkedIn Presentation February 2016
Vapor Combustor Improvement Project LinkedIn Presentation February 2016
 
Finding Transition States Algorithmically for Automatic Reaction Mechanism Ge...
Finding Transition States Algorithmically for Automatic Reaction Mechanism Ge...Finding Transition States Algorithmically for Automatic Reaction Mechanism Ge...
Finding Transition States Algorithmically for Automatic Reaction Mechanism Ge...
 
Molecular design: One step back and two paths forward
Molecular design:  One step back and two paths forwardMolecular design:  One step back and two paths forward
Molecular design: One step back and two paths forward
 
QSAR STUDY ON READY BIODEGRADABILITY OF CHEMICALS. Presented at the 3rd Chemo...
QSAR STUDY ON READY BIODEGRADABILITY OF CHEMICALS. Presented at the 3rd Chemo...QSAR STUDY ON READY BIODEGRADABILITY OF CHEMICALS. Presented at the 3rd Chemo...
QSAR STUDY ON READY BIODEGRADABILITY OF CHEMICALS. Presented at the 3rd Chemo...
 
CDAC 2018 Pellegrini clustering ppi networks
CDAC 2018 Pellegrini clustering ppi networksCDAC 2018 Pellegrini clustering ppi networks
CDAC 2018 Pellegrini clustering ppi networks
 
Randomizing genome-scale metabolic networks
Randomizing genome-scale metabolic networksRandomizing genome-scale metabolic networks
Randomizing genome-scale metabolic networks
 
Poster_Jun 2014
Poster_Jun 2014Poster_Jun 2014
Poster_Jun 2014
 
Extraction, Analysis, Atom Mapping, Classification and Naming of Reactions fr...
Extraction, Analysis, Atom Mapping, Classification and Naming of Reactions fr...Extraction, Analysis, Atom Mapping, Classification and Naming of Reactions fr...
Extraction, Analysis, Atom Mapping, Classification and Naming of Reactions fr...
 
Using Calorimetric Data to Drive Accuracy in Computer-Aided Drug Design
Using Calorimetric Data to Drive Accuracy in Computer-Aided Drug DesignUsing Calorimetric Data to Drive Accuracy in Computer-Aided Drug Design
Using Calorimetric Data to Drive Accuracy in Computer-Aided Drug Design
 
LSBB_NOK_bob1
LSBB_NOK_bob1LSBB_NOK_bob1
LSBB_NOK_bob1
 
Biosensors And Bioelectronics Presentation by Sijung Hu
Biosensors And Bioelectronics Presentation by Sijung HuBiosensors And Bioelectronics Presentation by Sijung Hu
Biosensors And Bioelectronics Presentation by Sijung Hu
 
Igor Segota: PhD thesis presentation
Igor Segota: PhD thesis presentationIgor Segota: PhD thesis presentation
Igor Segota: PhD thesis presentation
 
IB Chemistry on ICT, 3D software, Avogadro, AngusLab, Swiss PDB Viewer for In...
IB Chemistry on ICT, 3D software, Avogadro, AngusLab, Swiss PDB Viewer for In...IB Chemistry on ICT, 3D software, Avogadro, AngusLab, Swiss PDB Viewer for In...
IB Chemistry on ICT, 3D software, Avogadro, AngusLab, Swiss PDB Viewer for In...
 

Recently uploaded

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 

Recently uploaded (20)

GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 

Computational Chemistry Robots

  • 1. Computational Chemistry Robots ACS Sep 2005 Computational Chemistry Robots J. A. Townsend, P. Murray-Rust, S. M. Tyrrell, Y. Zhang [email_address]
  • 2.
  • 3.
  • 4.
  • 5. The overall view molecules computation dissemination
  • 6. The overall view molecules computation dissemination Check results
  • 7.
  • 8. Computing the NCI database MOPAC PM5 a a MOPAC PM5 – collaboration with J.J.P. Stewart
  • 9. Protocol Log Files Parse System Crashes Science Errors Analysis Pathological Behaviour Statistics Other Science Disseminate Results Unsuitable Data Program Crashes Inform Developer
  • 10.
  • 11. An Example Taverna Workflow
  • 12. Parsing Log Files to CML Coordinates Molecular Formula Calculation Type Point Group Dipole Total Energy Computational Chemistry Log Files
  • 13. CompChem Output Coordinates Energy Levels Vibrations Coordinates Energy Level Vibration CML File CMLCore CMLCore CMLComp CMLSpect Input/jobControl General Parsers
  • 14. Dissemination of results LOG FILE CML FILE HUMAN DISPLAY WWMM* Server and DSpace Outside world JUMBOMarker NLP-based log file parser * World Wide Molecular Matrix
  • 15.
  • 16. Proteus molecules * Calculation JUNK Cured by MOPAC * Proteus was a shape changing ocean deity
  • 18. How do we know our results are valid? Computational Method 1 Computational Method 2 Experiment
  • 19. J.J.P. Stewart’s example Calculated  H f – Expt  H f
  • 20. GAMESS MOPAC results GAMESS a 631G* B3LYP Log Files a Project with Kim Baldridge and Wibke Sudholt
  • 21. Protocol Log Files Parse System Crashes Science Errors Analysis Pathological Behaviour Statistics Other Science Disseminate Results Unsuitable Data Program Crashes Inform Developer
  • 22. Repeat runs, different methods Multiple runs give same final structure from same input Changing memory allocation doesn’t make a difference
  • 23. Pathological behaviour - Early detection 100 min 631G*, B3LYP 200 min 15 min 631G*, B3LYP 10080 min divinyl ether trans-Crotonaldehyde Z matrix
  • 24. Times to run jobs
  • 25. Analysis of different computational methods Mean - Overall difference Normality - Distribution of values Outliers - Unusual molecules? Variance - Spread of the data, depends on both distributions. (standard deviation)
  • 27. Mean of distribution (Approx - 0.03 Å ) Range over which sample distribution is approximately normal Outliers Probability Plot (Normal QQ plot) S.D. 0.020 Å
  • 28. All bonds*  r (MOPAC – GAMESS) / Å * Excludes bonds to Hydrogenc
  • 29. All bonds*  r (MOPAC – GAMESS) / Å Good agreement Nearly normal Outliers S.D. 0.005 Å * Excludes bonds to Hydrogenc
  • 30. 2- Bad molecules and data usually cause outliers Na P O O H H
  • 31. Mean  r (M - G) / Å Standard Error of the Mean / Å All values given to 3 significant figures   C N O F S Cl C -0.006 0.020 -0.010 -0.014 -0.040 -0.037 0.000 0.000 0.000 0.001 0.001 0.001 N   0.006 -0.037   -0.055     0.001 0.001   0.009   O     -0.087   -0.070       0.004   0.014  
  • 32.  r CC bonds (M - G) / Å
  • 33.  r CC bonds (M - G) / Å Good agreement Nearly normal Outliers S.D. 0.013 Å JUNK
  • 34. Selection of molecules with C C  r (M - G) > 0.05 Angstroms
  • 35. Y = 0.0277 X – 0.0061 Non aromatic C C bonds adjacent to CF n
  • 36.  r NN bonds (M - G) / Å
  • 37. Good agreement Nearly normal Kink S.D. 0.022 Å  r NN bonds (M - G) / Å
  • 38. Density plot of  r NN bonds (M - G) / Å
  • 39. LEFT RIGHT Density plot of  r NN bonds (M - G) / Å
  • 40. Most common fragments found in Left set but not Right set C(sp 3 ) C(sp 3 ) (sp 3 ) S(sp 2 ) N(ar) N (ar) C(sp 2 ) S(sp 2 ) N(ar) N (ar) C(sp 2 ) Or
  • 41. GAMESS Log Files Comparison of theory and experiment CIF* CIF* CIF* CIF* CIF* CIF 2 CML * CIF: Crystallographic Information File
  • 43. All bonds*  r (Cryst. – GAMESS) / Å Single molecules, no disorder * Excludes bonds to Hydrogenc
  • 44. All bonds*  r (Cryst. – GAMESS) / Å Single molecules, no disorder Mean  r - 0.011 Å Nearly normal Outliers S.D. 0.014 Å * Excludes bonds to Hydrogenc
  • 45.  r CC bonds (C – G) / Å
  • 46. Mean  r - 0.01 Å Nearly normal S.D. 0.009 Å  r CC bonds (C – G) / Å
  • 47.  r CO bonds (C – G) / Å
  • 48. Good agreement Nearly normal Outliers ? S.D. 0.011 Å  r CO bonds (C – G) / Å
  • 49.  r = +0.08 Å Chemistry can cause outliers H movement
  • 50.
  • 51. Thanks J.J.P. Stewart Kim Baldridge Wibke Sudholt Simon Tyrrell Yong Zhang Peter Murray-Rust Unilever
  • 52. Questions Homepage: http://wwmm.ch.cam.ac.uk InChI FAQ: http://wwmm.ch.cam.ac.uk/inchifaq R: http:// www.r-project.org Taverna: http://taverna.sourceforge.net/ MOPAC 2002: http://www.cachesoftware.com/mopac/ GAMESS: http:// www.msg.ameslab.gov/GAMESS/GAMESS.html