SlideShare une entreprise Scribd logo
1  sur  62
Télécharger pour lire hors ligne
Matteo Floris

La chemoinformatica
uno strumento computazionale per la chimica farmaceutica

CRS4 - collana seminari 2012
3 Maggio 2012
Mi presento


Matteo Floris

Laurea in C.T.F., Univ di Padova
Master in Bioinformatica, Koeln Univ.
Dottorato in Biochimica, Univ. Roma “La Sapienza”

Chemoinformatica: sviluppo di metodi per ligand based
drug design

Bioinformatica presso il CRS4 per 6 anni (genomica
computazionale)

matteo.floris@gmail.com
Chemoinformatics or cheminformatics?


Chemoinformatics is a vast discipline, standing on the
interface of chemistry, biology and computer science

D. Agrafiotis, J&J
Premessa
Premessa


Drug design


Rational drug design o rational design

Ricerca di nuovi (potenziali!) farmaci sulla base della
conoscenza di un target biologico
Premessa


Drug design


Rational drug design o rational design

Ricerca di nuovi (potenziali!) farmaci sulla base della
conoscenza di un target biologico

Drug design spesso si serve di tecniche di modeling
computazionale (computer-aided drug design, CADD)
Premessa


Drug design


Rational drug design o rational design

Ricerca di nuovi (potenziali!) farmaci sulla base della
conoscenza di un target biologico

Drug design spesso si serve di tecniche di modeling
computazionale (computer-aided drug design, CADD)

Se la struttura tridimensionale del target molecolare è nota,
allora si parla di structure-based drug design.
Premessa


Ligand based CADD 
   Structure based CADD
Premessa


Ligand based CADD 

                                      Structure based CADD

Basato sulla conoscenza di altre
molecole che in grado di legarsi col
target biologico di interesse.

Queste altre molecole possono
essere utilizzate per costruire una
ipotesi farmacoforica che definisca
le caratteristiche minime richieste
per avere l'interazione.

In alternativa, le techiche
quantitative structure-activity
relationship (QSAR) permettono di
cercare una correlazione tra
proprietà chimico-fisiche della
molecola e l'attività biologica.
Premessa


Ligand based CADD 

                                      Structure based CADD
                                       

                                      
Basato sulla conoscenza di altre       Si basa sulla conoscenza della
molecole che in grado di legarsi col   struttura del target biologico di
target biologico di interesse.
        interesse, ottenuta tramite tecniche

                                      di x-ray crystallography o
Queste altre molecole possono          spetroscopia NMR.
essere utilizzate per costruire una    
ipotesi farmacoforica che definisca    Qualora la struttura del target non
le caratteristiche minime richieste    fosse a disposizione, si può ovviare
per avere l'interazione.
              con la costruzione di modelli

                                      tridimensionali per omologia.
In alternativa, le techiche            
quantitative structure-activity        Con l'ausilio di strumenti
relationship (QSAR) permettono di      computazionali è possibile stimare
cercare una correlazione tra           l'affinità e la selettività di uno o più
proprietà chimico-fisiche della        composti per il target.
molecola e l'attività biologica.
A virtual space odyssey


One of the main goals in drug discovery is to identify and
develop new ligands with high binding affinity towards
a protein target. Today, there is increased reliance on
computer-based tools […]. These help select molecules
from the vast expanse of chemical space and aid
optimization of compounds of interest into drugs.



 
 
 
 
 
 
 
 
 
 
 Cath O'Driscoll, Nature, 2004
A real world odyssey
L'universo chimico
L'universo chimico


Chemical space is the space spanned by all possible (i.e.
energetically stable) molecules and chemical compounds –
that is, all stoichiometric combinations of electrons and
atomic nuclei, in all possible topology isomers. 

Chemical reactions allow us to move in chemical space.
L'universo chimico


Chemical space is the space spanned by all possible (i.e.
energetically stable) molecules and chemical compounds –
that is, all stoichiometric combinations of electrons and
atomic nuclei, in all possible topology isomers. 

Chemical reactions allow us to move in chemical space. 

The mapping between chemical space and molecular
properties is often not unique, meaning that there can be
multiple molecules which exhibit the same properties
L'universo chimico


CAS REGISTRY 

is the most authoritative collection of disclosed chemical substance
information, containing more than 65 million organic and inorganic
substances and 63 million sequences

67,370,815 
Commercially available chemicals in CAS



Pubchem

Pcsubstance contains about 85 million records.

Pccompound contains nearly 30 million unique structures.

PCBioAssay contains more than 585,000 BioAssays. Each BioAssay
contains a various number of data points.
L'universo chimico


GDB-13 enumerates small organic molecules up to 13
atoms of C, N, O, S and Cl following simple chemical
stability and synthetic feasibility rules. 

With 977.468.314 structures, GDB-13 is the largest publicly
available small organic molecule database to date
L'universo chimico
L'universo chimico




 150 possibili sostituenti
da mono a 14 sostituenti
  10^29 derivati teorici
L'universo chimico


Navigating chemical space for biology and medicine

Christopher Lipinski
 & Andrew Hopkins
Nature 432, 855–861 (16 December 2004) doi:10.1038/nature03193
Despite over a century of applying organic synthesis to the search for drugs, we are still far
from even a cursory examination of the vast number of possible small molecules that could
be created. Indeed, a thorough examination of all ‘chemical space’ is practically
impossible. Given this, what are the best strategies for identifying small molecules
that modulate biological targets?
L'universo chimico


Navigating chemical space for biology and medicine

Christopher Lipinski
 & Andrew Hopkins
Nature 432, 855–861 (16 December 2004) doi:10.1038/nature03193
Despite over a century of applying organic synthesis to the search for drugs, we are still far
from even a cursory examination of the vast number of possible small molecules that could
be created. Indeed, a thorough examination of all ‘chemical space’ is practically
impossible. Given this, what are the best strategies for identifying small molecules
that modulate biological targets?





    Il salvarsan (o arsfenamina o 606) è un
    farmaco utilizzato nel trattamento della
    sifilide e della tripanosomiasi africana. È
    stato il primo agente chemioterapico
    conosciuto.
L'universo chimico
L'universo chimico
Trust, but verify


Many scientists TRUST chemistry and biology databases that are so
often reused, reanalyzed and integrated with new cheminformatics or
bioinformatics tools. 

The authors of such articles do not appear to analyze for problems
caused by poor DATA QUALITY or hypotheses that are incorrect due
to poor underlying data.


   
   
   
   
   
   
   
   
   
                                        Antony Williams, ChemSpider
Rappresentare molecole




MOLECULES
 real objects


                   MOLECULE
                REPRESENTATIONS
                     models



                                     MOLECULAR 
                                    DESCRIPTORS
                                      information
Rappresentare molecole

Chemical table file 

          
          benzene
          
          6 6 0 0    0 0 0 0   0 0 1 V2000
            1.9050   -0.7932   0.0000 C 0 0   0   0   0   0   0   0   0   0   0   0
            1.9050   -2.1232   0.0000 C 0 0   0   0   0   0   0   0   0   0   0   0
            0.7531   -0.1282   0.0000 C 0 0   0   0   0   0   0   0   0   0   0   0
            0.7531   -2.7882   0.0000 C 0 0   0   0   0   0   0   0   0   0   0   0
           -0.3987   -0.7932   0.0000 C 0 0   0   0   0   0   0   0   0   0   0   0
           -0.3987   -2.1232   0.0000 C 0 0   0   0   0   0   0   0   0   0   0   0
          2 1 1 0    0 0 0
          3 1 2 0    0 0 0
          4 2 2 0    0 0 0
          5 3 1 0    0 0 0
          6 4 1 0    0 0 0
          6 5 2 0    0 0 0
          M END
          $$$$
Rappresentare molecole


SMILES ®

Benzene: c1ccccc1
Metano: C
Etino: C#C
Sildenafil citrato (Viagra): 
OC(=O)CC(O)(CC(O)=O)C(O)=O.CCCc1nn(C)c2c1nc([nH]c2=O)-c1cc(ccc1OCC)S(=O)
(=O)N1CCN(C)CC1
Rappresentare molecole


InChI

The IUPAC International Chemical Identifier (InChI) is a
non-proprietary identifier for chemical substances that can
be used in printed and electronic data sources thus
enabling easier linking of diverse data compilations

http://www.inchi-trust.org/
Rappresentare molecole


InChI is short for International Chemical Identifier.

InChIs are text strings comprising different layers and
sublayers of information separated by slashes (/). 

Each InChI strings starts with the InChI version number
followed by the main layer. This main layer contains
sublayers for chemical formula, atom connections and
hydrogen atoms. 

Depending on the structure of the molecule the main layer
may be followed by additional layers e. g. for charge,
stereochemical and/or isotop information.

InChI=1S/C6H6/c1-2-4-6-5-3-1/h1-6H
Rappresentare molecole
Rappresentare molecole


Astrazioni:

grafi
grafi astratti
markush
descrittori (rappresentazioni numeriche)
fingerprints (rappresentazioni binarie)
Descrittori molecolari


"The molecular descriptor is the final result of a logic and mathematical
procedure which transforms chemical information encoded within a symbolic
representation of a molecule into a useful number or the result of some
standardized experiment. 

The field of molecular descriptors is strongly interdisciplinary and involves a mass of
different theories. For the definition of molecular descriptors, a knowledge of
algebra, graph theory, information theory, computational chemistry, theories of
organic reactivity and physical chemistry is usually required, although at
different levels. 

For the use of the molecular descriptors, a knowledge of statistics, chemometrics,
and the principles of the QSAR/QSPR approaches is necessary in addition to
the specific knowledge of the problem. Moreover, programming, sophisticated
software and hardware are often inseparable fellow-travelers of the researcher in
this field. 

From the introduction to the "Handbook of Molecular Descriptors"
by Roberto Todeschini and Viviana Consonni, Wiley-VCH, 2000.
Descrittori molecolari

The main classes of theoretical molecular descriptors are: 


• 0D-descriptors (i.e. constitutional descriptors, count descriptors), 
• 1D-descriptors (i.e. list of structural fragments, fingerprints), 
• 2D-descriptors (i.e. graph invariants), 
• 3D-descriptors (such as, for example, 3D-MoRSE descriptors,
    WHIM descriptors, GETAWAY descriptors, quantum-chemical
    descriptors, size, steric, surface and volume descriptors), 
• 4D-descriptors (such as those derived from GRID or CoMFA
    methods, Volsurf).
QSAR


More than a century ago, Crum-Brown and Fraser
expressed the idea that the physiological action of a
substance in a certain biological system (A) was a
function (f) of its chemical constitution C:


                         A = f C
QSAR


More than a century ago, Crum-Brown and Fraser
expressed the idea that the physiological action of a
substance in a certain biological system (A) was a
function (f) of its chemical constitution C:


                            A = f C
                                
To explain the complex relationships between molecules and
observed quantities, two main streams were developed, the first
related to the search for relationships between molecular
structures and physico-chemical properties (QSPR,
Quantitative Structure-Property Relationships) and the second
between molecular structures and biological activities (QSAR,
Quantitative Structure-Activity Relationships).
QSAR


There is a consensus among current predictive toxicologists that Corwin
Hansch is the founder of modern QSAR. In the classic article it was
illustrated that, in general, biological activity for a group of ‘congeneric’
chemicals can be described by a comprehensive model:

                      Log 1/C50 = a π + b ε + cS + d

in which C, the toxicant concentration at which an endpoint is manifested
(e.g. 50% mortality or effect), is related to a hydrophobicity term, p, an
electronic and a steric term, S, (typically Taft’s substituent constant,
ES).
Librerie computazionali


CDK

Openbabel

CACTVS

RDKit

SVL/MOE
Librerie computazionali


CDK
         Web: cdk.sf.net

            Linguaggio: Java (Jython, Groovy)
Openbabel
   GUI: n.a.

             Pro: licenza LGPL, Jmol
CACTVS
             Cons: solo per programmatori

            1 AMBIT
             2 Bioclipse
RDKit
       3 CDK Taverna
             4 CDKDescUI
             5 Evince

            6 HyperDossier

SVL/MOE      7 JChemPaint
             8 JOELib
             9 Jumbo
             10 KNIME CDK feature
             11 LICSS
             12 NMRShiftDB
             13 Nomen
             14 PaDEL
             15 QueryConstructor
             16 rcdk
             17 SafeBase(TM)
             18 Scaffold Hunter
             19 SENECA
             20 SmileMS
             21 Obsolete projects
             21.1 XB Edit (Working title)
             22 Jmol
Librerie computazionali


CDK
         Web: openbabel.org

            Linguaggio: c++, python/java/ perl bindings
Openbabel
   GUI: si!

             Pro: flessibilita'
CACTVS
             Cons:

RDKit

SVL/MOE
Librerie computazionali


CDK
         Web: xemistry.com

            Linguaggio: Tcl
Openbabel
   GUI: a pagamento

             Pro: free for academics, team
CACTVS
             Cons: Tcl

RDKit

SVL/MOE
Librerie computazionali


CDK
         Web: www.rdkit.org/

            Linguaggio: Python, c++
Openbabel
   GUI: n.a.

             Pro: smirks, team
CACTVS
             Cons: installazione, eta'

RDKit

SVL/MOE
Algoritmi: similarity search
Algoritmi: similarity search
Algoritmi: similarity search


Similarity measures, calculations that quantify the similarity
of two molecules, and screening, a way of rapidly
eliminating molecules as candidates in a substructure
search, are both processes that use fingerprints. 

Fingerprints are a very abstract representation of certain
structural features of a molecule
Algoritmi: similarity search


Structural keys

   • The presence/absence of each element, or if an element is common
     (nitrogen, for example), several bits might represent "at least 1 N", "at
     least 2 N", "at least 4 N", and so forth.
   • Unusual or important electronic configurations, such as "sp3 carbon" or
     "triple-bonded nitrogen."
   • Rings and ring systems, such as cyclohexane, pyridine, or napthalene.
   • Common functional groups, such as alcohols, amines, hydrocarbons, and
     so forth.
   • Functional groups of special importance in a particular database. For
     example, a database of organo-metallic molecules might have bits
     assigned for metal-containing functional groups; in a drug database one
     might have bits for specific skeletal features such as steroids and
     barbiturates.
Algoritmi: similarity search


For example, the molecule OC=CN would generate the
following patterns:

0-bond paths:
   C
 O
 N
1-bond paths:
   OC
 C=C
 CN
2-bond paths:
   OC=C
 C=CN
 
3-bond paths:
   OC=CN
Algoritmi: similarity search


10001001010001001010001001110100100010101
001001011100111010010010000100101011010010

Tanimoto index = c/(a + b + c)
Algoritmi: substructure search
Database pubblici


Pubchem

Chembl

ZINC

Vendors vari

BindingDB
Librerie chimiche


Problematiche

Registrazione

Unicità

Strumenti

     1. filtering
     2. normalizzazione
     3. generazione dei tautomeri
     4. stati ionici
     5. unicizzazione
     6. generazione dei conformeri
MMsINC 1.0

3.967.056 total compounds
3.297.001 parent compounds
449.482 ionic states
220.573 tautomers

283.464.647 conformers (about 30confs/mol); 
ordered by empirical E-pot; 
max 5 confs/mol (= about 4.6 conformers per compound)

Final number of conformers: 18.461.878 (for which we have ph4-FP and USR
descriptors)






Fanton et al, IEEE, 2008; Masciocchi et al, Nucleic acid research, 2009
MMsINC 2.0

92.355.744 compounds from 65 public data sources and commercial catalogs
71.206.303 after single-vendor-based cleaning
42.073.344 unique compounds after redundancy washing
40 M of alternative tautomers
5 M of ionic states
Expected number of conformers: about 220 M

Average intra-vendor redundancy: 14%
10 vendors with redundancy more than 40%!
4 vendors with redundancy = 0% (small sets, 100 - 2000 comp.)
Ridondanza chimica
L'impatto della tautomeria




250000


                                                                                  total pairs taut/neu
                                                                                  different pred
                                                                                  Different AD
                                                                                  diff pred & diff AD
200000




150000




100000




50000




    0


         Skin   DevTox   LC50DM   LC50FM   Carcinogenicity   Mutagenicity   BCF
Mimicking peptides... in silico
Mimicking peptides... in silico




Floris et al, Nucleic acid research, 2011; Floris M and Moro S, Molecular Informatics, 2011
Mimicking peptides... in silico
Mimicking peptides... in silico
Screening farmacoforico su larga scala


•   2 minutes for the screening of 1 ph4 model on the CRS4 cluster resources over 17
    M of conformers (4 M of commercial compounds)
•   Output: SDF with top commercial compounds with highest overlap with the original
    pharmacophore hypothesis
•   Possibility of multiple simultaneous screenings and parameter tuning in a
    reasonable time lapse
La cassetta degli attrezzi del chemoinformatico



•   Python, Java
•   R, Weka
•   Openbabel, CDK
•   Marvin Beans
•   un database personale
•   il BlueObelisk
L'importanza di un ambiente di lavoro sano
Ringraziamenti

    •   Alessandro Bulfone
    •   Prof Stefano Moro
    •   Silvana Urru, Andrea Cristiani, Ricardo Medda, Stefania Olla
    •   i colleghi di Outreach del CRS4
    •   i colleghi del CNR (IRGB-CNR, Prof F. Cucca)
    •   Marco Fanton, Mattia Sturlese, Fabian Cedrati, Davide Sabbadin
    •   tutti gli altri collaboratori: Alberto Manganaro, Emilio Benfenati, i
        colleghi del gruppo ministeriale QSAR-Reach, i colleghi del
        BlueObeslik, il gruppo TNBC
    •   la mia famiglia (Lolli, Ric, Vera, nonni assortiti, sorelle varie)



matteo.floris@gmail.com

Contenu connexe

Similaire à La chemoinformatica: uno strumento computazionale per la chimica farmaceutica

Cadd and molecular modeling for M.Pharm
Cadd and molecular modeling for M.PharmCadd and molecular modeling for M.Pharm
Cadd and molecular modeling for M.PharmShikha Popali
 
Structure based drug design
Structure based drug designStructure based drug design
Structure based drug designADAM S
 
structure based drug design ppt
structure based drug design pptstructure based drug design ppt
structure based drug design pptJabir Hussain
 
Bioinformatics t9-t10-biocheminformatics v2014
Bioinformatics t9-t10-biocheminformatics v2014Bioinformatics t9-t10-biocheminformatics v2014
Bioinformatics t9-t10-biocheminformatics v2014Prof. Wim Van Criekinge
 
Chemoinformatic File Format.pptx
Chemoinformatic File Format.pptxChemoinformatic File Format.pptx
Chemoinformatic File Format.pptxwadhava gurumeet
 
Computer aided drug design - a new drug discovery tool
Computer aided drug design - a new drug discovery toolComputer aided drug design - a new drug discovery tool
Computer aided drug design - a new drug discovery toolVikas Soni
 
Significance of computational tools in drug discovery
Significance of computational tools in drug discoverySignificance of computational tools in drug discovery
Significance of computational tools in drug discoveryDrMopuriDeepaReddy
 
Drug Discovery Today: Fighting TB with Technology
Drug Discovery Today: Fighting TB with TechnologyDrug Discovery Today: Fighting TB with Technology
Drug Discovery Today: Fighting TB with Technologyrendevilla
 
Bioinformatics role in Pharmaceutical industries
Bioinformatics role in Pharmaceutical industriesBioinformatics role in Pharmaceutical industries
Bioinformatics role in Pharmaceutical industriesMuzna Kashaf
 
Structural Bioinformatics.pdf
Structural Bioinformatics.pdfStructural Bioinformatics.pdf
Structural Bioinformatics.pdfRahmatEkoSanjaya1
 
An Introduction to Chemoinformatics for the postgraduate students of Agriculture
An Introduction to Chemoinformatics for the postgraduate students of AgricultureAn Introduction to Chemoinformatics for the postgraduate students of Agriculture
An Introduction to Chemoinformatics for the postgraduate students of AgricultureDevakumar Jain
 
Chemistry Reserach as a Social Machine
 Chemistry Reserach as a Social Machine Chemistry Reserach as a Social Machine
Chemistry Reserach as a Social MachineJeremy Frey
 
Computer aided drug design
Computer aided drug designComputer aided drug design
Computer aided drug designROHIT
 
Reproducibility in cheminformatics and computational chemistry research: cert...
Reproducibility in cheminformatics and computational chemistry research: cert...Reproducibility in cheminformatics and computational chemistry research: cert...
Reproducibility in cheminformatics and computational chemistry research: cert...Greg Landrum
 
Role of bioinformatics of drug designing
Role of bioinformatics of drug designingRole of bioinformatics of drug designing
Role of bioinformatics of drug designingDr NEETHU ASOKAN
 

Similaire à La chemoinformatica: uno strumento computazionale per la chimica farmaceutica (20)

Cadd and molecular modeling for M.Pharm
Cadd and molecular modeling for M.PharmCadd and molecular modeling for M.Pharm
Cadd and molecular modeling for M.Pharm
 
Structure based drug design
Structure based drug designStructure based drug design
Structure based drug design
 
Drug design
Drug designDrug design
Drug design
 
structure based drug design ppt
structure based drug design pptstructure based drug design ppt
structure based drug design ppt
 
Bioinformatics t9-t10-biocheminformatics v2014
Bioinformatics t9-t10-biocheminformatics v2014Bioinformatics t9-t10-biocheminformatics v2014
Bioinformatics t9-t10-biocheminformatics v2014
 
Chemoinformatic File Format.pptx
Chemoinformatic File Format.pptxChemoinformatic File Format.pptx
Chemoinformatic File Format.pptx
 
Computer aided drug design - a new drug discovery tool
Computer aided drug design - a new drug discovery toolComputer aided drug design - a new drug discovery tool
Computer aided drug design - a new drug discovery tool
 
Significance of computational tools in drug discovery
Significance of computational tools in drug discoverySignificance of computational tools in drug discovery
Significance of computational tools in drug discovery
 
Drug Discovery Today: Fighting TB with Technology
Drug Discovery Today: Fighting TB with TechnologyDrug Discovery Today: Fighting TB with Technology
Drug Discovery Today: Fighting TB with Technology
 
Assignment 105B.pptx
Assignment 105B.pptxAssignment 105B.pptx
Assignment 105B.pptx
 
Bioinformatics role in Pharmaceutical industries
Bioinformatics role in Pharmaceutical industriesBioinformatics role in Pharmaceutical industries
Bioinformatics role in Pharmaceutical industries
 
Cadd assignment 4 (sarita)
Cadd assignment 4 (sarita)Cadd assignment 4 (sarita)
Cadd assignment 4 (sarita)
 
Structural Bioinformatics.pdf
Structural Bioinformatics.pdfStructural Bioinformatics.pdf
Structural Bioinformatics.pdf
 
An Introduction to Chemoinformatics for the postgraduate students of Agriculture
An Introduction to Chemoinformatics for the postgraduate students of AgricultureAn Introduction to Chemoinformatics for the postgraduate students of Agriculture
An Introduction to Chemoinformatics for the postgraduate students of Agriculture
 
Chemistry Reserach as a Social Machine
 Chemistry Reserach as a Social Machine Chemistry Reserach as a Social Machine
Chemistry Reserach as a Social Machine
 
Computer aided drug design
Computer aided drug designComputer aided drug design
Computer aided drug design
 
Reproducibility in cheminformatics and computational chemistry research: cert...
Reproducibility in cheminformatics and computational chemistry research: cert...Reproducibility in cheminformatics and computational chemistry research: cert...
Reproducibility in cheminformatics and computational chemistry research: cert...
 
Role of bioinformatics of drug designing
Role of bioinformatics of drug designingRole of bioinformatics of drug designing
Role of bioinformatics of drug designing
 
CADD
CADDCADD
CADD
 
Gdt 2-126
Gdt 2-126Gdt 2-126
Gdt 2-126
 

Plus de CRS4 Research Center in Sardinia

Sequenziamento Esomico. Maria Valentini (CRS4), Cagliari, 18 Novembre 2015
Sequenziamento Esomico. Maria Valentini (CRS4), Cagliari, 18 Novembre 2015Sequenziamento Esomico. Maria Valentini (CRS4), Cagliari, 18 Novembre 2015
Sequenziamento Esomico. Maria Valentini (CRS4), Cagliari, 18 Novembre 2015CRS4 Research Center in Sardinia
 
Near Surface Geoscience Conference 2015, Turin - A Spatial Velocity Analysis ...
Near Surface Geoscience Conference 2015, Turin - A Spatial Velocity Analysis ...Near Surface Geoscience Conference 2015, Turin - A Spatial Velocity Analysis ...
Near Surface Geoscience Conference 2015, Turin - A Spatial Velocity Analysis ...CRS4 Research Center in Sardinia
 
GIS partecipativo. Laura Muscas e Valentina Spanu (CRS4), Cagliari, 21 Ottobr...
GIS partecipativo. Laura Muscas e Valentina Spanu (CRS4), Cagliari, 21 Ottobr...GIS partecipativo. Laura Muscas e Valentina Spanu (CRS4), Cagliari, 21 Ottobr...
GIS partecipativo. Laura Muscas e Valentina Spanu (CRS4), Cagliari, 21 Ottobr...CRS4 Research Center in Sardinia
 
Alfonso Damiano (Università di Cagliari) ICT per Smart Grid
Alfonso Damiano (Università di Cagliari) ICT per Smart Grid Alfonso Damiano (Università di Cagliari) ICT per Smart Grid
Alfonso Damiano (Università di Cagliari) ICT per Smart Grid CRS4 Research Center in Sardinia
 
Dinamica Molecolare e Modellistica dell'interazione di lipidi col recettore P...
Dinamica Molecolare e Modellistica dell'interazione di lipidi col recettore P...Dinamica Molecolare e Modellistica dell'interazione di lipidi col recettore P...
Dinamica Molecolare e Modellistica dell'interazione di lipidi col recettore P...CRS4 Research Center in Sardinia
 
Innovazione e infrastrutture cloud per lo sviluppo di applicativi web e mobil...
Innovazione e infrastrutture cloud per lo sviluppo di applicativi web e mobil...Innovazione e infrastrutture cloud per lo sviluppo di applicativi web e mobil...
Innovazione e infrastrutture cloud per lo sviluppo di applicativi web e mobil...CRS4 Research Center in Sardinia
 
Elementi di sismica a riflessione e Georadar (Gian Piero Deidda, UNICA)
Elementi di sismica a riflessione e Georadar (Gian Piero Deidda, UNICA)Elementi di sismica a riflessione e Georadar (Gian Piero Deidda, UNICA)
Elementi di sismica a riflessione e Georadar (Gian Piero Deidda, UNICA)CRS4 Research Center in Sardinia
 
Near Surface Geoscience Conference 2014, Athens - Real-­time or full­‐precisi...
Near Surface Geoscience Conference 2014, Athens - Real-­time or full­‐precisi...Near Surface Geoscience Conference 2014, Athens - Real-­time or full­‐precisi...
Near Surface Geoscience Conference 2014, Athens - Real-­time or full­‐precisi...CRS4 Research Center in Sardinia
 
Luigi Atzori Metabolomica: Introduzione e review di alcune applicazioni in am...
Luigi Atzori Metabolomica: Introduzione e review di alcune applicazioni in am...Luigi Atzori Metabolomica: Introduzione e review di alcune applicazioni in am...
Luigi Atzori Metabolomica: Introduzione e review di alcune applicazioni in am...CRS4 Research Center in Sardinia
 
Scripting e DataWarehouse sui Big Data. Luca Pireddu (CRS4)
Scripting e DataWarehouse sui Big Data. Luca Pireddu (CRS4)Scripting e DataWarehouse sui Big Data. Luca Pireddu (CRS4)
Scripting e DataWarehouse sui Big Data. Luca Pireddu (CRS4)CRS4 Research Center in Sardinia
 

Plus de CRS4 Research Center in Sardinia (20)

The future is close
The future is closeThe future is close
The future is close
 
The future is close
The future is closeThe future is close
The future is close
 
Presentazione Linea B2 progetto Tutti a Iscol@ 2017
Presentazione Linea B2 progetto Tutti a Iscol@ 2017Presentazione Linea B2 progetto Tutti a Iscol@ 2017
Presentazione Linea B2 progetto Tutti a Iscol@ 2017
 
Iscola linea B 2016
Iscola linea B 2016Iscola linea B 2016
Iscola linea B 2016
 
Sequenziamento Esomico. Maria Valentini (CRS4), Cagliari, 18 Novembre 2015
Sequenziamento Esomico. Maria Valentini (CRS4), Cagliari, 18 Novembre 2015Sequenziamento Esomico. Maria Valentini (CRS4), Cagliari, 18 Novembre 2015
Sequenziamento Esomico. Maria Valentini (CRS4), Cagliari, 18 Novembre 2015
 
Near Surface Geoscience Conference 2015, Turin - A Spatial Velocity Analysis ...
Near Surface Geoscience Conference 2015, Turin - A Spatial Velocity Analysis ...Near Surface Geoscience Conference 2015, Turin - A Spatial Velocity Analysis ...
Near Surface Geoscience Conference 2015, Turin - A Spatial Velocity Analysis ...
 
GIS partecipativo. Laura Muscas e Valentina Spanu (CRS4), Cagliari, 21 Ottobr...
GIS partecipativo. Laura Muscas e Valentina Spanu (CRS4), Cagliari, 21 Ottobr...GIS partecipativo. Laura Muscas e Valentina Spanu (CRS4), Cagliari, 21 Ottobr...
GIS partecipativo. Laura Muscas e Valentina Spanu (CRS4), Cagliari, 21 Ottobr...
 
Alfonso Damiano (Università di Cagliari) ICT per Smart Grid
Alfonso Damiano (Università di Cagliari) ICT per Smart Grid Alfonso Damiano (Università di Cagliari) ICT per Smart Grid
Alfonso Damiano (Università di Cagliari) ICT per Smart Grid
 
Big Data Infrastructures - Hadoop ecosystem, M. E. Piras
Big Data Infrastructures - Hadoop ecosystem, M. E. PirasBig Data Infrastructures - Hadoop ecosystem, M. E. Piras
Big Data Infrastructures - Hadoop ecosystem, M. E. Piras
 
Big Data Analytics, Giovanni Delussu e Marco Enrico Piras
 Big Data Analytics, Giovanni Delussu e Marco Enrico Piras  Big Data Analytics, Giovanni Delussu e Marco Enrico Piras
Big Data Analytics, Giovanni Delussu e Marco Enrico Piras
 
Dinamica Molecolare e Modellistica dell'interazione di lipidi col recettore P...
Dinamica Molecolare e Modellistica dell'interazione di lipidi col recettore P...Dinamica Molecolare e Modellistica dell'interazione di lipidi col recettore P...
Dinamica Molecolare e Modellistica dell'interazione di lipidi col recettore P...
 
Innovazione e infrastrutture cloud per lo sviluppo di applicativi web e mobil...
Innovazione e infrastrutture cloud per lo sviluppo di applicativi web e mobil...Innovazione e infrastrutture cloud per lo sviluppo di applicativi web e mobil...
Innovazione e infrastrutture cloud per lo sviluppo di applicativi web e mobil...
 
Elementi di sismica a riflessione e Georadar (Gian Piero Deidda, UNICA)
Elementi di sismica a riflessione e Georadar (Gian Piero Deidda, UNICA)Elementi di sismica a riflessione e Georadar (Gian Piero Deidda, UNICA)
Elementi di sismica a riflessione e Georadar (Gian Piero Deidda, UNICA)
 
Near Surface Geoscience Conference 2014, Athens - Real-­time or full­‐precisi...
Near Surface Geoscience Conference 2014, Athens - Real-­time or full­‐precisi...Near Surface Geoscience Conference 2014, Athens - Real-­time or full­‐precisi...
Near Surface Geoscience Conference 2014, Athens - Real-­time or full­‐precisi...
 
SmartGeo/Eiagrid portal (Guido Satta, CRS4)
SmartGeo/Eiagrid portal (Guido Satta, CRS4)SmartGeo/Eiagrid portal (Guido Satta, CRS4)
SmartGeo/Eiagrid portal (Guido Satta, CRS4)
 
Luigi Atzori Metabolomica: Introduzione e review di alcune applicazioni in am...
Luigi Atzori Metabolomica: Introduzione e review di alcune applicazioni in am...Luigi Atzori Metabolomica: Introduzione e review di alcune applicazioni in am...
Luigi Atzori Metabolomica: Introduzione e review di alcune applicazioni in am...
 
Mobile Graphics (part2)
Mobile Graphics (part2)Mobile Graphics (part2)
Mobile Graphics (part2)
 
Mobile Graphics (part1)
Mobile Graphics (part1)Mobile Graphics (part1)
Mobile Graphics (part1)
 
A Survey of Compressed GPU-based Direct Volume Rendering
A Survey of Compressed GPU-based Direct Volume RenderingA Survey of Compressed GPU-based Direct Volume Rendering
A Survey of Compressed GPU-based Direct Volume Rendering
 
Scripting e DataWarehouse sui Big Data. Luca Pireddu (CRS4)
Scripting e DataWarehouse sui Big Data. Luca Pireddu (CRS4)Scripting e DataWarehouse sui Big Data. Luca Pireddu (CRS4)
Scripting e DataWarehouse sui Big Data. Luca Pireddu (CRS4)
 

Dernier

Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 

Dernier (20)

Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 

La chemoinformatica: uno strumento computazionale per la chimica farmaceutica

  • 1. Matteo Floris La chemoinformatica uno strumento computazionale per la chimica farmaceutica CRS4 - collana seminari 2012 3 Maggio 2012
  • 2. Mi presento Matteo Floris Laurea in C.T.F., Univ di Padova Master in Bioinformatica, Koeln Univ. Dottorato in Biochimica, Univ. Roma “La Sapienza” Chemoinformatica: sviluppo di metodi per ligand based drug design Bioinformatica presso il CRS4 per 6 anni (genomica computazionale) matteo.floris@gmail.com
  • 3. Chemoinformatics or cheminformatics? Chemoinformatics is a vast discipline, standing on the interface of chemistry, biology and computer science D. Agrafiotis, J&J
  • 5. Premessa Drug design Rational drug design o rational design Ricerca di nuovi (potenziali!) farmaci sulla base della conoscenza di un target biologico
  • 6. Premessa Drug design Rational drug design o rational design Ricerca di nuovi (potenziali!) farmaci sulla base della conoscenza di un target biologico Drug design spesso si serve di tecniche di modeling computazionale (computer-aided drug design, CADD)
  • 7. Premessa Drug design Rational drug design o rational design Ricerca di nuovi (potenziali!) farmaci sulla base della conoscenza di un target biologico Drug design spesso si serve di tecniche di modeling computazionale (computer-aided drug design, CADD) Se la struttura tridimensionale del target molecolare è nota, allora si parla di structure-based drug design.
  • 8. Premessa Ligand based CADD Structure based CADD
  • 9. Premessa Ligand based CADD Structure based CADD Basato sulla conoscenza di altre molecole che in grado di legarsi col target biologico di interesse. Queste altre molecole possono essere utilizzate per costruire una ipotesi farmacoforica che definisca le caratteristiche minime richieste per avere l'interazione. In alternativa, le techiche quantitative structure-activity relationship (QSAR) permettono di cercare una correlazione tra proprietà chimico-fisiche della molecola e l'attività biologica.
  • 10. Premessa Ligand based CADD Structure based CADD Basato sulla conoscenza di altre Si basa sulla conoscenza della molecole che in grado di legarsi col struttura del target biologico di target biologico di interesse. interesse, ottenuta tramite tecniche di x-ray crystallography o Queste altre molecole possono spetroscopia NMR. essere utilizzate per costruire una ipotesi farmacoforica che definisca Qualora la struttura del target non le caratteristiche minime richieste fosse a disposizione, si può ovviare per avere l'interazione. con la costruzione di modelli tridimensionali per omologia. In alternativa, le techiche quantitative structure-activity Con l'ausilio di strumenti relationship (QSAR) permettono di computazionali è possibile stimare cercare una correlazione tra l'affinità e la selettività di uno o più proprietà chimico-fisiche della composti per il target. molecola e l'attività biologica.
  • 11. A virtual space odyssey One of the main goals in drug discovery is to identify and develop new ligands with high binding affinity towards a protein target. Today, there is increased reliance on computer-based tools […]. These help select molecules from the vast expanse of chemical space and aid optimization of compounds of interest into drugs. Cath O'Driscoll, Nature, 2004
  • 12. A real world odyssey
  • 14. L'universo chimico Chemical space is the space spanned by all possible (i.e. energetically stable) molecules and chemical compounds – that is, all stoichiometric combinations of electrons and atomic nuclei, in all possible topology isomers. Chemical reactions allow us to move in chemical space.
  • 15. L'universo chimico Chemical space is the space spanned by all possible (i.e. energetically stable) molecules and chemical compounds – that is, all stoichiometric combinations of electrons and atomic nuclei, in all possible topology isomers. Chemical reactions allow us to move in chemical space. The mapping between chemical space and molecular properties is often not unique, meaning that there can be multiple molecules which exhibit the same properties
  • 16. L'universo chimico CAS REGISTRY is the most authoritative collection of disclosed chemical substance information, containing more than 65 million organic and inorganic substances and 63 million sequences 67,370,815 Commercially available chemicals in CAS Pubchem Pcsubstance contains about 85 million records. Pccompound contains nearly 30 million unique structures. PCBioAssay contains more than 585,000 BioAssays. Each BioAssay contains a various number of data points.
  • 17. L'universo chimico GDB-13 enumerates small organic molecules up to 13 atoms of C, N, O, S and Cl following simple chemical stability and synthetic feasibility rules. With 977.468.314 structures, GDB-13 is the largest publicly available small organic molecule database to date
  • 19. L'universo chimico 150 possibili sostituenti da mono a 14 sostituenti 10^29 derivati teorici
  • 20. L'universo chimico Navigating chemical space for biology and medicine Christopher Lipinski & Andrew Hopkins Nature 432, 855–861 (16 December 2004) doi:10.1038/nature03193 Despite over a century of applying organic synthesis to the search for drugs, we are still far from even a cursory examination of the vast number of possible small molecules that could be created. Indeed, a thorough examination of all ‘chemical space’ is practically impossible. Given this, what are the best strategies for identifying small molecules that modulate biological targets?
  • 21. L'universo chimico Navigating chemical space for biology and medicine Christopher Lipinski & Andrew Hopkins Nature 432, 855–861 (16 December 2004) doi:10.1038/nature03193 Despite over a century of applying organic synthesis to the search for drugs, we are still far from even a cursory examination of the vast number of possible small molecules that could be created. Indeed, a thorough examination of all ‘chemical space’ is practically impossible. Given this, what are the best strategies for identifying small molecules that modulate biological targets? Il salvarsan (o arsfenamina o 606) è un farmaco utilizzato nel trattamento della sifilide e della tripanosomiasi africana. È stato il primo agente chemioterapico conosciuto.
  • 24. Trust, but verify Many scientists TRUST chemistry and biology databases that are so often reused, reanalyzed and integrated with new cheminformatics or bioinformatics tools. The authors of such articles do not appear to analyze for problems caused by poor DATA QUALITY or hypotheses that are incorrect due to poor underlying data. Antony Williams, ChemSpider
  • 25. Rappresentare molecole MOLECULES real objects MOLECULE REPRESENTATIONS models MOLECULAR DESCRIPTORS information
  • 26. Rappresentare molecole Chemical table file benzene 6 6 0 0 0 0 0 0 0 0 1 V2000 1.9050 -0.7932 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 1.9050 -2.1232 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 0.7531 -0.1282 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 0.7531 -2.7882 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 -0.3987 -0.7932 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 -0.3987 -2.1232 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2 1 1 0 0 0 0 3 1 2 0 0 0 0 4 2 2 0 0 0 0 5 3 1 0 0 0 0 6 4 1 0 0 0 0 6 5 2 0 0 0 0 M END $$$$
  • 27. Rappresentare molecole SMILES ® Benzene: c1ccccc1 Metano: C Etino: C#C Sildenafil citrato (Viagra): OC(=O)CC(O)(CC(O)=O)C(O)=O.CCCc1nn(C)c2c1nc([nH]c2=O)-c1cc(ccc1OCC)S(=O) (=O)N1CCN(C)CC1
  • 28. Rappresentare molecole InChI The IUPAC International Chemical Identifier (InChI) is a non-proprietary identifier for chemical substances that can be used in printed and electronic data sources thus enabling easier linking of diverse data compilations http://www.inchi-trust.org/
  • 29. Rappresentare molecole InChI is short for International Chemical Identifier. InChIs are text strings comprising different layers and sublayers of information separated by slashes (/). Each InChI strings starts with the InChI version number followed by the main layer. This main layer contains sublayers for chemical formula, atom connections and hydrogen atoms. Depending on the structure of the molecule the main layer may be followed by additional layers e. g. for charge, stereochemical and/or isotop information. InChI=1S/C6H6/c1-2-4-6-5-3-1/h1-6H
  • 31. Rappresentare molecole Astrazioni: grafi grafi astratti markush descrittori (rappresentazioni numeriche) fingerprints (rappresentazioni binarie)
  • 32. Descrittori molecolari "The molecular descriptor is the final result of a logic and mathematical procedure which transforms chemical information encoded within a symbolic representation of a molecule into a useful number or the result of some standardized experiment. The field of molecular descriptors is strongly interdisciplinary and involves a mass of different theories. For the definition of molecular descriptors, a knowledge of algebra, graph theory, information theory, computational chemistry, theories of organic reactivity and physical chemistry is usually required, although at different levels. For the use of the molecular descriptors, a knowledge of statistics, chemometrics, and the principles of the QSAR/QSPR approaches is necessary in addition to the specific knowledge of the problem. Moreover, programming, sophisticated software and hardware are often inseparable fellow-travelers of the researcher in this field. From the introduction to the "Handbook of Molecular Descriptors" by Roberto Todeschini and Viviana Consonni, Wiley-VCH, 2000.
  • 33. Descrittori molecolari The main classes of theoretical molecular descriptors are: • 0D-descriptors (i.e. constitutional descriptors, count descriptors), • 1D-descriptors (i.e. list of structural fragments, fingerprints), • 2D-descriptors (i.e. graph invariants), • 3D-descriptors (such as, for example, 3D-MoRSE descriptors, WHIM descriptors, GETAWAY descriptors, quantum-chemical descriptors, size, steric, surface and volume descriptors), • 4D-descriptors (such as those derived from GRID or CoMFA methods, Volsurf).
  • 34. QSAR More than a century ago, Crum-Brown and Fraser expressed the idea that the physiological action of a substance in a certain biological system (A) was a function (f) of its chemical constitution C: A = f C
  • 35. QSAR More than a century ago, Crum-Brown and Fraser expressed the idea that the physiological action of a substance in a certain biological system (A) was a function (f) of its chemical constitution C: A = f C To explain the complex relationships between molecules and observed quantities, two main streams were developed, the first related to the search for relationships between molecular structures and physico-chemical properties (QSPR, Quantitative Structure-Property Relationships) and the second between molecular structures and biological activities (QSAR, Quantitative Structure-Activity Relationships).
  • 36. QSAR There is a consensus among current predictive toxicologists that Corwin Hansch is the founder of modern QSAR. In the classic article it was illustrated that, in general, biological activity for a group of ‘congeneric’ chemicals can be described by a comprehensive model: Log 1/C50 = a π + b ε + cS + d in which C, the toxicant concentration at which an endpoint is manifested (e.g. 50% mortality or effect), is related to a hydrophobicity term, p, an electronic and a steric term, S, (typically Taft’s substituent constant, ES).
  • 38. Librerie computazionali CDK Web: cdk.sf.net Linguaggio: Java (Jython, Groovy) Openbabel GUI: n.a. Pro: licenza LGPL, Jmol CACTVS Cons: solo per programmatori 1 AMBIT 2 Bioclipse RDKit 3 CDK Taverna 4 CDKDescUI 5 Evince 6 HyperDossier SVL/MOE 7 JChemPaint 8 JOELib 9 Jumbo 10 KNIME CDK feature 11 LICSS 12 NMRShiftDB 13 Nomen 14 PaDEL 15 QueryConstructor 16 rcdk 17 SafeBase(TM) 18 Scaffold Hunter 19 SENECA 20 SmileMS 21 Obsolete projects 21.1 XB Edit (Working title) 22 Jmol
  • 39. Librerie computazionali CDK Web: openbabel.org Linguaggio: c++, python/java/ perl bindings Openbabel GUI: si! Pro: flessibilita' CACTVS Cons: RDKit SVL/MOE
  • 40. Librerie computazionali CDK Web: xemistry.com Linguaggio: Tcl Openbabel GUI: a pagamento Pro: free for academics, team CACTVS Cons: Tcl RDKit SVL/MOE
  • 41. Librerie computazionali CDK Web: www.rdkit.org/ Linguaggio: Python, c++ Openbabel GUI: n.a. Pro: smirks, team CACTVS Cons: installazione, eta' RDKit SVL/MOE
  • 44. Algoritmi: similarity search Similarity measures, calculations that quantify the similarity of two molecules, and screening, a way of rapidly eliminating molecules as candidates in a substructure search, are both processes that use fingerprints. Fingerprints are a very abstract representation of certain structural features of a molecule
  • 45. Algoritmi: similarity search Structural keys • The presence/absence of each element, or if an element is common (nitrogen, for example), several bits might represent "at least 1 N", "at least 2 N", "at least 4 N", and so forth. • Unusual or important electronic configurations, such as "sp3 carbon" or "triple-bonded nitrogen." • Rings and ring systems, such as cyclohexane, pyridine, or napthalene. • Common functional groups, such as alcohols, amines, hydrocarbons, and so forth. • Functional groups of special importance in a particular database. For example, a database of organo-metallic molecules might have bits assigned for metal-containing functional groups; in a drug database one might have bits for specific skeletal features such as steroids and barbiturates.
  • 46. Algoritmi: similarity search For example, the molecule OC=CN would generate the following patterns: 0-bond paths: C O N 1-bond paths: OC C=C CN 2-bond paths: OC=C C=CN 3-bond paths: OC=CN
  • 50. Librerie chimiche Problematiche Registrazione Unicità Strumenti 1. filtering 2. normalizzazione 3. generazione dei tautomeri 4. stati ionici 5. unicizzazione 6. generazione dei conformeri
  • 51. MMsINC 1.0 3.967.056 total compounds 3.297.001 parent compounds 449.482 ionic states 220.573 tautomers 283.464.647 conformers (about 30confs/mol); ordered by empirical E-pot; max 5 confs/mol (= about 4.6 conformers per compound) Final number of conformers: 18.461.878 (for which we have ph4-FP and USR descriptors) Fanton et al, IEEE, 2008; Masciocchi et al, Nucleic acid research, 2009
  • 52. MMsINC 2.0 92.355.744 compounds from 65 public data sources and commercial catalogs 71.206.303 after single-vendor-based cleaning 42.073.344 unique compounds after redundancy washing 40 M of alternative tautomers 5 M of ionic states Expected number of conformers: about 220 M Average intra-vendor redundancy: 14% 10 vendors with redundancy more than 40%! 4 vendors with redundancy = 0% (small sets, 100 - 2000 comp.)
  • 54. L'impatto della tautomeria 250000 total pairs taut/neu different pred Different AD diff pred & diff AD 200000 150000 100000 50000 0 Skin DevTox LC50DM LC50FM Carcinogenicity Mutagenicity BCF
  • 56. Mimicking peptides... in silico Floris et al, Nucleic acid research, 2011; Floris M and Moro S, Molecular Informatics, 2011
  • 59. Screening farmacoforico su larga scala • 2 minutes for the screening of 1 ph4 model on the CRS4 cluster resources over 17 M of conformers (4 M of commercial compounds) • Output: SDF with top commercial compounds with highest overlap with the original pharmacophore hypothesis • Possibility of multiple simultaneous screenings and parameter tuning in a reasonable time lapse
  • 60. La cassetta degli attrezzi del chemoinformatico • Python, Java • R, Weka • Openbabel, CDK • Marvin Beans • un database personale • il BlueObelisk
  • 61. L'importanza di un ambiente di lavoro sano
  • 62. Ringraziamenti • Alessandro Bulfone • Prof Stefano Moro • Silvana Urru, Andrea Cristiani, Ricardo Medda, Stefania Olla • i colleghi di Outreach del CRS4 • i colleghi del CNR (IRGB-CNR, Prof F. Cucca) • Marco Fanton, Mattia Sturlese, Fabian Cedrati, Davide Sabbadin • tutti gli altri collaboratori: Alberto Manganaro, Emilio Benfenati, i colleghi del gruppo ministeriale QSAR-Reach, i colleghi del BlueObeslik, il gruppo TNBC • la mia famiglia (Lolli, Ric, Vera, nonni assortiti, sorelle varie) matteo.floris@gmail.com