1. ORganic VIrtual Library
A program to create Virtual Compound Library using Organic Substituents
'HYHORSHG E 63UDVDQWK .XPDU
%LRLQIRUPDWLFLDQ
2. What it is all about ?
Generation of large number of compounds = Library
Focused on Organic Substituents (200 in no.)
Only most frequent organic substituents
Non-
2 Types: Non- Aromatic Aromatic
H3C Cl
CH 3 R
R
3. Basement
System):
SMILES (Simplified Molecular Input Line Entry System): is a line notation (a
typographical method using printable characters) for entering and
representing molecules and reactions.
Why SMILES is chosen ?
Easy to storage and retrieval
Requires minimum storage space in hard disks
Reproducibility
String manipulation can be done which can be easily reflected in 3D
4. Focus Scope
No Diverse Compounds = Molecular Diversity Minimum
Maintain Scaffold
Play with Side Chains / Terminals
Creating Analogues
Requires priori knowledge of substitution in SMILE
Library Generation based on Well Furnished Substitution Site; Not on
Scaffold
Deals with Organic Subset : B,C,N,O,P,S
5. Advantages
Freely available
Command line interpreter Interface
Execution across all platforms
Customizable PERL program for own library creation
Output files in .smi format (SMILES), easily recognized by molecular editors
Generate a Library of 1200 unique molecules
Fragment-
Useful for HTS and Fragment-based approach
6. Limitations
No GUI
No atoms other than Organic Subset (e.g. Silicon) since their occurrence is
known to be very less
Users who want to customize must know the structure’s SMILE
(It can be easily tackled by Molecular Editors)
Minimum Requirements
http://www.ActiveState.com/ActivePerl/)
Active Perl 5.10 (Freeware http://www.ActiveState.com/ActivePerl/)
Hardware Space Required:
Generates 60.8 KB of 1200 library files (if your files are compressed by
Generic Windows Compression)
8. Region of Substitution
S H3C
C
A CN(Cl)CCOc1ccc(cc1)/C(c2ccccc2)=C(/CC)c3ccccc3
F Tamoxifen’s SMILES
F
O Input Format
O
L
CN(Cl)CCOc1ccc(cc1)/C(c2ccccc2)=C(region)c3ccccc3
D H3C
N
CH3
Use the word REGION or region
Tamoxifen (Breast Cancer Drug) in order to recognize the place
of organic substituents
substitution
9. Let us generate an Organic Virtual Library of
Tamoxifen
Command line interpreter Interface of ORVIL
13. Applications
Cl
Cl
O Cl
H3C N
N
O CH3 CH 3
Fareston C-Compound #3
14. References
Uwe Eichlera, Peter Ertla, Alberto Gobbia and Bernd Rohdeb, 1999.
Definition of an Optimal Subset of Organic Substituents. Interactive Visual
Comparison of Various Selection Algorithms. Internet Journal of Chemistry,
2(14).
Perl Programming by James Tindall
Marvin Family, Markush Enumerations, ChemAxon Ltd
ChemSketch, ACDLabs
(Entire List will be available in Manuscript)