BioDec, based near Bologna, Italy, provides top-notch services, solutions, and consulting in the field of lab data management and in postgenomics "in silico" research. The presentation summarizes our main achievements and describes our commercial offer.
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
BioDec Srl Company Profile
1. Introducing BioDec ...
● Bioinformatic turnkey
solutions
BioDec S.r.l.
● Bioinformatics consulting
Via Calzavecchio 20/2 ● Biosequence DB
I-40033 Casalecchio di Reno (BO)
annotation pipelines
info@biodec.com
www.biodec.com ● Web applications
● System integration
Close ties to UniBo Biocomputing Unit
(www.biocomp.unibo.it)
2. Business Activities
Bioinformatic turnkey solutions
• Integrated solutions:
• Bioinformation / Lab Data Management System - CMS-
based lab data, diary and workflow management;
• Molecular Anthropology - Haplogroup tree browsers;
• Plone4bio - CMS-based biosequence management and
annotation software.
• Decoders (predictors).
Bioinformatics Consulting
• Development, engineering and integration of custom software;
• Annotated databases of biosequences (e.g. genomes).
Our Forte
• Bioinformation management;
• Machine-learning methods and annotation pipelines;
• Web applications.
2/25
3. Bioinformation Management System
A bundle including Plone4Bio, the annotated databases, and
BioDec's own lab data management software, to collect, manage and
analyse laboratory data from specific molecular biology techniques.
• Lab diary, recording aims and conditions of the experiment, including
structured pages for the recording of experimental data, specific
of each technique. Data may include digital images, and may be
shared within a working group or an organization according to
flexible, entirely user-defined security levels. Presently supports:
●
Immunogenic assays (due 1st quarter 2010);
● Polymerase Chain Reaction techniques;
● Electrophoresis blot techniques.
• Customizable workflows are supported, and may differ for specific
techniques or Customer specifications, thus ensuring a full match
between the rules applied in the lab and the data lifecycle.
3/25
7. Haplone
● An application for
the management
of molecular
antropology data
● Allows to store,
inspect, search
and retrieve data
through the
familiar interface
of a standard web
browser
7/25
8. Haplone screenshots
● Population subsets can be
easily selected (by location,
haplogroup, sex, MRCA) using
simple query forms, whose
reports also provide basic
statistics and charts on the
selected sets.
● By leveraging on Plone access-
control features, the
application can handle
selective access to stored data,
allowing fine-grained control on
what it can be accessed by an
anonymous vs an
authenticated user, so it can be
used both for internal
information sharing and data
dissemination simultaneously.
8/25
9. Haplone screenshots
● Data are stored in the
"Subject" data structure,
containing both personal and
molecular data.
● For each subject, the
application calculates on-the-
fly the haplogroup, based on
its tested UEPs, as well as the
most recent common ancestor
for each sexual lineage, based
on the stored population.
● The system also takes care of
checking data consistency and
flags the user for potential
errors, such as inconsistent or
conflicting UEPs or out-of-
range STRs within a given
9/25
10. Plone4Bio
● A Plone-based (http://plone.net), feature-rich graphic
BioSQL browser, to search and explore data and
metadata (annotations) from biosequence databases.
May integrate custom-made predictors (“Decoders”);
● We publicly released the base version, including an
example predictor and documentation, as open-source
software, available from http://plone4bio.org;
● Plone4Bio is reliable and modular, and is the basis of
BioDec commercial software bundles.
10/25
14. Plone4Bio commercial release
User-managed pipeline for biosequence analysis and comparison,
providing:
● Full CMS features: several standard and user-defined content
types (including files, pages, RSS feeds...);
● Full set of Decoders: to annotate biosequences from any
database;
● User-defined biosequences: customer's own biosequences
may be instantiated and populated;
● Annotated databases: our decoders have been used to
annotate and cross-link public databases sections from Uniprot,
NCBI and Ensembl, thus providing a reliable and meaningful
metadata set. Custom annotated databases are available on
request.
14/25
15. Tools from machine learning
Known sequences (DB subsets) New sequence
TTCCPSIVARSNFNVCRLPGTPEAICATYTGCIIIPGATCPGDYAN
ANN, ANN,
General HMM,
HMM,
SVM Rules SVM
Known mapping
Prediction
Known • Artificial Neural Networks (ANNs)
structures • Hidden Markov Models (HMMs)
• Support Vector Machines (SVMs)
15/25
16. BioDec Decoders
Protein sequence tools Protein Structure tools
● Transmembrane all-alpha sequence ● Interaction patch prediction
● Transmembrane all-beta sequence ● Fold recognition and modelling
● Signal peptide RNA tools
● GPI-anchoring prediction ● siRNA design
● Coiled-coil segment prediction
● Disordered region prediction Our toolbox is built to be MODULAR,
● Subcellular localization prediction so we can assemble analysis systems
according to your specifications.
● Sequence classification
Papers about our decoders have been published in journals such as
Bioinformatics, Journal of Molecular Biology, Nucleic Acids Research.
16/25
17. A BioDec Decoder: ZenDock
● Interaction patch decoder
● Analyzes protein solvent-
exposed surface for putative
“interactor” residues,
returning a “fuzzy”
(probabilistic) answer.
● Interactors are correlated
and grouped into patches.
● Results are mapped on the
protein 3D structure and
made available through a Int non-Int
web interface.
Contact-shell profile
17/25
18. Case study - Searching for the “membrane
fraction” of a bacterial proteome
• Angler, a Pipeline for the annotation of proteome biosequences
such as prokaryotic proteomes, is used to select candidates for
epitope selection.
• The annotation results (“Proteome Atlases”) are published in
annotated databases.
Angler classifies gram- Proteome Predictions:
negative proteome ✔ Signal peptides
sequences into nine
✔ Betabarrels
different classes, including Generate Classify:
✔ Alpha-helical TMP
the all-beta membrane profiles 9 classes
✔ Fold recognition
proteins and the soluble
✔ Coiled coils
secreted proteins, most
relevant to vaccine ✔ Disordered regions
Proteome
development.
Atlas
18/25
19. Angler performance
EcoGene Knowledge Base (E. Coli K12)
Accuracy Coverage
All-alpha TMP 92.3% 92.6%
All-beta TMP 86.7% 75.0%
Soluble 96.9% 95.9%
All-alpha TMPs represent less than 20% of a proteome.
All-beta TMPs represent less than 5% of a proteome.
Outer Membrane Inner Membrane
β-barrel α-helices
Bilayer
Porin Bacteriorhodopsin
(Rhodobacter capsulatus) (Halobacterium salinarum)
19/25
20. Case Study - siRNA Design
• Sirena, a siRNA design engine, used two very
large, consistent and independent data sets from
the literature: one for fitting, the other for testing.
• Based on machine-learning: Neural Network.
• Prediction is performed on a 19nt input window.
• Input is the sequence and the sequence composition for each
window in the mRNA sequence.
• Output is an estimated knock-down efficiency (Q).
• Around 70% of the reported candidates are expected to have
experimental knock-down efficiency greater than 80%.
• Successfully field-tested by TargetHerpes virologists.
TargetHerpes.org
20/25
21. Case study - Discovering new
receptors of pharmaceutical interest
I have a compound library active
on some receptors subclasses.
Q.: Which sequences in the
Human genome may be
targeted by my library?
21/25
22. BioDec's approach
Train
● Find all known, classified
receptors.
● Build a MSA using both sequence
and 7TM-topology informations.
● From MSA, derive and partition a
classification tree.
● Train class-specific HMMs.
Predict
● Scan Human genome using our
“Septimus” tool to find putative
new 7TM-receptors.
● Classify using the class HMMs.
22/25
23. Success
● Consistent 7TM prediction
for all the targets, useful for
receptor modelling.
● Some newly-classified
targets have been
succesfully validated.
– New targets
– New lead molecules
23/25
24. Case study - blocking protein-protein interaction
● ZenDock: multiple analysis on each
available structure then consensus.
● Predicted a new potential interaction
site.
● Very good agreement between
prediction and experiments.
● Peptide found is now a new-drug lead.
24/25
25. Thank you
BioDec S.r.l.
Via Calzavecchio 20/2
I-40033 Casalecchio di Reno (BO)
info@biodec.com
www.biodec.com
25/25