3. CATH database
o The CATH database is a free, publicly available online resource
that provides information on the evolutionary relationships of
protein domains.
o It was created in the mid-1990s by Professor Christine Orengo
and colleagues, and continues to be developed by the Orengo
group at University College London.
4. Protein domains are the basic units of proteins that can
fold, function, and evolve independently
Knowledge of protein domains is critical for protein
classification, understanding their biological functions,
annotating their evolutionary mechanisms and protein
design.
Domains are obtained from protein structures deposited
in the Protein Data Bank.
Both domain identification and subsequent
classification use manual as well as automated
procedures.
Protein Domains
5. The data in CATH are obtained from PDB files
deposited in the Protein Data Bank.
The structures can be determined only with a
resolution of 4Ǻor better are included.
Further more CATH requires the domains with
minimum 40 residues of length with 70% or more
side chains.
6. Submitted protein chains are chopped to obtain
the domains.
Classification are assigned to the resulting
domains.
7.
8.
9. highest level-placed the selected protein
into 1 of 4 categories of secondary structure.
description of the cross
arrangement of secondary structure, independent
of topology.
indication of over all shape and
connectivity of protein’s secondary structures.
proteins of known
structure that are homologous (share a common
ancester) to a selected protein.
10.
11. 308,999 structural protein domain entries
53,479,436 non-structural protein domain entries
2,737 homologous super family entries
92,882 functional family entries