CATH provides information on protein structures that are related by evolution. Protein structures (from the PDB) are chopped into structural domains then grouped into evolutionary superfamilies.

The CATH database is a hierarchical domain classification of protein structures in the Protein Data Bank (PDB). Only high-quality structures, as determined by the SIFT criteria (eg crystal structures of resolution 4.0 angstroms or better), are considered. The domains are first identified from within the PDB chains and then classified into four major levels in the CATH hierarchy: Class, Architecture, Topology (fold family) and Homologous superfamily. Both steps are performed using a combination of automated and manual procedures: structures with very high similarity to structures already present in CATH are automatically chopped into domains and/or assigned to the correct superfamily; all other structures are processed manually using the evidence generated from automatically performed scans.

Group Leader:Christine Orengo