Structural Classification of Proteins Database - Hierarchical Structure

Hierarchical Structure

The source of protein structures is the Protein Data Bank. The unit of classification of structure in SCOP is the protein domain. What the SCOP authors mean by "domain" is suggested by their statement that small proteins and most medium sized ones have just one domain, and by the observation that human hemoglobin, which has an α2β2 structure, is assigned two SCOP domains, one for the α and one for the β subunit.

The shapes of domains are called "folds" in SCOP. Domains belonging to the same fold have the same major secondary structures in the same arrangement with the same topological connections. 1195 folds are given in SCOP version 1.75. Short descriptions of each fold are given. For example, the "Globin-like" fold is described as core: 6 helices; folded leaf, partly opened. The fold to which a domain belongs is determined by inspection, rather than by software.

The levels of SCOP are as follows.

  1. Class: Types of folds, e.g., beta sheets.
  2. Fold: The different shapes of domains within a class.
  3. Superfamily: The domains in a fold are grouped into superfamilies, which have at least a distant common ancestor.
  4. Family: The domains in a superfamily are grouped into families, which have a more recent common ancestor.
  5. Protein domain: The domains in families are grouped into protein domains, which are essentially the same protein.
  6. Species: The domains in "protein domains" are grouped according to species.
  7. Domain: part of a protein. For simple proteins, it can be the entire protein.

The folds are grouped into "classes". The classes are the top level, or "root" of the SCOP hierarchical classification. The classes are displayed something like this:

Classes:
a. All alpha proteins (284)
domains consisting of α-helices
b. All beta proteins (174)
domains consisting of ß-sheets
c. Alpha and beta proteins (a/b) (147)
Mainly parallel beta sheets (beta-alpha-beta units)
d. Alpha and beta proteins (a+b) (376)
Mainly antiparallel beta sheets (segregated alpha and beta regions)
e. Multi-domain proteins (alpha and beta) (66)
Folds consisting of two or more domains belonging to different classes
f. membrane and cell surface proteins and peptides (58)
Does not include proteins in the immune system
g. Small proteins (90)
Usually dominated by metal ligand, heme, and/or disulfide bridges
h. coiled-coil proteins (7)
Not a true class
i. Low resolution protein structures (26)
Peptides and fragments. Not a true class
j. Peptides (121)
peptides and fragments. Not a true class.
k. Designed proteins (44)
Experimental structures of proteins with essentially non-natural sequences. Not a true class

The number in brackets, called a "sunid", is a SCOP unique integer identifier for each node in the SCOP hierarchy. The number in parentheses indicates how many elements are in each category. For example, there are 284 folds in the "All alpha proteins" class. Each member of the hierarchy is a link to the next level of the hierarchy.

The first few folds of the 284 folds in the "All-α proteins" class are displayed something like the following.

Folds:
1. Globin-like (2)
core: 6 helices; folded leaf, partly opened
2. Long alpha-hairpin (20)
2 helices; antiparallel hairpin, left-handed twist
3. Type I dockerin domain (1)
tandem repeat of two calcium-binding loop-helix motifs, distinct from the EF-hand

Each fold is followed by a description of that fold.

The domains within a fold are further classified into superfamilies, which, in turn, are classified into families. Within a fold, domains belonging to the same superfamily are assumed to have a common ancestor. However, this ancestor is presumed to be distant, because the different members of a superfamily have low sequence identities. The two superfamilies of the "Globin-like" fold are displayed something like the following:

Superfamilies:
  1. Globin-like (4)
  2. alpha-helical ferredoxin (2) contains two Fe4-S4 clusters

No description is given for the "Globin-like" superfamily, presumably because its description is very like that of its fold, which has the same name.

Families are more closely related than superfamilies. Domains within a fold are placed in the same family if

  1. they have at least a 30% similarity in sequences, or, failing that,
  2. if they have some similarity in sequences, e.g., 15%, and perform the same function.

The similarity in sequence and structure is evidence that these proteins have a closer evolutionary relationship than do proteins in the same superfamily. Sequence tools, such as BLAST, are used to assist in placing domains into superfamilies and families. The four families in the "Globin-like" superfamily of the "Globin-like" fold are displayed something like the following.

Families:
  1. Truncated hemoglobin (6) lack the first helix (A)
  2. Nerve tissue mini-hemoglobin (neural globin) (1) lack the first helix but otherwise is more similar to conventional globins than the truncated ones
  3. Globins (81) Heme-binding protein
  4. Phycocyanin-like phycobilisome proteins (26) oligomers of two different types of globin-like subunits containing two extra helices at the N-terminus binds a bilin chromophore

The families in SCOP may also be referred to using a SCOP concise classification string, sccs, which looks like, e.g., a.1.1.2 for the "Globin" family. The letter identifies the class to which the domain belongs; the following integers identify the fold, superfamily, and family, respectively.

Within a family are protein domains. Proteins are placed in the same protein domain if they are isoforms of each other, or if they are essentially the same protein, but from different species. This is apparently done manually. The "protein domains" are further subdivided into species. ("Protein domains" are not on separate pages in the current release of SCOP; in pre-SCOP, they are on separate pages.) Here is how some of the 81 protein domains of the "Globins" family are displayed.

Protein Domains:
7. Leghemoglobin
1. Yellow lupin (Lupinus luteus) (17)
2. Soybean (Glycine max), isoform A (2)
8. Non-symbiotic plant hemoglobin
1. Rice (Oryza sativa) (1)
9. Hemoglobin, alpha-chain
1. Human (Homo sapiens) (192)
2. Human (Homo sapiens), zeta isoform (1)
3. Horse (Equus caballus) (19)
4. Deer (Odocoileus virginianus) (1)

The "TaxId" is the taxonomy ID number; it is also a link to the NCBI taxonomy browser, which provides more information about the species to which the protein belongs.

Clicking on a species or isoform brings up a list of domains. Here is how some of the 192 domains of the "Hemoglobin, alpha-chain from Human (Homo sapiens)" protein are displayed.

PDB Entry Domains:
1. 2dn3
automatically matched to d1abwa1
complexed with cmo, hem
1. region a:2-141
2. 1ird
complexed with cmo, hem
1. chain a
3. 2dn1
automatically matched to d1abwa1
complexed with hem, mbn, oxy
1. region a:2-141

Clicking on the PDB numbers is supposed to display the structure of the molecule, but the links are currently broken. (The links do work in pre-SCOP.)

Read more about this topic:  Structural Classification Of Proteins Database

Other articles related to "hierarchical structure, hierarchical":

Horse Behavior - Horses As Herd Animals - Hierarchical Structure
... in large groups, establishment of a stable hierarchical system or "pecking order" is important to smooth group functioning ...
Functional Decomposition - Applications - Machine Learning
... When a natural or artificial system is intrinsically hierarchical, the joint distribution on system variables should provide evidence of this ... The task of an observer who seeks to understand the system is then to infer the hierarchical structure from observations of these variables ... This is the notion behind the hierarchical decomposition of a joint distribution, the attempt to recover something of the intrinsic hierarchical structure which generated that joint distribution ...
Aperiodic Tiling - Constructions - Aperiodic Hierarchical Tilings
... there is not a formal definition describing when a tiling has a hierarchical structure nonetheless, it is clear that substitution tilings have them, as do the tilings of Berger, Knuth, Läuchli and Robinson ... As with the term "aperiodic tiling" itself, the term "aperiodic hierarchical tiling" is a convenient shorthand, meaning something along the lines of "a set of tiles admitting only non-periodic tilings with a ... Each of these sets of tiles, in any tiling they admit, forces a particular hierarchical structure ...
Dynamic Antisymmetry
... The crux of this theory is that hierarchical structure in natural language maps universally onto a particular surface linearization, namely specifier-head-complement branching ... To understand what is meant by hierarchical structure, consider the sentence, The King of England likes apples ... by a pronoun, we say that it constitutes a hierarchical unit (called a constituent) ...

Famous quotes containing the words structure and/or hierarchical:

    I’m a Sunday School teacher, and I’ve always known that the structure of law is founded on the Christian ethic that you shall love the Lord your God and your neighbor as yourself—a very high and perfect standard. We all know the fallibility of man, and the contentions in society, as described by Reinhold Niebuhr and many others, don’t permit us to achieve perfection.
    Jimmy Carter (James Earl Carter, Jr.)

    Authority is the spiritual dimension of power because it depends upon faith in a system of meaning that decrees the necessity of the hierarchical order and so provides for the unity of imperative control.
    Shoshana Zuboff (b. 1951)