MSDmotif: exploring protein sites and motifs

Background Protein structures have conserved features – motifs, which have a sufficient influence on the protein function. These motifs can be found in sequence as well as in 3D space. Understanding of these fragments is essential for 3D structure prediction, modelling and drug-design. The Protein Data Bank (PDB) is the source of this information however present search tools have limited 3D options to integrate protein sequence with its 3D structure. Results We describe here a web application for querying the PDB for ligands, binding sites, small 3D structural and sequence motifs and the underlying database. Novel algorithms for chemical fragments, 3D motifs, ϕ/ψ sequences, super-secondary structure motifs and for small 3D structural motif associations searches are incorporated. The interface provides functionality for visualization, search criteria creation, sequence and 3D multiple alignment options. MSDmotif is an integrated system where a results page is also a search form. A set of motif statistics is available for analysis. This set includes molecule and motif binding statistics, distribution of motif sequences, occurrence of an amino-acid within a motif, correlation of amino-acids side-chain charges within a motif and Ramachandran plots for each residue. The binding statistics are presented in association with properties that include a ligand fragment library. Access is also provided through the distributed Annotation System (DAS) protocol. An additional entry point facilitates XML requests with XML responses. Conclusion MSDmotif is unique by combining chemical, sequence and 3D data in a single search engine with a range of search and visualisation options. It provides multiple views of data found in the PDB archive for exploring protein structures.


Asx-motif
A motif of five consecutive residues and two H-bonds in which: -residue(i) is Aspartate or Asparagine (Asx) -side-chain O of residue(i) is H-bonded to the main-chain NH of residue(i+2) or (i+3) -main-chain CO of residue(i) is H-bonded to the main-chain NH of residue(i+3) or (i+4)

Asx-turn
A motif of three consecutive residues and one H-bond in which: -residue(i) is Aspartate or Asparagine (Asx) -the side-chain O of residue(i) is H-bonded to the main-chain NH of residue(i+2). Types I' and II' Left-handed form of Type I and II consequently

Beta-bulge
A motif of three residues within a -sheet in which the β main chains of two consecutive residues are H-bonded to that of the third, and in which the dihedral angles are as follows:

Beta-bulge loop
A motif of three residues within a -sheet consisting β of two H-bonds in which: -the main-chain NH of residue(i) is H-bonded to the main-chain CO of residue(i+4) (Type 1) or residue(i+5) (Type 2) -the main-chain CO of residue i is H-bonded to the main-chain NH of residue(i+3) (Type 1) or residue(i+4) (Type 2)

Beta-turn
A motif of four consecutive residues that may contain one H-bond, which, if present, is between the mainchain CO of the first residue and the main-chain NH of the fourth. It is characterized by the dihedral angles of the second and third residues, which are the basis for sub-categorization:

Nest
A motif of two consecutive residues with dihedral angles as follows (for the RL form):

Sub-categories
Type RL residue(i): In LR nests the and values for (i) and (i+1) are φ ψ interchanged.
Nest should not have Proline as any residue.

Schellmann loop
A motif of six consecutive residues (common type) or seven consecutive residues (wide type) that contains two H-bonds in which: -the main-chain CO of residue(i) is H-bonded to the main-chain NH of residue(i+5) (common type) or residue(i+6) (wide type) -the main-chain CO of residue(i+1) is H-bonded to the mainchain NH of residue(i+4) (common type) or residue(i+5) (wide type)

ST-motif
A motif of five consecutive residues and two H-bonds in which: -residue(i) is Serine (S) or Threonine (T) -side-chain O of residue(i) is H-bonded to the main-chain NH of residue(i+2) or (i+3) -main-chain CO of residue(i) is H-bonded to the main-chain NH of residue(i+3) or (i+4)

ST-staple
A motif of four or five consecutive residues and one Hbond in which: -residue(i) is Serine (S) or Threonine (T) -the side-chain OH of residue(i) is H-bonded to the main-chain CO of residue(i-3) or (i-4) -angles of residues(i-1), (i-2) and (i-3) are negative. φ

ST-turn
A motif of three consecutive residues and one H-bond in which: -residue(i) is Serine (S) or Threonine (T) -the side-chain O of residue(i) is H-bonded to the main-chain NH of residue(i+2).

Gamma-turn
A motif of three consecutive residues i, i+1, i+2 and one H-bond in which: -the main-chain O of residue(i) is H-bonded to the main-chain NH of residue(i+2).