SuperMimic – Fitting peptide mimetics into protein structures
© Goede et al; licensee BioMed Central Ltd. 2006
Received: 06 June 2005
Accepted: 10 January 2006
Published: 10 January 2006
Various experimental techniques yield peptides that are biologically active but have unfavourable pharmacological properties. The design of structurally similar organic compounds, i.e. peptide mimetics, is a challenging field in medicinal chemistry.
SuperMimic identifies compounds that mimic parts of a protein, or positions in proteins that are suitable for inserting mimetics. The application provides libraries that contain peptidomimetic building blocks on the one hand and protein structures on the other. The search for promising peptidomimetic linkers for a given peptide is based on the superposition of the peptide with several conformers of the mimetic. New synthetic elements or proteins can be imported and used for searching.
We present a graphical user interface for finding peptide mimetics that can be inserted into a protein or for fitting small molecules into a protein. Using SuperMimic, promising locations in proteins for the insertion of mimetics can be found quickly and conveniently.
Many protein interactions are known, mostly involving other proteins, peptides or different organic molecules, and more and more are being deciphered. The main goal of drug design is to interfere specifically with these interactions. As peptides are often poor drug candidates, the need arises for bioequivalent compounds with better pharmacological properties. Starting from a known spatial structure, the aim is to find compounds that mimic the function of a peptide but have improved cellular transport properties, low toxicity, few side effects and more rigid structures as well as protease resistance [1, 2].
Various methods exist for developing peptide mimetics. These include computational as well as experimental screening methods. One method is to identify small peptides that are essential for the interactions of the protein, e.g. using SPOT synthesis. Subsequently, mimetics for these peptides are designed that can be used as drugs. On the basis of a known protein structure, scaffolding templates for binders can also be constructed and then optimised using different methods (see [3–5] for reviews).
The approach presented in this paper is to detect peptide mimetics directly using a known protein structure and a mimetic structure. Specific atomic positions are defined in both structures and then compared with respect to their spatial conformations. In this way, organic compounds that fit into the backbone of a protein can be identified. Conversely, it is possible to find protein positions where a specific mimetic could be inserted.
A practical application of SuperMimic could be the design of an artificial protein in which peptidomimetic building blocks replace parts of the backbone and that can subsequently be synthesized. Moreover, it is possible to find organic compounds or design artificial peptides that imitate the binding site and hence the functionality of a protein.
A library containing peptidomimetic building blocks collected from the literature and represented by several conformations, as well as several protein structural libraries, are made available. Both libraries can be scanned exhaustively. The searches can also be performed with structures provided by the user.
Protein and mimetic libraries
Using the program SuperMimic, collections of short chains of PDB structures  as well as peptide mimetics can be scanned. In order to guarantee rapid access to 3D data, all libraries are stored in binary form. In addition, the address of each protein chain within the binary file is stored and imported together with a list of the chains at the start of the program. Thus, samples of proteins from the library can be scanned at low expense.
Peptide mimetic structures are arranged in sub-libraries saved in separate files and automatically loaded after the program is started. This facilitates regular fast updates of the libraries by creating new files.
The 'goodness' of a pair of stem positions is then evaluated on the basis of these parameters by the formula
goodness = Δx2 + Δy2 + 2(Δβ2 + Δγ2),
where e.g. Δx2 denotes the squared deviation of the x values. The square root of the goodness is an upper estimate of the Root Mean Square Deviation (RMSD) of the stem atoms. A detailed description of the procedure can be found in .
In this way, a pre-selection of suitable candidates is obtained. This primary search permits rapid calculations because the evaluation of goodness is significantly less expensive than that of RMSD. Pairs of stem atoms yielding a goodness below a given limit are retained and their RMSD is calculated according to the algorithm described by Kabsch . These calculations can also be performed very rapidly, as the required spatial coordinates are stored in the main memory.
The procedure described so far is carried out for each chosen protein or protein chain, and the hits are collected. Finally, they are reordered according to the RMSD of the stem atoms. Different goodness limits in the primary search are set depending on the kind of the search, so that the set of hits is restricted to a reasonable size.
Results and Discussion
Peptide mimetics libraries
SuperMimic provides a library of 126 peptidomimetic structures. It contains 88 synthetic elements described in the literature, which have been arranged in sublibraries such as beta-turn- or gamma-turn-mimetics. Some of them are known to be drug-like compounds. Appropriate references can be found on the website. Moreover, the library contains a collection of 18 peptides, each comprising a sequence of one D-amino acid flanked by two L-amino acids, which can be used as beta- or gamma-turn mimetics, and 20 peptidomimetic ligands extracted from PDB structures. In order to account for the flexibilities of the peptide mimetics, each structure contained in the library is represented by 5–13 low-energy conformers. These were generated by the Accelrys software MedChem Explorer, using the algorithm of Smellie et al. .
Insertion of peptide mimetics into proteins is realised by chemical syntheses. Such syntheses are mostly practised with small proteins, so it is useful to restrict a search to small protein chains. To allow candidates for synthesis to be identified easily and rapidly, the program is linked to a library of such proteins. This library contains 10403 chains of PDB structures  up to 100 amino acids long. Alternatively, this large library can be replaced by a set of 2206 chains with less than 90% sequence identity, represented by structures with best resolution, or by a set of 416 chains with less than 30% sequence identity. All protein chain sets were generated using the Columba database .
SuperMimic permits two general searching approaches. Firstly, it is possible to conduct a fast scan for small molecules that mimic the structure of a given peptide or can be inserted into a given protein or peptide. Secondly, starting with a peptidomimetic structure, positions in proteins suitable for its insertion can be screened. There are several options for the screening process.
A protein structure can be imported by the user, either from the libraries of small proteins provided or by loading a PDB file. A search for peptide mimetics that fit into the backbone of the chosen protein can then be initiated. This results in a list of peptide mimetics, the position within the protein where the mimetic could be inserted, and the conformation of the mimetic that fits best.
Instead of scanning the whole protein structure the search can be limited to a special part of the protein, e.g. an exposed loop.
The stem positions within the protein can be fixed. In this case the position is not limited to the backbone. Arbitrary atoms can be chosen as stem atoms, including those in the protein side chains. This option can be used if the position within the protein where a mimetic structure should be fitted is known exactly.
All the above-described searches can be performed within the whole mimetics library or alternatively limited to a sublibrary of mimetics, e.g. beta-turn-mimetics, or even to an individual molecule.
The structure of a mimetic can be imported by the user, either from the libraries of peptide mimetics provided or by loading the structure of a small molecule in MDL mol or sd file format. A search for proteins where the mimetic fits into the backbone can then be initiated. This results in a list of proteins, including the position within each protein where the mimetic could be inserted, and the conformation of the mimetic that fits best.
Instead of the whole library of small protein chains the search can be limited to a sample of proteins from the library, or to an individual protein.
All-to-all comparisons are also possible, but owing to the large number of hits this can be limited by the memory capacity of the computer. Should this situation arise, such comparisons may be restricted to samples from the protein library on one side, or to sub-families of peptide mimetics on the other.
Stem atoms have been predefined for all the libraries provided and should be specified interactively by the user for his or her own structures. Delivering several conformers will yield better results as the search space is enlarged.
All possible combinations of protein and mimetic stem atoms are scanned and candidates fulfilling certain geometrical criteria are sorted according to the Root Mean Square Deviation (RMSD) of the stem atoms. They can be inspected visually in a graphical display. Possible clashes between atoms of the mimetic and the protein are indicated. The superposed proteins and mimetics can be exported as complexes in PDB file format; alternatively, the mimetics can be saved as MDL mol files with their atoms in the protein's coordinate system.
Two versions of the program can be downloaded from the SuperMimic website. With the standard version, fragments of 2–6 amino acid residues can be replaced with peptide mimetics. The extended version handles peptides up to twelve residues long. By bridging larger sequences, the search space is enlarged at the expense of computing time.
Furthermore, all the protein and peptide mimetics libraries are available on the website. Different mimetics sublibraries can be included or excluded by retaining or omitting the respective files. Library files only have to be saved in the same directory as the executable file. They are loaded automatically at subsequent program starts.
In addition, descriptions of the peptide mimetics can be found on the website, including structures, names, classifications described in the literature and references. For support, help pages and several demonstrations explaining how to use the program are provided.
A typical search for the insertion positions of one peptide mimetic structure in the large protein library comprises a comparison of roughly 10000 protein chains, each less than 100 amino acids long, with an average of ten conformers of the mimetic. With the standard version of SuperMimic, peptides of 2–6 amino acids can be bridged, resulting in nearly 500 possible stem positions in one protein chain. Thus, 50 million geometrical comparisons are necessary. Owing to the effective ways of storing the data and pre-selecting the fitting stem positions used in SuperMimic, such a search only takes about three minutes on a low-end desktop PC (Athlon 1400).
Limiting the similarity search to four stem positions allows the screening of large sets of structures in a short time. This is possible because the positions of these four atoms can be described and compared easily using only six parameters, two of which are fixed bond lengths .
SuperMimic is a tool for finding potential non-peptidic building blocks that can replace or mimic parts of a protein, and conversely for identifying locations within a protein where such building blocks can be inserted. It allows rapid, convenient searches within the protein and peptide mimetic libraries provided, as well as using imported structures.
Availability and requirements
Project name: SuperMimic
Project home page: http://bioinformatics.charite.de/supermimic/
Operating system(s): Windows
Programming language: Delphi
Other requirements: no
Any restrictions to use by non-academics: no
We thank Kristian Rother for providing the list of PDB chains and help with Columba, and Stephan Lorenzen for critically reading the manuscript. The work was supported by the BMBF-funded Berlin Center for Genome Based Bioinformatics (BCB).
- Preissner R, Goede A, Rother K, Osterkamp F, Koert U, Froemmel C: Matching organic libraries with protein-substructures. J Comput Aid Mol Des 2001, 15: 811–817. 10.1023/A:1013158818807View ArticleGoogle Scholar
- Smith AB 3rd, Cantin LD, Pasternak A, Guise-Zawacki L, Yao W, Charnley AK, Barbosa J, Sprengeler PA, Hirschmann R, Munshi S, Olsen DB, Schleif WA, Kuo LC: Design, Synthesis, and Biological Evaluation of Monopyrrolinone-Based HIV-1 Protease Inhibitors. J Med Chem 2003, 46: 1831–44. 10.1021/jm0204587View ArticlePubMedGoogle Scholar
- Zutshi R, Brickner M, Chmielewski J: Inhibiting the assembly of protein-protein interfaces. Curr Opin Chem Biol 1998, 2: 62–6. 10.1016/S1367-5931(98)80036-7View ArticlePubMedGoogle Scholar
- Cochran AG: Antagonists of protein-protein interactions. Chem Biol 2000, 7: R85–94. 10.1016/S1074-5521(00)00106-XView ArticlePubMedGoogle Scholar
- Toogood PL: Inhibition of protein-protein association by small molecules: approaches and progress. J Med Chem 2002, 45: 1543–58. 10.1021/jm010468sView ArticlePubMedGoogle Scholar
- Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Bourne PE: The Protein Data Bank. Nucleic Acids Res 2000, 28: 235–242. 10.1093/nar/28.1.235PubMed CentralView ArticlePubMedGoogle Scholar
- Michalsky E, Goede A, Preissner R: Loops In Proteins (LIP) – a comprehensive loop database for homology modelling. Protein Eng 2003, 16: 979–985. 10.1093/protein/gzg119View ArticlePubMedGoogle Scholar
- Kabsch W: A solution for the best rotation to relate two sets of vectors. Acta Crystalogr 1976, A32: 922–923. 10.1107/S0567739476001873View ArticleGoogle Scholar
- Smellie A, Stanton R, Henne R, Teig S: Conformational analysis by intersection: CONAN. J Comput Chem 2003, 24: 10–20. 10.1002/jcc.10175View ArticlePubMedGoogle Scholar
- Trissl S, Rother K, Mueller H, Steinke T, Koch I, Preissner R, Froemmel C, Leser U: Columba: an integrated database of proteins, structures, and annotations. BMC Bioinformatics 2005, 6: 81. 10.1186/1471-2105-6-81PubMed CentralView ArticlePubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.