PyMod: sequence similarity searches, multiple sequence-structure alignments, and homology modeling within PyMOL
© Bramucci et al.; licensee BioMed Central Ltd. 2012
Published: 28 March 2012
In recent years, an exponential growing number of tools for protein sequence analysis, editing and modeling tasks have been put at the disposal of the scientific community. Despite the vast majority of these tools have been released as open source software, their deep learning curves often discourages even the most experienced users.
A simple and intuitive interface, PyMod, between the popular molecular graphics system PyMOL and several other tools (i.e., [PSI-]BLAST, ClustalW, MUSCLE, CEalign and MODELLER) has been developed, to show how the integration of the individual steps required for homology modeling and sequence/structure analysis within the PyMOL framework can hugely simplify these tasks. Sequence similarity searches, multiple sequence and structural alignments generation and editing, and even the possibility to merge sequence and structure alignments have been implemented in PyMod, with the aim of creating a simple, yet powerful tool for sequence and structure analysis and building of homology models.
PyMod represents a new tool for the analysis and the manipulation of protein sequences and structures. The ease of use, integration with many sequence retrieving and alignment tools and PyMOL, one of the most used molecular visualization system, are the key features of this tool.
Source code, installation instructions, video tutorials and a user's guide are freely available at the URL http://schubert.bio.uniroma1.it/pymod/index.html
Once confined only to experts in bioinformatics, protein sequence retrieving, aligning and modeling tasks are now being routinely approached by an increasing number of researchers, who can take also advantage of the growing number of structures that are being deposited every day in public databases. Integrating protein sequence and structure information has therefore become an imperative, especially in the field of protein structure prediction from sequence, by means of homology modeling (HM) methodologies.
In recent years, a number of valuable tools related to protein sequence analysis and modeling (e.g., DeepView , MolIDE  and Chimera ) has been developed. While these tools are in many cases easily accessible, and have greatly simplified some of the problems that are most frequently encountered when coping with sequence/structure analysis tasks (e.g., lack of graphical user interfaces [GUIs], need to make use of many programs in an integrated way and input and output file format manipulation problems), the initial difficulties and deep learning curves often encountered when mastering the usage of new software sometimes discourages first-time, as well as more experienced users. On the other hand, public servers (e.g., Phyre , CPHmodels ), which are able to automatize some or all of the main modeling tasks, often do not offer users the ability to apply knowledge-based intervention during the analysis (e.g., sequences selection, manual refinement of multiple alignments and choice of parameters during model construction).
PyMod integrated tools
PyMod has a rich functionality, based on its core sequence alignment, clustering and editing window. These features are described in outline in the following sub-sections.
As such, PyMod provides a graphical interface for (PSI-)BLAST searches of large databases, both locally or remotely, which can be also used as a standalone tool inside the PyMOL framework.
Alignment of sequences and structures
PyMod represents a new tool for the analysis and the manipulation of protein sequences and structures. The ease of use, integration with many sequence retrieving and alignment tools and PyMOL, one of the most used molecular visualization system, are the key features of this tool. We plan to release future updates of PyMod, including additional tools for secondary structure prediction, sequence retrieving and alignment, as well as other tools suggested by the users' community. Finally, a tighter integration between PyMOL, MODELLER and PyMod will constitute a main issue of future project development plans.
Availability and requirements
Project name: PyMod
Project home page: http://schubert.bio.uniroma1.it/pymod
Operating system(s): Windows (XP, Vista, Seven). Linux (Ubuntu) and Mac OS (10.6) will be supported in the next release.
Programming language: Python
License: Lesser General Public License (LGPL)
Other requirements: PyMOL version 1.1.1 or newer, BioPython version 1.50 or newer, Standalone BLAST 2.2.25+ or newer, Muscle, ClustalW and MODELLER.
This work was partially supported by the funds of the Italian "Ministero dell'Istruzione, dell'Università e della Ricerca" and by the "Consorzio Interuniversitario per le Applicazioni di Supercalcolo per Università e Ricerca." (CASPUR, Roma, Italy) [std11-459]. This work will be submitted by EB in partial fulfillment of the requirements of the degree of "Dottorato di Ricerca in Biochimica" at Sapienza, Università di Roma.
This article has been published as part of BMC Bioinformatics Volume 13 Supplement 4, 2012: Italian Society of Bioinformatics (BITS): Annual Meeting 2011. The full contents of the supplement are available online at http://www.biomedcentral.com/1471-2105/13/S4.
- Arnold K, Bordoli L, Kopp J, Schwede T: The SWISS-MODEL Work-space: a web-based environment for protein structure homology modelling. Bioinformatics 2006, 22: 195–201. 10.1093/bioinformatics/bti770View ArticlePubMedGoogle Scholar
- Canutescu AA, Dunbrack RL Jr: MolIDE: a homology modeling framework you can click with. Bioinformatics 2005, 21: 2914–2916. 10.1093/bioinformatics/bti438View ArticlePubMedGoogle Scholar
- Pettersen EF, Goddard TD, Huang CC, Couch GS, Greenblatt DM, Meng EC, Ferrin TE: UCSF Chimera--a visualization system for exploratory re-search and analysis. J Comput Chem 2004, 25: 1605–1612. 10.1002/jcc.20084View ArticlePubMedGoogle Scholar
- Kelley LA, Sternberg MJE: Protein structure prediction on the web: a case study using the Phyre server. Nature Protocols 2009, 4: 363–371.View ArticlePubMedGoogle Scholar
- Nielsen M, Lundegaard C, Lund O, Petersen TN: CPHmodels-3.0 - Remote homology modeling using structure guided sequence profiles. Nucleic Acids Research 2010, 38: W576-W581. 10.1093/nar/gkq535PubMed CentralView ArticlePubMedGoogle Scholar
- DeLano WL: The PyMOL Molecular Graphics System. San Carlos, CA: DeLano Scientific; 2002.Google Scholar
- Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ: Basic local alignment search tool. J Mol Biol 1990, 215: 403–410.View ArticlePubMedGoogle Scholar
- Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 1997, 25: 3389–3402. 10.1093/nar/25.17.3389PubMed CentralView ArticlePubMedGoogle Scholar
- Edgar RC: MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res 2004, 32: 1792–1797. 10.1093/nar/gkh340PubMed CentralView ArticlePubMedGoogle Scholar
- Thompson JD, Higgins DG, Gibson TJ: CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res 1994, 22: 4673–4680. 10.1093/nar/22.22.4673PubMed CentralView ArticlePubMedGoogle Scholar
- Shindyalov IN, Bourne PE: Protein structure alignment by incremental combinatorial extension (CE) of the optimal path. Protein Eng 1998, 9: 739–747.View ArticleGoogle Scholar
- Eswar N, Webb B, Marti-Renom MA, Madhusudhan MS, Eramian D, Shen MY, Pieper U, Sali A: Comparative protein structure modeling using MODELLER. Curr Protoc Bioinformatics 2006, Chapter 5: Unit 5.6.PubMedGoogle Scholar
- Needleman SB, Wunsch CD: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 1970, 48: 443–453. 10.1016/0022-2836(70)90057-4View ArticlePubMedGoogle Scholar
- Smith TF, Waterman MS: Identification of common molecular subsequences. Journal of Molecular Biology 1981, 147: 195–197. 10.1016/0022-2836(81)90087-5View ArticlePubMedGoogle Scholar
- Remmert M, Linke D, Lupas AN, Soding J: HHomp--prediction and classification of outer membrane proteins. Nucleic Acids Res 2009, 37: W446-W451. 10.1093/nar/gkp325PubMed CentralView ArticlePubMedGoogle Scholar
- Yona G, Levitt M: Within the Twilight Zone: a sensitive profile-profile comparison tool based on information theory. J Mol Biol 2002, 315: 1257–1275. 10.1006/jmbi.2001.5293View ArticlePubMedGoogle Scholar
- Notredame C, Higgins D, Heringa J: T-Coffee: A novel method for fast and accurate multiple sequence alignment. Journal of Molecular Biology 2000, 302: 205–217. 10.1006/jmbi.2000.4042View ArticlePubMedGoogle Scholar
- O'Sullivan O, Suhre K, Abergel C, Higgins DG, Notredame C: 3DCoffee: combining protein sequences and structures within multiple sequence alignments. J Mol Biol 2004, 340: 385–395. 10.1016/j.jmb.2004.04.058View ArticlePubMedGoogle Scholar
- Söding J, Biegert A, Lupas AN: The HHpred interactive server for protein homology detection and structure prediction. Nucleic Acids Res 2005, 33(Web Server issue):W244-W248.PubMed CentralView ArticlePubMedGoogle Scholar
- Wallner B, Elofsson A: All are not equal: a benchmark of different homology modeling programs. Protein Sci 2005, 14: 1315–27. 10.1110/ps.041253405PubMed CentralView ArticlePubMedGoogle Scholar
- Kuntal BK, Aparoy P, Reddanna P: EasyModeller: a graphical interface to MODELLER. BMC Res Notes 2010, 3: 226–330. 10.1186/1756-0500-3-226PubMed CentralView ArticlePubMedGoogle Scholar
- Mathur A, Shankaracharya , Vidyarthi AS: SWIFT MODELLER: A JAVA based GUI for molecular modeling. J Mol Model 2011, 17: 2601–2607. 10.1007/s00894-011-0960-4View ArticlePubMedGoogle Scholar
- Shen M-y, Sali A: Statistical potential for assessment and prediction of protein structures. Protein Science 2006, 15: 2507–2524. 10.1110/ps.062416606PubMed CentralView ArticlePubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.