UbiProt: a database of ubiquitylated proteins
© Chernorudskiy et al; licensee BioMed Central Ltd. 2007
Received: 23 October 2006
Accepted: 18 April 2007
Published: 18 April 2007
Post-translational protein modification with ubiquitin, or ubiquitylation, is one of the hottest topics in a modern biology due to a dramatic impact on diverse metabolic pathways and involvement in pathogenesis of severe human diseases. A great number of eukaryotic proteins was found to be ubiquitylated. However, data about particular ubiquitylated proteins are rather disembodied.
To fill a general need for collecting and systematizing experimental data concerning ubiquitylation we have developed a new resource, UbiProt Database, a knowledgebase of ubiquitylated proteins. The database contains retrievable information about overall characteristics of a particular protein, ubiquitylation features, related ubiquitylation and de-ubiquitylation machinery and literature references reflecting experimental evidence of ubiquitylation. UbiProt is available at http://ubiprot.org.ru for free.
UbiProt Database is a public resource offering comprehensive information on ubiquitylated proteins. The resource can serve as a general reference source both for researchers in ubiquitin field and those who deal with particular ubiquitylated proteins which are of their interest. Further development of the UbiProt Database is expected to be of common interest for research groups involved in studies of the ubiquitin system.
Ubiquitin is a small (76 amino acids) protein that has an ability to be linked to other intracellular proteins via a covalent isopeptide bond . Such a posttranslational modification of target proteins named "ubiquitylation" leads to a manifestation of various biological effects of ubiquitin. Ubiquitylation is implicated in protein degradation via an intracellular ATP-dependent proteolytic system . Additionally, it participates in several non-proteolytic events. It is common knowledge that the ubiquitin system mediates throng of essential cellular outcomes, including a cell cycle control, an inflammatory response, carcinogenesis and many others, therefore ubiquitylation is a phenomenon of great importance for cell vital functions .
A diversity of known biological effects of ubiquitylation is realized through modification of a huge variety of target proteins followed by alteration of their functions. A rapidly growing number of experimental evidence on protein ubiquitylation determines a general need for such data collecting and systematization. We try to solve this problem by elaborating a new resource, the UbiProt Database, a knowledgebase of ubiquitylated proteins. This project aims to summarize a significant volume of data concerning ubiquitylation and to provide essential information on target proteins.
Construction and content
UbiProt is developed and deployed with an open source software. A database management system is MySQL 4.0. The software is developed on the basis of PHP 4.0.3 including some additional modules like SMARTY and PEAR. Web interface software uses PHP+SMARTY template framework.
All data included were experimentally obtained by various research groups and can be verified using respective references. The main content was obtained from several large-scale proteomic studies [4–7]. The rest of target proteins were acquired from original research articles containing direct evidence for ubiquitylation of particular proteins. A manual annotation was performed for all database entries.
The following information block provides information on the respective ubiquitylation and deubiquitylation machinery: cognate ubiquitin-conjugating components E2 and E3, deubiquitylating enzyme(s) (DUB), non-enzymatic adaptor proteins (E4/AP), if these components are known.
Utility and discussion
UbiProt Database is considered to be a useful tool for different purposes, but mainly for identification of proteins posttranslationally modified by ubiquitin in high-throughput studies. A computational biology (e.g. comparative and evolutionary analysis of protein ubiquitylation) also offers a promising field of a database application. The dataset collected will provide an insight into the ubiquitin-dependent mechanisms controlling essential cellular events and, in future, help to develop new compounds for treating ubiquitin-associated human diseases.
At the present time the information about 400 individual proteins from different organisms is collected and this work is still in progress. Our group permanently analyses a broad array of literature data (approximately 40 journal articles per week) in order to collect new information on the target proteins, identified ubiquitylation sites, structures of multi-ubiquitin chains and features of the ubiquitylation machinery. A continuous renewal of the corresponding fields as well as an insertion of new information blocks, especially concerning a biological function of ubiquitylation, are provided. Our database undergoes updating as soon as appropriate confident information becomes available. We also plan to extend our dataset with a detailed description of up-stream and down-stream components of ubiquitin system, together with comprehensive information on the domain structure of target proteins including ubiquitin-binding domains (reviewed in ).
UbiProt is more convenient in terms of search of information about ubiquitylated proteins than more general databases. Although databases like Swiss-Prot and Human Protein Reference Database  also contain data about ubiquitylation, there are several problems with a retrieval of the information of interest. First of all, one trying to obtain information about ubiquitylation from Swiss-Prot using the SRS system  will face some difficulties with a query formulation, as far as the SRS poorly works with complex queries . A query should be formulated very precisely, so it is necessary to find at least one entry prior to the main search. In addition, even precise queries work well only for proteins with an identified ubiquitylation site(s). An amount of proteins with unknown ubiquitylation sites is much more bigger, and in this case query formulation rules may differ. Another problem is search redundancy. A search for ubiquitylated proteins in Swiss-Prot most likely will return not only pure ubiquitylated proteins, but also numerous enzymes of the ubiquitylation cascade.
Besides several search problems mentioned above, there is lack of data in the established databases. Many known substrates of ubiquitylation do not appear as ubiquitylated proteins in Swiss-Prot (e.g. p53, BRCA1, MDM2, Ymer and others), for many other proteins precise ubiquitylation sites are not designated. Human Protein Reference Database also cannot serve as a complex reference source for ubiquitylation, because it contains data only about 18 ubiquitylated proteins, without details about an ubiquitylation type and respective enzymes. This can be due to the insufficient use of data from the proteomic studies dedicated to ubiquitylation. Only 2 recent papers presenting results of high-throughput analysis [4, 5] are reviewed in Swiss-Prot at the moment. Our dataset is based also on another proteomic works published so far [6, 7, 16].
UbiProt aims to collect data from a number of resources including cited databases to make the information easily accessible after validation and annotation. It is more specific and comprehensive comparing to general sequence databases such as Swiss-Prot.
All scientists working on protein ubiquitylation are encouraged to join collaboration in keeping the database up-to-date by submitting additional information and comments. A downloadable Excel form can be used for submitting new data.
UbiProt Database is a comprehensive resource on ubiquitylated proteins, aimed to systematize information on protein ubiquitylation and to make it available for further analysis and use. The resource is considered to be useful as a general reference source both for researchers in the ubiquitin field and those who deal with particular ubiquitylated proteins of interest. Its biological utility and application also includes identification of proteins posttranslationally modified by the ubiquitin in high-throughput studies and bio-computational analysis of the ubiquitin-protein conjugates. It may help to understand the mechanisms of ubiquitylation and to develop new compounds for manipulating the ubiquitin-dependent pathways.
Further development of the UbiProt Database is expected to be of common interest for research groups involved in studies of the ubiquitin system.
Availability and requirements
UbiProt can be accessed on a public Apache powered website at http://ubiprot.org.ru. The website was tested with the most of commonly used Web browsers. It is freely available for non-commercial use. For any questions regarding commercial use please contact the corresponding author.
Sequence Retrieval System
- Jennissen HP: Ubiquitin and the enigma of intracellular protein degradation. Eur J Biochem 1995, 231: 1–30. 10.1111/j.1432-1033.1995.tb20665.xView ArticlePubMedGoogle Scholar
- Glickman MH, Ciechanover A: The ubiquitin-proteasome proteolytic pathway: destruction for the sake of construction. Physiol Rev 2002, 82: 373–428.View ArticlePubMedGoogle Scholar
- Haglund K, Dikic I: Ubiquitylation and cell signaling. EMBO J 2005, 24: 3353–3359. 10.1038/sj.emboj.7600808PubMed CentralView ArticlePubMedGoogle Scholar
- Hitchcock AL, Auld K, Gygi SP, Silver PA: A subset of membrane-associated proteins is ubiquitinated in response to mutations in the endoplasmic reticulum degradation machinery. Proc Natl Acad Sci USA 2003, 100: 12735–12740. 10.1073/pnas.2135500100PubMed CentralView ArticlePubMedGoogle Scholar
- Peng J, Schwartz D, Elias JE, Thoreen CC, Cheng D, Marsischky G, Roelofs J, Finley D, Gygi SP: A proteomics approach to understanding protein ubiquitination. Nat Biotechnol 2003, 21: 921–926. 10.1038/nbt849View ArticlePubMedGoogle Scholar
- Kirkpatrick DS, Weldon SF, Tsaprailis G, Liebler DC, Gandolfi AJ: Proteomic identification of ubiquitinated proteins from human cells expressing His-tagged ubiquitin. Proteomics 2005, 5: 2104–2111. 10.1002/pmic.200401089View ArticlePubMedGoogle Scholar
- Mayor T, Lipford JR, Graumann J, Smith GT, Deshaies RJ: Analysis of poly-ubiquitin conjugates reveals that the Rpn10 substrate receptor contributes to the turnover of multiple proteasome targets. Mol Cell Proteomics 2005, 4: 741–751. 10.1074/mcp.M400220-MCP200View ArticlePubMedGoogle Scholar
- Pickart CM: Ubiquitin in chains. Trends Biochem Sci 2000, 25: 544–548. 10.1016/S0968-0004(00)01681-9View ArticlePubMedGoogle Scholar
- Entrez PubMed[http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?DB=pubmed]
- UniProt Knowledgebase (Swiss-Prot and TrEMBL)[http://www.expasy.org/sprot/]
- RCSB Protein Data Bank[http://www.pdb.org/pdb/]
- Hicke L, Schubert HL, Hill CP: Ubiquitin-binding domains. Nat Rev Mol Cell Biol 2005, 6: 610–621. 10.1038/nrm1701View ArticlePubMedGoogle Scholar
- Human Protein Reference Database[http://www.hprd.org/]
- SRSWWW at ExPASy[http://www.expasy.org/srs5/]
- Croce O, Lamarre M, Christen R: Querying the public databases for sequences using complex keywords contained in the feature lines. BMC Bioinformatics 2006, 7: 45. 10.1186/1471-2105-7-45PubMed CentralView ArticlePubMedGoogle Scholar
- Vasilescu J, Smith JC, Ethier M, Figeys D: Proteomic analysis of ubiquitinated proteins from human mcf-7 breast cancer cells by immunoaffinity purification and mass spectrometry. J Proteome Res 2005, 4: 2192–2200. 10.1021/pr050265iView ArticlePubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.