dbOGAP - An Integrated Bioinformatics Resource for Protein O-GlcNAcylation
© Wang et al; licensee BioMed Central Ltd. 2011
Received: 4 November 2010
Accepted: 6 April 2011
Published: 6 April 2011
Protein O-GlcNAcylation (or O-GlcNAc-ylation) is an O-linked glycosylation involving the transfer of β-N-acetylglucosamine to the hydroxyl group of serine or threonine residues of proteins. Growing evidences suggest that protein O-GlcNAcylation is common and is analogous to phosphorylation in modulating broad ranges of biological processes. However, compared to phosphorylation, the amount of protein O-GlcNAcylation data is relatively limited and its annotation in databases is scarce. Furthermore, a bioinformatics resource for O-GlcNAcylation is lacking, and an O-GlcNAcylation site prediction tool is much needed.
We developed a database of O-GlcNAcylated proteins and sites, dbOGAP, primarily based on literature published since O-GlcNAcylation was first described in 1984. The database currently contains ~800 proteins with experimental O-GlcNAcylation information, of which ~61% are of humans, and 172 proteins have a total of ~400 O-GlcNAcylation sites identified. The O-GlcNAcylated proteins are primarily nucleocytoplasmic, including membrane- and non-membrane bounded organelle-associated proteins. The known O-GlcNAcylated proteins exert a broad range of functions including transcriptional regulation, macromolecular complex assembly, intracellular transport, translation, and regulation of cell growth or death. The database also contains ~365 potential O-GlcNAcylated proteins inferred from known O-GlcNAcylated orthologs. Additional annotations, including other protein posttranslational modifications, biological pathways and disease information are integrated into the database. We developed an O-GlcNAcylation site prediction system, OGlcNAcScan, based on Support Vector Machine and trained using protein sequences with known O-GlcNAcylation sites from dbOGAP. The site prediction system achieved an area under ROC curve of 74.3% in five-fold cross-validation. The dbOGAP website was developed to allow for performing search and query on O-GlcNAcylated proteins and associated literature, as well as for browsing by gene names, organisms or pathways, and downloading of the database. Also available from the website, the OGlcNAcScan tool presents a list of predicted O-GlcNAcylation sites for given protein sequences.
dbOGAP is the first public bioinformatics resource to allow systematic access to the O-GlcNAcylated proteins, and related functional information and bibliography, as well as to an O-GlcNAcylation site prediction tool. The resource will facilitate research on O-GlcNAcylation and its proteomic identification.
O-GlcNAcylation, or O-GlcNAc-ylation to distinguish it from acylation, is an O-linked glycosylation involving the β-attachment of a single N-acetylglucosamine (GlcNAc) to the serine (Ser)/threonine (Thr) residues catalyzed by O-GlcNAc transferase (OGT), whose removal is catalyzed by O-GlcNAcase (OGA) . The two O-GlcNAc cycling enzymes OGT and OGA are each encoded by a single gene in mammalian species. Unlike N-linked or mucin-type O-linked glycosylation, O-GlcNAcylation occurs primarily in nucleocytoplasmic proteins . Analogous to phosphorylation, the modification is dynamic and the O-GlcNAc moiety is not further extended . O-GlcNAcylation is also often reciprocal to phosphorylation at the same or adjacent Ser/Thr residues [1–3], which led to a "Yin-Yang" hypothesis on protein functions modulated by the two post-translational modifications (PTMs)  through competitively blocking each other's occupancy at given sites. For example, reciprocal O-GlcNAcylation and phosphorylation at the same Ser16 of murine estrogen receptor β (ERβ modulate the degradation of ERβ by stabilizing or destabilizing the protein, respectively . Similarly, O-GlcNAcylation of p53 at Ser149 is associated with decreased phosphorylation at the adjacent Thr155, resulting in decreased p53 ubiquitination and subsequent degradation, thus stabilizing p53 . In contrast to the enormous body of research on phosphorylation, the amount of research on O-GlcNAcylation has been disproportionally small due to difficulties in detecting the O-GlcNAc group, partly because of its being labile, dynamic, and substoichiometric . Over 600 proteins have been reported to be O-GlcNAcylated since it was first identified in 1984 , many of which were identified in recent years [1–3, 9–11] as a result of improved mass spectrometry technologies. Growing evidences now suggest that O-GlcNAcylation is very common and has broad roles in physiology and diseases, especially through its reciprocal interplay with phosphorylation, e.g., regulation of insulin signaling, transcription, and roles in diabetes and neurodegenerative diseases .
A number of bioinformatics databases have been developed for protein post-translational modifications, including those of general PTMs, e.g., dbPTM , or specific types, e.g., databases of protein phosphorylation, e.g., PhosphoELM , PhosphoSite , and those of protein glycosylation , ubiquitination  and protease cleavage . By contrast, there has been no special database dedicated to O-GlcNAcylated proteins and sites, and their annotations are also scarce in protein databases, e.g., only ~100 experimental O-GlcNAcylation sites for 35 proteins are currently annotated in UniProtKB . Moreover, O-GlcNAcylation annotations have not been included in the specialized glycosylation databases (e.g., GlycoBase, the Functional Glycomics Gateway) [15, 19].
Because of growing interests in studying the crucial roles of O-GlcNAcylation in cell signaling and many other cellular processes, identifying the site motifs and computationally predicting the O-GlcNAcylation sites become important bioinformatics tasks to assist those studies. Unlike N-linked glcycosylation with a consensus motif of "Asn-X-Thr/Ser", O-linked glycosylation, including mucin-type O-glycosylation and O-GlcNAc glycosylation, has not yet found well-defined sequence motifs. The past effort in developing prediction method for O-glycosylation has mostly focused on the mucin-type [20–23]. To our best knowledge there has been only one site prediction tool for O-GlcNAcylation, YinOYang, which is an artificial neural network system trained on sequence fragments of ~40 GlcNAcylation sites available at the time . The motif of O-GlcNAcylation remains poorly defined, and there is a pressing need to develop an O-GlcNAcylation site prediction tool based on a much greater number of experimental O-GlcNAcylation sites available now.
Here we report the development of a d atab ase of O-G lcNA cylated p roteins and sites (dbOGAP) for all currently known O-GlcNAcylated proteins reported from literature, and of an O-GlcNAcylation site prediction system (OGlcNAcScan) based on nearly 400 O-GlcNAcylation sites. Both the database and the prediction system are available through the dbOGAP web site, which serves as a public bioinformatics resource to facilitate research on O-GlcNAcylated proteins and to assist proteomic identification of O-GlcNAcylation sites.
Construction and Content
1. The Database Development
2. The O-GlcNAc Site Prediction
An O-GlcNAcylation site prediction system, OGlcNAcScan, was developed based on annotated O-GlcNAcylation sites in dbOGAP using the SVMlight implementation of Support Vector Machine (SVM) . A training data set of the prediction system consists of 373 positive instances that are experimental O-GlcNAcylation sites in 167 protein sequences from dbOGAP, and also of 29,897 negative instances that are the rest of the un-annotated Ser/Thr sites in the same protein sequences. Given a Ser/Thr site, n upstream and n downstream amino acids were regarded as its sequence context and then 2n+1 amino acids, including the O-GlcNAcylated Ser or Thr residue in the middle, were converted into a vector of binary values (0 or 1) using the widely-used sparse encoding method described, for example, in Julenius et al. 2005 . Note, if the site is less than n amino acid away from the sequence terminals, the end-of-sequence symbol is padded at the terminal as many as needed to derive a fixed-length sequence fragment. In this encoding method, each amino acid type and the end-of-sequence symbol is coded with 21 binary values, e.g., 100...0 (one followed by 20 zeros) for Ala, 010...0 for Arg, ..., and 000...1 for end-of-sequence), and the resulting feature vector consists of 21 × (2n+1) binary values. For different values of n, we trained SVM classifiers with the RBF kernel. The parameters involving these classifiers, C and γ, were optimized through 5-fold cross-validation tests, where classifiers were trained and tested, respectively, on a four-fifths and the remaining one-fifth of the data set for five times. We explored different sequence encoding methods, such as frequencies of amino acid types [21, 23] and gappy bi-grams/dimers , but the orthodox sparse encoding method with n = 5 yielded the best prediction performance.
3. The Database and the Web site Implementation
1. The Database Contents
O-GlcNAcylation and phosphorylation occurring at identical or adjacent (+/- 4 amino acids) serine/threonine (S/T) sites of O-GlcNAcylated proteins.
Identical phosphorylation S/T site
Adjacent phosphorylation S/T site
T728, S732, S735
S7, T33, S34, S55
S7, T33, S34, S55
S5, S8, S9, S10, S29, T33, S34, S51, S56
S9, S13, T14
S70, S171, S173
T169, S171, T177
S30, S31, S49
S30, S31, S34, S47, S53
S244, S254, S256, S280
S1407, S2027, S2029, T2700, T2703
S2029, S2694, T2703
T1406, S2029, T2703
S494, T495, S496, S499, S502
S1199, S1200, S1201
S613, T641, S642, S699, S703
S613, T641, S642, S703
T641, S642, S703
S707, T714, S715
Functional profiles of O-GlcNAcylated proteins
Major GO categories of human O-GlcNAcylated proteins.
Gene Ontology (GO) Terms
GO Biological Processes
GO:0045449~regulation of transcription
GO:0051252~regulation of RNA metabolic process
GO:0006355~regulation of transcription, DNA-dependent
GO:0043933~macromolecular complex subunit organization
GO:0065003~macromolecular complex assembly
GO:0010605~negative regulation of macromolecule metabolic process
GO:0042981~regulation of apoptosis
GO:0043067~regulation of programmed cell death
GO:0010941~regulation of cell death
GO:0045184~establishment of protein localization
GO:0010604~positive regulation of macromolecule metabolic process
GO Molecular Function
GO:0032555~purine ribonucleotide binding
GO:0017076~purine nucleotide binding
GO:0030528~transcription regulator activity
GO:0032559~adenyl ribonucleotide binding
GO:0030554~adenyl nucleotide binding
GO:0001883~purine nucleoside binding
GO:0005198~structural molecule activity
GO:0042802~identical protein binding
GO Cellular Component
GO:0043232~intracellular non-membrane-bounded organelle
GO:0070013~intracellular organelle lumen
We further examined the O-GlcNAcylated proteins for enrichment of GO terms at deeper levels of the GO hierarchy. As summarized in [Additional file 1, Supplementary Table S1], the top enriched GO biological processes relate to protein translation, carbohydrate (glucose) metabolism, RNA processing/splicing, and RNA/protein transport, followed by macromolecular complex and organelle organization, regulation of cell cycle and cell death, chromosome organization and transcription, regulation of protein and other small molecule metabolisms. The enriched GO molecular functions include nucleoside, nucleotide and nucleic acid binding, transcription factor activity, protein binding and other molecular activities. The enriched GO cellular components include cytosol, organelle lumen and non-membrane-bounded organelles, nuclear compartments such as nucleoplasm, nuclear pore and nucleolus, ribosome and cytoskeleton, nuclear protein complexes and chromatin, membrane and vesicle associated spaces, and contractile associated proteins. Notably, although significant proportions of known O-GlcNAcylated proteins are associated with intracellular membranes or inner side of plasma membrane, only a few plasma transmembrane proteins, such as glucose transporters and notch receptor were reported to be O-GlcNAcylated [30–32]. Therefore O-GlcNAcylated proteins are primarily nucleocytoplasmic and are engaged in broad biological functions.
Pathways and disease processes related to O-GlcNAcylated proteins
Pathway profiles using GeneGo Pathway Maps analysis.
Development_Role of CDK5 in neuronal development
Development_Gastrin in cell growth and proliferation
Immune response_Gastrin in inflammatory response
Signal transduction_Activation of PKC via G-Protein coupled receptor
Cytoskeleton remodeling_Cytoskeleton remodeling
Glycolysis and gluconeogenesis (short map)
Transcription_Role of Akt in hypoxia induced HIF1 activation
Immune response_MIF - the neuroendocrine-macrophage connector
Development_Prolactin receptor signaling
Cytoskeleton remodeling_TGF, WNT and cytoskeletal remodeling
Development_Gastrin in differentiation of the gastric mucosa
Development_EGFR signaling pathway
Cytoskeleton remodeling_Regulation of actin cytoskeleton by Rho GTPases
G-protein signaling_Proinsulin C-peptide signaling
Development_Glucocorticoid receptor signaling
Development_VEGF signaling and activation
Cell adhesion_Histamine H1 receptor signaling in the interruption of cell barrier integrity
Immune response_Inhibitory action of Lipoxins on pro-inflammatory TNF-alpha signaling
Signal transduction_Calcium signaling
Regulation of CFTR activity (norm and CF)
Translation _Regulation of translation initiation
Immune response_Histamine H1 receptor signaling in immune response
Immune response_IL-2 activation and signaling pathway
Transcription_P53 signaling pathway
Development_Ligand-dependent activation of the ESR1/AP-1 pathway
Development_PDGF signaling via STATs and NF-kB
Signal transduction_AKT signaling
Development_VEGF signaling via VEGFR2 - generic cascades
Development_Thrombopoietin-regulated cell processes
Mucin expression in CF via IL-6, IL-17 signaling pathways
Development_TGF-beta-dependent induction of EMT via RhoA, PI3K and ILK.
Development_PIP3 signaling in cardiac myocytes
Cytoskeleton remodeling_Keratin filaments
Development_A3 receptor signaling
Transport_RAN regulation pathway
Immune response_IL-7 signaling in T lymphocytes
Muscle contraction_Regulation of eNOS activity in endothelial cells
Immune response_IL-6 signaling pathway
Because of the broad cellular processes and pathways that the O-GlcNAcylated proteins are known to participate in, O-GlcNAcylation may potentially play significant roles in many pathological conditions. Indeed, four categories of disease conditions have been implicated to involve O-GlcNAcylation, i.e., type II diabetes, neurodegenerative diseases, cardiovascular diseases, and cancers . For example, OGT regulates insulin signaling through O-GlcNAcylation of several important insulin signaling molecules, e.g., IRS-1, PI3K, PDK1, and AKT1, leading to attenuation of insulin signaling responses in glycogen synthesis, activation of gluconeogenic genes and glucose transporter GLUT4 translocation, thus contributing to insulin resistance in type II diabetes [1, 35]. Tau protein is subject to both OGlcNAcylation and phosphorylation, and hyperphosphorylation apparently contributes to neurofibrillary tangle formation by tau in Alzheimer's disease . O-GlcNAcylation represents a key regulatory mechanism in modulating vascular reactivity, such as contractile and relaxant response through modification of proteins, e.g., NOS, sarcoplasmic reticulumn Ca(2+)-ATPase, PKC, MAPK and cytoskeleton and microtubule proteins . O-GlcNAcylation can mediate cardiac stress responses and has cardioprotective effects through transcription-mediated regulation as well as cardiomyocyte calcium homeostasis . O-GlcNAcylation may have general roles in cancer through its involvement in oncogenesis or tumor suppression by coupling cellular metabolic status to regulation of signal transduction, transcription, and protein degradation. For example, reducing cellular UDP-GlcNAc level in MCF-7 cells changed the O-GlcNAcylation patterns of key proteins that control cell proliferation and differentiation, including Sp1, chaperonin TCP-1 theta, and oncogene Ets-related protein isoform 7 . Many cancer genes or tumor suppressors are known to be O-GlcNAcylated or to interact with OGT, such as c-Myc, AKT1, AKT2, ATF1, CBP, FOXO1, TOP1, p53 and HIC1 . In breast cancer cells, knockdown of OGT and the resulting global reduction of O-GlcNAcylation inhibited cell proliferation and metastasis ability .
2. The O-GlcNAcylation Site Prediction
3. The dbOGAP Web Site
The O-GlcNAcylated protein entry
The O-GlcNAcScan report
The dbOGAP database download
The dbOGAP web site provides a download page (Figure 4, linked in #1) for downloading the database in several data files, including all O-GlcNAcylated proteins, sites and orthologs. A full bibliography of O-GlcNAcylated proteins can also be downloaded. The data sets for developing the OGlcNAcScan system are available to the scientific community for further development of the site prediction (Figure 4, #2).
Up to now, the amount of data published on protein O-GlcNAcylation is only a fraction of that of phosphorylation, and its biological role is much less understood. Since 2006, the identification of O-GlcNAcylated proteins and sites has been rapidly growing due to the improved mass spectrometry technologies and O-GlcNAc enrichment techniques [7–9]. The dbOGAP database provides a timely bioinformatics resource to allow readily access by the community to the known and potential O-GlcNAcylated proteins and sites.
While a large number of O-GlcNAcylated proteins and sites were identified in recent years, many were determined based on large-scale mass spectrometry and would need to be further validated. Although O-GlcNAcylation has been known to occur primarily in nucleocytoplamic proteins, the GO profiles show that O-GlcNAcylated proteins are localized in a broad range of intracellular compartments. Interestingly, some O-GlcNAcylated proteins are of unusual classes, e.g., adenylate kinase 2 (AK2, UniProtKB: KAD2_HUMAN)  localized in the mitochondria inter-membrane space, and alpha-1-inhibitor 3 (A1i3, UniProtKB: A1I3_RAT) , a secreted protein. Although false positive identification of O-GlcNAcylation is not uncommon from mass spectrometry, it is possible that such proteins may be indeed O-GlcNAcylated. It is known that OGT has at least three isoforms differing in N-terminal sequences with identical catalytic domain, the mitochondrial (mOGT) and two nucleocytoplasmic forms (ncOGT and sOGT) [46, 47]. The mOGT form was shown associated with the mitochondrial inner membrane , thus consistent with the observation of O-GlcNAcylation of the mitochondrial protein AK2. There are a total of ~11 O-GlcNAcylated proteins in dbOGAP that are known to be secreted or have secreted forms besides cytoplasmic ones. It is possible that only the cytoplasmic forms of some of these proteins are O-GlcNAcylated while the secreted ones may not, albeit experimental validation is needed. Thus, the types and/or sources of O-GlcNAcylation identification have been assigned to protein entries as evidence attribution to annotations in the dbOGAP database.
The OGlcNAcScan site prediction system provides a much needed tool for studying protein glycosylation as well as phosphorylation. Since the site prediction is primarily based on the protein sequence context, some secreted proteins may be erroneously predicted even with a relatively high score, e.g., T298 in mucin 4 (UniProtKB: MUC4_HUMAN) predicted with a score of 0.287, though it is unlikely to be O-GlcNAcylated. In such cases, a cautionary note is given to indicate that a protein sequence being predicted is known to have "secreted" form(s). With the continuing growth of O-GlcNAcylation sites data, the OGlcNAcScan tool will be further enhanced through retraining the SVM model, as well as by integrating physiochemical properties and structural information into the SVM prediction model.
In conclusion, the dbOGAP database and the web site become the first of its kind in the public domain to provide readily access to a curated and systematic collection of protein O-GlcNAcylation information, and to a state-of-the-art O-GlcNAcylation site prediction system, OGlcNAcScan, to assist proteomic identification of O-GlcNAc modification sites. Thus, the dbOGAP resource should benefit the biological community to study the broad roles of O-GlcNAcylation in physiology and diseases.
Availability and Requirements
We would like to acknowledge the support from Lombardi Comprehensive Cancer Center (LCCC) at Georgetown University Medical Center. J.W. was supported by a postdoctoral fellowship at the LCCC, and M.T., H.L and Z.Z.H are partially supported by NIH/NLM grant 1R01LM009959-01A1. We also would like to thank Jinesh Shah for assisting curation of evidence attributions reported in literature for O-GlcNAcylation data from large-scale mass spectrometry.
- Hart GW, Housley MP, Slawson C: Cycling of O-linked beta-N-acetylglucosamine on nucleocytoplasmic proteins. Nature 2007, 446: 1017–1022. 10.1038/nature05815View ArticlePubMedGoogle Scholar
- Copeland RJ, Bullen JW, Hart GW: Cross-talk between GlcNAcylation and phosphorylation: roles in insulin resistance and glucose toxicity. Am J Physiol Endocrinol Metab 2008, 295: E17–28. 10.1152/ajpendo.90281.2008PubMed CentralView ArticlePubMedGoogle Scholar
- Wang Z, Gucek M, Hart GW: Cross-talk between GlcNAcylation and phosphorylation: site-specific phosphorylation dynamics in response to globally elevated O-GlcNAc. Proc Natl Acad Sci USA 2008, 105: 13793–13798. 10.1073/pnas.0806216105PubMed CentralView ArticlePubMedGoogle Scholar
- Wells L, Kreppel LK, Comer FI, Wadzinski BE, Hart GW: O-GlcNAc transferase is in a functional complex with protein phosphatase 1 catalytic subunits. J Biol Chem 2004, 279: 38466–38470. 10.1074/jbc.M406481200View ArticlePubMedGoogle Scholar
- Cheng X, Hart GW: Alternative O-glycosylation/O-phosphorylation of serine-16 in murine estrogen receptor beta: post-translational regulation of turnover and transactivation activity. J Biol Chem 2001, 276: 10570–5. 10.1074/jbc.M010411200View ArticlePubMedGoogle Scholar
- Yang WH, Kim JE, Nam HW, Ju JW, Kim HS, Kim YS, Cho JW: Modification of p53 with O-linked N-acetylglucosamine regulates p53 activity and stability. Nat Cell Biol 2006, 8: 1074–1083. 10.1038/ncb1470View ArticlePubMedGoogle Scholar
- Wang Z, Udeshi ND, O'Malley M, Shabanowitz J, Hunt DF, Hart GW: Enrichment and site-mapping of O-Linked N-Acetylglucosamine by a combination of chemical/enzymatic tagging, photochemical cleavage, and electron transfer dissociation (ETD) mass spectrometry. Mol Cell Proteomics 2009, 9: 153–160.PubMed CentralView ArticlePubMedGoogle Scholar
- Torres CR, Hart GW: Topography and polypeptide distribution of terminal N-acetylglucosamine residues on the surfaces of intact lymphocytes. Evidence for O-linked GlcNAc. J Biol Chem 1984, 259: 3308–3317.PubMedGoogle Scholar
- Vosseller K, Trinidad JC, Chalkley RJ, Specht CG, Thalhammer A, Lynn AJ, Snedecor JO, Guan S, Medzihradszky KF, Maltby DA, Schoepfer R, Burlingame AL: O-linked N-acetylglucosamine proteomics of postsynaptic density preparations using lectin weak affinity chromatography and mass spectrometry. Mol Cell Proteomics 2006, 5: 923–934. 10.1074/mcp.T500040-MCP200View ArticlePubMedGoogle Scholar
- Nandi A, Sprung R, Barma DK, Zhao Y, Kim SC, Falck JR, Zhao Y: Global identification of O-GlcNAc-modified proteins. Anal Chem 2006, 78: 452–458. 10.1021/ac051207jView ArticlePubMedGoogle Scholar
- Khidekel N, Ficarro SB, Clark PM, Bryan MC, Swaney DL, Rexach JE, Sun YE, Coon JJ, Peters EC, Hsieh-Wilson LC: Probing the dynamics of O-GlcNAc glycosylation in the brain using quantitative proteomics. Nat Chem Biol 2007, 3: 339–348. 10.1038/nchembio881View ArticlePubMedGoogle Scholar
- Lee TY, Huang HD, Hung JH, Huang HY, Yang YS, Wang TH: dbPTM: an information repository of protein post-translational modification. Nucleic Acids Res 2006, (34 Database):D622–627. 10.1093/nar/gkj083
- Diella F, Cameron S, Gemünd C, Linding R, Via A, Kuster B, Sicheritz-Pontén T, Blom N, Gibson TJ: Phospho.ELM: A database of experimentally verified phosphorylation sites in eukaryotic proteins. BMC Bioinformatics 2004, 5: 79. 10.1186/1471-2105-5-79PubMed CentralView ArticlePubMedGoogle Scholar
- Hornbeck PV, Chabra I, Kornhauser JM, Skrzypek E, Zhang B: PhosphoSite: A bioinformatics resource dedicated to physiological protein phosphorylation. Proteomics 2004, 4: 1551–1561. 10.1002/pmic.200300772View ArticlePubMedGoogle Scholar
- Gupta R, Birch H, Rapacki K, Brunak S, Hansen JE: O-GLYCBASE version 4.0: a revised database of O-glycosylated proteins. Nucleic Acids Res 1999, 27: 370–372. 10.1093/nar/27.1.370PubMed CentralView ArticlePubMedGoogle Scholar
- Chernorudskiy AL, Garcia A, Eremin EV, Shorina AS, Kondratieva EV, Gainullin MR: UbiProt: a database of ubiquitylated proteins. BMC Bioinformatics 2007, 8: 126. 10.1186/1471-2105-8-126PubMed CentralView ArticlePubMedGoogle Scholar
- Rawlings ND, Barrett AJ, Bateman A: MEROPS: the peptidase database. Nucleic Acids Res 2010, (38 Database):D227–233. 10.1093/nar/gkp971
- UniProt Consortium: The Universal Protein Resource (UniProt) in 2010. Nucleic Acids Res 2010, (38 Database):D142–148.
- Consortium for Functional Glycomics (CFG)[http://www.functionalglycomics.org/]
- Hansen JE, Lund O, Tolstrup N, Gooley AA, Williams KL, Brunak S: NetOglyc: prediction of mucin type O-glycosylation sites based on sequence context and surface accessibility. Glycoconj J 1998, 15: 115–130. 10.1023/A:1006960004440View ArticlePubMedGoogle Scholar
- Julenius K, Mølgaard A, Gupta R, Brunak S: Prediction, conservation analysis, and structural characterization of mammalian mucin-type O-glycosylation sites. Glycobiology 2005, 15: 153–164. 10.1093/glycob/cwh151View ArticlePubMedGoogle Scholar
- Chen YZ, Tang YR, Sheng ZY, Zhang Z: Prediction of mucin-type O-glycosylation sites in mammalian proteins using the composition of k-spaced amino acid pairs. BMC Bioinformatics 2008, 9: 101. 10.1186/1471-2105-9-101PubMed CentralView ArticlePubMedGoogle Scholar
- Torii M, Liu H, Hu ZZ: Support vector machine-based mucin-type O-glycosylation site prediction using enhanced sequence feature encoding. AMIA Annu Symp Proc 2009, 640–644.Google Scholar
- Gupta R, Brunak S: Prediction of glycosylation across the human proteome and the correlation to protein function. Pacific Symposium on Biocomputing 2002, 310–322.Google Scholar
- PIR Blast neighbors[http://pir.georgetown.edu/pirwww/search/]
- Wu CH, Huang H, Nikolskaya A, Hu Z, Barker WC: The iProClass integrated database for protein functional analysis. Comput Biol Chem 2004, 28: 87–96. 10.1016/j.compbiolchem.2003.10.003View ArticlePubMedGoogle Scholar
- Joachims T: Text Categorization with Support Vector Machines: Learning with Many Relevant Features. European Conference on Machine Learning 1998, 137–142.Google Scholar
- Huang da W, Sherman BT, Tan Q, Kir J, Liu D, Bryant D, Guo Y, Stephens R, Baseler MW, Lane HC, Lempicki RA: DAVID Bioinformatics Resources: expanded annotation database and novel algorithms to better extract biology from large gene lists. Nucleic Acids Res 2007, (35 Web Server):W169–75. 10.1093/nar/gkm415
- Buse MG, Robinson KA, Marshall BA, Hresko RC, Mueckler MM: Enhanced O-GlcNAc protein modification is associated with insulin resistance in GLUT1-overexpressing muscles. Am J Physiol Endocrinol Metab 2002, 283: E241–250.View ArticlePubMedGoogle Scholar
- Park SY, Ryu J, Lee W: O-GlcNAc modification on IRS-1 and Akt2 by PUGNAc inhibits their phosphorylation and induces insulin resistance in rat primary adipocytes. Exp Mol Med 2005, 37: 220–229.View ArticlePubMedGoogle Scholar
- Matsuura A, Ito M, Sakaidani Y, Kondo T, Murakami K, Furukawa K, Nadano D, Matsuda T, Okajima T: O-linked N-acetylglucosamine is present on the extracellular domain of notch receptors. J Biol Chem 2008, 283: 35486–35495. 10.1074/jbc.M806202200View ArticlePubMedGoogle Scholar
- Ekins S, Nikolsky Y, Bugrim A, Kirillov E, Nikolskaya T: Pathway mapping tools for analysis of high content data. Methods Mol Biol 2007, 356: 319–350.PubMedGoogle Scholar
- Slawson C, Copeland RJ, Hart GW: O-GlcNAc signaling: a metabolic link between diabetes and cancer? Trends Biochem Sci 2010, 35: 547–555. 10.1016/j.tibs.2010.04.005PubMed CentralView ArticlePubMedGoogle Scholar
- Yang X, Ongusaha PP, Miles PD, Havstad JC, Zhang F, So WV, Kudlow JE, Michell RH, Olefsky JM, Field SJ, Evans RM: Phosphoinositide signalling links O-GlcNAc transferase to insulin resistance. Nature 2008, 451: 964–969. 10.1038/nature06668View ArticlePubMedGoogle Scholar
- Liu F, Shi J, Tanimukai H, Gu J, Gu J, Grundke-Iqbal I, Iqbal K, Gong CX: Reduced O-GlcNAcylation links lower brain glucose metabolism and tau pathology in Alzheimer's disease. Brain 2009, 132: 1820–1832. 10.1093/brain/awp099PubMed CentralView ArticlePubMedGoogle Scholar
- Lima VV, Rigsby CS, Hardy DM, Webb RC, Tostes RC: O-GlcNAcylation: a novel post-translational mechanism to alter vascular cellular signaling in health and disease: focus on hypertension. J Am Soc Hypertens 2009, 3: 374–387. 10.1016/j.jash.2009.09.004PubMed CentralView ArticlePubMedGoogle Scholar
- Chatham JC, Marchase RB: The role of protein O-linked beta-N-acetylglucosamine in mediating cardiac stress responses. Biochim Biophys Acta 2010, 1800: 57–66.PubMed CentralView ArticlePubMedGoogle Scholar
- Lefebvre T, Pinte S, Guérardel C, Deltour S, Martin-Soudant N, Slomianny MC, Michalski JC, Leprince D: The tumor suppressor HIC1 (hypermethylated in cancer 1) is O-GlcNAc glycosylated. Eur J Biochem 2004, 271: 3843–3854. 10.1111/j.1432-1033.2004.04316.xView ArticlePubMedGoogle Scholar
- Donadio AC, Lobo C, Tosina M, de la Rosa V, Martín-Rufián M, Campos-Sandoval JA, Matés JM, Márquez J, Alonso FJ, Segura JA: Antisense glutaminase inhibition modifies the O-GlcNAc pattern and flux through the hexosamine pathway in breast cancer cells. J Cell Biochem 2008, 103: 800–811. 10.1002/jcb.21449View ArticlePubMedGoogle Scholar
- Caldwell SA, Jackson SR, Shahriari KS, Lynch TP, Sethi G, Walker S, Vosseller K, Reginato MJ: Nutrient sensor O-GlcNAc transferase regulates breast cancer tumorigenesis through targeting of the oncogenic transcription factor FoxM1. Oncogene 2010, 29: 2831–2842. 10.1038/onc.2010.41View ArticlePubMedGoogle Scholar
- Vacic V, Iakoucheva LM, Radivojac P: Two Sample Logo: a graphical representation of the differences between two sets of sequence alignments. Bioinformatics 2006, 22: 1536–1537. 10.1093/bioinformatics/btl151View ArticlePubMedGoogle Scholar
- Coppock DS: Why Lift? Data Modeling and Mining. Information Management Online 2002. Last visited October 11, 2010 [http://www.information-management.com/news/5329–1.html] Last visited October 11, 2010Google Scholar
- Nandi A, Sprung R, Barma DK, Zhao Y, Kim SC, Falck JR, Zhao Y: Global identification of O-GlcNAc-modified proteins. Anal Chem 2006, 78: 452–458. 10.1021/ac051207jView ArticlePubMedGoogle Scholar
- Cieniewski-Bernard C, Bastide B, Lefebvre T, Lemoine J, Mounier Y, Michalski JC: Identification of O-linked N-acetylglucosamine proteins in rat skeletal muscle using two-dimensional gel electrophoresis and mass spectrometry. Mol Cell Proteomics 2004, 3: 577–585. 10.1074/mcp.M400024-MCP200View ArticlePubMedGoogle Scholar
- Love DC, Kochan J, Cathey RL, Shin SH, Hanover JA: Mitochondrial and nucleocytoplasmic targeting of O-linked GlcNAc transferase. J Cell Sci 2003, 116(Pt 4):647–654. 10.1242/jcs.00246View ArticlePubMedGoogle Scholar
- Lazarus BD, Love DC, Hanover JA: Recombinant O-GlcNAc transferase isoforms: identification of O-GlcNAcase, yes tyrosine kinase, and tau as isoform-specific substrates. Glycobiology 2006, 16: 415–421. 10.1093/glycob/cwj078View ArticlePubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.