Moremen KW, Tiemeyer M, Nairn AV. Vertebrate protein glycosylation: diversity, synthesis and function. Nat Rev Mol Cell Biol. 2012;13(7):448–62.
CAS
PubMed
PubMed Central
Google Scholar
Kristic J, Lauc G. Ubiquitous importance of protein glycosylation. Methods Mol Biol. 2017;1503:1–12.
CAS
PubMed
Google Scholar
Zhang X, Wang Y. Glycosylation quality control by the Golgi structure. J Mol Biol. 2016;428(16):3183–93.
CAS
PubMed
PubMed Central
Google Scholar
Ohtsubo K, Marth JD. Glycosylation in cellular mechanisms of health and disease. Cell. 2006;126(5):855–67.
CAS
PubMed
Google Scholar
Dwek RA. Biological importance of glycosylation. Dev Biol Stand. 1998;96:43–7.
CAS
PubMed
Google Scholar
Veillon L, Zhou S, Mechref Y. Quantitative Glycomics: a combined analytical and bioinformatics approach. Methods Enzymol. 2017;585:431–77.
CAS
PubMed
PubMed Central
Google Scholar
Aoki-Kinoshita KF, Kanehisa M. Bioinformatics approaches in glycomics and drug discovery. Curr Opin Mol Ther. 2006;8(6):514–20.
CAS
PubMed
Google Scholar
von der Lieth CW, Bohne-Lang A, Lohmann KK, Frank M. Bioinformatics for glycomics: status, methods, requirements and perspectives. Brief Bioinform. 2004;5(2):164–78.
PubMed
Google Scholar
Pinho SS, Reis CA. Glycosylation in cancer: mechanisms and clinical implications. Nat Rev Cancer. 2015;15(9):540–55.
CAS
PubMed
Google Scholar
Xu C, Ng DT. Glycosylation-directed quality control of protein folding. Nat Rev Mol Cell Biol. 2015;16(12):742–52.
CAS
PubMed
Google Scholar
Bao W, Yuan C-A, Zhang Y, Han K, Nandi AK, Honig B, Ds H. Mutli-features Predction of protein translational modification sites. IEEE/ACM transactions on computational biology and bioinformatics. 2017.
Li F, Fan C, Marquez-Lago TT, Leier A, Revote J, Jia C, Zhu Y, Smith AI, Webb GI, et al. PRISM: a comprehensive 3D structure database for post-translational modifications and mutations with functional impact. bioRxiv. 2019:523308.
Neelofar K, Ahmad J. Glycosylation gap in patients with diabetes with chronic kidney disease and healthy participants: a comparative study. Indian J Endocrinol Metab. 2017;21(3):410–4.
CAS
PubMed
PubMed Central
Google Scholar
Sadurni A, Kehr G, Ahlqvist M, Peilot Sjogren H, Kankkonen C, Knerr L, Gilmour R. Fluorine-directed glycosylation enables the Stereocontrolled synthesis of selective SGLT2 inhibitors for type II diabetes. Chemistry. 2017.
Wolff SP, Dean RT. Glucose autoxidation and protein modification. The potential role of 'autoxidative glycosylation' in diabetes. Biochem J. 1987;245(1):243–50.
CAS
PubMed
PubMed Central
Google Scholar
Drabik A, Bodzon-Kulakowska A, Suder P, Silberring J, Kulig J, Sierzega M. Glycosylation changes in serum proteins identify patients with pancreatic Cancer. J Proteome Res. 2017;16(4):1436–44.
CAS
PubMed
Google Scholar
Ferreira JA, Magalhaes A, Gomes J, Peixoto A, Gaiteiro C, Fernandes E, Santos LL, Reis CA. Protein glycosylation in gastric and colorectal cancers: toward cancer detection and targeted therapeutics. Cancer Lett. 2017;387:32–45.
CAS
PubMed
Google Scholar
Magalhaes A, Duarte HO, Reis CA. Aberrant glycosylation in Cancer: a novel molecular mechanism controlling metastasis. Cancer Cell. 2017;31(6):733–5.
CAS
PubMed
Google Scholar
Oliveira-Ferrer L, Legler K, Milde-Langosch K. Role of protein glycosylation in cancer metastasis. Semin Cancer Biol. 2017;44:141–52.
CAS
PubMed
Google Scholar
Roberts JD, Klein JL, Palmantier R, Dhume ST, George MD, Olden K. The role of protein glycosylation inhibitors in the prevention of metastasis and therapy of cancer. Cancer Detect Prev. 1998;22(5):455–62.
CAS
PubMed
Google Scholar
Steentoft C, Vakhrushev SY, Joshi HJ, Kong Y, Vester-Christensen MB, Schjoldager KT, Lavrsen K, Dabelsteen S, Pedersen NB, Marcos-Silva L, et al. Precision mapping of the human O-GalNAc glycoproteome through SimpleCell technology. EMBO J. 2013;32(10):1478–88.
CAS
PubMed
PubMed Central
Google Scholar
Vergroesen RD, Slot LM, Hafkenscheid L, Koning MT, Scherer HU, Toes REM. Response to: 'Acquiring new N-glycosylation sites in variable regions of immunoglobulin genes by somatic hypermutation is a common feature of autoimmune diseases' by Visser et al. Ann Rheum Dis. 2017.
Visser A, Hamza N, Kroese FGM, Bos NA: Acquiring new N-glycosylation sites in variable regions of immunoglobulin genes by somatic hypermutation is a common feature of autoimmune diseases. Ann Rheum Dis 2017.
Gupta R, Brunak S. Prediction of glycosylation across the human proteome and the correlation to protein function. Pac Symp Biocomput. 2002:310–22.
Caragea C, Sinapov J, Silvescu A, Dobbs D, Honavar V. Glycosylation site prediction using ensembles of support vector machine classifiers. BMC Bioinformatics. 2007;8:438.
PubMed
PubMed Central
Google Scholar
Hamby SE, Hirst JD. Prediction of glycosylation sites using random forests. BMC Bioinformatics. 2008;9:500.
PubMed
PubMed Central
Google Scholar
Chauhan JS, Rao A, Raghava GP. In silico platform for prediction of N-, O- and C-glycosites in eukaryotic protein sequences. PLoS One. 2013;8(6):e67008.
CAS
PubMed
PubMed Central
Google Scholar
Pejaver V, Hsu WL, Xin FX, Dunker AK, Uversky VN, Radivojac P. The structural and functional signatures of proteins that undergo multiple events of post-translational modification. Protein Sci. 2014;23(8):1077–93.
CAS
PubMed
PubMed Central
Google Scholar
Li F, Li C, Wang M, Webb GI, Zhang Y, Whisstock JC, Song J. GlycoMine: a machine learning-based approach for predicting N-, C- and O-linked glycosylation in the human proteome. Bioinformatics. 2015;31(9):1411–9.
CAS
PubMed
Google Scholar
Li F, Li C, Revote J, Zhang Y, Webb GI, Li J, Song J, Lithgow T. GlycoMine(struct): a new bioinformatics tool for highly accurate mapping of the human N-linked and O-linked glycoproteomes by incorporating structural features. Sci Rep. 2016;6:34595.
CAS
PubMed
PubMed Central
Google Scholar
De Comité F, Denis F, Gilleron R, Letouzey F: Positive and unlabeled examples help learning. In: Algorithmic Learning Theory: 1999. Springer: 219–230.
Niu G, du Plessis MC, Sakai T, Ma Y, Sugiyama M: Theoretical comparisons of positive-unlabeled learning against positive-negative learning. In: Advances in neural information processing systems: 2016. 1199–1207.
Menon A, Rooyen BV, Ong CS, Williamson B: Learning from Corrupted Binary Labels via Class-Probability Estimation. In: Proceedings of the 32nd International Conference on Machine Learning; Proceedings of Machine Learning Research: Edited by Francis B, David B. PMLR 2015: 125--134.
Jain S, White M, Radivojac P: Recovering true classifier performance in positive-unlabeled learning. In: AAAI: 2017. 2066–2072.
Pejaver V, Urresti J, Lugo-Martinez J, Pagel KA, Lin GN, Nam H-J, Mort M, Cooper DN, Sebat J, Iakoucheva LM, et al. MutPred2: inferring the molecular and phenotypic impact of amino acid variants. bioRxiv:2017.
Elkan C, Noto K: Learning classifiers from only positive and unlabeled data. In: Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining: 2008. ACM: 213–220.
Chang S, Zhang Y, Tang J, Yin D, Chang Y, Hasegawa-Johnson MA, Huang TS: Positive-unlabeled learning in streaming networks. In: Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining: 2016. ACM: 755–764.
Xu YY, Yang F, Zhang Y, Shen HB. Bioimaging-based detection of mislocalized proteins in human cancers by semi-supervised learning. Bioinformatics. 2015;31(7):1111–9.
CAS
PubMed
Google Scholar
The UniProt C. UniProt: the universal protein knowledgebase. Nucleic Acids Res. 2017;45(D1):D158–69.
Google Scholar
Peng HC, Long FH, Ding C. Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. Ieee T Pattern Anal. 2005;27(8):1226–38.
Google Scholar
Li F, Li C, Marquez-Lago TT, Leier A, Akutsu T, Purcell AW, Smith AI, Lithgow T, Daly RJ, Song J, et al. Quokka: a comprehensive tool for rapid and accurate prediction of kinase family-specific phosphorylation sites in the human proteome. Bioinformatics. 2018.
Chen Z, Liu X, Li F, Li C, Marquez-Lago T, Leier A, Akutsu T, Webb GI, Xu D, Smith AI, et al. Large-scale comparative assessment of computational predictors for lysine post-translational modification sites. Brief Bioinform. 2018.
Song J, Li F, Leier A, Marquez-Lago TT, Akutsu T, Haffari G, Chou KC, Webb GI, Pike RN, Hancock J. PROSPERous: high-throughput prediction of substrate cleavage sites for 90 proteases with improved accuracy. Bioinformatics. 2018;34(4):684–7.
CAS
PubMed
Google Scholar
Song J, Wang Y, Li F, Akutsu T, Rawlings ND, Webb GI, Chou K-C: iProt-Sub: a comprehensive package for accurately mapping and predicting protease-specific substrates and cleavage sites. Briefings in Bioinformatics 2018:bby028-bby028.
Li F, Wang Y, Li C, Marquez-Lago TT, Leier A, Rawlings ND, Haffari G, Revote J, Akutsu T, Chou K-C et al: Twenty years of bioinformatics research for protease-specific substrate and cleavage site prediction: a comprehensive revisit and benchmarking of existing methods. Briefings in Bioinformatics 2018:bby077-bby077.
Fanchi M, Insung N, Lukasz K, Uversky VN. Compartmentalization and functionality of nuclear disorder: intrinsic disorder and protein-protein interactions in intra-nuclear compartments. Int J Mol Sci. 2015;17(1):24.
Google Scholar
Huang KY, Su MG, Kao HJ, Hsieh YC, Jhong JH, Cheng KH, Huang HD, Lee TY. dbPTM 2016: 10-year anniversary of a resource for post-translational modification of proteins. Nucleic Acids Res. 2016;44(D1):D435–46.
CAS
PubMed
Google Scholar
Hornbeck PV, Zhang B, Murray B, Kornhauser JM, Latham V, Skrzypek E. PhosphoSitePlus, 2014: mutations, PTMs and recalibrations. Nucleic Acids Res. 2015;43(Database issue):D512–20.
CAS
PubMed
Google Scholar
Fu L, Niu B, Zhu Z, Wu S, Li W. CD-HIT: accelerated for clustering the next-generation sequencing data. Bioinformatics. 2012;28(23):3150–2.
CAS
PubMed
PubMed Central
Google Scholar
Cheng X, Zhao S-G, Lin W-Z, Xiao X, Chou K-C. pLoc-mAnimal: predict subcellular localization of animal proteins with both single and multiple sites. Bioinformatics. 2017;33(22):3524–31.
CAS
PubMed
Google Scholar
Nakashima H, Nishikawa K, Ooi T. The folding type of a protein is relevant to the amino acid composition. J Biochem. 1986;99(1):153–62.
CAS
PubMed
Google Scholar
Chou KC. Prediction of protein cellular attributes using pseudo-amino acid composition. Proteins: Structure, Function, and Bioinformatics. 2001;43(3):246–55.
CAS
Google Scholar
Feng Z-P, Zhang C-T. Prediction of membrane protein types based on the hydrophobic index of amino acids. J Protein Chem. 2000;19(4):269–75.
CAS
PubMed
Google Scholar
Chen Z, Zhao P, Li F, Leier A, Marquez-Lago TT, Wang Y, Webb GI, Smith AI, Daly RJ, Chou KC, et al. iFeature: a python package and web server for features extraction and selection from protein and peptide sequences. Bioinformatics. 2018.
Horne DS. Prediction of protein helix content from an autocorrelation analysis of sequence hydrophobicities. Biopolymers. 1988;27(3):451–77.
CAS
PubMed
Google Scholar
Sokal RR, Thomson BA. Population structure inferred by local spatial autocorrelation: an example from an Amerindian tribal population. Am J Phys Anthropol. 2006;129(1):121–31.
PubMed
Google Scholar
Dubchak I, Muchnik I, Mayor C, Dralyuk I, Kim SH. Recognition of a protein fold in the context of the SCOP classification. Proteins: Structure, Function, and Bioinformatics. 1999;35(4):401–7.
CAS
Google Scholar
Chou K-C. Prediction of protein subcellular locations by incorporating quasi-sequence-order effect. Biochem Biophys Res Commun. 2000;278(2):477–83.
CAS
PubMed
Google Scholar
Kawashima S, Pokarowski P, Pokarowska M, Kolinski A, Katayama T, Kanehisa M. AAindex: amino acid index database, progress report 2008. Nucleic Acids Res. 2008;36(Database issue):D202–5.
CAS
PubMed
Google Scholar
Jowkar GH, Mansoori EG. Perceptron ensemble of graph-based positive-unlabeled learning for disease gene identification. Comput Biol Chem. 2016;64:263–70.
CAS
PubMed
Google Scholar
Yang P, Li X, Chua HN, Kwoh CK, Ng SK. Ensemble positive unlabeled learning for disease gene identification. PLoS One. 2014;9(5):e97079.
PubMed
PubMed Central
Google Scholar
Yang P, Li XL, Mei JP, Kwoh CK, Ng SK. Positive-unlabeled learning for disease gene identification. Bioinformatics. 2012;28(20):2640–7.
CAS
PubMed
PubMed Central
Google Scholar
Jiang M, Cao JZ. Positive-unlabeled learning for Pupylation sites prediction. Biomed Res Int. 2016;2016:4525786.
PubMed
PubMed Central
Google Scholar
Nan X, Bao L, Zhao X, Zhao X, Sangaiah AK, Wang GG, Ma Z. EPuL: An Enhanced Positive-Unlabeled Learning Algorithm for the Prediction of Pupylation Sites. Molecules. 2017;(9):22.
Yang P, Humphrey SJ, James DE, Yang YH, Jothi R. Positive-unlabeled ensemble learning for kinase substrate prediction from dynamic phosphoproteomics data. Bioinformatics. 2016;32(2):252–9.
CAS
PubMed
Google Scholar
Xu Y-Y, Yang F, Shen H-B. Incorporating organelle correlations into semi-supervised learning for protein subcellular localization prediction. Bioinformatics. 2016;32(14):2184–92.
CAS
PubMed
Google Scholar
Hameed PN, Verspoor K, Kusljic S, Halgamuge S. Positive-unlabeled learning for inferring drug interactions based on heterogeneous attributes. BMC Bioinformatics. 2017;18(1):140.
PubMed
PubMed Central
Google Scholar
Quinlan JR: C4. 5: programs for machine learning: Elsevier; 2014.
Langley P, Iba W, Thompson K: An analysis of Bayesian classifiers. In: Aaai: 1992. 223–228.
Denis F, Gilleron R, Letouzey F. Learning from positive and unlabeled examples. Theor Comput Sci. 2005;348(1):70–83.
Google Scholar
He J, Zhang Y, Li X, Wang Y: Bayesian classifiers for positive unlabeled learning. In: International Conference on Web-Age Information Management: 2011. Springer: 81–93.
Li F, Song J, Li C, Akutsu T, Zhang Y: PAnDE: Averaged n-Dependence Estimators for Positive Unlabeled Learning. ICIC Express Letters Part B: Applications, 8(9):11.
Webb GI, Boughton JR, Zheng F, Ting KM, Salem H. Learning by extrapolation from marginal to full-multivariate probability distributions: decreasingly naive Bayesian classification. Mach Learn. 2012;86(2):233–72.
Google Scholar
Jain S, White M, Trosset MW, Radivojac P: Nonparametric semi-supervised learning of class proportions. arXiv preprint arXiv:160101944 2016.
Jain S, White M, Radivojac P: Estimating the class prior and posterior from noisy positives and unlabeled data. In: Advances in Neural Information Processing Systems: 2016. 2693–2701.
Denis F, Laurent A, Gilleron R, Tommasi M: Text classification and co-training from positive and unlabeled examples. In: Proceedings of the ICML 2003 workshop: the continuum from labeled to unlabeled data: 2003. 80–87.
Webb GI, Pazzani MJ: Adjusted probability naive Bayesian induction. In: Australian Joint Conference on Artificial Intelligence: 1998. Springer: 285–295.
Friedman N, Geiger D, Goldszmidt M. Bayesian network classifiers. Mach Learn. 1997;29(2–3):131–63.
Google Scholar
Su J, Zhang H: Full Bayesian network classifiers. In: Proceedings of the 23rd international conference on Machine learning: 2006. ACM: 897–904.
Xie HL, Fu L, Nie XD. Using ensemble SVM to identify human GPCRs N-linked glycosylation sites based on the general form of Chou's PseAAC. Protein Eng Des Sel. 2013;26(11):735–42.
CAS
PubMed
Google Scholar
Song J, Li F, Takemoto K, Haffari G, Akutsu T, Chou KC, Webb GI. PREvaIL, an integrative approach for inferring catalytic residues using sequence, structural, and network features in a machine-learning framework. J Theor Biol. 2018;443:125–37.
CAS
PubMed
Google Scholar
Wei L, Hu J, Li F, Song J, Su R, Zou Q. Comparative analysis and prediction of quorum-sensing peptides using feature representation learning and machine learning algorithms. Brief Bioinform. 2018.
Zhang M, Li F, Marquez-Lago TT, Leier A, Fan C, Kwoh CK, Chou KC, Song J, Jia C. MULTiPly: a novel multi-layer predictor for discovering general and specific types of promoters. Bioinformatics. 2019.
Witten IH, Frank E, Hall MA, Pal CJ: Data mining: practical machine learning tools and techniques: Morgan Kaufmann; 2016.
Abe N, Zadrozny B, Langford J: Outlier detection by active learning. In: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining: 2006. ACM: 504–509.
Sebert DM. Outliers in statistical data. J Qual Technol. 1997;29(2):230.
Google Scholar
Manevitz LM, Yousef M. One-class SVMs for document classification. J Mach Learn Res. 2001;2(Dec):139–54.
Google Scholar
Hempstalk K, Frank E, Witten IH: One-class classification by combining density and class probability estimation. In: Joint European Conference on Machine Learning and Knowledge Discovery in Databases: 2008. Springer: 505–519.