Volume 18 Supplement 7
In silico re-identification of properties of drug target proteins
© The Author(s) 2017
Published: 31 May 2017
Computational approaches in the identification of drug targets are expected to reduce time and effort in drug development. Advances in genomics and proteomics provide the opportunity to uncover properties of druggable genomes. Although several studies have been conducted for distinguishing drug targets from non-drug targets, they mainly focus on the sequences and functional roles of proteins. Many other properties of proteins have not been fully investigated.
Using the DrugBank (version 3.0) database containing nearly 6,816 drug entries including 760 FDA-approved drugs and 1822 of their targets and human UniProt/Swiss-Prot databases, we defined 1578 non-redundant drug target and 17,575 non-drug target proteins. To select these non-redundant protein datasets, we built four datasets (A, B, C, and D) by considering clustering of paralogous proteins.
We first reassessed the widely used properties of drug target proteins. We confirmed and extended that drug target proteins (1) are likely to have more hydrophobic, less polar, less PEST sequences, and more signal peptide sequences higher and (2) are more involved in enzyme catalysis, oxidation and reduction in cellular respiration, and operational genes. In this study, we proposed new properties (essentiality, expression pattern, PTMs, and solvent accessibility) for effectively identifying drug target proteins. We found that (1) drug targetability and protein essentiality are decoupled, (2) druggability of proteins has high expression level and tissue specificity, and (3) functional post-translational modification residues are enriched in drug target proteins. In addition, to predict the drug targetability of proteins, we exploited two machine learning methods (Support Vector Machine and Random Forest). When we predicted drug targets by combining previously known protein properties and proposed new properties, an F-score of 0.8307 was obtained.
When the newly proposed properties are integrated, the prediction performance is improved and these properties are related to drug targets. We believe that our study will provide a new aspect in inferring drug-target interactions.
KeywordsDrug target Bioinformatics Proteomics
With the rapid accumulation of drug-related data in public databases, much attention has been paid to developing computational approaches to identify new drug candidates and to reposition existing drugs because computational tools help reduce time and costs of drug development . Along with drug-related data, significant increases in proteomics data encourage researchers to focus on computational approaches in drug development. Similarities in amino acids sequences with existing drug targets and in functional roles of target proteins, including G-protein-coupled receptors (GPCRs), enzymes, and ion channels, have been main resources for inferring drug-target interactions, and many predictions have been performed within each functional category . Recently, more resources, including side effects of drugs, drug-drug interactions, and protein-protein interactions, have been incorporated for predicting new drug targets [3, 4].
Such prediction efforts will be advanced if more properties of drug targets can be revealed. Over the last two decades, there have been several efforts to curate drug targets and to categorize them [5–8]. When Hopkins and Groom  identified 399 non-redundant molecular targets, targets were contained in only 130 protein families, half of which fall into just six gene families, including GPCRs and serine/threonine and tyrosine protein kinases. At that time, they predicted that the numbers of druggable genomes and drug targets would be approximately 3,000 and around 600-1500, respectively. Imming et al.  listed 218 targets and classified them based on “mechanism of actions", such as enzymes, substrates, metabolites, proteins, receptors, ion channels, transport proteins, DNA, RNA, ribosome, and targets of monoclonal antibodies. Recently, information about drugs and their targets have been systematically deposited in public databases. The DrugBank database , launched in 2006, is a systematic collection of drug-protein interactions containing information on more than 760 Food and Drug Administration (FDA)-approved drugs and around 2000 drug target proteins. Moreover, this database contains drug-target interactions with gene annotations from Swiss-Prot .
With the availability of various proteomics data, more comprehensive analysis about drug targets has become possible. Bakheet and Doig  defined 148 proteins as drug targets from the DrugBank database to analyze the protein target properties. They identified several features to distinguish targets from non-targets: all amino acid compositions, the length of proteins, hydrophobicity, secondary structure of proteins, transmembrane helices, and others. Bull and Doig  extended protein properties from Bakheet and Doig by proposing additional properties: protein-protein interactions, expression levels, and germline variants. However, these features were not strong indicators for distinguishing targets from non-targets. They also applied machine learning approaches such as support vector machine (SVM) and random forest (RF) to predict drug target proteins [11–13].
Identification of drug target proteins
We used the DrugBank (version 3.0)  database to define drug target proteins. It contains nearly 6816 drug entries, including 760 FDA-approved drugs and 1822 of their targets, including 1661 proteins, 226 enzymes, 110 carriers, and 19 transporters. Using human UniProt/Swiss-Prot databases (release 2014.02) , 1578 non-redundant drug target proteins were defined and named as human drug target proteins or hDP +. The remaining 17,575 human proteins were assigned to non-drug target proteins (named hDP 0).
To consider the possibility that the relevance of drug target protein properties may be over or underestimated depending on their gene family size, we built four datasets (A, B, C, and D). The first dataset A is composed of an initial 1,578 hDP + and 17,575 hDP 0. The second dataset B, derived from dataset A, contains only one representative protein from each gene family and thus has 792 hDP + and 8,361 hDP 0. For dataset C and D, if members of a gene family are derived from both hDP + and hDP 0, all genes in this gene family were excluded from the hDP 0 set. Thus, the third dataset C, derived from dataset A, has 1578 hDP + and 15,691 hDP 0, and the fourth dataset D, derived from dataset B, has 792 hDP + and 7949 hDP 0. In cases where a gene family has multiple members, the longest coding sequences (CDS) were selected to represent the gene family.
Widely studied properties of drug target proteins
All properties (simple sequence properties, primary enzyme commission number, gene ontology terms, subcellular location, signal peptide cleavage, transmembrane helices, PEST regions that are rich in proline (P), glutamic acid (E), serine (S), and threonine (T), and secondary structure) tested in Bakheet and Doig , except for glycosylation, phosphorylation, and subcellular location, were reinvestigated for our four drug target datasets using the same bioinformatics tools and databases.
For more accurately and quantitatively analyzing post-translational protein modifications (PTMs), we used the PhosphoSitePlus database (March 4, 2014) , which is a manually curated collection of PTMs. It has collected nearly 212,556 PTM sites, and we used the top three PTMs for this study, including phosphorylation (160,338; 75.4%), ubiquitination (34,293; 16.1%), and acetylation (17,925; 8.4%).
Because the Swiss-Prot database has explained only about 18% of human proteins with respect to subcellular location, we used two additional subcellular localization databases: (1) manually curated LOCATE  database generated from a high-throughput immunofluorescence-based assay and peer-reviewed literature and (2) the comprehensively annotated Cell-PLoc  database using gene ontology, functional domain, and evolutionary conservation information. As a result, about 43% of human proteins had their subcellular location; however, the others still remain unrevealed. For these, we used five prediction programs (CELLO, pTarget, Proteome Analyst, WoLFPSORT, and MultiLoc) , and their subcellular locations were determined if they were supported by at least three prediction tools. In this study, we exploited ten subcellular location terms used in the LOCATE database as follows: cytoplasm, cytoskeleton, endoplasmic reticulum, extracellular, Golgi apparatus, lysosome, mitochondrion, nucleus, peroxisome, and plasma membrane.
Newly proposed properties of drug target proteins
We downloaded the gene annotations for gene families through BioMart in the ENSEMBL database (release 75) , and the gene family was defined if it had at least two members.
Human essential and non-essential genes were obtained from Georgi et al. , who exploited genes with lethal and non-lethal phenotypes in the Mouse Genome Database. The dataset included 2472 essential genes and 3811 non-essential genes.
SABLE  was used to predict the solvent accessibility of each amino acid in the protein sequences. The SABLE score ranged 0 to 99; values close to 0 indicate fully buried (i.e., solvent inaccessible) and close to 99 indicate fully exposed (i.e., solvent accessible). We used an average SABLE value for a protein as the solvent accessibility score.
To determine whether there was significantly different drug properties between hDP + and hDP 0, we performed two statistical tests: (1) a chi-square test and (2) a Wilcoxon rank-sum test for properties measured as discrete and continuous values, respectively.
Predicting drug targets
We predicted drug targets by classifying proteins into two groups: hDP 0 proteins and hDP + proteins. For prediction, the properties of proteins were used as features for two machine learning approaches, SVM and RF, and R package (randomForest) and Liblinear were used for implementation . Feature values were scaled into normalized values between 0 to 1 by calculating X=(X−min i )/(max i −min i ), where X is the feature value and min i and max i are, respectively, the minimum and maximum values of the ith attribute. When we construct SVM and RF classifiers, we made the number of proteins in the two groups the same by reducing the number of proteins in hDP 0 with random selection. To construct the SVM classifier, the L2-regularized L2-loss support vector classification was used. The optimal error parameter (C) and radial bias parameter (ε) were set to 1.3 and 0.01, respectively. For SVM, we chose the parameter C with the “-C” option provided by Liblinear, which repeatedly selects the optimal value with training data . Although the parameter C was recalculated during each cross-validation for all four data sets (A, B, C, and D), the same value was obtained. For the parameter ε, the default value was used. For RF, the size of the random subset of features evaluated at each node was calculated by mtry=log 2(number of features+1), and the number of trees was set to 100. In general, with the more trees, the accuracy increases. However, the amount of improvements decreases when the number of trees becomes too large. Thus, the benefit of the prediction performance is less than the cost of the computation time to learn these additional trees .
We performed cross-validation to measure an accuracy of SVM and RF classifier based on widely used (W) and newly proposed (N) properties. In addition, we performed classification using statistically significant widely used (W ′) and newly proposed (N ′) features. Using only training data sets, we selected statistically significant features with p-value less than 0.05 at each cross-validation step. Recall, precision, and F-score were used as measurements: recall=TP/(TP+FN), precision=TP/(TP+FP), and F1=2×recall×precision/(recall+precision), where TP, FN, TN, and FP represent true positive (correctly predicted as hDP +), false negative (incorrectly predicted as hDP 0), true negative (correctly predicted as hDP 0), and false positive (incorrectly predicted as hDP +), respectively.
Results and discussion
Number of proteins for each dataset
Number of hDP +
Number of hDP 0
Widely used properties of drug target proteins
Because drug metabolism is closely related to enzymes , we checked and analyzed whether the hDP + when compared to hDP 0 contain relatively more enzyme proteins and which enzyme classes are dominant in hDP +. As expected, more than half (453 out of 792, 57.1%) of hDP + are involved in enzyme activity, whereas 15.2% (1211 out of 7949) of hDP 0 are. All six enzyme classes have a significantly higher proportion of hDP + than in hDP 0 (Fig. 2 b, Additional file 3: Figure S2), which is inconsistent with Bakheet and Doig’s results. This inconsistency might have been caused by using distributions among only enzymes rather than using proportions of enzymes among all target proteins or non-target proteins.
We next investigated whether hDP + specifically include signal peptide sequences, which play an important role in the pharmacokinetics . The frequency of signal peptide sequences in hDP + (347 out of 792) was significantly higher (0.452 vs. 0.226, P=1.49×10−04) than that in hDP 0 (1796 out of 7949), suggesting that hDP + are more likely to be secreted. Thus, we further explored which subcellular locations are preferentially associated with hDP +. From the top five subcellular locations with a proportion > 10% in hDP +, the plasma membrane, extracellular region, and mitochondrion were significantly favored as hDP + locations. In contrast, hDP 0 were frequently located in the nucleus and cytoplasm (Fig. 2 c, Additional file 4: Figure S3).
Newly proposed properties of drug target proteins
Predicting drug targets
Result for drug target protein prediction using machine learning methods
Set A, W
Set A, W ′
Set A, W+N
Set A, W ′ + N ′
Set B, W
Set B, W ′
Set B, W+N
Set B, W ′ + N ′
Set C, W
Set C, W ′
Set C, W+N
Set C, W ′ + N ′
Set D, W
Set D, W ′
Set D, W+N
Set D, W ′ + N ′
Set A, W
Set A, W ′
Set A, W+N
Set A, W ′ + N ′
Set B, W
Set B, W ′
Set B, W+N
Set B, W ′ + N ′
Set C, W
Set C, W ′
Set C, W+N
Set C, W ′ + N ′
Set D, W
Set D, W ′
Set D, W+N
Set D, W ′ + N ′
Bull and Doig  and Huang et al.  also predicted drug targets. Bull and Doig  employed the RF method with extended protein properties from Bakheet and Doig , and Huang et al.  used the SVM method with the same protein properties as those in Bakheet and Doig . The accuracy of Bull and Doig  and Huang et al.  were an F-score of 0.8237 and a G-mean of 0.7813, respectively. Because datasets used in Bull and Doig , Huang et al. , and this study were somewhat different due to different versions of DrugBank, it is hard to directly compare their results with ours. However, the accuracy values of the F-score of our approach incorporating newly proposed properties were higher than those from the previous two approaches. In addition, because the approach in Huang et al.  was similar to that of our study using dataset A with features of W ′, we can infer that dataset C with features of W ′ + N ′ outperforms the approach in Huang et al. .
In this study, we proposed new properties (essentiality, expression pattern, PTMs, and solvent accessibility) for effectively identifying drug target proteins. To this end, we performed a highly controlled experimental study (in silico) in order to minimize statistical biases due to involvement of redundant duplicated genes. Although it has been known that essential proteins are indispensable to the viability of an organism and the loss of just one of them is sufficient to lead to lethality or infertility [41, 42], intriguingly we observed drug targetability and protein essentiality are decoupled. We also revealed that druggability of proteins has high expression level and tissue specificity. To investigate whether drug target proteins appear to be PTMs, as different from previous studies [11, 12], we used a manually curated large collection of PTMs with protein structure information. Using three major types of PTM (phosphorylation, acetylation, and ubiquitination), functional PTM residues are enriched in drug target proteins. We also reassessed the widely used properties of drug target proteins. Using more comprehensive and refined set of protein properties with more powerful methodologies, we confirmed and extended that drug target proteins (1) are likely to have more hydrophobic, less polar, less PEST sequences, no preference in the proportion of small amino acids, more increase in length of residues, and more signal peptide sequences higher and (2) are more involved in enzyme catalysis, oxidation and reduction in cellular respiration, and operational genes. To build a classifier distinguishing between drug and non-drug target proteins, we utilized both newly proposed properties and widely used properties and we achieved much higher accuracy rate compared to that using existing the widely used properties. As a result, we expect that our new properties as well as extended existing ones will help to infer drug-target interactions more reliably.
This research was supported by the Bio-Synergy Research Project (NRF-2016M3A9C4939665 and NRF-2015M3A9C-4075820) of the Ministry of Science, ICT and Future Planning through the National Research Foundation. The publication charges for this article was funded the Bio-Synergy Research Project (NRF-2016M3A9C4939665).
Availability of data and materials
The datasets used and/or analysed during the current study available on http://gcancer.org/drugtarget/.
All the authors shared the responsibility in this paper. BK conducted data collection, analysis, and prediction model experiments on drug target protein properties. JJ analyzed and plotted data on protein properties. CP and HL initiated the study. All the authors participated in writing the manuscript. All authors read and approved the final manuscript.
The authors declare that they have no competing interests.
Consent for publication
Ethics approval and consent to participate
About this supplement
This article has been published as part of BMC Bioinformatics Volume 18 Supplement 7, 2017: Proceedings of the Tenth International Workshop on Data and Text Mining in Biomedical Informatics. The full contents of the supplement are available online at https://bmcbioinformatics.biomedcentral.com/articles/supplements/volume-18-supplement-7.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
- Sliwoski G, Kothiwale S, Meiler J, Lowe EW. Computational methods in drug discovery. Pharmacol Rev. 2014; 66(1):334–95.View ArticlePubMedPubMed CentralGoogle Scholar
- Yamanishi Y, Araki M, Gutteridge A, Honda W, Kanehisa M. Prediction of drug–target interaction networks from the integration of chemical and genomic spaces. Bioinformatics. 2008; 24(13):232–40.View ArticleGoogle Scholar
- Campillos M, Kuhn M, Gavin AC, Jensen LJ, Bork P. Drug target identification using side-effect similarity. Science. 2008; 321(5886):263–6.View ArticlePubMedGoogle Scholar
- Kim S, Jin D, Lee H. Predicting drug-target interactions using drug-drug interactions. PLoS ONE. 2013; 8(11):80129. doi:10.1371/journal.pone.0080129.View ArticleGoogle Scholar
- Hopkins AL, Groom CR. The druggable genome. Nat Rev Drug Discov. 2002; 1(9):727–30.View ArticlePubMedGoogle Scholar
- Imming P, Sinning C, Meyer A. Drugs, their targets and the nature and number of drug targets. Nat Rev Drug Discov. 2006; 5(10):821–34.View ArticlePubMedGoogle Scholar
- Overington JP, Al-Lazikani B, Hopkins AL. How many drug targets are there?Nat Rev Drug Discov. 2006; 5(12):993–6.View ArticlePubMedGoogle Scholar
- Zheng C, Han L, Yap C, Ji Z, Cao Z, Chen Y. Therapeutic targets: progress of their exploration and investigation of their characteristics. Pharmacol Rev. 2006; 58(2):259–79.View ArticlePubMedGoogle Scholar
- Wishart DS, Knox C, Guo AC, Shrivastava S, Hassanali M, Stothard P, Chang Z, Woolsey J. Drugbank: a comprehensive resource for in silico drug discovery and exploration. Nucleic Acids Res. 2006; 34(suppl 1):668–72.View ArticleGoogle Scholar
- Magrane M, Consortium U, et al. Uniprot knowledgebase: a hub of integrated protein data. Database. 2011; 2011:009.View ArticleGoogle Scholar
- Bakheet TM, Doig AJ. Properties and identification of human protein drug targets. Bioinformatics. 2009; 25(4):451–7.View ArticlePubMedGoogle Scholar
- Bull SC, Doig AJ. Properties of protein drug target classes. PLoS ONE. 2015; 10(3):0117955. doi:10.1371/journal.pone.0117955.Google Scholar
- Huang C, Zhang R, Chen Z, Jiang Y, Shang Z, Sun P, Zhang X, Li X. Predict potential drug targets from the ion channel proteins based on svm. J Theor Biol. 2010; 262(4):750–6.View ArticlePubMedGoogle Scholar
- Knox C, Law V, Jewison T, Liu P, Ly S, Frolkis A, Pon A, Banco K, Mak C, Neveu V, et al. Drugbank 3.0: a comprehensive resource for ‘omics’ research on drugs. Nucleic Acids Res. 2011; 39(suppl 1):1035–41.View ArticleGoogle Scholar
- Hornbeck PV, Kornhauser JM, Tkachev S, Zhang B, Skrzypek E, Murray B, Latham V, Sullivan M. Phosphositeplus: a comprehensive resource for investigating the structure and function of experimentally determined post-translational modifications in man and mouse. Nucleic Acids Res. 2011; 40(D1):D261–70.View ArticlePubMedPubMed CentralGoogle Scholar
- Sprenger J, Fink JL, Karunaratne S, Hanson K, Hamilton NA, Teasdale RD. Locate: a mammalian protein subcellular localization database. Nucleic Acids Res. 2008; 36(suppl 1):230–3.Google Scholar
- Chou KC, Shen HB. Cell-ploc: a package of web servers for predicting subcellular localization of proteins in various organisms. Nat Protoc. 2008; 3(2):153–62.View ArticlePubMedGoogle Scholar
- Sprenger J, Fink JL, Teasdale RD. Evaluation and comparison of mammalian subcellular localization prediction methods. BMC Bioinforma. 2006; 7(Suppl 5):3.View ArticleGoogle Scholar
- Cunningham F, Amode MR, Barrell D, Beal K, Billis K, Brent S, Carvalho-Silva D, Clapham P, Coates G, Fitzgerald S, et al.Ensembl 2015. Nucleic Acids Res. 2015; 43(D1):D662–9.View ArticlePubMedGoogle Scholar
- Georgi B, Voight BF, Bućan M. From mouse to human: evolutionary genomics analysis of human orthologs of essential genes. PLoS Genet. 2013; 9(5):1003484.View ArticleGoogle Scholar
- Su AI, Wiltshire T, Batalov S, Lapp H, Ching KA, Block D, Zhang J, Soden R, Hayakawa M, Kreiman G, et al. A gene atlas of the mouse and human protein-encoding transcriptomes. Proc Natl Acad Sci USA. 2004; 101(16):6062–7.View ArticlePubMedPubMed CentralGoogle Scholar
- Liao BY, Scott NM, Zhang J. Impacts of gene essentiality, expression pattern, and gene compactness on the evolutionary rate of mammalian proteins. Mol Biol Evol. 2006; 23(11):2072–80.View ArticlePubMedGoogle Scholar
- Chen SC-C, Chen FC, Li WH. Phosphorylated and nonphosphorylated serine and threonine residues evolve at different rates in mammals. Mol Biol Evol. 2010; 27(11):2548–54.View ArticlePubMedPubMed CentralGoogle Scholar
- Fan RE, Chang KW, Hsieh CJ, Wang XR, Lin CJ. Liblinear: A library for large linear classification. J Mach Learn Res. 2008; 9(Aug):1871–4.Google Scholar
- Oshiro TM, Perez PS, Baranauskas JA. How many trees in a random forest? In: Perner P, editor. Machine Learning and Data Mining in Pattern Recognition. MLDM, Lecture Notes in Computer Science, vol 7376. Berlin: Springer: 2012. p. 154–68.Google Scholar
- Rice P, Longden I, Bleasby A. Emboss: The european molecular biology open software suite. Trends Genet. 2000; 16(6):276–7.View ArticlePubMedGoogle Scholar
- Rogers S, Wells R, Rechsteiner M. Amino acid sequences common to rapidly degraded proteins: the pest hypothesis. Science. 1986; 234(4774):364–8.View ArticlePubMedGoogle Scholar
- Copeland RA, Harpel MR, Tummino PJ. Targeting enzyme inhibitors in drug discovery. Expert Opin Ther Targets. 2007; 11(7):967–78.View ArticlePubMedGoogle Scholar
- Giacomini KM, Huang SM, Tweedie DJ, Benet LZ, Brouwer KL, Chu X, Dahlin A, Evers R, Fischer V, Hillgren KM, et al. Membrane transporters in drug development. Nat Rev Drug Discov. 2010; 9(3):215–36.View ArticlePubMedGoogle Scholar
- Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, et al. Gene ontology: tool for the unification of biology. Nat Genet. 2000; 25(1):25–9.View ArticlePubMedPubMed CentralGoogle Scholar
- Huang DW, Sherman BT, Lempicki RA. Systematic and integrative analysis of large gene lists using david bioinformatics resources. Nat Protoc. 2008; 4(1):44–57.View ArticleGoogle Scholar
- Rivera MC, Jain R, Moore JE, Lake JA. Genomic evidence for two functionally distinct gene classes. Proc Natl Acad Sci. 1998; 95(11):6239–44.View ArticlePubMedPubMed CentralGoogle Scholar
- Grotenbreg G, Ploegh H. Chemical biology: dressed-up proteins. Nature. 2007; 446(7139):993–5.View ArticlePubMedGoogle Scholar
- Geiss-Friedlander R, Melchior F. Concepts in sumoylation: a decade on. Nat Rev Mol Cell Biol. 2007; 8(12):947–56.View ArticlePubMedGoogle Scholar
- Wang YC, Peterson SE, Loring JF. Protein post-translational modifications and regulation of pluripotency in human stem cells. Cell Res. 2014; 24(2):143–60.View ArticlePubMedGoogle Scholar
- Walsh CT. Protein phosphorylation by protein kinases. Posttranslational modification of proteins: Expanding nature’s inventory. Englewood: Roberts and Company Publishers; 2006.Google Scholar
- Lu CT, Huang KY, Su MG, Lee TY, Bretaña NA, Chang WC, Chen YJ, Chen YJ, Huang HD. Dbptm 3.0: an informative resource for investigating substrate site specificity and functional association of protein post-translational modifications. Nucleic Acids Res. 2013; 41(D1):D295–305.View ArticlePubMedGoogle Scholar
- Li J, Jia J, Li H, Yu J, Sun H, He Y, Lv D, Yang X, Glocker MO, Ma L, et al. Sysptm 2.0: an updated systematic resource for post-translational modification. Database. 2014; 2014:025.Google Scholar
- Zielinska DF, Gnad F, Wiśniewski JR, Mann M. Precision mapping of an in vivo n-glycoproteome reveals rigid topological and sequence constraints. Cell. 2010; 141(5):897–907.View ArticlePubMedGoogle Scholar
- Landry CR, Levy ED, Michnick SW. Weak functional constraints on phosphoproteomes. Trends Genet. 2009; 25(5):193–7.View ArticlePubMedGoogle Scholar
- He X, Zhang J. Why do hubs tend to be essential in protein networks. PLoS Genet. 2006; 2(6):88.View ArticleGoogle Scholar
- Yıldırım MA, Goh KI, Cusick ME, Barabási AL, Vidal M. Drug—target network. Nat Biotechnol. 2007; 25(10):1119–26.View ArticlePubMedGoogle Scholar