Cluster analysis of networks generated through homology: automatic identification of important protein communities involved in cancer metastasis
© Jonsson et al; licensee BioMed Central Ltd. 2006
Received: 6 October 2005
Accepted: 6 January 2006
Published: 6 January 2006
Protein-protein interactions have traditionally been studied on a small scale, using classical biochemical methods to investigate the proteins of interest. More recently large-scale methods, such as two-hybrid screens, have been utilised to survey extensive portions of genomes. Current high-throughput approaches have a relatively high rate of errors, whereas in-depth biochemical studies are too expensive and time-consuming to be practical for extensive studies. As a result, there are gaps in our knowledge of many key biological networks, for which computational approaches are particularly suitable.
We constructed networks, or 'interactomes', of putative protein-protein interactions in the rat proteome – the rat being an organism extensively used for cancer studies. This was achieved by integrating experimental protein-protein interaction data from many species and translating this data into the reference frame of the rat. The putative rat protein interactions were given confidence scores based on their homology to proteins that have been experimentally observed to interact. The confidence score was furthermore weighted according to the extent of the experimental evidence, giving a higher weight to more frequently observed interactions. The scoring function was subsequently validated and networks constructed around key proteins, identified as being highly up- or down-regulated in rat cell lines of high metastatic potential. Using clustering methods on the networks, we have identified key protein communities involved in cancer metastasis.
The protein network generation and subsequent network analysis used here, were shown to be useful for highlighting key proteins involved in metastasis. This approach, in conjunction with microarray expression data, can be extended to other species, thereby suggesting possible pathways around proteins of interest.
Microarray experiments provide information about gene expression within the cells under study.
Expression patterns can be uncovered from large-scale microarray data by systematically grouping genes with the help of clustering methods. Co-clustering of genes can indicate that the genes in question have a similar function or that they participate in the same cellular process [1, 2]. Nevertheless, microarray experiments typically yield hundreds of significantly differentially-expressed genes, making it difficult to draw biological conclusions. Furthermore, although microarray experiments can show correlations between the expression of genes, they do not reveal the exact protein interaction mechanism.
Protein network analysis is dependent on a reliable assignment of protein-protein interactions. Protein-protein interactions are commonly studied using biochemical methods, and several different experimental methods are currently in use. Two-hybrid screens have, to date, yielded the bulk of available data [3, 4]; however their level of accuracy is not particularly high and should be supported by additional evidence [5, 6]. Advances in other techniques, such as tandem-affinity purification and mass spectroscopy, have also made large-scale studies increasingly feasible [7, 8].
A number of computational methods, either based on sequence or structural features, have been developed to complement experimental approaches to predicting protein-protein interactions [9, 10]. An increasing emphasis has been on deducing and exploring the protein-protein interaction networks that are reflected in expression data; gene networks have been inferred from gene expression data using mathematical analysis such as Bayesian regression [11–14]. Moreover, networks have been derived by complementing gene expression data with data from different sources, such as gene ontologies, phenotypic profiling and functional similarities [15–18].
Alternative techniques to network construction have also been taken, see e.g. Cabusora et al. , where a protein interaction map was created based upon the principle that interacting protein modules in one organism may be fused into a single chain in another, and Calvano et al.  who constructed the network by literature searches for information pertaining to interacting protein pairs from closely related organisms. These methods do not utilise gene expression explicitly in the network generation, rather the expression data is used as a tool to focus on the network.
Previous studies have mapped expression data of different systems onto experimentally-based networks. Ideker et al.  used gene expression changes in response to perturbation to highlight clusters within a yeast network, and Sohler et al.  made use of statistical analysis to highlight significant sub-clusters, also within a yeast network. Moreover, the dynamic aspect of yeast networks have been highlighted by de Lichtenberg and coworkers , who combined temporal cell cycle expression data with protein-protein interaction networks.
Here we have taken an extensive multi-genome approach, utilising a homology-based method for predicting interacting proteins  and further extended it by developing a scoring function, based upon sequence similarity and the amount of experimental data supporting each interaction. This scoring function has subsequently been extensively validated. In contrast to the above methodologies we go beyond data integration by considering orthologous relationships and are therefore able to create a more extensive protein interaction network – or 'interactome' – for a higher eukaryote, the rat.
In order to demonstrate the utility of our predicted interactions, expression data on tumour progression resulting in rat sarcomas with high metastatic potential were mapped onto our interactome, creating protein networks around key proteins involved in the metastatic process.
Results and Discussion
Validation of the scoring function
The protein networks are composed of predicted individual interactions, each of which is assigned a score which indicates the strength of the prediction. Before examining the networks in detail it is necessary to assess the quality of the predictions and to validate the method.
Selection of cut-off value for the scoring function
Identification of highly reliable interactions
Many methods for detecting protein-protein interactions can yield either false positive or false negative results, but X-ray crystal structures of complexed proteins can be considered to be a gold standard for proof. To examine the validity of our scoring function we looked at the interaction scores of rat proteins that have either been crystallised together in a complex or have a very high homology to one that has been. These scores were then compared to ones without any crystallographic evidence, i.e. those that do not interact or have not been proven to do so by crystallography.
We found that highly reliable interactions, identified by X-ray crystallography, score higher than those without crystallographic evidence, with median scores 128 and 16 respectively. This difference was significant according to a χ2-test (p ≪ 0.0001), indicating that true interactions score higher than those whose association has not been confirmed by crystallography.
Distribution of protein-protein interaction scores. Interaction scores of X-ray crystal structures (n = 377) compared to the scores of all (genome-wide) predicted interactions.
Percentage of interactions
Interaction score 0 – 10
Interaction score > 10
X-ray crystal structures
Community participation and cellular localisation
Another way of estimating the quality of the scoring function is to look at proteins participating in the same cellular process and compare them with proteins that are not thought to interact directly in a pathway. We used a clique percolation method to identify 'communities' within the network that show high interconnectivity. This yielded 37 communities of tightly interconnected proteins that will be described later. One can assume that interactions within communities are more likely to be true than interactions between communities, i.e. higher scores would be expected for intra-community interactions . We found this to be true; the average score for interactions within a community was 26.2 (n = 2038) and the average score for interactions between communities was 13.5 (n = 502). This is significant at a 95% confidence level (p = 3.1 × 10-30).
Lastly, the protein interaction scores were examined in the context of cellular localisation. We assume that for true interactions, interacting proteins would co-localise in the same cellular compartment, at least during the time of interaction, and thus would expect predicted interactions between proteins in separate cellular compartments to be less reliable and receive a lower score. Localisation data from the Gene Ontology Consortium  were used, where available, for proteins within the thirty-seven protein communities. Of the protein interactions predicted, 681 (94%) were considered co-localised, with an average score of 25.8 and 41 (6%) were annotated as not sharing cellular localisation, with an average score of 13.1. The score difference is statistically significant (p = 0.001 at a 95% confidence level).
Metastatic network communities identified by cluster analysis
Metastasis is a key event that is associated with a poor prognosis in cancer patients. Metastasising cancer cells have the ability to break away from the primary tumour and move to different organs, making the cancer more difficult to treat. Much is unknown about the molecular biology of metastasis, but it culminates in the cancerous cells acquiring several properties, such as increased motility and invasiveness. This involves a network of cascading protein-protein interactions which have to be unravelled if an effective treatment is to be developed.
As a starting point, we used data from a microarray analysis of cell lines with different metastatic potentials (see Methods). We took the highest up- and down-regulated genes (≥ 4-fold up- or down-expression), and constructed networks around these, extending two generations from the starting point, i.e. initially including proteins that interact directly with the originating protein and then going on to include the proteins that interact with them. This subset of the rat interactome contained 10,628 interactions. We then performed a cluster analysis in order to highlight areas in the protein networks that are involved in the metastatic process. The clustering is based on a clique percolation method  that seeks to identify 'communities' of highly interconnected proteins that make up the essential structural units of the networks.
The community definition is based on the observation that a typical member in a community is linked to many, but not necessarily all other nodes in the community. In other words, a community can be regarded as a union of smaller, complete, fully-connected subgraphs that share nodes (see Methods section). Palla et al.  have shown that clique clustering analysis is a powerful tool to identify communities of proteins participating in the same cellular processes. Furthermore, it has been shown that subnetworks of proteins involved in a defined cellular process are more heavily interconnected by direct protein interaction than would be expected by chance . Highly connected proteins are also more likely to be essential to cellular processes .
Domain frequency within the clustered communities. The table shows the most frequently observed domains in the metastasis-related cluster communities (observed frequencies) alongside the expected domain frequencies, based on the domain composition of the whole rat genome. The n-fold difference was calculated from the frequency percentages (numbers within parentheses).
Observed frequency (%)
Expected frequency (%)
IQ calmodulin-binding motif
Protein kinase domain
Calponin homology (CH) domain
Proteasome A-type and B-type
Transforming growth factor β-like domain
The intracellular signalling cascade
It is not the aim here to explore every member of each community, instead, with the automatic identification of metastatic-related protein communities being the primary focus, we will illustrate the value of our approach by describing a key section of the regulation pathway. The intracellular signalling cascade constitutes the head of a chain of communities (Figure 3), and as such warrants a closer investigation.
Three separate groups of proteins are distinguishable: vascular endothelial growth factors (Vegfa, Vegfc, Figf) and the receptor (Kdr), which play a principle role in tumour progression and angiogenesis  and which have also been associated with tumour metastasis ; insulin-like growth factors and receptors (Igf1, Igf1r and Grb 7/14); and JAK/STAT proteins (JAK2, STATSb).
The figure shows the three ligands, Vegfa and Vegfc and Figf, at different levels of expression, all of which can bind to kinase insert domain protein receptor Kdr, a VEGF receptor, which in turn induces mitogenesis and differentation of vascular endothelial cells .
The interaction between Kdr and Socs1, an SH2 domain-containing suppressor of cytokine signaling 1, is plausible as Kdr has a tyrosine protein kinase domain which in a mouse homologue has been shown to interact with Socs1 . Furthermore, up-regulation of Socsl has been linked with the suppression of cytokine signalling and the JAK/STAT inflammatory signalling [36–38], which is shown here further down the network; here also, Socs1 is up-regulated and JAK/STAT down-regulated.
The proposed Ptpn11-Lck interaction is based on orthology to an interaction between Ptprc and Lck in mouse. Ptpn11 and Ptprc both have tyrosine specific protein phosphatase activity. Ptpn11 is phosphorylated by tyrosine protein kinases, contains two SH2 domains and therefore could be phosphorylated by Lck.
Higher up the network are the insulin-like growth factor 1 and its receptor (Igf1 and Igf1r, respectively) which are highly implicated in different cancers [39–41]. The insulin-like growth factors are involved in several cellular processes, such as regulation of proliferation, migration, survival, size control, and differentiation [42–45]. Igf1r is overexpressed in most malignant tumours, where it functions as an anti-apoptotic agent by enhancing cell survival. Igf1 has also been shown to enhance adhesion and motility of cancer cells [46, 47]; however, the exact role of Igf1r in the metastatic process has not been established. The network shown here suggests a link between the insulin-like growth factor receptor and the vascular endothelial growth factors through the highly up-regulated phospholipase delta 4 (Plcd4) and phospholipase gamma 1/2 (Plcg 1/2). The Plcg 1/2 and Igf1r interaction is based on the fact that the phospholipase has been shown to interact with insulin receptor, a close homologue of the insulin-like receptor.
Another distinguishing feature in the network is the highly down-regulated protein tyrosine phosphatase (Ptpn13). It has been reported that a protein tyrosine phosphatase, Ptp61F, negatively regulates the JAK/STAT pathway in Drosophila melanogaster . Our networks suggest that the signalling protein tyrosine phosphatase, Ptpn13, may act on the JAK/STAT pathway similarly, through the dephosphorylation of the growth hormone receptor Ghr.
The few examples shown here illustrate the value of the approach in terms of revealing potential pathways and interactions that play a part in cancer metastasis, but further experimental work will be needed to confirm the validity of these predictions.
Network view of gene expression
Extracting meaningful information from microarray expression data is often difficult, especially when looking at a complex process involving a large number of genes and unknown mechanisms. Clustering of genes may be of use when trying to find genes in a common pathway and genes with related function, but it often has limitations, such as in identifying negative feedback loops . Furthermore, even if key proteins are highlighted through microarray analysis, the expression data rarely reveals all proteins involved in a particular pathway.
The connectivity of up- and down-regulated proteins. Observed and expected frequencies of pairwise protein interactions, categorised by their expression: N-N (non-expressed protein interacting with non-expressed protein), U-U (up-regulated protein interacting with up-regulated protein), D-D (down-regulated protein interacting with down-regulated protein) and U-D (up-regulated interacting with down-regulated). For the purpose of the classification, up-regulated proteins are those up-regulated more than 20% and down-regulated proteins down-regulated more than 20%. Expected values were calculated based on a random distribution of the expression data on the network (p < 0.001 for a χ2-test).
Expression data has previously been put into a network context, for example by incorporating gene ontology data  and protein interactions , but here we generated the networks first, mapped the expression on top, and then performed a clustering. This approach allows us to bypass some of the obstacles involved in traditional microarray analysis, such as clustering of gene expression patterns; as demonstrated here, interactions of up-up and down-down regulated genes are not necessarily co-localised. To focus on the parts of the genome-wide interaction network relevant to metastasis we first selected subnetworks around highly up- and down-regulated genes and then utilised the clique method, which highlights hubs of highly interconnected protein communities. This allows us to examine the most complex parts of the network but as a result simple linear pathways do not get included.
Although we believe the general approach is of value in protein network analysis, there remain some shortcomings. Most importantly, transient protein-protein interactions are unlikely to be captured by our approach. This is a direct consequence of transient not being as well documented as non-transient interactions. Moreover, the method cannot distinguish between true and false positives, for which there is limited experimental data – however, these problems will be alleviated as more high-throughput proteomic studies are completed.
The system-level approach taken here, combining information on how proteins interact with each other and how genes are expressed, is a particularly appealing way to gain understanding of complex biological processes, such as metastasis. Although not discussed here in great detail several interesting groups of interactions, at the domain level, have been highlighted as potentially important players in the metastatic process. Further dissection of these is the subject of ongoing studies and consequently to be confirmed experimentally.
This method of using homologous protein interaction data to infer protein-protein information could be particularly useful for proteins for which there is no definite binding partner information. Moreover the relative expression levels of neighbouring proteins may prove an important consideration, when protein networks are to be subsequently modulated in conjunction with disease analysis, for example by targeting the expression of a particular gene by short interfering RNA (siRNA) .
The approaches described in this work are readily transferable to other species and cellular processes.
Protein interaction prediction
In order to identify homologous interaction pairs for which there is experimental data, BLAST searches were run for the rat genome  against all proteins in the DIP  and MIPS Mammalian Protein-Protein Interaction databases .
Experimental sources for building the interactome. Summary of the experiments used as a foundation for building the interactome, from most frequent (top) to least frequent (bottom). The percentage of the total is listed after each value.
Two hybrid test
Tandem Affinity Purification (TAP)
In vitro binding
Gel filtration chromatography
In vivo kinase activity assay
Gel retardation assays
Native gel electrophoresis
The score, S, was calculated for each putative interaction according to the following:
where and are sequence similarity bit scores to proteins a i and b i , respectively, which have experimentally been shown to interact; n is the number of experiments linking protein a i to protein b i ; and N is the total number of instances where the same pair of proteins is identified as interacting through different homologues. As mentioned in the Introduction, two-hybrid experiments are prone to giving false-positive results. Although most of the interactions created here are derived through yeast two-hybrid links, it has been shown that confidence is higher for interactions detected in multiple independent yeast two-hybrid experiments . This fact is reflected in the additive nature of the score, where a protein interaction that shows up repeatedly in independent two-hybrid experiments gets a higher score.
In order to test the scoring function, we created a subset of data from the RCSB Protein Data Bank  that specifically concentrates on stable functional protein interactions, rather than transient. Protein chains with high sequence homology to the Norwegian rat were considered (e ≤ 1 × 10-10). We distinguished biologically functional complexes (where multimeric protein chains are permanently bound and essential for the complex function) from transient ones (where protein chains may be bound to a complex but may also act as a separate functional protein on its own), by applying a method proposed by Ofran and Rost . This yielded 377 binary chain interactions.
Gene ontology cellular compartments. A simplified representation of gene ontology cellular compartments. Protein accessibility between compartments is represented by ones and zeros: the former indicates the possibility of interaction between respective compartments and the latter excludes any interactions.
Creation of networks around up/down-regulated genes
The number of protein communities at different clustering threshold values. The number of protein communities vary as the k-value for clustering is changed. The table shows the total number of separate protein communities for each k-value.
Clustering threshold value
Number of protein communities
k = 3
k = 4
k = 5
k = 6
k = 7
k = 8
k = 9
k = 10
k = 11
Microarray expression data for metastatic rat cells
To investigate genes that may be important in the development of metastases, we used a rat sarcoma model in which the cell populations K2, T15, A297 and A311 have 0, 40, 90 and 100% incidence of metastasis, respectively. We performed Affymetrix microarray analysis on the four cell populations and the primary tumours that formed when the cells were injected subcutaneously into rats. All experiments were performed in triplicate, using Affymetrix rat 230A GeneChip oligonucleotide arrays . Total RNA was extracted from each sample and used to prepare biotinylated target RNA; 10 μ g of RNA was used to generate first-strand cDNA by using a T7-linked oligo(dT) primer. After second-strand synthesis, in vitro transcription was performed with biotinylated UTP and CTP (Enzo Diagnostics), resulting in approximately 100-fold amplification of RNA. A complete description of the procedures is included in The Paterson Institute's Affymetrix GeneChip systems protocols .
The target cRNA generated from each sample was processed as per the manufacturer's recommendation using an Affymetrix GeneChip Instrument System . Briefly, spike controls were added to 10 μ g fragmented cDNA before overnight hybridisation, arrays were washed and stained with streptavidin-phycoerythrin, and scanned on an Affymetrix GeneChip scanner. The procedure is further described in The Paterson Institute's RNA Hybridisation protocols . The median fluorescence intensity value of each GeneChip was calculated and used to normalise the chips. Gene expression was considered in terms of fold-changes between non-metastatic and the median of the three metastatic samples.
This work was funded by Cancer Research UK. The authors would like to thank members of the Biomolecular Modelling Laboratory, Ian Kerr and Holger Gerhardt at Cancer Research UK for helpful discussions.
- Eisen MB, Spellman PT, Brown PO, Botstein D: Cluster analysis and display of genome-wide expression patterns. P Natl Acad Sci USA 1998, 95: 14863–14868. 10.1073/pnas.95.25.14863View ArticleGoogle Scholar
- Niehrs C, Pollet N: Synexpression groups in eukaryotes. Nature 1999, 402: 483–487. 10.1038/990025View ArticlePubMedGoogle Scholar
- Uetz P, Giot L, Cagney G, Mansfield TA, Judson RS, Knight JR, Lockshon D, Narayan V, Srinivasan M, Pochart P: A comprehensive analysis of protein-protein interactions in Saccharomyces cerevisiae. Nature 2000, 403: 623–627. 10.1038/35001009View ArticlePubMedGoogle Scholar
- Ho Y, Gruhler A, Heilbut A, Bader GD, Moore L, Adams SL, Millar A, Taylor P, Bennett K, Boutilier K: Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry. Nature 2002, 415: 180–183. 10.1038/415180aView ArticlePubMedGoogle Scholar
- Sprinzak E, Sattath S, Margalit H: How Reliable are Experimental Protein-Protein Interaction Data? J Mol Biol 2003, 327: 919–923. 10.1016/S0022-2836(03)00239-0View ArticlePubMedGoogle Scholar
- Bader GD, Hogue CWV: Analyzing yeast protein-protein interaction data obtained from different sources. Nat Biotechnol 2002, 20: 991–997. 10.1038/nbt1002-991View ArticlePubMedGoogle Scholar
- Gavin AC, Bosche M, Krause R, Grandi P, Marzioch M, Bauer A, Schultz J, Rick JM, Michon AM, Cruciat CM: Functional organization of the yeast proteome by systematic analysis of protein complexes. Nature 2002, 415: 141–147. 10.1038/415141aView ArticlePubMedGoogle Scholar
- Mann M, Hendrickson RC, Pandey A: Analysis of proteins and proteomes by mass spectrometry. Annu Rev Biochem 2001, 70: 437–473. 10.1146/annurev.biochem.70.1.437View ArticlePubMedGoogle Scholar
- Park J, Lappe M, Teichmann SA: Mapping Protein Family Interactions: Intramolecular and Intermolecular Protein Family Interaction Repertoires in the PDB and Yeast. J Mol Biol 2001, 307: 329–938. 10.1006/jmbi.2001.4526View ArticleGoogle Scholar
- Valencia A, Pazos F: Computational methods for the prediction of protein interactions. Curr Opin Struc Biol 2002, 12: 368–373. 10.1016/S0959-440X(02)00333-0View ArticleGoogle Scholar
- Bader JS, Chaudhuri A, Rothberg JM, Chant J: Gaining confidence in high-throughput protein interaction networks. Nat Biotechnol 2004, 22: 78–85. 10.1038/nbt924View ArticlePubMedGoogle Scholar
- Brazhnik P, de la Fuente A, Mendes P: Gene networks: how to put the function in genomics. Trends Biotechnol 2002, 20: 467–472. 10.1016/S0167-7799(02)02053-XView ArticlePubMedGoogle Scholar
- Rogers S, Girolami M: A Bayesian regression approach to the inference of regulatory networks from gene expression data. Bioinformatics 2005, 21: 3131–3137. 10.1093/bioinformatics/bti487View ArticlePubMedGoogle Scholar
- Jansen R, Yu H, Greenbaum D, Kluger Y, Krogan NJ, Chung S, Emili A, Snyder M, Greenblatt JF, Gerstein M: A Bayesian Networks Approach for Predicting Protein-Protein Interactions. Science 2002, 302: 449–453. 10.1126/science.1087361View ArticleGoogle Scholar
- Jansen R, Lan N, Qian J, Gerstein M: Integration of genomic datasets to predict protein complexes in yeast. J Struct Funct Genomics 2002, 2: 71–81. 10.1023/A:1020495201615View ArticlePubMedGoogle Scholar
- Gunsalus KC, Ge H, Schetter AJ, Goldberg DS, Han JD, Hao T, Berriz GF, Bertin N, Huang J, Chuang LS, Li N, Mani R, Hyman AA, Sonnichsen B, Echeverri CJ, Roth FP, Vidal M, Piano F: Predictive models of molecular machines involved in Caenorhabditis elegans early embryogenesis. Nature 2005, 436: 861–865. 10.1038/nature03876View ArticlePubMedGoogle Scholar
- Lu LJ, Xia Y, Paccanaro A, Yu H, Gerstein M: Assessing the limits of genomic data integration for predicting protein networks. Genome Res 2005, 15: 945–953. 10.1101/gr.3610305PubMed CentralView ArticlePubMedGoogle Scholar
- Rhodes DR, Tomlins SA, Varambally S, Mahavisno V, Barrette T, Kalyana-Sundaram S, Ghosh D, Pandey A, Chinnaiyan AM: Probabilistic model of the human protein-protein interaction network. Nat Biotechnol 2005, 23: 951–959. 10.1038/nbt1103View ArticlePubMedGoogle Scholar
- Cabusora L, Sutton E, Fulmer A, Forst CV: Differential network expression during drug and stress response. Bioinformatics 2005, 21: 2898–2905. 10.1093/bioinformatics/bti440View ArticlePubMedGoogle Scholar
- Calvano SE, Xiao W, Richards DR, Felciano RM, Baker HV, Cho RJ, Chen RO, Brownstein BH, Cobb JP, Tschoeke SK, Miller-Graziano C, Moldawer LL, Mindrinos MN, Davis RW, Tompkins RG, Lowry SF: A network-based analysis of systemic inflammation in humans. Nature 2005, 437(7061):1032–7. 10.1038/nature03985View ArticlePubMedGoogle Scholar
- Ideker T, Ozier O, Schwikowski B, Siegel AF: Discovering regulatory and signalling circuits in molecular interaction networks. Bioinformatics 2002, 18: S233-S240. 10.1093/bioinformatics/18.suppl_1.S233View ArticlePubMedGoogle Scholar
- Sohler F, Hanisch D, Zimmer R: New methods for joint analysis of biological networks and expression data. Bioinformatics 2004, 20: 1517–1521. 10.1093/bioinformatics/bth112View ArticlePubMedGoogle Scholar
- de Lichtenberg U, Jensen LJ, Brunak S, Bork P: Dynamic complex formation during the yeast cell cycle. Science 2005, 307: 724–727. 10.1126/science.1105103View ArticlePubMedGoogle Scholar
- Goffard N, Garcia V, Iragne F, Groppi A, de Daruvar A: IPPRED:server for proteins interactions inference. Bioinformatics 2003, 19: 903–904. 10.1093/bioinformatics/btg091View ArticlePubMedGoogle Scholar
- PIP: Potential Interactions of Proteins[http://www.bmm.icnet.uk/~pip]
- Aloy P, Pichaud M, Russell RB: Protein complexes: structure prediction challenges for the 21st century. Curr Opin Struc Biol 2005, 15: 15–22. 10.1016/j.sbi.2005.01.012View ArticleGoogle Scholar
- Palla G, Derényi I, Farkas I, Vicsek T: Uncovering the overlapping community structure of complex networks in nature and society. Nature 2005, 435: 814–818. 10.1038/nature03607View ArticlePubMedGoogle Scholar
- Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, Harris MA, Hill DP, Issel-Tarver L, Kasarskis A, Lewis S, Matese JC, Richardson JE, Ringwald M, Rubin GM, Sherlock G: Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet 2000, 25: 25–29. 10.1038/75556PubMed CentralView ArticlePubMedGoogle Scholar
- Derenyi I, Palla G, Vicsek T: Clique percolation in random networks. Phys Rev Lett 2005, 94: 160202. 10.1103/PhysRevLett.94.160202View ArticlePubMedGoogle Scholar
- Jeong H, Mason SP, Barabasi AL, Oltvai ZN: Lethality and centrality in protein networks. Nature 2001, 411: 41–42. 10.1038/35075138View ArticlePubMedGoogle Scholar
- Contreras-Moreira B, Bates PA: Domain fishing: a first step in protein comparative modelling. Biomformatics 2002, 18: 1141–1142. 10.1093/bioinformatics/18.8.1141View ArticleGoogle Scholar
- Ferrara N, Gerber HP, LeCouter J: The biology of VEGF and its receptors. Nature Med 2003, 9: 669–676. 10.1038/nm0603-669View ArticlePubMedGoogle Scholar
- Hirakawa S, Kodama S, Kunstfeld R, Kajiya K, Brown LF, Detmar M: VEGF-A induces tumor and sentinel lymph node lymphangiogenesis and promotes lymphatic metastasis. J Exp Med 2005, 201: 1089–1099. 10.1084/jem.20041896PubMed CentralView ArticlePubMedGoogle Scholar
- Takahashi T, Ueno H, Shibuya M: EGF activates protein kinase C-dependent, but Ras-independent Raf-MEK-MAP kinase pathway for DNA synthesis in primary endothelial cells. Oncogene 1999, 18: 2221–2230. 10.1038/sj.onc.1202527View ArticlePubMedGoogle Scholar
- Bourette RP, De Sepulveda P, Arnaud S, Dubreuil P, Rottapel R, Mouchiroud G: Suppressor of cytokine signaling 1 interacts with the macrophage colony-stimulating factor receptor and negatively regulates its proliferation signal. J Biol Chem 2001, 276: 22133–22139. 10.1074/jbc.M101878200View ArticlePubMedGoogle Scholar
- Alexander WS, Hilton DJ: The role of suppressors of cytokine signaling (SOCS) proteins in regulation of the immune response. Annu Rev Immunol 2004, 22: 503–529. 10.1146/annurev.immunol.22.091003.090312View ArticlePubMedGoogle Scholar
- Park EJ, Park SY, Joe EH, Jou I: 15d-PGJ2 and rosiglitazone suppress Janus kinase-STAT inflammatory signaling through induction of suppressor of cytokine signaling 1 (SOCS1) and SOCS3 in glia. J Biol Chem 2003, 278: 14747–14752. 10.1074/jbc.M210819200View ArticlePubMedGoogle Scholar
- Ali S, Nouhi Z, Chughtai N, Ali S: SHP-2 regulates SOCS-1-mediated Janus kinase-2 ubiquitination/degradation downstream of the prolactin receptor. J Biol Chem 2003, 278: 52021–52031. 10.1074/jbc.M306758200View ArticlePubMedGoogle Scholar
- Furukawa M, Raffeld M, Mateo C, Sakamoto A, Moody TW, Ito T, Venzon D, Serrano J, Jensen R: Increased expression of insulin-like growth factor I and/or its receptor in gastrinomas is associated with low curability, increased growth, and development of metastases. Clin Cancer Res 2005, 11: 3233–3242. 10.1158/1078-0432.CCR-04-1915View ArticlePubMedGoogle Scholar
- Hofmann F, García-Echeverríaon C: Blocking insulin-like growth factor-I receptor as a strategy for targeting cancer. Drug Discov Today 2005, 10: 1041–1047. 10.1016/S1359-6446(05)03512-9View ArticlePubMedGoogle Scholar
- All-Ericsson C, Girnita L, Seregard S, Bartolazzi A, Jager MJ, Larsson O: Insulin-like growth factor-1 receptor in uveal melanoma: a predictor for metastatic disease and a potential therapeutic target. Invest Ophthalmol Vis Sci 2002, 43: 1–8.PubMedGoogle Scholar
- LeRoith D, Werner H, Beitner-Johnson D, Roberts CT: Molecular and cellular aspects of the insulin-like growth factor I receptor. Endocr Rev 1995, 16: 143–163. 10.1210/er.16.2.143View ArticlePubMedGoogle Scholar
- Yenush L, White MF: The IRS-signalling system during insulin and cytokine action. Bioessays 1997, 19: 491–500. 10.1002/bies.950190608View ArticlePubMedGoogle Scholar
- Massagué J, Czech MP: The Subunit Structures of Two Distinct Receptors for Insulin-like Growth Factors I and I1 and Their Relationship to the Insulin Receptor. J Biol Chem 1982, 257: 5038–5045.PubMedGoogle Scholar
- Ullrich A, Gray A, Tam AW, Yang-Feng T, Tsubokawa M, Collins C, Henzel W, Le Bon T, Kathuria S, Chen E: Insulin-like growth factor I receptor primary structure: comparison with insulin receptor suggests structural determinants that define functional specificity. EMBO J 1986, 5: 2503–2512.PubMed CentralPubMedGoogle Scholar
- Dunn SE, Ehrlich M, Sharp NJ, Reiss K, Solomon G, Hawkins R, Baserga R, Barrett JC: A dominant negative mutant of the insulin-like growth factor-I receptor inhibits the adhesion, invasion, and metastasis of breast cancer. Cancer Res 1998, 58: 3353–3361.PubMedGoogle Scholar
- Andre F, Janssens B, Bruyneel E, van Roy F, Gespach C, Mareel M, Bracke M: Alpha-catenin is required for IGF-I-induced cellular migration but not invasion in human colonic cancer cells. Oncogene 2004, 23: 1177–1186. 10.1038/sj.onc.1207238View ArticlePubMedGoogle Scholar
- Müller P, Kuttenkeuler D, Gesellchen V, Zeidler MP, Boutros M: Identification of JAK/STAT signalling components by genome-wide RNA interference. Nature 2005, 436: 871–875. 10.1038/nature03869View ArticlePubMedGoogle Scholar
- Armstrong NJ, van de Wiel MA: Microarray data analysis: from hypotheses to conclusions using gene expression data. Cell Oncol 2004, 26: 279–290.PubMedGoogle Scholar
- Segal E, Wang H, Koller D: Discovering molecular pathways from protein interaction and gene expression data. Bioinformatics 2003, 19: i264-i272. 10.1093/bioinformatics/btg1037View ArticlePubMedGoogle Scholar
- Karagiannis TC, El-Osta A: RNA interference and potential theraputic applications of short interfering RNAs. Cancer Gene Ther 2005, 12: 787–795. 10.1038/sj.cgt.7700857View ArticlePubMedGoogle Scholar
- Pruitt KD, Tatusova T, Maglott DR: NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins. Nucleic Acids Res 2005, 33: D501-D504. 10.1093/nar/gki025PubMed CentralView ArticlePubMedGoogle Scholar
- Salwinski L, Miller CS, Smith AJ, Pettit FK, Bowie JU, Eisenberg D: The Database of Interacting Proteins: 2004 update. Nucleic Acids Res 2004, 32: D449-D451. 10.1093/nar/gkh086PubMed CentralView ArticlePubMedGoogle Scholar
- Pagel P, Kovac S, Oesterheld M, Brauner B, Dunger-Kaltenbach I, Frishman G, Montrone C, Mark P, Stumpflen V, Mewes HW, Frishman D: The MIPS mammalian protein – protein interaction database. Bioinformatics 2005, 21: 832–834. 10.1093/bioinformatics/bti115View ArticlePubMedGoogle Scholar
- Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, E BP: The Protein Data Bank. Nucleic Acids Res 2000, 28: 235–242. 10.1093/nar/28.1.235PubMed CentralView ArticlePubMedGoogle Scholar
- Ofran Y, Rost B: Analysing six types of protein-protein interfaces. J Mol Biol 2003, 325: 377–387. 10.1016/S0022-2836(02)01223-8View ArticlePubMedGoogle Scholar
- Affymetrix genechip rat expression set 230[http://www.affymetrix.com/support/technical/datasheets/rat230_datasheet.pdf]
- The Paterson Institute's target preparation for Affymetrix genechip systems protocols[http://bioinf.picr.man.ac.uk/mbcf/downloads/GeneChip_Target_Prep_Protocol-CR-UK_v2.pdf]
- Affymetrix expression analysis technical manual[http://www.affymetrix.com/support/technical/manual/expression_manual.affx]
- The Paterson Institute's RNA hybridisation protocols[http://bioinf.picr.man.ac.uk/mbcf/downloads/GeneChip_Hyb_Wash_Scan_Protocol-CR-UK_v2.pdf]
- North S, Gansner E, Ellson J: Graphviz.1998. [http://www.graphviz.org]Google Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.