Prediction of heterotrimeric protein complexes by two-phase learning using neighboring kernels

Ruan, Peiying; Hayashida, Morihiro; Maruyama, Osamu; Akutsu, Tatsuya

doi:10.1186/1471-2105-15-S2-S6

Volume 15 Supplement 2

Selected articles from the Twelfth Asia Pacific Bioinformatics Conference (APBC 2014): Bioinformatics

Proceedings
Open access
Published: 24 January 2014

Prediction of heterotrimeric protein complexes by two-phase learning using neighboring kernels

Peiying Ruan¹,
Morihiro Hayashida¹,
Osamu Maruyama² &
…
Tatsuya Akutsu¹

BMC Bioinformatics volume 15, Article number: S6 (2014) Cite this article

2348 Accesses
7 Citations
Metrics details

Abstract

Background

Protein complexes play important roles in biological systems such as gene regulatory networks and metabolic pathways. Most methods for predicting protein complexes try to find protein complexes with size more than three. It, however, is known that protein complexes with smaller sizes occupy a large part of whole complexes for several species. In our previous work, we developed a method with several feature space mappings and the domain composition kernel for prediction of heterodimeric protein complexes, which outperforms existing methods.

Results

We propose methods for prediction of heterotrimeric protein complexes by extending techniques in the previous work on the basis of the idea that most heterotrimeric protein complexes are not likely to share the same protein with each other. We make use of the discriminant function in support vector machines (SVMs), and design novel feature space mappings for the second phase. As the second classifier, we examine SVMs and relevance vector machines (RVMs). We perform 10-fold cross-validation computational experiments. The results suggest that our proposed two-phase methods and SVM with the extended features outperform the existing method NWE, which was reported to outperform other existing methods such as MCL, MCODE, DPClus, CMC, COACH, RRW, and PPSampler for prediction of heterotrimeric protein complexes.

Conclusions

We propose two-phase prediction methods with the extended features, the domain composition kernel, SVMs and RVMs. The two-phase method with the extended features and the domain composition kernel using SVM as the second classifier is particularly useful for prediction of heterotrimeric protein complexes.

Background

To identify a set of proteins as a functional protein complex is essential for understanding molecular systems in living cells. Several proteins form a complex and work as a transcription factor, whereas there exist another type of proteins that work as enzymes. Hence, to identify proteins that constitute such transcription factors is useful for uncovering gene regulatory networks and metabolic pathways. Many computational methods have been developed for predicting protein complexes from protein-protein interaction networks [1, 2]. Enright et al. developed the Markov cluster (MCL) algorithm [3], which repeatedly executes two operators called expansion and inflation to a matrix whose element represents the transition probability from a protein to another. The expansion operation takes the power of the matrix, and the inflation operation takes the Hadamard power of the matrix. MCL is fast and efficient because of these operations. Macropol et al. developed the repeated random walks (RRW) method [4], which iteratively expands a cluster depending on the probabilities in steady states of random walks with restarts. Maruyama and Chihara improved the RRW method by weighting the restart probabilities and proposed the node-weighted expansion (NWE) method [5]. Bader and Hogue developed the molecular complex detection (MCODE) method [6], which uses a modified clustering coefficient defined by edge density in a subset of the original and adjacent vertices to find densely connected regions. King et al. developed the restricted neighborhood search clustering (RNSC) method [7], which selects clusters generated by a cost function according to the cluster size, density and functional homogeneity. Altaf-Ul-Amin et al. developed DPClus [8], which tries to find densely connected regions. Chua et al. developed the protein complex prediction (PCP) method [9], which finds maximal cliques using the functional similarity weight based on indirect interactions. Liu et al. developed the clustering based on maximal cliques (CMC) method [10], which generates all maximal cliques from the protein-protein interaction networks, and assembles highly overlapped clusters based on their interconnectivity. Wu et al. developed the core-attachment based (COACH) method [11]. Most methods basically focus on finding densely connected subgraph in protein-protein interaction networks. Hence, it is considered to be difficult that they detect small protein complexes because, for instance, the edge density of two interacting proteins is always 1.0 even if the proteins do not form a complex.

However, protein complexes with small sizes occupy a large part of whole known protein complexes. CYC2008 is a comprehensive catalogue of 408 manually curated yeast protein complexes [12]. In the catalogue, 172 complexes (42%) are heterodimeric, and 87 complexes (21%) are heterotrimeric as reported also in [13]. In our previous study, hence, we developed a method using our proposed kernel for predicting heterodimeric protein complexes [14], which outperforms an existing method using the naive Bayes classifier [15]. In this paper, we propose prediction methods for heterotrimeric protein complexes by extending techniques in our previous method on the basis of the idea that heterotrimeric protein complexes are not likely to share the same protein with other heterotrimeric protein complexes. For that purpose, we apply supervised learning methods twice such as support vector machine (SVM) [16] and relevance vector machine (RVM) [17]. Tatsuke and Maruyama developed the proteins' partition sampler (PPSampler) method based on the Metropolis-Hastings algorithm, which generates clusters whose sizes follow a power-law distribution, and outperforms other existing methods in F-measure for whole protein complexes [13]. For prediction of heterotrimeric protein complexes, they reported that the F-measure of NWE was better than those of the existing methods, MCL, MCODE, DPClus, CMC, COACH, RRW, and PPSampler. We perform 10-fold cross-validation, and calculate the average F-measure. The results suggest that our proposed methods outperform the existing method NWE.

Methods

In this section, we propose prediction methods for heterotrimeric protein complexes. More accurately, we consider the following problem: Given a network of protein-protein interactions weighted by some reliability, determine whether or not three distinct proteins that are connected in the protein-protein interaction network form a protein complex. Let G(V, E) be an undirected graph with a set V of vertices and a set E of edges, representing the protein-protein interaction network. Here, a vertex represents a protein, an edge (i, j) represents an interaction between proteins P_i and P_j, and the weight w_ij represents reliability and strength of the interaction between P_i and P_j. In this paper, we use the WI-PHI database [1] as edge weights, which has been calculated from heterogeneous biological experimental data. We call P_i a neighboring protein to P_j if (i, j) ∈ E. Then, our proposed methods use the support vector machine (SVM), its discriminant function, and the relevance vector machine (RVM).

Support and relevance vector machine

We briefly review the support and relevance vector machines [16, 17]. Suppose that N training data {x_i, t_i} with target t_i ∈ {-1, 1} are given. For our purpose, x_icorresponds to a set of three distinct proteins, t_i = 1 corresponds to the case that the set forms a heterotrimeric protein complex. Then, we consider linear models represented by the form

y (x) = \sum_{i = 1}^{M} a_{i} ϕ_{i} (x) + b,

(1)

where ϕ_i denotes a basis function, M denotes the number of basis functions, a_i denotes the coefficient, and b denotes the bias parameter. In the SVM, ϕ_i(x) is implicitly defined as K(x_i, x) with a positive semidefinite kernel function K, M is equal to N, and a_i and b are determined by maximizing the margin. New sets x of proteins are classified according to the sign of y(x). We make use of this discriminant function y(x) in our proposed methods.

The RVM is a Bayesian sparse kernel technique for classification and regression, and shares some characteristics of the SVM. As well as the SVM, the basis functions of the RVM are given by kernels, which are not required to be positive semidefinite. It, however, is known that training time of the RVM is in general longer than that of the SVM. In the RVM, a hyperparameter γ_i for each parameter a_i and a prior distribution over parameters a_i are introduced to obtain a sparse model. For the classification, the model in Eq. (1) is transformed as σ(y(x)), where σ(y) denotes the logistic sigmoid function 1/(1 + e^-y), and a_i and b are determined by maximizing the marginal log-likelihood with respect to γ.

Extension of feature space mapping

In our previous study, we proposed seven feature space mappings for prediction of heterodimeric protein complexes [14]. These are based on the idea that the reliability of the interaction in a heterodimer should be high and conversely the reliability of the interaction between a protein in a heterodimer and a protein not in the heterodimer should be low. We extend the feature space mappings for two interacting proteins to mappings for three proteins. Table 1 shows detailed extended mappings for three distinct proteins P_i, P_j, and P_k that are connected in the protein-protein interaction network. Here the fifth mapping in the previous study is eliminated because more neighboring proteins increase the maximum of differences close to the maximum of neighboring weights denoted by (F3). (F1) and (F2) denote the maximum and minimum of the weights of interactions between P_i, P_j, and P_k, respectively. The first feature in the previous study is the weight of the interaction between two proteins. Since there are at least two interactions for three focused proteins and we cannot use all the weights as elements of our feature vector without changes, we take the maximum and minimum of the weights (see Figure 1). In addition, the proteins in a heterotrimer should interact with each other, and (F2), which is the minimum of the weights, is expected to be high. (F3) and (F4) denote the maximum and minimum of the weights of interactions between either of P_i, P_j, P_k and a neighboring protein P_r, respectively, where r ≠ i, j, k and (i, r) ∈ E, (j, r) ∈ E, or (k, r) ∈ E. It is considered that (F3), which is the maximum of the neighboring weights of a heterotrimer, should be lower than the weights of interactions in the heterotrimer. Consider the case that a protein P_r interacts with two of proteins P_i, P_j, and P_k, where P_r is not any of P_i, P_j, and P_k (see Figure 1). If the weights of both interactions are large, these proteins including P_r may form a complex. We introduce the maximum of smaller weights of interactions with neighboring proteins P_r denoted by (F5). (F6) and (F7) denote the maximum and the minimum of the numbers of domains contained in P_i, P_j, and P_k, respectively. The number of domains in a protein complex is expected to be large because domains are considered as mediators of protein-protein interactions.

Table 1 Feature space mapping from three distinct proteins P_i, P_j, P_k.

Full size table

In addition to the extended features, we examine the domain composition kernel developed in our previous study [14]. We defined equivalence =_d between two proteins P_i and P_j as the condition that P_i consists of the same domains of P_j , and defined equivalence =_cbetween two sets x_iand x_jthat consist of ${P_{i_{1}}, \cdot \cdot \cdot, P_{i_{n}}}$ and ${P_{j_{1}}, \cdot \cdot \cdot, P_{j_{n}}}$ , respectively, as $\exists σ \in 픖_{n} \forall k (P_{i_{k}} =_{d} P_{j_{σ (k)}})$ , where $픖_{n}$ denotes the symmetric group of degree n on the set {1, ⋯, n}. Then, the domain composition kernel K_c was defined by

K_{c} (x_{i}, x_{j}) = \{\begin{gathered} 1 (if x_{i} =_{c} x_{j}), \\ 0 (otherwise) . \end{gathered}

(2)

Two-phase learning approach

Our proposed methods take two-phase learning approach. The basic idea for designing our methods is that heterotrimeric protein complexes are not likely to share the same protein with other heterotrimeric protein complexes. We estimate model parameters of SVM using training data in the first phase, and predict whether or not the training data and the neighboring sets sharing at least one protein with the training data are heterotrimeric protein complexes, respectively. Then, the second phase predictor makes use of the discriminant values obtained by the first phase predictor. It is expected that the discriminant values for a target set of proteins and its neighboring set do not become large together if heterotrimeric protein complexes do not share the same protein.

Suppose that the training data set comprises N sets x_iof three distinct proteins with the corresponding label t_i ∈ {-1, 1}. For each x_i, we calculate 7-dimensional feature vector f⁽¹⁾(x_i) using (F1),…,(F7) shown in Table 1 and the kernel matrix whose (i, j)-th element is 〈f⁽¹⁾(x_i), f⁽¹⁾(x_j)〉 + αK_c(x_i, x_j), where α is a constant and $⟨\cdot, \cdot⟩$ denotes the inner product. Then, we obtain the model parameters in Eq. (1) by applying the SVM to the training data set. Let $N (x)$ be all sets of three distinct proteins that are neighboring to x and connected in the protein-protein interaction network, where we call x_ia neighboring set to x_jif x_iand x_jshare the same protein and x_iis not x_j(see Figure 2). For each x_i, we calculate the discriminant values y(x_i) and y(x) for all $x \in N (x_{i})$ . Since the discriminant values may include outliers, by taking the averages of positive and negative discriminant values separately, we define four feature space mappings for x_i,

f^{(2 s)} (x_{i}) = y (x_{i}),

(3)

f^{(2 p)} (x_{i}) = \frac{1}{| {x \in N (x_{i}) | y (x) > 0} |} \sum_{{x \in N (x_{i}) | y (x) > 0}} y (x),

(4)

f^{(2 n)} (x_{i}) = \frac{1}{| {x \in N (x_{i}) | y (x) < 0} |} \sum_{{x \in N (x_{i}) | y (x) < 0}} y (x),

(5)

f^{(2 a)} (x_{i}) = \frac{1}{| N (x_{i}) |} \sum_{x \in N (x_{i})} y (x),

(6)

where |S| denotes the number of elements in the set S. Here, we define f ^(2p)(x_i) = 0 (f ⁽²ⁿ⁾(x_i) = 0, f ^(2a)(x_i) = 0) if $| {x \in N (x_{i}) | y (x) > 0} | = 0 (| {x \in N (x_{i}) | y (x) < 0} | = 0, | N (x_{i}) | = 0)$ . We compose 11-dimensional feature vector f⁽²⁾(x_i) using f⁽¹⁾, f ^(2s), f ^(2p), f ⁽²ⁿ⁾and f ^(2a), calculate the kernel matrix with the (i, j)-th element 〈f⁽²⁾(x_i), f ⁽²⁾(x_j)〉 + αK_c(x_i, x_j), and we apply some supervised learning method. It should be noted that our methods use only training data to estimate model parameters. For test data x, we calculate 〈f⁽²⁾(x_i), f ⁽²⁾(x)〉 + αK_c(x_i, x) for training data x_i, and determine whether or not x is a heterotrimeric protein complex according to the second classifier.

Computational experiments

Data and implementation

To evaluate our proposed methods, we performed computational experiments and compared them with the existing method NWE [5]. We used the WI-PHI database [1] containing 49607 interacting protein pairs except self interactions as input weights of interactions, which is available at the supporting information web page of the paper. The weights were obtained from high-throughput yeast two-hybrid data [18, 19] and several biological databases such as BioGRID [2] and BIND [20] by using a log-likelihood score (LLS) to each dataset and the socioaffinity (SA) index [21] that measures the log-odds score of the number of times that two proteins are observed to interact to the expectation value from the dataset.

We prepared datasets using heterotrimeric protein complexes in CYC2008 protein complex catalogue [12], which contains 87 heterotrimeric protein complexes, and is available at http://wodaklab.org/cyc2008/. We restricted positive and negative examples to sets of three distinct proteins that form a single connected component in the input protein-protein interaction network. Thus, 7 heterotrimers were eliminated, and we used 80 heterotrimers as positive examples. For negative examples, we extracted 32647 sets of three proteins included in protein complexes with size more than three of CYC2008, and we selected uniquely at random 100 examples from the sets because our methods require many neighboring sets of three proteins for an example in the second phase. It is considered that negative examples selected from such sets are more difficult to be classified than those selected from all sets of three proteins except heterotrimers.

For NWE, we set some options related with the size of complexes so that NWE output protein complexes with size two or more from the WI-PHI protein-protein interaction network in the same way as [13], and extracted only protein complexes with size three from the result.

For measuring the performance, we used accuracy, precision, recall, and F-measure defined by

a c c u r a c y = \frac{T P + T N}{T P + T N + F P + F N},

(7)

p r e c i s i o n = \frac{T P}{T P + F P},

(8)

r e c a l l = \frac{T P}{T P + F N},

(9)

F - m e a s u r e = \frac{2 \cdot p r e c i s i o n \cdot r e c a l l}{p r e c i s i o n + r e c a l l},

(10)

where TP, FP, and FN mean the number of true positive, false positive, false negative examples, respectively.

We used 'libsvm' (version 3.11) [22] and 'SparseBayes' package (version 2.0) [23] as implementations of SVM and RVM, respectively.

Results

We performed 10-fold cross-validation, and took the average of accuracy, precision, recall, and F-measure. Furthermore, we repeated this procedure 10 times for other datasets with randomly selected negative examples, and took the average. Table 2 shows the results on the average of accuracy, precision, recall, and F-measure by our proposed methods and NWE. 'SVM+SVM' and 'SVM+RVM' denote two-phase methods using SVM and RVM as the second classifier, respectively. 'SVM' denotes usual SVM using only features f⁽¹⁾. α denotes the coefficient of the domain composition kernel K_c. We examined α = 0.5 because the case was best for prediction of heterodimeric protein complexes in our previous study [14]. NWE predicted 54 protein complexes with size three from the WI-PHI protein-protein interaction network, and 19 of them were actual heterotrimeric protein complexes in the CYC2008 protein complex catalogue. We can see from the table that the F-measure by SVM+SVM, SVM+RVM, SVM for both α = 0, and 0.5 were higher than those by NWE, respectively. Furthermore, the accuracy and F-measure by the two-phase method SVM+SVM were higher than those by usual SVM with f⁽¹⁾, respectively. The accuracy and F-measure by SVM+RVM, however, were lower than those by SVM, respectively. It implies that RVMs may be less useful than SVMs for these problems that SVMs can be applied. Thus, the results suggest that our proposed methods SVM+SVM, SVM+RVM, and SVM outperform the existing method NWE. The results also suggest the usefulness of the second phase.

Table 2 Results on the average of accuracy, precision, recall, and F-measure by our proposed methods and NWE.

Full size table

Conclusions

We proposed prediction methods by two-phase learning for heterotrimeric protein complexes. In the methods, we extended the feature space mappings in our previous study for prediction of heterodimeric protein complexes, and made use of the discriminant function for neighboring sets of three proteins. To validate our proposed methods, we performed 10-fold cross-validation computational experiments. The results suggest that our two-phase prediction methods and SVM with the extended features outperform the existing method NWE, which was reported to outperform many other existing methods such as MCL, MCODE, DPClus, CMC, COACH, RRW, and PPSampler, although our methods are limited to prediction of heterotrimeric protein complexes. For further evaluation, we would like to perform computational experiments for other datasets if such data become available.

We have some possibility to further improve the prediction accuracy. For instance, we can use sequence information for designing feature space mappings as well as domains contained in proteins. In addition, we can introduce some probabilistic model such as conditional and Markov random fields to neighboring sets of three proteins although in this paper we considered kernels between neighboring sets.

References

Kiemer L, Costa S, Ueffing M, Cesareni G: WI-PHI: A weighted yeast interactome enriched for direct physical interactions. Proteomics. 2007, 7: 932-943. 10.1002/pmic.200600448.
Article CAS PubMed Google Scholar
Stark C, Breitkreutz BJ, Reguly T, Boucher L, Breitkreutz A, Tyers M: BioGRID: a general repository for interaction datasets. Nucleic Acids Research. 2006, 34: D535-D539. 10.1093/nar/gkj109.
Article PubMed Central CAS PubMed Google Scholar
Enright A, Dongen SV, Ouzounis C: An efficient algorithm for large-scale detection of protein families. Nucleic Acids Research. 2002, 30: 1575-1584. 10.1093/nar/30.7.1575.
Article PubMed Central CAS PubMed Google Scholar
Macropol K, Can T, Singh A: Repeated random walks on genome-scale protein networks for local cluster discovery. BMC Bioinformatics. 2009, 10: 283-10.1186/1471-2105-10-283.
Article PubMed Central PubMed Google Scholar
Maruyama O, Chihara A: NWE: Node-weighted expansion for protein complex prediction using random walk distances. Proteome Science. 2011, 9 (Suppl 1): S14-10.1186/1477-5956-9-S1-S14.
Article PubMed Central PubMed Google Scholar
Bader GD, Hogue CW: An automated method for finding molecular complexes in large protein interaction networks. BMC Bioinformatics. 2003, 4: 2-10.1186/1471-2105-4-2.
Article PubMed Central PubMed Google Scholar
King A, Prulj N, Jurisica I: Protein complex prediction via cost-based clustering. Bioinformatics. 2004, 20: 3013-3020. 10.1093/bioinformatics/bth351.
Article CAS PubMed Google Scholar
Altaf-Ul-Amin M, Shinbo Y, Mihara K, Kurokawa K, Kanaya S: Development and implementation of an algorithm for detection of protein complexes in large interaction networks. BMC Bioinformatics. 2006, 7: 207-10.1186/1471-2105-7-207.
Article PubMed Central PubMed Google Scholar
Chua H, Ning K, Sung WK, Leong H, Wong L: Using indirect protein-protein interactions for protein complex prediction. Journal of Bioinformatics and Computational Biology. 2008, 6: 435-466. 10.1142/S0219720008003497.
Article CAS PubMed Google Scholar
Liu G, Wong L, Chua HN: Complex discovery from weighted PPI networks. Bioinformatics. 2009, 25: 1891-1897. 10.1093/bioinformatics/btp311.
Article CAS PubMed Google Scholar
Wu M, Li X, Kwoh C, Ng S: A core-attachment based method to detect protein complexes in PPI networks. BMC Bioinformatics. 2009, 10: 169-10.1186/1471-2105-10-169.
Article PubMed Central PubMed Google Scholar
Pu S, Wong J, Turner B, Cho E, Wodak S: Up-to-date catalogues of yeast protein complexes. Nucleic Acids Research. 2009, 37: 825-831. 10.1093/nar/gkn1005.
Article PubMed Central CAS PubMed Google Scholar
Tatsuke D, Maruyama O: Sampling strategy for protein complex prediction using cluster size frequency. Gene. 2013, 7: 152-158.
Article Google Scholar
Ruan P, Hayashida M, Maruyama O, Akutsu T: Prediction of heterodimeric protein complexes from weighted protein-protein interaction networks using novel features and kernel functions. PLoS ONE. 2013, 8 (6): e65265-10.1371/journal.pone.0065265.
Article PubMed Central CAS PubMed Google Scholar
Maruyama O: Heterodimeric protein complex identification. ACM Conference on Bioinformatics, Computational Biology and Biomedicine 2011. 2011, 499-501.
Google Scholar
Vapnik V: Statistical Learning Theory. 1998, Wiley-Interscience
Google Scholar
Tipping ME: The relevance vector machine. Advances in Neural Information Processing Systems. 2000, 652-658.
Google Scholar
Ito T, Chiba T, Ozawa R, Yoshida M, Hattori M, Sakaki Y: A comprehensive two-hybrid analysis to explore the yeast protein interactome. Proc Natl Acad Sci USA. 2001, 98: 4569-4574. 10.1073/pnas.061034498.
Article PubMed Central CAS PubMed Google Scholar
Uetz P, Giot L, Cagney G, Mansfield T, Judson R, Knight J, Lockshon D, Narayan V, Srinivasan M, Pochart P, Qureshi-Emili A, Li Y, Godwin B, Conover D, Kalbfleisch T, Vijayadamodar G, Yang M, Johnston M, Fields S, Rothberg J: A comprehensive analysis of protein-protein interactions in Saccharomyces cerevisiae. Nature. 2000, 403: 623-627. 10.1038/35001009.
Article CAS PubMed Google Scholar
Alfarano C, Andrade CE, Anthony K, Bahroos N, Bajec M: The Biomolecular Interaction Network Database and related tools 2005 update. Nucleic Acids Research. 2005, 33: D418-D424.
Article PubMed Central CAS PubMed Google Scholar
Gavin AC, Aloy P, Grandi P, Krause R, Boesche M, Marzioch M, Rau C, Jensen LJ, Bastuck S, Dümpelfeld B, Edelmann A, Heurtier MA, Hoffman V, Hoefert C, Klein K, Hudak M, Michon AM, Schelder M, Schirle M, Remor M, Rudi T, Hooper S, Bauer A, Bouwmeester T, Casari G, Drewes G, Neubauer G, Rick JM, Kuster B, Bork P, Russell RB, Superti-Furga G: Proteome survey reveals modularity of the yeast cell machinery. Nature. 2006, 440: 631-636. 10.1038/nature04532.
Article CAS PubMed Google Scholar
Chang CC, Lin CJ: LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology. 2011, 2: 27:1-27:27.
Article Google Scholar
Tipping ME, Faul AC: Fast marginal likelihood maximisation for sparse Bayesian models. Proceedings of the Ninth International Workshop on Artificial Intelligence and Statistics. 2003
Google Scholar

Download references

Acknowledgements

This work was partially supported by Grants-in-Aid #22240009 and #24500361 from MEXT, Japan.

Declarations

The publication costs for this article were funded by JSPS, Japan (Grants-in-Aid #22240009).

This article has been published as part of BMC Bioinformatics Volume 15 Supplement 2, 2014: Selected articles from the Twelfth Asia Pacific Bioinformatics Conference (APBC 2014): Bioinformatics. The full contents of the supplement are available online at http://www.biomedcentral.com/bmcbioinformatics/supplements/15/S2.

Author information

Authors and Affiliations

Bioinformatics Center, Institute for Chemical Research, Kyoto University, Gokasho, Uji, Kyoto, Japan
Peiying Ruan, Morihiro Hayashida & Tatsuya Akutsu
Institute of Mathematics for Industry, Kyushu University, 744 Motooka, Nishi-ku, Fukuoka, Japan
Osamu Maruyama

Authors

Peiying Ruan
View author publications
You can also search for this author in PubMed Google Scholar
Morihiro Hayashida
View author publications
You can also search for this author in PubMed Google Scholar
Osamu Maruyama
View author publications
You can also search for this author in PubMed Google Scholar
Tatsuya Akutsu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Morihiro Hayashida.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors' contributions

PR and MH developed and implemented the methods. MH drafted the manuscript. OM and TA participated in the discussions during the development of the methods and helped draft the manuscript. All authors read and approved the final manuscript.

Rights and permissions

Open Access This article is published under license to BioMed Central Ltd. This is an Open Access article is distributed under the terms of the Creative Commons Attribution License ( https://creativecommons.org/licenses/by/2.0 ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver ( https://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated.

Reprints and permissions

About this article

Cite this article

Ruan, P., Hayashida, M., Maruyama, O. et al. Prediction of heterotrimeric protein complexes by two-phase learning using neighboring kernels. BMC Bioinformatics 15 (Suppl 2), S6 (2014). https://doi.org/10.1186/1471-2105-15-S2-S6

Download citation

Published: 24 January 2014
DOI: https://doi.org/10.1186/1471-2105-15-S2-S6

Selected articles from the Twelfth Asia Pacific Bioinformatics Conference (APBC 2014): Bioinformatics

Prediction of heterotrimeric protein complexes by two-phase learning using neighboring kernels