 Research
 Open Access
 Published:
MTAGCN: predicting miRNAtarget associations in Camellia sinensis var. assamica through graph convolution neural network
BMC Bioinformatics volumeÂ 23, ArticleÂ number:Â 271 (2022)
Abstract
Background
MircoRNAs (miRNAs) play a central role in diverse biological processes of Camellia sinensis var.assamica (CSA) through their associations with target mRNAs, including CSA growth, development and stress response. However, although the experiment methods of CSA miRNAtarget identifications are costly and timeconsuming, few computational methods have been developed to tackle the CSA miRNAtarget association prediction problem.
Results
In this paper, we constructed a heterogeneous network for CSA miRNA and targets by integrating rich biological information, including a miRNA similarity network, a target similarity network, and a miRNAtarget association network. We then proposed a deep learning framework of graph convolution networks with layer attention mechanism, named MTAGCN. In particular, MTAGCN uses the attention mechanism to combine embeddings of multiple graph convolution layers, employing the integrated embedding to score the unobserved CSA miRNAtarget associations.
Discussion
Comprehensive experiment results on two tasks (balanced task and unbalanced task) demonstrated that our proposed model achieved better performance than the classic machine learning and existing graph convolution networkbased methods. The analysis of these results could offer valuable information for understanding complex CSA miRNAtarget association mechanisms and would make a contribution to precision plant breeding.
Introduction
Tea, produced from the dried leaves of tea plant, Camellia sinensis, is one of the most widely consumed drink in the world, which has large economic, medicinal and cultural significance [1]. Many studies demonstrated that the characteristic secondary metabolites in tea leaves such as polyphenols, caffeine, theanine, vitamins, have numerous health and medical benefits for humans [2, 3]. Plant microRNAs (miRNAs) are highly conserved and play an important role in gene expression regulation by targeting specific mRNA [4]. Furthermore, it is proven that miRNAs are involved in the development procedures, stress responses or biosynthesis of the secondary metabolites in Camellia sinensis var.assamica (CSA) [5, 6]. Thus, the identification of CSA miRNAs can not only improve the understanding of miRNA targeted gene regulation but also the evolution of miRNAs.
Although experiment methods to identifying CSA miRNAtarget have high accuracy [7], they may suffer from timeconsuming, laborious and expensive. As a result, it is necessary to develop computational methods for predicting miRNAtarget association. Machine learning or deep learningbased methods have been generally adopted to solve various association pair prediction problems in biology. For example, many classification algorithms regard the associations as samples firstly, and the feature vectors of the edges are used to represent these samples. Then the classifiers are trained to recognize the realexisting associations in the graph [8, 9]. Nevertheless, the above machine learning methods are heavily dependent on the negative data sampling and the feature extraction. Therefore, more advanced machine learning methods, such as label propagation [10], regularized least squares [11], semisupervised graph cut [12], sparse subspace learning [13], matrix factorization [14] and matrix completion [15, 16], are introduced to solve these kinds of problems. Matrix completion and matrix factorization methods are popular in community due to their flexibility in aggregating apriori information [17]. However, deploying them on highdimensionality data is challenging because of the high computational complexity of matrix operations.
Deep learning methods have recently shown excellent performance in many fields, such as perception, planning, localization, and control [18]. The excellent capabilities of deep learning methods for learning representations from the complicated data make it extremely suitable for predicting association pairs in biology. Graph neural network (GNN) uses different node neighborhood aggregating schemes, representing a significant progress in directly processing network/graph structure data [19]. Each node feature can be updated by aggregating features of its neighboring nodes during the layer propagation and the node embedding will naturally capture the graph structure. GNNs have been extensively applied in multifarious problems, achieving superior performance in biological tasks, such as diseasegene association identification [20, 21], drugdrug interaction predictions [22, 23], miRNAdisease association predictions [24, 25], etc. As an extension of convolutional neural network for processing graph data, graph convolution network (GCN) [26], an important branch of GNN, has achieved excellent performance in different tasks. It is an endtoend architecture and captures the graph structural information through messages passed between graph nodes, thereby retains explainability. In recent years, it shows superior performances in biological network analysis [27, 28].
In this paper, we developed a graph convolutional network model (MTAGCN) for predicting CSA miRNAtarget associations. At first, we constructed heterogeneous networks by exploiting the CSA miRNAtarget associations, miRNAmiRNA similarity matrix and targettarget similarity matrix. Next, the graph convolution operation was conducted on the heterogeneous network to learn CSA miRNAs and targets embeddings. Considering that the embeddings from different convolution layers represent the proximity of nodes in the network at different levels [29], we introduced the attention mechanism [30] to combine useful neighborhoods representation adaptively and dynamically. Finally, we defined a score function which based on the integrated embedding, giving predictive scores for unobserved miRNAtarget associations. Comprehensive experiment results on two tasks, i.e. the balanced task and the unbalanced task, showed that our proposed MTAGCN model had a better performance than five machine learning and three existing stateoftheart methods.
In summary, our main contributions are as follows:

We constructed the heterogeneous network to effectively integrate rich biological information, including CSA miRNAtarget associations, CSA miRNA information and CSA target information.

We proposed MTAGCN, a novel GCNbased method for predicting CSA miRNAtarget associations. To our knowledge, this is the first work to adapt deep learning method for CSA miRNAtarget association prediction.

We designed the attention mechanism to integrate the embeddings information from multiple convolutional layers, leading to more useful representation from miRNAs and targets.
Methods and materials
Data
The data we used in this study was collected from the 2020 version of the CSA miRNAtarget associations released in the work of Suo et al. [7]. This dataset contains 5264 relationships between CSA miRNAs and targets which include 356 miRNAs and 4041 targets. For the lack of some miRNA sequences and target information, we removed the relationships between miRNAs and targets, including 66 miRNAs and 1166 targets. Therefore, the resulting dataset we obtained contains 3745 miRNAtarget pairs, including 290 different types of CSA miRNAs and 2876 targets. Then, we acquired the CSA target gene locations from http://teacon.wchoda.com, a database of gene coexpression network for CSA plant [31]. According to the CSA target gene locations, we extracted target sequences from CSA whole genome data in the Tea Plant Information Archive [32]. The details are shown in Table 1.
To perform the fivefold cross validation, we developed a balanced and an unbalanced dataset, respectively, to evaluate the CSA miRNAtarget prediction models. In the training dataset, fourfifths of the positive samples and all the negative samples are used. As for test set in the unbalanced task, we use the remaining onefifth of the positive samples (749 positive samples) and draw 20 times number of positive samples as negative samples (14,980 negative samples). Then the class of negative data vastly outnumbers that of positive data, causing a class imbalance problem (Table 2). As for test set in the balanced task, we used the same number of negative samples as positive samples (Table 2). In addition, in order to acquire CSA miRNA similarities and target similarities, Kmer [33], an algorithm based on nucleic acid composition, is used to transform the CSA targets sequences and miRNAs sequences into feature vectors.
Construction of heterogeneous network
Construction of similarity network
As mentioned above, we used Kmer to obtain CSA miRNA and target features. For one miRNA or target binary feature vector, each element means whether the feature descriptor is present or absent. In this work, we adopted the Jaccard index to calculate the miRNAmiRNA and targettarget similarities. Jaccard index [27] is a prevailing measure for calculating similarity based on these features. Thus, we further constructed miRNA similarity matrix and target similarity matrix. The Jaccard index measure between two vectors \(x_{i}\) and \(x_{j}\) is defined as follows:
where \(\left {x_{i} \cap x_{j} } \right\) denotes the number of features where both elements in \(x_{i}\) and the related ones of \(x_{j}\) equal to 1, and \(\left {x_{i} \cup x_{j} } \right\) denotes the numbers of features where either the elements of \(x_{i}\) or the related ones of \(x_{j}\) equal to 1.
Herein, we also considered other similarity calculation measures to construct similarity network, including cosine similarity, Gaussian kernelbased similarity, and Pearson similarity. These measures are widely used in constructing similarity network and have achieved great performance in many biological prediction tasks [22, 34, 35].
Heterogeneous network for CSA miRNAs and targets
The heterogeneous network is constructed based on miRNAtarget associations, miRNAmiRNA similarity and targettarget similarity.
The miRNAtarget associations are denoted as an adjacent matrix \({ }A \in \left\{ {0,1} \right\}^{M*N}\), M and N represent the number of miRNAs and targets, respectively. If a CSA miRNA \(r_{i}\) is associated with a target \(t_{j}\), \(A_{ij}\)â€‰=â€‰1; otherwise \({ }A_{ij}\)â€‰=â€‰0. The miRNAmiRNA similarity network is derived from the CSA miRNA similarity matrix \(S^{m}\) with \(S_{ij}^{m}\) as its (i,j)th element. And the targettarget similarity network is derived from the CSA target similarity network \(S^{n}\) with \(S_{ij}^{n}\) as its (i, j)th element. Furthermore, we adapt \(\sim S^{m} = D_{m}^{{  \frac{1}{2}}} S^{m} D_{m}^{{  \frac{1}{2}}}\) and \(\sim S^{n} = D_{n}^{{  \frac{1}{2}}} S^{n} D_{n}^{{  \frac{1}{2}}}\) to normalize the similarity matrices, where \(D_{m} = {\text{diag}}\left( {\mathop \sum \limits_{j} S_{ij}^{m} } \right)\) and \(D_{n} = {\text{diag}}\left( {\mathop \sum \limits_{j} S_{ij}^{n} } \right)\). Finally, the heterogeneous network defined by the adjacency matrix comes to be
Graph convolution
Regarding known associations between miRNAs and targets as a bipartite graph, the prediction problem in this paper can be defined as a semisupervised link prediction task on such a graph.
We assume that a bipartite graph Gâ€‰=â€‰(Î½, Îµ) with Î½â€‰=â€‰(\(\nu_{m, } \nu_{t} )\) including \(n_{m }\) miRNA nodes and \(n_{t}\) target nodes, which have numerical features \(X_{m } = \left[ {x_{m}^{1} ,x_{m }^{2} , \ldots ,x_{m}^{{n_{m} }} } \right]^{T} \in R^{{n_{m} *M}}\) and \(X_{t } = \left[ {x_{t}^{1} ,x_{t }^{2} , \ldots ,x_{t}^{{n_{t} }} } \right]^{T} \in R^{{n_{t} *N}}\), respectively. Supposing that partial links (denoted as Îµ in G) are given labels, our goal is to predict whether there are any potential links between miRNA and target that have not been determined previously. Thus, how best to effectively utilize both graph topology and the attribute information of the nodes is a problem we need to address.
There have recently been some attempts to use deep learning techniques to graphbased data analyses. A graph convolutional network (GCN) is proposed in Kipf et al. [26]. Graph convolution is defined on graph as the multiplication of an input signal with a filter \(g_{\theta }\) in the Fourier domain [19]. Given an adjacent matrix A with its Laplacian L: DA, and attributes of each node on graph (denoted as s), spectral graph convolution tries to decompose s on the spectral components. We assume that L can be decomposed by \({ }L = U\Lambda U^{T}\), U is eigenvector matrix and Î› is the diagonal matrix. Hence, \(g_{\theta } {\text{*s}} = Ug_{\theta } U^{T}\) s is a graph Fourier transform of \(U^{T} s\). Defferrard used a truncated expansion in terms of Chebyshev polynomials [36]\(T_{k\left( s \right)}\) up to \(K^{th}\) order, approximating the spectral filter in order to avoid the issue of computationally costly eigendecomposition of L
where \(\theta^{\prime}\) is a vector about Chebyshev coefficients and \(T_{k}\) is the Chebyshev polynomials. A further research simplified this definition by approximating the largest eigenvalue of L by Formula (4) [26]. The convolution operator is
Prediction framework of the proposed MTAGCN
The workflow of our model is shown in Fig.Â 1. Our proposed MATGCN model consists of three parts, i.e., similarity network integration, encoder construction and decoder construction. We integrate similarity networks by combining rich biological information to construct the heterogeneous network. And the encoder is a GCN model with layer attention mechanism, capturing network structure information using GCN. We design a decoder, a fully connected layer network, to transform features into the original space.
For graph convolution, we adopted the simplified definition. As mentioned above, the prediction of associations between CSA miRNAs and targets can be considered as a semisupervised link prediction problem. But current GCNbased approaches tackle node classification problem on homogeneous network and are not applicable to the issue involving prediction of associations. Thus, we extend the current graph convolution idea to solve link prediction problem defined on heterogeneous, bipartite, attributed networks. For this goal, we proposed the GCNbased framework called MTAGCN to solve the novel prediction problem. The Algorithm 1 shows the detailed training steps of the MTAGCN for predicting CSA miRNAtarget association.
GCN is a multilayer connected neural network and its propagation rule is defined as follows:
where Ïƒ is an adjustable activation function, D is the diagonal degree matrix, A is the adjacency matrix, \(H^{\left( l \right)}\) is the nodes embedding in the lth layer and \(W^{\left( l \right)}\) is the layerwise trainable weight.
For constructing the encoder of MTAGCN, we consider how to fully use the CSA miRNAmiRNA similarity network, the CSA targettarget similarity network and the miRNAtarget associations through graph convolution network on the heterogeneous graph \(A_{H}\). Specifically, we set the input graph G as
where Î¼ is a penalty factor that controls the contribution of the similarity in MTAGCNâ€™s propagation process,\(S^{m}\) is the CSA miRNAmiRNA similarity matrix and \(S^{n}\) is the targettarget similarity matrix. To initialize embeddings, we introduce graph convolution into the latent factor model in the light of the nature â€˜miRNAtargetâ€™ associations and the embedding matrix is reconstructed as
with the above setting, the MTAGCN encoder for first layer can be defined as
where \(H^{\left( 1 \right)} \in {\text{R}}^{{\left( {M + N} \right)*k}}\) denotes the firstlayer node embeddings in the heterogeneous matrix \(A_{H}\), k is the embedding dimensionality and \(W^{\left( 0 \right)} \in {\text{R}}^{{\left( {M + N} \right)*k}}\) is the trainable weight matrix of the firstlayer. The MTAGCN encoders for subsequent layers follow the Formula (5) and G is defined in Formula (7). Herein, after L iteration, L kdimensional CSA miRNA and target embeddings can be obtained. Furthermore, we introduce SELU (scaled exponential linear unit) [37] as the activation function used in MTAGCN graph convolution layers to accelerate learning procedure and enhance generalization performance.
Different layers of the embeddings capture different structural information. Such as, the first layer obtains direct edge information and other layers obtain the multihop neighbor information by iteratively updating the embeddings [38, 39]. Considering that different embeddings in different layers have various contributions, we introduce a selfattention mechanism, which adaptively combines embeddings and harvests final embeddings of CSA miRNAs and targets as \(\left[ {\begin{array}{*{20}c} {H_{I} } \\ {H_{G} } \\ \end{array} } \right] = \sum a_{l} H^{l}\), where \(H_{I} \in R^{{M{*}k}} { }\) is the final embeddings of miRNAs,\({ }H_{G} \in R^{{N{*}k}}\) is the final embeddings of targets, \(a_{l}\) is autolearned by a singlelayer feedforward network.
To reconstruct adjacency matrix for CSA miRNAtarget associations, a bilinear decoder \(A^{\prime} = {\text{f}}\left( {H_{I} ,H_{G} } \right)\) is built as follows:
where \(W^{\prime} \in R^{k*k}\) is the trainable matrix. We denoted \(A_{ij}^{^{\prime}} { }\) as the predicted scores of the CSA miRNAtarget association, which is given by corresponding (i, j)th entry of \(A^{\prime}.\)
Optimization
In the dataset with M CSA miRNAs and N targets, the miRNAtarget association pairs are taken as the set of all positive association pairs \(\gamma^{ + }\) and other pairs as the set of negative pairs \(\gamma^{  }\). Although it is a binary classification problem to differentiate two types of miRNAtarget pairs, the number of negative miRNAâ€“target pairs are much higher than that of the positive pairs. Herein, MTAGCN learns parameter by the loss function (weighted crossentropy):
where (i, j) is the instance for CSA miRNA \(r_{i}\) and target \(t_{j}\), \(\lambda = \frac{{\left {\gamma^{  } } \right}}{{\left {\gamma^{ + } } \right}}\), \(\left {\gamma^{ + } } \right\) and \(\left {\gamma^{  } } \right\) denote the corresponding pairs. The balance factor Î» emphasizes the known associations and decreases the impact of data imbalance.
The Xaiver initialization method [40] is used to randomly initialize all trainable weight matrices. Then, as is shown in the 9 rows of Algorithm 1, we use the Adam optimizer [41] for the optimization. In order to balance the training speed and the experimental result, we also use a simple cycle learning rate [42] during the optimization, that is making a change from 0.01 to 0.1. Furthermore, we introduce finegrained edge dropout [43] and coarsegrained node dropout [44] in the graph convolution layers to prevent overfitting. The finegrained edge dropout is appliedÂ to convolution layers and dense layers, randomly drops out edges. And the coarsegrained node dropout can efficiently enforce dropout at the node level.
Negative sampling
Recent arts usually focus on positive sampling, while the strategy for negative sampling is left insufficiently explored. However, many studies theoretically proved that negative sampling is important as positive sampling in determining the optimization objective and the resulted variance [45]. Hence, negative sampling has wide application in many fields for its simplicity and efficiency, such as natural language processing [46], computer vision [47], recommender system [48] and graph embedding [49]. Inspired by previous study [50], we adopted three strategies of negative sampling, including random negative sampling, sampling by CSA miRNA (SCM) and sampling by CSA target (SCT).
For random negative sampling, the negative samples were generated by randomly drawing from the total negative samples. Furthermore, we proposed two negative sampling methods, SCM and SCT. The two methods are similar in some ways and the details are shown in Algorithm 2. For SCM/SCT, we first computed all numbers of positive sample based on the per miRNA/target. Then the negative sample was drawn based on the corresponding miRNA/target. For the unbalanced task, we executed the SCM/SCT in a loop to get enough negative samples. Compared with the random negative sampling without regularity, the other two sampling methods can be based on one CSA miRNA or target. It is worth pointing out that SCM/SCT can ensure that every miRNA/target be sampled, greatly increasing the sampling range of negative samples. In the following, we will compare the above sampling strategies.
Results and discussions
In this section, we briefly introduced the experimental setup. Next, we carried out to evaluate the performance of the proposed MTAGCN model and the effect of layer attention mechanism, then demonstrated the performance of our model by comparing with five machine learning methods and three existing link/association prediction methods on balanced and unbalanced tasks (Table 2), respectively.
Experimental setting
To evaluate the effectiveness of our model, we performed fivefold cross validation on the two tasks. We randomly divided known miRNAtarget associations into five subsets with equal size. For fivefold crossvalidation, we randomly used the 80% known miRNAtarget associations for training and the remaining 20% for test. We employed the AUPR (area under precisionrecall curve) and the AUC (area under ROC curve) as primary metrics during cross validation which are widely used for pairwise link predictions [51]. Besides, we also calculated other metrics, i.e. recall, specificity, precision, ACC and F1score.
We set the embeddings dimensionality k as 64 by conducting the parameter sensitivity analysis. The layer number L, the initial learning rate lr, the coarsegrained node dropout Î± and the finegrained edge dropout Î², are respectively set to 3, 0.01, 0.6 and 0.6. In addition, the total training epochs of MTAGCN Î³ was set to 500, and the penalty factor Î¼ was set to 0.06. Our experiment code was implanted on the opensource machine learning framework Tensorflow. All experiments were conducted on Ubuntu operating 20.04 system with a NVDIA GeForce GTX3090 GPU and 32G memory.
The influence of different heterogeneous networks
MTAGCN takes advantage of the CSA miRNAtarget heterogeneous network to construct the model. And we built the heterogeneous network by aggregating miRNAmiRNA similarities, targettarget similarities and known miRNAtarget associations. Since we took into account four similarity measures, MTAGCN could be trained on various heterogeneous networks, which may have a certain effect on the predictive ability.
MTAGCN models based on heterogeneous networks with different miRNAmiRNA and targettarget similarities were evaluated by fivefold cross validation, and Table 3 shows the corresponding results. The Jaccard index achieved slightly better performance than the other similarity measures we used. And these results reflected that our model is robust. Based on the analysis, we ultimately employed Jaccard index to calculate CSA targettarget similarity and CSA miRNAmiRNA similarity. In the following study, the heterogenous network was construct by fusing two similarity networks and CSA miRNAtarget associations.
Analysis of negative sampling
As mentioned above, we adopted three negative sampling strategies motivated by previous studies. We tested our model on these sampling strategies and then discussed how they influence the performances of MTAGCN. Table 4 shows that SCT achieves better results than both SCM and random negative sampling methods. That is maybe due to the number of targets is much more than that of miRNAs, resulting in a larger sampling range and reducing the sampling imbalance. To this end, we performed SCT strategy, setting the ratio of positive and negative samples to a rate of 1:1 (balanced task) and 1:20 (unbalanced task) in the followup test sets.
Results of MTAGCN
To develop the MTAGCN, we used the embeddings for diverse layers to construct models which denoted as MTAGCNL1, MTAGCNL2 and MTAGCNL3. Table 5 shows the performance of the above models using fivefold cross validation. MTAGCNL1 and MTAGCNL2 performed better than MTAGCNL3, showing that the lower layer captures more information than the higher layer because of the oversmoothing. However, MTAGCN that combines the embeddings for all three layers produced the best results on the balanced task.
The lth layer of MTAGCN captures the lthorder proximity value between nodes, and the attention weights represent the relative contribution of the corresponding convolution layers. We implemented 20 runs of 5cv, and the Fig.Â 2 visualizes the attention weights of diverse convolution layers. Different convolution layers have diverse weights, and that of the lower layer is greater than those of the higher layers, revealing that the lowerorder proximity is of more important than the higher. Therefore, it also helps to illustrate the performance of MTAGCNL1, MTAGCNL2, MTAGCNL3 (Table 5).
Furthermore, we considered MTAGCNAVE and MTAGCNCON, which integrate embeddings from different convolution layers. MTAGCNAVE adopts the average of weights for different embeddings. As to MTAGCNCON, we stack the embeddings for three layers directly. In Table 5, the results indicate that the MTAGCN with attention mechanism achieved more encouraging performance than MTAGCNAVE and MTAGCNCON. Additional file 1: Table S1 shows the results under unbalanced task, from which we can obtain similar conclusions.
Comparison with the machine learning methods
To investigate the performance of our proposed MTAGCN model for CSA miRNAtarget association prediction, we compared it with some classic machine learning algorithms, including random forest (RF), extremely randomized tree (ERT), decision tree (DT), Gaussian naÃ¯ve Bayes (GNBS), deep neural network (DNN). The results for the above machine learning models on the balanced and unbalanced task are shown in Fig.Â 3.
According to the results, MTAGCN outperforms all classic machine learning methods, strikingly for the balanced task in Fig.Â 3 (A). As to the unbalanced task, the accuracy and the specificity of the MTAGCN are lower than most of the classic machine learning models, but the primary metrics AUPR and AUC are higher than these models. It is believed that both the accuracy and specificity are thresholdbased metrics, which are greatly affected by data imbalance [52]. Overall, MTAGCN has a better performance than the methods used. These classic machine methods all have a low AUPR, F1, and recall, which means the proposed model produces more robust performances across two tasks.
Comparison with the stateoftheart methods
As mentioned before, there has few existing methods developed specifically to solve CSA miRNAtarget association prediction problem. Therefore, we compared MTAGCN with three stateoftheart approaches proposed to address other association prediction tasks in the computational biology.

GCMDR [53] constructed a graph convolutional network based model to identify miRNAdrug resistance relationships.

GATMDA [54] proposed a graph attention networks model with inductive matrix completion to predict human microbedisease associations.

GCNMDA [55] deployed a conditional random field on the graph convolution network to predict human microbedrug associations.
We compared them with our proposed model under the same experimental conditions, including balanced and unbalanced tasks. The results are shown in Fig.Â 4. We can observe that among all the methods under the balanced task in Fig.Â 4 (A), MTAGCN achieves the best performance. For the unbalanced task, although GCMDR, GCNMDA have slightly higher accuracy and specificity values than MTAGCN, overall, MTAGCN has better performance on the other metrics in Fig.Â 4 (B). As mentioned above, accuracy and specificity are greatly affected by data imbalance. In addition, MTAGCN outperformed all compared deeplearning methods in the most evaluation metrics. Furthermore, we would explain the reason why GCMDR obtained such low F1, recall and precision, finding that predicted scores of the true positive samples are almost close to 0. It is believed that the robustness of the GCMDR model is not good for CSA miRNAtarget prediction problem.
Parameter sensitivity
There are several important parameters influence our model performance, such as the coarsegrained node dropout rate Î±, finegrained edge dropout rate Î², the embedding dimensionality k and the total training epoch T. In order to assess the parameter sensitivity, we evaluated the influences using fivefold CV for all parameters based on balanced task. The node dropout rate Î± plays an important role in our model. We ranged Î± from 0.1 to 0.6 with a step value of 0.1. As shown in Fig.Â 5, we can achieve the best performance when Î±â€‰=â€‰0.6 and a small value of Î± is not good for the model performance. Î² is the regular dropout rate of the edge. We evaluated the performance of model by varying Î² from 0.1 to 0.6 with a step of 0.1. From Fig.Â 5, we could conclude that this parameter has a relatively slight influence on our model performance, which indicates that our model is robust against the regular dropout rate Î². In addition, we used k to control the dimensionality of embeddings. In our experiment, we varied k from the range of {8, 16, 32, 64, 128, 256}. It can be observed that the best performance is achieved when k is 64 and the performance decreases if the value of k further increases in the Fig.Â 5. Lastly, we also considered the influence of total training epoch T. Results in Fig.Â 5 show that, our model produces the robust performances to the training epoch, which first slightly increases and then decreases, with epochâ€‰=â€‰500 achieving the best performance.
Case study
To verify the performance of the proposed model on CSA miRNAtarget prediction task, we conducted case studies for CSA miRNA associated with targets. csnMIR156j_5p is a conserved miRNA in the leaf and root degradomes of CSA, which plays an important role in organ/tissuespecific physiological and developmental process [7]. It has high expression levels with the functions of photosynthesis and transmembrane transport, regulating target CSA019508.1 and CSA015924.1 respectively. Some studies also proved that this miRNA could bind to the target CSA018458.1 to inhibit the secretion of resistance proteins [56]. csnMIR319e_5p is a miRNA that influences the ATPase activity and ATP metabolic process related gene expression. For example, it can combine with the target TEA00633.1 and TEA000574.1 to reduce CSA respiration [31]. csnMIR390b_3p is a conserved miRNA that relates to the structural constituent ribosome and oxygencontaining compound. Recent studies show that there is a close relationship between this miRNA and the CSA photosynthesis when targeting CSA024193.1 and CSA016339.1 [56]. However, huge challenges remain to reveal the mechanism of miRNA because of its functional complexity. Table 6 lists the results of the three case studies. It is obvious that they all show superior performances, demonstrating that the proposed MTAGCN model is capable of predicting the undiscovered potential miRNAtarget associations for CSA miRNAs and targets.
Conclusion
In this paper, we proposed a novel deep learning framework, named MTAGCN, based on graph convolution network with layer attention for CSA miRNAtarget association prediction. Compared with existing methods utilizing the topological graphs, MTAGCN integrates the graph information of the heterogeneous network built from CSA miRNAtarget associations, CSA miRNAmiRNA similarity network and CSA targettarget similarity network. Furthermore, MTAGCN adaptively combines embeddings at diverse convolution layers. Extensive experimental results demonstrate that MTAGCN outperforms the existing link/association prediction methods in predicting CSA miRNAtarget associations.
However, although our model has good prediction performance, there is still room to enhance MTAGCN through further refinement. Due to the noise in the features extracted from similarity networks, our model is far from perfect, and the prediction results can be further improved. As a fastgrowing research field, graph construction and multisource feature fusion methods are boosting the model performance. For later versions of MTAGCN, we aim to further work closely with other study groups and develop the model on more experimentally verified data about CSA miRNAtarget link associations.
Availability of data and materials
The source code and processed data are freely available at https://github.com/haisonF/MTAGCN. The CSA miRNA information is available at the Supplementary Table 23 in Suo et al., https://ars.elscdn.com/content/image/1s2.0S0888754320320188mmc1.xlsx. The CSA target information is available online at http://teacon.wchoda.com.
References
Namita P, Mukesh R, Vijay KJ. Camellia sinensis (green tea): a review. Global J Pharmacol. 2012;6:52â€“9.
Gong AD, Lian SB, Wu NN, Zhou YJ, Zhao SQ, Zhang LM, et al. Integrated transcriptomics and metabolomics analysis of catechins, caffeine and theanine biosynthesis in tea plant (Camellia sinensis) over the course of seasons. BMC Plant Biol. 2020;2:20.
Chen D, Chen G, Sun Y, Zeng X, Ye H. Physiological genetics, chemical composition, health benefits and toxicology of tea (Camellia sinensis L.) flower: a review. Food Res Int. 2020;137:109584.
Xia EH, Tong W, Wu Q, Wei S, Zhao J, Zhang ZZ, et al. Tea plant genomics: achievements, challenges and perspectives. Hortic Res. 2020;7:7.
Li H, Lin Q, Yan M, Wang M, Wang P, Zhao H, et al. Relationship between secondary metabolism and miRNA for important flavor compounds in different tissues of tea plant (Camellia sinensis) as revealed by genomewide miRNA analysis. J Agric Food Chem. 2021;69:2001â€“12.
Liu ZW, Li H, Liu JX, Wang Y, Zhuang J. Integrative transcriptome, proteome, and microRNA analysis reveals the effects of nitrogen sufficiency and deficiency conditions on theanine metabolism in the tea plant (Camellia sinensis). Hortic Res. 2020;7:44.
Suo A, Lan Z, Lu C, Zhao Z, Pu D, Wu X, et al. Characterizing microRNAs and their targets in different organs of Camellia sinensis var. assamica. Genomics. 2021;113:159â€“70.
Gottlieb A, Stein GY, Ruppin E, Sharan R. PREDICT: a method for inferring novel drug indications with application to personalized medicine. Mol Syst Biol. 2011;7:496.
Oh M, Ahn J, Yoon Y. A networkbased classification model for deriving novel drugdisease associations and assessing their molecular actions. PLoS ONE. 2014;9:e111668.
Zhang W, Yue X, Chen Y, Lin W, Li B, Liu F, et al. Predicting drugdisease associations based on the known association bipartite network. In: 2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). IEEE; 2017. p. 503â€“9.
Lu L, Yu H. DR2DI: a powerful computational tool for predicting novel drugdisease associations. J Comput Aided Mol Des. 2018;32:633â€“42.
Wu G, Liu J, Wang C. Semisupervised graph cut algorithm for drug repositioning by integrating drug, disease and genomic associations. In: 2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). IEEE; 2016. p. 223â€“8.
Liang X, Zhang P, Yan L, Fu Y, Peng F, Qu L, et al. LRSSL: predict and interpret drugdisease associations based on data integration using sparse subspace learning. Bioinformatics. 2017;33:1187â€“96.
Zhang W, Yue X, Lin W, Wu W, Liu R, Huang F, et al. Predicting drugdisease associations by using similarity constrained matrix factorization. BMC Bioinform. 2018;19:233.
Yang M, Luo H, Li Y, Wu FX, Wang J. Overlap matrix completion for predicting drugassociated indications. PLoS Comput Biol. 2019;15:e1007541.
Luo H, Li M, Wang S, Liu Q, Li Y, Wang J. Computational drug repositioning using lowrank matrix approximation and randomized algorithms. Bioinformatics. 2018;34:1904â€“12.
Li Y, Wu FX, Ngom A. A review on machine learning principles for multiview biological data integration. Brief Bioinform. 2018;19:325â€“40.
Carrio A, Sampedro C, RodriguezRamos A, Campoy P. A review of deep learning methods and applications for unmanned aerial vehicles. J Sens. 2017;2017:44589.
Wu Z, Pan S, Chen F, Long G, Zhang C, Yu PS. A Comprehensive survey on graph neural networks. IEEE Trans Neural Netw Learn Syst. 2021;32:4â€“24.
Han P, Yang P, Zhao P, Shang S, Liu Y, Zhou J, et al. GCNMF: diseasegene association identification by graph convolutional networks and matrix factorization. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2019. p. 705â€“13.
Zhang Y, Chen L, Li S. CIPHERSC: diseasegene association inference using graph convolution on a contextaware network with singlecell data. IEEE/ACM Trans Comput Biol Bioinform. 2020;2:8886.
Purkayastha S, Mondal I, Sarkar S, Goyal P, Pillai JK. Drugdrug interactions prediction based on drug embedding and graph autoencoder. In: 2019 IEEE 19th International Conference on Bioinformatics and Bioengineering (BIBE). IEEE; 2019. p. 547â€“52.
Tran T, Kavuluru R, Kilicoglu H. Attentiongated graph convolutions for extracting drug interaction information from drug labels. ACM Trans Comput Healthc. 2021;2:44589.
Li C, Liu H, Hu Q, Que J, Yao J. A novel computational model for predicting microRNAâ€“disease associations based on heterogeneous graph convolutional networks. Cells. 2019;8:977.
Li J, Zhang S, Liu T, Ning C, Zhang Z, Zhou W. Neural inductive matrix completion with graph convolutional networks for miRNAdisease association prediction. Bioinformatics. 2020;36:2538â€“46.
Kipf TN, Welling M. SemiSupervised Classification with Graph Convolutional Networks. In: International Conference on Learning Representations (ICLR). 2017.
Yu Z, Huang F, Zhao X, Xiao W, Zhang W. Predicting drugâ€“disease associations through layer attention graph convolutional network. Brief Bioinform. 2021;22:243.
Chu Y, Wang X, Dai Q, Wang Y, Wang Q, Peng S, et al. MDAGCNFTG: identifying miRNAdisease associations based on graph convolutional networks via graph sampling through the feature and topology graph. Brief Bioinform. 2021. https://doi.org/10.1093/bib/bbab165.
Yang JH, Chen CM, Wang CJ, Tsai MF. HOPrec: highorder proximity for implicit recommendation. In: Proceedings of the 12th ACM Conference on Recommender Systems. 2018. p. 140â€“4.
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, et al. Attention is all you need. In: Advances in Neural Information Processing Systems. 2017. p. 5998â€“6008.
Zhang R, Ma Y, Hu X, Chen Y, He X, Wang P, et al. TeaCoN: a database of gene coexpression network for tea plant (Camellia sinensis). BMC Genomics. 2020;21:461.
Xia EH, Li FD, Tong W, Li PH, Wu Q, Zhao HJ, et al. Tea Plant Information Archive: a comprehensive genomics and bioinformatics platform for tea plant. Plant Biotechnol J. 2019;17:1938â€“53.
Chen Z, Zhao P, Li F, MarquezLago TT, Leier A, Revote J, et al. iLearn: an integrated platform and metalearner for feature engineering, machinelearning analysis and modeling of DNA, RNA and protein sequence data. Brief Bioinform. 2020;21:1047â€“57.
Huang F, Qiu Y, Li Q, Liu S, Ni F. Predicting drugdisease associations via multitask learning based on collective matrix factorization. Front Bioeng Biotechnol. 2020;218:44005.
Fu H, Huang F, Liu X, Qiu Y, Zhang W. MVGCN: data integration through multiview graph convolutional network for predicting links in biomedical bipartite networks. Bioinformatics. 2022;38:426â€“34.
Defferrard M, Bresson X, Vandergheynst P. Convolutional neural networks on graphs with fast localized spectral filtering. Adv Neural Inf Process Syst. 2016;29:3844â€“52.
Trottier L, Giguere P, ChaibDraa B. Parametric exponential linear unit for deep convolutional neural networks. In: 2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA). IEEE; 2017. p. 207â€“14.
Wang X, He X, Wang M, Feng F, Chua TS. Neural graph collaborative filtering. In: Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval. 2019. p. 165â€“74.
He X, Deng K, Wang X, Li Y, Zhang Y, Wang M. Lightgcn: Simplifying and powering graph convolution network for recommendation. In: Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. 2020. p. 639â€“48.
Glorot X, Bengio Y. Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of the thirteenth international conference on artificial intelligence and statistics. JMLR Workshop and Conference Proceedings; 2010. p. 249â€“56.
Kingma DP, Ba J. Adam: A Method for Stochastic Optimization. In: International Conference on Learning Representations (ICLR). 2015. p. 13.
Smith LN. Cyclical learning rates for training neural networks. In: 2017 IEEE Winter Conference on Applications of Computer Vision (WACV). IEEE; 2017. p. 464â€“72.
Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R. Dropout: A Simple Way to Prevent Neural Networks from Overfitting. J Mach Learn Res. 2014;15:1929â€“58.
van den Berg R, Kipf TN, Welling M. Graph Convolutional Matrix Completion. 2017.
Yang Z, Ding M, Zhou C, Yang H, Zhou J, Tang J. Understanding negative sampling in graph representation learning. In: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2020. p. 1666â€“76.
Chen L, Yuan F, Jose JM, Zhang W. Improving negative sampling for word representation using selfembedded features. In: Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining. 2018. p. 99â€“107.
Wu CY, Manmatha R, Smola AJ, Krahenbuhl P. Sampling matters in deep embedding learning. In: Proceedings of the IEEE International Conference on Computer Vision. 2017. p. 2840â€“8.
Ding J, Quan Y, He X, Li Y, Jin D. Reinforced Negative Sampling for Recommendation with Exposure Data. In: IJCAI. 2019. p. 2230â€“6.
Grover A, Leskovec J. node2vec: Scalable feature learning for networks. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2016. p. 855â€“64.
Chen T, Sun Y, Shi Y, Hong L. On sampling strategies for neural networkbased collaborative filtering. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2017. p. 767â€“76.
Zhang Y, Chen M, Li A, Cheng X, Jin H, Liu Y. LDAIISPS: LncRNAâ€“disease associations inference based on integrated space projection scores. Int J Mol Sci. 2020;21:1508.
Mellor A, Boukir S, Haywood A, Jones S. Exploring issues of training data imbalance and mislabelling on random forest performance for large area land cover classification using the ensemble margin. ISPRS J Photogramm Remote Sens. 2015;105:155â€“68.
Huang Y, Hu P, Chan KCC, You ZH. Graph convolution for predicting associations between miRNA and drug resistance. Bioinformatics. 2020;36:851â€“8.
Long Y, Luo J, Zhang Y, Xia Y. Predicting human microbedisease associations via graph attention networks with inductive matrix completion. Brief Bioinform. 2021;2:22.
Long Y, Wu M, Kwoh CK, Luo J, Li X. Predicting human microbedrug associations via graph convolutional network with conditional random field. Bioinformatics. 2020;36:4918â€“27.
Zhang Y, Zhu X, Chen X, Song C, Zou Z, Wang Y, et al. Identification and characterization of coldresponsive microRNAs in tea plant (Camellia sinensis) and their targets using highthroughput sequencing and degradome analysis. BMC Plant Biol. 2014;14:1â€“18.
Acknowledgements
We would like to thank all authors of the cited references.
Funding
This work was supported by the grants from the National Natural Science Foundation of China (62102004), the Natural Science Young Foundation of Anhui (2008085QF293), the Natural Science Young Foundation of Anhui Agricultural University (2019zd12) and the Introduction and Stabilization of Talent Project of Anhui Agricultural University (yj201932).
Author information
Authors and Affiliations
Contributions
HF: data curation, conceptualization, methodology, visualization, writing original draft. YX: methodology, visualization, conceptualization. XW: data curation, methodology. WX: visualization, conceptualization. ZY: supervision and editing, conceptualization, funding acquisition. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interests
Authors have no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Additional file 1
.Â Performance of MTAGCN based on different embeddings for the unbalanced task.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
About this article
Cite this article
Feng, H., Xiang, Y., Wang, X. et al. MTAGCN: predicting miRNAtarget associations in Camellia sinensis var. assamica through graph convolution neural network. BMC Bioinformatics 23, 271 (2022). https://doi.org/10.1186/s12859022048193
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s12859022048193
Keywords
 CSA miRNAtarget association prediction
 Deep learning
 Graph convolution network
 Layer attention mechanism