 Research
 Open access
 Published:
Predicting miRNAdisease associations based on PPMI and attention network
BMC Bioinformatics volume 24, Article number: 113 (2023)
Abstract
Background
With the development of biotechnology and the accumulation of theories, many studies have found that microRNAs (miRNAs) play an important role in various diseases. Uncovering the potential associations between miRNAs and diseases is helpful to better understand the pathogenesis of complex diseases. However, traditional biological experiments are expensive and timeconsuming. Therefore, it is necessary to develop more efficient computational methods for exploring underlying diseaserelated miRNAs.
Results
In this paper, we present a new computational method based on positive pointwise mutual information (PPMI) and attention network to predict miRNAdisease associations (MDAs), called PATMDA. Firstly, we construct the heterogeneous MDA network and multiple similarity networks of miRNAs and diseases. Secondly, we respectively perform random walk with restart and PPMI on different similarity network views to get multiorder proximity features and then obtain highorder proximity representations of miRNAs and diseases by applying the convolutional neural network to fuse the learned proximity features. Then, we design an attention network with neural aggregation to integrate the representations of a node and its heterogeneous neighbor nodes according to the MDA network. Finally, an inner product decoder is adopted to calculate the relationship scores between miRNAs and diseases.
Conclusions
PATMDA achieves superior performance over the six stateoftheart methods with the area under the receiver operating characteristic curve of 0.933 and 0.946 on the HMDD v2.0 and HMDD v3.2 datasets, respectively. The case studies further demonstrate the validity of PATMDA for discovering novel diseaseassociated miRNAs.
Background
MicroRNAs (miRNAs) are a class of small endogenous noncoding RNAs that do not encode proteins. MiRNAs are approximately 22nt in length and bind to the \(3^{\prime }\) untranslated region of target mRNAs mainly through sequencespecific base pairing, which in turn participates in the regulation of target mRNA expression at the posttranscriptional level [1,2,3,4]. More and more studies have shown that mutation or abnormal expression of miRNAs is often linked to the development and progression of complex human diseases such as cancer [5, 6]. For example, miR143 and 145 consistently show reduced stable levels of mature miRNAs in adenoma and carcinoma stages of colorectal cancer [7]. In lung cancer, high hsamir155 and low hsalet7a2 expression are associated with poor survival and they may be potential prognostic markers [8]. Therefore, it is necessary to reveal more underlying associations between miRNAs and diseases for the sake of understanding the pathogenesis and developing personalized therapies.
Experimental methods such as qPTPCR [9], northern blotting [10], and microarray profiling [11] have been used to predict novel miRNAs associated with diseases. Although experimental methods have highly accurate results, they usually require a relatively large time and economic investment, which is inefficient. Therefore, to facilitate the discovery of potential diseaserelated miRNAs, computational methods are developed, which can be classified into three main types, namely similaritybased methods, machine learningbased methods, and deep learningbased methods.
For similaritybased methods, they are based on the assumption that functionally similar miRNAs tend to correlate with similar diseases and vice versa. Jiang et al. [12] put forward the first computational method for miRNAdisease association (MDA) prediction, which uses hypergeometric probability distributions to explore diseaserelated miRNAs. However, it was oversimplified by using the Boolean network to reflect the associations between diseases or between miRNAs, which may result in loss of information. Subsequently, HDMP [13] was proposed to assess the functional similarity between two miRNAs based on known MDAs and semantic similarity of diseases and to predict the MDA scores based on weighted k most similar neighbors. This approach overcomes the drawback of the Boolean network by calculating the similarities of miRNAs and diseases, but they both only consider the direct neighbor information (local information) of the network and ignore the global information in the network. Further, Chen et al. [14] devised a method based on global network similarity, which identifies potential diseaseassociated miRNAs by performing random walk with restart (RWR) on the disease similarity network. And Shi et al. [15] combined a proteinprotein interaction (PPI) network and utilized RWR to predict underlying associations between miRNAs and diseases, which takes advantage of the association information between miRNAs or diseases and genes. Although these methods are gradually improving the performance of MDA prediction, they are difficult in predicting miRNAs associated with new diseases that have no known relevant miRNAs. To address this problem, Chen et al. [16] proposed to employ regularized least squares to uncover novel MDAs, which is a semisupervised and global approach. Although some similaritybased methods attempt to improve the performance of identifying new MDAs, including the potential associations for new diseases and new miRNAs [17,18,19], they are susceptible to the quality of the networks constructed, such as different similarity calculation methods may yield different results.
Machine learningbased methods are another class of computational methods that are often used for predicting MDAs. For example, EGBMMDA [20] employed the extreme gradient boosting machine to obtain the probability scores of relationships between miRNAs and diseases, and it was the first decision tree learningbased model for inferring candidate miRNAs. Chen et al. [21] designed a computational model that uses a filterbased method to select important features and employed random forest to discover diseaseassociated miRNAs. In addition, Zhang et al. [22] proposed a graph regularized generalized matrix factorization method to screen novel miRNAs that are related to diseases, which takes into account the neighborhood information of each node. NCMCMDA [23] combined neighborhood constraints with matrix completion to reconstruct the relationship matrix between miRNAs and diseases. However, these models above extract and fuse feature at shallow levels, which are unable to learn complex latent associations from multisource data.
In the last few years, deep learning has achieved satisfactory results in many domains, and as a result, using deep learning to predict molecular associations has become a hot topic. For example, SAEMDA [24] pretrained the stacked autoencoder (SAE) with all MDA pairs, and then finetuned the SAE with an equal number of known and unobserved MDA pairs to determine potential diseaserelated miRNAs. Although deep learningbased methods [25,26,27] have achieved good performance on intermolecular relationship prediction, some of them ignore the information interaction between nodes on the heterogeneous network composed of different biological entities. Recently, to exploit known molecular relationship pairs for information fusion between different types of nodes, Long et al. [28] used graph attention networks with talkingheads to learn embeddings of microbes and diseases based on the microbedisease association network, and GAEMDA [29] is a new graph autoencoder method that aggregates the neighborhood information of nodes based on known MDAs via the aggregator function and multilayer perceptron, which achieves heterogeneous information fusion. Nevertheless, most of these models only consider the firstorder proximity of the nodes in simple integrated similarity networks, while ignoring the multihop neighborhood information in different similarity networks. Some studies [25, 30, 31] have shown that highorder neighborhood information in networks is important for learning embedding representations of nodes on homogeneous/heterogeneous networks. Therefore, to learn highorder proximity representations of nodes from different similarity networks and efficiently fuse information of different types of nodes, we develop a new endtoend computational approach based on positive pointwise mutual information (PPMI) and attention network for predicting MDAs, called PATMDA. Specifically, our main contributions are summarized as follows:

We construct the MDA network and multiple miRNA and disease similarity networks, which are based on disease semantic similarity, miRNA functional similarity, and Gaussian interaction profile (GIP) kernel similarity for miRNAs and diseases.

To learn global structural information from the similarity network views, RWR and PPMI are utilized to obtain multiorder proximity features. Furthermore, we combine highorder proximity representations got by exploiting convolutional neural network (CNN) and firstorder proximity representations including direct neighbor information.

To efficiently integrating structural features of different types of nodes, we design an attention network with neural aggregation, which learns the final representations of miRNA and disease nodes by fusing the representations of nodes and their heterogeneous neighbors based on the MDA network.

Our experimental results show PATMDA outperforms baseline methods in exploring novel MDAs.
Results and discussion
Datasets
In this work, we obtain human MDAs from HMDD v2.0 [32] that are confirmed by experimental evidence in the literature, including 5430 known MDAs among 495 miRNAs and 383 diseases. In addition, the newest version HMDD v3.2 [33] is used to further validate the performance of the model, where as in [25], 12 446 observed MDAs involving 853 miRNAs and 591 diseases are extracted. And directed acyclic graphs (DAGs) about the semantic trees of diseases are downloaded from the medical subject heading (MeSH) (https://www.nlm.nih.gov/mesh/).
Experimental setup
The PATMDA model is implemented based on the Pytorch framework. The Xavier normal distribution is employed for the initialization of the transformation matrices. For the hyperparameters of the model, we set the number of RWR transition steps K as 3, the number of CNN filters \(C_{out}\) for miRNA and disease as 256, the dimensionality of the transformed feature \(f_{tran}\) as 256, attentional heads’ number L as 2, and the learning rate as 0.0001.
In addition, common classification evaluation metrics are used to evaluate the performance of PAMDA for predicting MDAs, which includes area under the receiver operating characteristic (ROC) curve (AUC), area under the precision/recall (P–R) curve (AUPR), area under the TruePositiveRate@k (TPR@k) curve (AUTPR@k), accuracy (Acc.), precision (Prec.), recall and F1score. And we plot the ROC curve, P–R curve, and TPR@k curve to evaluate the PATMDA performance, where TPR@k curve depicts the proportion of positive samples predicted correctly in the top k to all positive samples under different k values [34]. It is worth noting that the calculation of the evaluation metrics, such as precision, would involve the existence of negative samples or making some assumptions about unknown samples. Since there are no proven uncorrelated miRNAdisease pairs, we make the assumption that MDA pairs that are not verified are considered negative samples. Further, in the experiment, we take all known MDA pairs as positive samples, and randomly select an equal number of samples from unconfirmed MDA pairs as negative samples. We use 5fold crossvalidation (5CV) to evaluate the PATMDA model. Specifically, all samples are randomly divided into 5 equal parts, and in turn, each part is treated as the test set while the others are applied for training. In each round of 5CV, the GIP kernel similarity of miRNAs and diseases is recalculated based on the training set.
Performance evaluation
Here, we evaluate the performance of PATMDA on HMDD v2.0 dataset using 5CV. As shown in Table 1, PATMDA obtains mean Acc. of 85.78\(\%\), Prec. of 85.27\(\%\), recall of 86.53\(\%\), and F1score of 85.88\(\%\). In addition, Figs. 1, 2 and 3 show the ROC, P–R and TPR@k curves of the PATMDA model, respectively. We are able to see that PATMDA obtains mean AUTPR@k of 72\(\%\), achieves mean AUC of 93.3\(\%\), which is the mean of 92.3\(\%\), 93.73\(\%\), 93.65\(\%\), 93.37\(\%\), 93.47\(\%\), and obtains average AUPR of 93.4\(\%\), which is the average of 92.72\(\%\), 93.54\(\%\), 93.88\(\%\), 93.14\(\%\), 93.73\(\%\).
Comparison with stateoftheart methods
To further evaluate the performance of our proposed model, we compare the PATMDA model with six stateoftheart computational models for predicting MDAs using 5CV on HMDD v2.0 and HMDD v3.2 datasets, including MDHGI [35], ABMDA [36], NIMCGCN [27], DANEMDA [37], SAEMDA [24] and MINIMDA [31]. MDHGI is a method for identifying potential diseaserelated miRNAs by using matrix decomposition and heterogeneous graph inference [35]. ABMDA is an adaptive boostingbased method for uncovering underlying associations between miRNAs and diseases [36]. NIMCGCN is a computational method that combines neural inductive matrix completion and graph convolutional network to predict MDAs [27], and DANEMDA reveals latent MDAs based on deep attributed network embedding [37]. SAEMDA is a stacked autoencoderbased approach for prioritizing diseaserelated miRNAs [24], and MINIMDA discovers potential relationships between miRNAs and diseases by integrating mixed neighborhood information in multimodal networks [31]. For a fair comparison, we adopt the default parameters of the baseline models provided by the authors to obtain their AUC and AUPR.
As shown in Figs. 4 and 5, PATMDA achieves competitive performance on both datasets. For HMDD v2.0 dataset, PATMDA achieves the highest AUC, AUPR of 93.3%, 93.4%, which may be due to that PATMDA fuses multiorder proximity representations from multiple similarity network views via CNN and aggregates heterogeneous structural information of miRNA and disease nodes via attention network with neural aggregation. Compared with MDHGI, ABMDA, and SAEMDA, PATMDA takes into account the highorder proximity representations of nodes in different similarity networks and enhances the information interaction between heterogeneous nodes, instead of using only direct neighbor information from the integrated similarity network as in these comparison methods. Although DANEMDA considers the interaction between the association network structure and similarity (attribute) information of miRNAs and diseases captured from the diverse degrees of proximity, its performance is not as good as that of PATMDA, which suggests that the design of our model is more reasonable. In addition, although NIMCGCN utilizes highorder information from the integrated similarity network, its performance is not as good as that of PATMDA, which may be due to the fact that NIMCGCN ignores the information fusion between miRNA and disease nodes. Though MINIMDA considers highorder information of nodes and information fusion between heterogeneous nodes, its performance is still worse than that of PATMDA, which may be because PATMDA can better capture and fuse information of nodes in different networks and has better representation ability. Furthermore, Fig. 5 shows that PATMDA achieves the highest AUC, AUPR of 94.6%, 94.68% on HMDD v3.2 dataset. In conclusion, the experimental results show that PATMDA is effective in exploring potential diseaserelated miRNAs.
Ablation experiments
We use an attention network with neural aggregation to efficiently aggregate information of the nodes and their heterogeneous neighbors, and obtain structural features of nodes containing firstorder and highorder proximity from multiple similarity network views. To analyze the importance of the main components of our model, we design three variants of PATMDA (PATMDA_NP, PATMDA_Int, PATMDA_Gat) as comparison methods. PATMDA_NP means that we do not consider highorder proximity representations of nodes. Besides, like most methods, PATMDA_Int uses GIP kernel similarity to fill in missing values for another similarity, instead of considering each similarity view separately. PATMDA_Gat uses the standard graph attention network (GAT) [38] to replace the attention network with neural aggregation module, which does not effectively consider the importance of the nodes themselves. Figure 6 shows the evaluation results of PATMDA and its variant models under 5CV on HMDD v2.0 dataset, except that the recall of PATMDA is lower than that of PATMDA_NP, all other indicators are significantly higher than the variant models. For PATMDA_NP and PATMDA, after combining highorder proximity information, this model can obtain more structural information than only considering firstorder proximity information. This result shows that highorder proximity and firstorder proximity representations contain different structural features, which means that highorder information can be used as a complement to firstorder information. For PATMDA_Int and PATMDA, it is more beneficial to consider the structural information in each similarity network separately to extract the feature representations of miRNAs and diseases. For PATMDA_Gat and PATMDA, compared with only considering the information of its neighbor nodes, after using neural aggregation to enhance the information interaction between the node and its neighbor nodes, the model obtains more informative representations, which proves that the information of the node itself is also very important.
Case studies
To further validate the ability of the PATMDA model to discover potential diseaserelated miRNAs in practical applications, we conduct two different types of case studies based on the HMDD v2.0 dataset, and the dbDEMC [39] and HMDD v3.2 [33] datasets are utilized to identify the top 20 new associations between miRNAs and diseases.
In the first case study, we focus on detecting novel MDAs. Specifically, for each specific disease, we use all known MDA pairs and an equal number of association pairs randomly selected from unknown MDA pairs except those associated with the disease to train the PATMDA model, and further predict probability scores for unknown relationship pairs related to the disease. Furthermore, we rank potential miRNAs in descending order based on the predicted scores. We establish the case study for three diseases, including lymphoma, prostate neoplasms, and esophageal neoplasms. Tables 2, 3 and 4 respectively show the top 20 miRNA candidates associated with three diseases, which can be identified and validated by dbDEMC or HMDD v3.2 datasets.
In the second case study, we attempt to validate the usability of the PATMDA model for new diseases without observed associated miRNAs, where we take the relationship pairs between a specific disease and all miRNAs as the test set, and use the remaining known relationship pairs and a randomly selected equal number of unobserved relationship pairs related to other diseases as the training set. Similarly, we prioritize the top 20 underlying miRNAs according to the relationship scores predicted by PATMDA. Here we predict the associations between miRNAs and breast neoplasms, one of the most common malignancies in women. As shown in Table 5, the top 20 predicted breast neoplasmrelated miRNAs are all confirmed by dbDEMC, and 12 of them are also verified by HMDD v3.2.
The results of the above case studies demonstrate that PATMMDA has good performance in screening latent MDAs and miRNA candidates associated with new diseases.
Conclusion
In this paper, we propose a novel computational approach named PATMDA, which combines PPMI and attention network with neural aggregation to identify unobserved associations between miRNAs and diseases. PATMDA not only considers the firstorder neighbor information in different similarity network views, but also efficiently extracts highorder neighbor information from similarity views by using PPMI and CNN. To obtain more informative representations, we use an attention network with neural aggregation to integrate the structural information of heterogeneous nodes according to the MDA network. Comprehensive experiments show that our proposed PATMDA model is reliable and efficient in retrieving potential miRNA candidates for diseases, which may contribute to guiding biological experiments.
However, there are still some limitations that need to be further investigated in the future. First, although we consider the topology features from multiple similarity network views, how to maintain the consistency and complementarity of features learned from different similarity views is a topic worthy of future research. Second, in the similarity calculation of diseases and miRNAs, we hope to introduce more information to discover diseaserelated miRNAs, such as miRNA sequence similarity and genebased functional similarity of miRNAs and diseases may help in MDA prediction. In conclusion, more and more biological data sources provide convenience for predicting MDAs, but how to more effectively and rationally apply information from different data sources to improve the performance of methods for inferring potential miRNAs for diseases requires further exploration. In addition, we will also try to use PATMDA to identify noncoding RNAs such as lncRNAs and circRNAs that are related to diseases, and further design a more general method for predicting the relationships between noncoding RNAs and diseases.
Methods
In this work, we put forward a deep learning model based on PPMI and attention network for MDA prediction. As shown in Fig. 7, PATMDA mainly consists of the following parts: (i) RWR and PPMI are applied to various similarity network views to learn multiorder proximity representations, in turn, CNN is employed to obtain highorder proximity representations by fusing the learned multiorder proximity features; (ii) based on the combined firstorder and highorder proximity representations, the information of the nodes and their heterogeneous neighbors is integrated by the attention network with neural aggregation to obtain the final embeddings; (iii) inner product decoder is used to predict the association probability scores between miRNAs and diseases.
Human MDAs
Based on known human MDAs, we get an association matrix \(A\in R^{nm*nd}\) between miRNAs and diseases, where nm and nd denote the number of miRNAs and diseases, respectively. When there is a verified relationship between miRNA i and disease j, \(A_{ij}\) is equal to 1, otherwise, it is 0. Further, we construct the heterogeneous association network including miRNA and disease nodes based on the relationship matrix, whose adjacency matrix \(G_{A}\) can be defined as follows:
Similarity measures
Disease semantic similarity
Disease semantic similarity can be calculated based on MeSH descriptors [13], in which the relationships between diseases can be described by DAGs. Many studies [24, 27, 40] have used DAGs to generate disease semantic similarities (DSSs). There are two different approaches to calculating DSSs. DSS1 is obtained based on the assumption that two diseases are more similar to each other if they share more ancestral nodes in the DAGs. Further considering that the disease that appears in more (or less) DAGs is more common (or specific), DSS2 assigns different semantic contribution values to the diseases in the same layer of the DAG. We compute these two DSSs between diseases according to the previous method [13, 41] and obtain the adjacency matrix \(G_d^{(1)}\) of the disease semantic similarity network by averaging them, which means the edge weight between two disease nodes is equal to their semantic similarity value.
MiRNA functional similarity
On the basis of the hypothesis that similar functional miRNAs tend to correlate with similar phenotypic diseases and vice versa, miRNA functional similarity can be calculated according to the method in a previous study [41]. That is, we are able to obtain miRNA functional similarity depending on the semantic similarity between miRNArelated disease sets. For fairness, for the HMDD v2.0 dataset, we acquire the miRNA functional similarity directly from https://www.cuilab.cn/files/images/cuilab/misim.zip as in many studies. Whereas for the HMDD v3.2 dataset, we generate the functional similarity between miRNAs following [41]. In turn, we obtain the adjacency matrix \(G_m^{(1)}\) of the miRNA functional similarity network, where the similarity value between two miRNAs determines the edge weight between the two nodes.
GIP kernel similarity for diseases and miRNAs
Based on the assumption that similar diseases and miRNAs have analogous modes of interaction and noninteraction and vice versa [24], GIP kernel similarity can be used to measure the relationships between miRNAs and between diseases. miRNA GIP kernel similarity and disease GIP kernel similarity can be calculated by the approach in the previous study [42]. Similarly, based on the GIP kernel similarity of miRNAs and diseases, we can obtain the adjacency matrices \(G_m^{(2)}\) and \(G_d^{(2)}\) of their corresponding GIP similarity networks.
Learning highorder proximity representations
As described in [43, 44], the edge weight between two nodes determines the firstorder proximity between them, that is, the firstorder proximity indicates the degree of similarity between two nodes. The highorder proximity between two nodes represents neighborhood similarity, meaning that two nodes are similar if they share similar neighbors. The highorder proximity includes the global structure information of the network, and the firstorder proximity contains the local structure information of the network, which gives the direct neighbor information. Each similarity network view of miRNAs (diseases) gives firstorder proximity information of miRNAs (diseases) from different perspectives. Next, we will introduce how to obtain the highorder proximity of nodes according to different similarity views.
Multiorder proximity representations by PPMI
Since only local structural information is contained in the similarity network, motivated by [45, 46], we adopt RWR [47] to capture the global topological information of different similarity network views. Specifically, every time, the random walk process will continue at probability \(\alpha\) and will return to the initial node and restart the process at probability \(1\alpha\). We denote the sth similarity view of miRNA and the qth similarity view of disease as \(G_m^{(s)}\) and \(G_d^{(q)}\), respectively. For example, for the view s of miRNA, RWR can be represented as the following iterative process:
where \(P_k^{(s)}\) denotes the transfer probability matrix after k steps in view s. \(P_0^{(s)}\) is an identity matrix, and \(\widehat{G}_m^{(s)}\) represents the onestep probability transition matrix got by employing rowwise normalization of the similarity weight matrix \(G_m^{(s)}\). After the Kstep, we obtain network structure information of view s from different orders of proximity with the probability transition matrices \(\left\{ P_1^{(s)}, P_2^{(s)}, \cdots , P_K^{(s)}\right\}\), which characterizes the probability of cooccurrence of miRNA nodes on view s from different degrees.
Next, according to the multiorder proximity of miRNAs on view s, we obtain multiorder representations of miRNA nodes by computing a shifted PPMI matrix [48]. For the kth step structure proximity \(P_k^{(s)}\), the PPMI matrix is calculated as follow:
where \(k=1,2, \cdots , K\) and \(X_{k, i j}^{(s)}\) is the jth feature of miRNA i at the kth step in view s. According to the view s of miRNAs, we get the multiorder representations of miRNAs as follows:
Analogously, we can obtain multiorder feature representations of diseases on the qth network view \(G_d^{(q)}\):
where \(Y_k^{(q)}\) is the kth order representation of diseases obtained by using the RWR and PPMI to the normalized similarity weight matrix \(G_d^{(q)}\) of view q, \(k=1,2, \cdots , K\) and K denotes the total number of RWR transition steps.
As shown in Fig. 7a, by applying the RWR and PPMI to preprocess the multiview of miRNAs and diseases, we can get the representations of miRNAs and diseases from diverse perspectives, which contain the multiorder proximity information from different views. These features for miRNAs of S views and diseases of Q views can be expressed as:
Multiorder proximity fusion by CNN
For miRNAs and diseases, multiple feature matrices from different views can be regarded as multiple channels of an image. To the best of our knowledge, CNN utilizes convolutional filters to generate feature maps, which has made a huge breakthrough in computer vision. Therefore, we use CNN to further extract features and obtain highorder proximity representations of miRNA and disease. Given miRNA channel embedding \(X_m=\left[ x_1, x_2, \cdots , x_{C_m^{in}}\right]\), the final highorder proximity representation \(\widehat{X}_m\) is calculated as follows:
where \(C_m^{in}=S \times K\), \(\otimes\) represents the convolution operator, and \(W_m{ }^t\) and \(b_m{ }^t\) are the tth convolution filter and bias vector respectively. \(output_t\) is the representation from the tth output channel, where \(t=1,2, \cdots , C_{out}\) and \(C_{out}\) is the number of CNN filters. By stacking the representations from all output channels, the final highorder proximity representation of miRNA \(\hat{X}_m \in R^{n m * C_{out}}\) is obtained. Similarly, the disease highorder proximity representation \(\widehat{Y}_d\) can be got.
In order to preserve the local and global structural features in the similarity network views, we respectively combine the firstorder and highorder representations of miRNA and disease as their structure embeddings, which can be defined as:
where \(\Vert\) denotes the concatenation operation, and \(\tilde{X}_m\) and \(\tilde{Y}_d\) are respectively the embeddings of miRNA and disease obtained from similarity views.
Attention network with neural aggregation
GAT [38] is a powerful graph neural network with good performance in processing graphstructured data. GAT learns the embedding of the central node by assigning different weights to distinct neighbors, which aggregates the information of the neighbors to generate a useful representation of the central node. Therefore, inspired by [38, 49], to achieve the effective fusion of heterogeneous information, we design an attention network with neural aggregation to aggregate the structural features (that are, \(\tilde{X}_m\) and \(\tilde{Y}_d\), which include direct and indirect neighbor information) from nodes of different types based on the MDA network. Specifically, firstly, since different kinds of nodes have different feature spaces, we use nodetype transformation matrices to map them to the same feature space, which is expressed as follows:
where \(h^{(l)} \in R^{(n m+n d) * f_{tran}}\) is the transformed highlevel features in the lth attention head, and \(W_m^{(l)}\) and \(W_d^{(l)}\) denote the transformation matrices of miRNA nodes and disease nodes, respectively. \(f_{tran}\) is the feature dimension of miRNA and disease nodes after being transformed.
Furthermore, we use an attention mechanism to learn the importance of neighbor nodes of each node, and then fuse the representations of neighbor nodes according to the attention score to enhance the representation of the center node. For example, the attention score \(e_{i j}^{(l)}\) between miRNA i and disease j can be defined as:
where \(h_i^{(l)}\) and \(h_j^{(l)}\) are respectively the transformed features of miRNA i and disease j, and \({\text {LeakyReLU}}\) is the activation function. Then, we obtain the attention coefficient \(a_{i j}^{(l)}\) by applying the \({\text {softmax}}\) function to normalize the attention score, which is shown as follows:
where \(\mathcal {N}_i\) denotes all neighbors of miRNA i in the adjacent matrix \(G_A\). After obtaining the importance of each neighbor node to the central node i, we can obtain the heterogeneous neighbor representation of miRNA i by integrating neighbor features according to the attention coefficient:
where \(\sigma\) represents the nonlinear activation function. Similarly, we can get the heterogeneous neighbor representations of disease nodes.
Since \(h_{\mathcal {N}}^{(l)}\) only integrates the representations of heterogeneous neighbor nodes and neglects the representation of the node itself, we design a neural aggregator to integrate the node representation \(h^{(l)}\) and its heterogeneous neighbor representation \(h_{\mathcal {N}}^{(l)}\), which facilitates the information interaction between a node and its heterogeneous neighbors by using the fully connected layers (FCLs). And the enhanced feature \(Z^{(l)}\) is represented as follows:
where \(W_1^{(l)}\), \(W_2^{(l)}\), \(b_1^{(l)}\) and \(b_2^{(l)}\) are respectively trainable weight and bias matrices.
Finally, similar to standard GAT [38], we use the following multihead mechanism to obtain the final embedding of miRNA and disease:
where L is the number of independent attentional heads.
Association prediction for miRNAs and diseases
According to the obtained feature representation \(\tilde{Z}_m\) for miRNAs and feature representation \(\tilde{Z}_d\) for diseases, we simply use their inner product to predict the probability scores of associations between miRNAs and diseases, which is described as follows:
where the larger the value of \(\hat{A}_{i j}\), the more likely miRNA i is associated with disease j, and conversely, the less likely miRNA i is related to disease j.
Finally, we optimize the parameters of the PATMDA model by minimizing the crossentropy loss between true labels and predicted values, where the loss function is defined as follows:
where \(A_{i j}\) denotes the true label of the association between miRNA i and disease j, and Adam optimizer [50] is utilized to train the model.
Availability of data and materials
The datasets that support the findings of this study are available in https://github.com/xxpaaa/PATMDA.
Abbreviations
 miRNA:

microRNA
 PPMI:

Positive pointwise mutual information
 MDA:

miRNAdisease association
 RWR:

Random walk with restart
 GIP:

Gaussian interaction profile
 CNN:

Convolutional neural network
 GAT:

Graph attention network
 FCL:

Fully connected layer
 5CV:

5Fold crossvalidation
 AUC:

Area under the receiver operating characteristic (ROC) curve
 AUPR:

Area under the precision/recall (P–R) curve
References
Ambros V. The functions of animal microRNAs. Nature. 2004;431(7006):350–5.
Bartel DP. MicroRNAs: genomics, biogenesis, mechanism, and function. Cell. 2004;116(2):281–97.
Lee RC, Ambros V. An extensive class of small RNAs in Caenorhabditis elegans. Science. 2001;294(5543):862–4.
Liu B, Fang L, Liu F, Wang X, Chen J, Chou KC. Identification of real microRNA precursors with a pseudo structure status composition approach. PLoS ONE. 2015;10(3):0121501.
Meltzer PS. Small RNAs with big impacts. Nature. 2005;435(7043):745–6.
Chen X, Xie D, Zhao Q, You ZH. MicroRNAs and complex diseases: from experimental results to computational models. Brief Bioinform. 2019;20(2):515–39.
Michael MZ, O’Connor SM, van Holst Pellekaan NG, Young GP, James RJ. Reduced accumulation of specific microRNAs in colorectal neoplasia. Mol Cancer Res. 2003;1(12):882–91.
Yanaihara N, Caplen N, Bowman E, Seike M, Kumamoto K, Yi M, Stephens RM, Okamoto A, Yokota J, Tanaka T, et al. Unique microRNA molecular profiles in lung cancer diagnosis and prognosis. Cancer Cell. 2006;9(3):189–98.
Freeman WM, Walker SJ, Vrana KE. Quantitative RTPCR: pitfalls and potential. Biotechniques. 1999;26(1):112–25.
Pall GS, Hamilton AJ. Improved northern blot method for enhanced detection of small RNA. Nat Protoc. 2008;3(6):1077–84.
Baskerville S, Bartel DP. Microarray profiling of microRNAs reveals frequent coexpression with neighboring miRNAs and host genes. RNA. 2005;11(3):241–7.
Jiang Q, Hao Y, Wang G, Juan L, Zhang T, Teng M, Liu Y, Wang Y. Prioritization of disease microRNAs through a human phenomemicroRNAome network. BMC Syst Biol. 2010;4(1):1–9.
Xuan P, Han K, Guo M, Guo Y, Li J, Ding J, Liu Y, Dai Q, Li J, Teng Z, et al. Prediction of microRNAs associated with human diseases based on weighted k most similar neighbors. PLoS ONE. 2013;8(8):70204.
Chen H, Zhang Z. Prediction of associations between OMIM diseases and MicroRNAs by random walk on OMIM disease similarity network. Sci World J 2013;2013.
Shi H, Xu J, Zhang G, Xu L, Li C, Wang L, Zhao Z, Jiang W, Guo Z, Li X. Walking the interactome to identify human miRNAdisease associations through the functional link between miRNA targets and disease genes. BMC Syst Biol. 2013;7(1):1–12.
Chen X, Yan GY. Semisupervised learning for potential human microRNAdisease associations inference. Sci Rep. 2014;4(1):1–10.
Chen M, Lu X, Liao B, Li Z, Cai L, Gu C. Uncover miRNAdisease association by exploiting global network similarity. PLoS ONE. 2016;11(12):0166509.
Chen X, Yan CC, Zhang X, You ZH, Deng L, Liu Y, Zhang Y, Dai Q. WBSMDA: within and between score for miRNAdisease association prediction. Sci Rep. 2016;6(1):1–9.
Ma Y, He T, Ge L, Zhang C, Jiang X. MiRNAdisease interaction prediction based on kernel neighborhood similarity and multinetwork bidirectional propagation. BMC Med Genom. 2019;12(10):1–14.
Chen X, Huang L, Xie D, Zhao Q. EGBMMDA: extreme gradient boosting machine for miRNAdisease association prediction. Cell Death Dis. 2018;9(1):1–16.
Chen X, Wang CC, Yin J, You ZH. Novel human miRNAdisease association inference based on random forest. Mol Ther Nucleic Acids. 2018;13:568–79.
Zhang ZC, Zhang XF, Wu M, OuYang L, Zhao XM, Li XL. A graph regularized generalized matrix factorization model for predicting links in biomedical bipartite networks. Bioinformatics. 2020;36(11):3474–81.
Chen X, Sun LG, Zhao Y. NCMCMDA: miRNAdisease association prediction through neighborhood constraint matrix completion. Brief Bioinform. 2021;22(1):485–96.
Wang CC, Li TH, Huang L, Chen X. Prediction of potential miRNAdisease associations based on stacked autoencoder. Brief Bioinform. 2022;23(2):021.
Tang X, Luo J, Shen C, Lai Z. Multiview multichannel attention graph convolutional network for miRNAdisease association prediction. Brief Bioinform. 2021;22(6):174.
Ding Y, Lei X, Liao B, Wu FX. Predicting miRNAdisease associations based on multiview variational graph autoencoder with matrix factorization. IEEE J Biomed Health Inform. 2021;26(1):446–57.
Li J, Zhang S, Liu T, Ning C, Zhang Z, Zhou W. Neural inductive matrix completion with graph convolutional networks for miRNAdisease association prediction. Bioinformatics. 2020;36(8):2538–46.
Long Y, Luo J, Zhang Y, Xia Y. Predicting human microbedisease associations via graph attention networks with inductive matrix completion. Brief Bioinform. 2021;22(3):146.
Li Z, Li J, Nie R, You ZH, Bao W. A graph autoencoder model for miRNAdisease associations prediction. Brief Bioinform. 2021;22(4).
Jiang L, Sun J, Wang Y, Ning Q, Luo N, Yin M. Identifying drugtarget interactions via heterogeneous graph attention networks combined with crossmodal similarities. Brief Bioinform. 2022;23(2):016.
Lou Z, Cheng Z, Li H, Teng Z, Liu Y, Tian Z. Predicting miRNA—disease associations via learning multimodal networks and fusing mixed neighborhood information. Brief Bioinform. 2022.
Li Y, Qiu C, Tu J, Geng B, Yang J, Jiang T, Cui Q. HMDD v2.0: a database for experimentally supported human microRNA and disease associations. Nucleic Acids Res. 2014;42(D1):1070–4.
Huang Z, Shi J, Gao Y, Cui C, Zhang S, Li J, Zhou Y, Cui Q. HMDD v3.0: a database for experimentally supported human microRNAdisease associations. Nucleic Acids Res. 2019;47(D1):1013–7.
Barracchia EP, Pio G, D’Elia D, Ceci M. Prediction of new associations between NCRNAs and diseases exploiting multitype hierarchical clustering. BMC Bioinform. 2020;21(1):1–24.
Chen X, Yin J, Qu J, Huang L. MDHGI: matrix decomposition and heterogeneous graph inference for miRNAdisease association prediction. PLoS Comput Biol. 2018;14(8):1006418.
Zhao Y, Chen X, Yin J. Adaptive boostingbased computational model for predicting potential miRNAdisease associations. Bioinformatics. 2019;35(22):4730–8.
Ji BY, You ZH, Wang Y, Li ZW, Wong L. DANEMDA: predicting microRNAdisease associations via deep attributed network embedding. Iscience. 2021;24(6):102455.
Veličković P, Cucurull G, Casanova A, Romero A, Lio P, Bengio Y. Graph attention networks. arXiv preprint arXiv:1710.10903 (2017)
Yang Z, Wu L, Wang A, Tang W, Zhao Y, Zhao H, Teschendorff AE. DBDEMC 2.0: updated database of differentially expressed miRNAs in human cancers. Nucleic Acids Res. 2017;45(D1):812–8.
You ZH, Huang ZA, Zhu Z, Yan GY, Li ZW, Wen Z, Chen X. PBMDA: a novel and effective pathbased computational model for miRNAdisease association prediction. PLoS Comput Biol. 2017;13(3):1005455.
Wang D, Wang J, Lu M, Song F, Cui Q. Inferring the human microRNA functional similarity and functional network based on microRNAassociated diseases. Bioinformatics. 2010;26(13):1644–50.
Van Laarhoven T, Nabuurs SB, Marchiori E. Gaussian interaction profile kernels for predicting drugtarget interaction. Bioinformatics. 2011;27(21):3036–43.
Tang J, Qu M, Wang M, Zhang M, Yan J, Mei Q. Line: largescale information network embedding. In: Proceedings of the 24th international conference on world wide web; 2015. p. 1067–77.
Cao S, Lu W, Xu Q. GRAREP: learning graph representations with global structural information. In: Proceedings of the 24th ACM international on conference on information and knowledge management; 2015. p. 891–900.
Cao S, Lu W, Xu Q. Deep neural networks for learning graph representations. In: Proceedings of the AAAI conference on artificial intelligence, vol 30. 2016.
Dong K, Huang T, Zhou L, Wang L, Chen H. Deep attributed network embedding based on the PPMI. In: International conference on database systems for advanced applications. 2021. p. 251–66. Springer
Tong H, Faloutsos C, Pan JY. Fast random walk with restart and its applications. In: Sixth international conference on data mining (ICDM’06). 2006. p. 613–22. IEEE.
Bullinaria JA, Levy JP. Extracting semantic representations from word cooccurrence statistics: a computational study. Behav Res Methods. 2007;39(3):510–26.
Li Z, Zhong T, Huang D, You ZH, Nie R. Hierarchical graph attention network for miRNAdisease association prediction. Mol Ther. 2022;30(4):1775–86.
Kingma DP, Ba J. Adam: a method for stochastic optimization. 2014. arXiv preprint arXiv:1412.6980.
Acknowledgements
Not applicable.
Funding
This work was supported by the National Natural Science Foundation of China (No. 62072212), the Development Project of Jilin Province of China (No. 20220508125RC), National Key R&D Program (No. 2018YFC2001302), and the Jilin Provincial Key Laboratory of Big Data Intelligent Cognition (No. 20210504003GH).
Author information
Authors and Affiliations
Contributions
XX and YW conceived the prediction method, carried out experiments and the result analysis, and wrote the manuscript; KH and NS performed experiments and revised the paper. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interests
The authors declare that they have no competing interests.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
About this article
Cite this article
Xie, X., Wang, Y., He, K. et al. Predicting miRNAdisease associations based on PPMI and attention network. BMC Bioinformatics 24, 113 (2023). https://doi.org/10.1186/s1285902305152z
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s1285902305152z