Skip to main content

DNI-MDCAP: improvement of causal MiRNA-disease association prediction based on deep network imputation



MiRNAs are involved in the occurrence and development of many diseases. Extensive literature studies have demonstrated that miRNA-disease associations are stratified and encompass ~ 20% causal associations. Computational models that predict causal miRNA-disease associations provide effective guidance in identifying novel interpretations of disease mechanisms and potential therapeutic targets. Although several predictive models for miRNA-disease associations exist, it is still challenging to discriminate causal miRNA-disease associations from non-causal ones. Hence, there is a pressing need to develop an efficient prediction model for causal miRNA-disease association prediction.


We developed DNI-MDCAP, an improved computational model that incorporated additional miRNA similarity metrics, deep graph embedding learning-based network imputation and semi-supervised learning framework. Through extensive predictive performance evaluation, including tenfold cross-validation and independent test, DNI-MDCAP showed excellent performance in identifying causal miRNA-disease associations, achieving an area under the receiver operating characteristic curve (AUROC) of 0.896 and 0.889, respectively. Regarding the challenge of discriminating causal miRNA-disease associations from non-causal ones, DNI-MDCAP exhibited superior predictive performance compared to existing models MDCAP and LE-MDCAP, reaching an AUROC of 0.870. Wilcoxon test also indicated significantly higher prediction scores for causal associations than for non-causal ones. Finally, the potential causal miRNA-disease associations predicted by DNI-MDCAP, exemplified by diabetic nephropathies and hsa-miR-193a, have been validated by recently published literature, further supporting the reliability of the prediction model.


DNI-MDCAP is a dedicated tool to specifically distinguish causal miRNA-disease associations with substantially improved accuracy. DNI-MDCAP is freely accessible at

Peer Review reports


MicroRNAs (MiRNAs) are small endogenous RNAs that play important gene-regulatory roles by interacting with the mRNAs of protein-coding genes to direct their post-transcriptional regulations [1, 2]. With the development of biological technologies, rapid accumulating research articles have demonstrated the involvement of miRNAs in various biological processes including but not limited to cell proliferation [3], metabolism [4], embryonic development [5], and diverse diseases like breast cancer [6], diabetic complications [7, 8] and heart failure [9]. Notably, the roles of miRNAs in diseases can be stratified into causal and non-causal ones, as they can either directly promote/inhibit disease progression [10], or serve as molecules that accompany the changes in disease status [11]. Causal miRNAs can be involved in pathogenic mechanisms and developments of multiple diseases such as Parkinson’s disease [12] and cardiovascular diseases [13, 14]. Also, a group of potentially pathogenic miRNAs have been identified in the human brain and central nervous system [15]. That is to say, the associations between miRNAs and diseases can be further categorized based on causality rather than simply considered binary. MiRNAs with causal associations to diseases can actively engage in diverse biological processes through targeting disease genes and other mechanisms. Once the regulation of a disease-causative miRNA is disrupted, the normal physiological regulation will be shifted to the pathological one, directly contributing to the onset and progression of the diseases [16]. Hence, causal miRNA-disease associations are of great prominence in investigating the molecular mechanisms of diseases and identifying miRNA candidates that can serve as novel targets for disease treatment.

Current experimental methods to determine miRNA-disease associations are often labor-intensive and time-consuming. Consequently, there is an urgent need to develop computational models that can more effectively predict the relation on a large scale. Exploiting various machine learning and graph-based methods, several previous studies have proposed computational models that can accurately identified general diseases-related miRNAs. For instance, Fu et al. [17] developed DeepMDA that extracted high-level features from similarity information using stacked autoencoders and then predicted miRNA-disease associations by adopting a 3-layer neural network. A sophisticated deep ensemble model proposed by Chen et al. [18] revealed potential miRNA-disease associations based on decision tree. Many other computational models, such as RWRMDA [19], NetCBI [20], similarly relied on complex graph algorithms to estimate the similarity links between miRNA and disease networks, thereby constructing models for miRNA-disease association prediction with the assumption that similar miRNAs tend to be associated with similar diseases. Liu et al. generated SMALF [21], learning latent miRNA and disease features through stacked autoencoders and utilizing XGBoost to predict miRNA-disease associations. Li et al. [22] advanced a new computational framework GATMDA to discover unknown miRNA-disease associations based on graph attention network with multi-source information, which effectively fuses linear and non-linear features. A novel method CFSAEMDA [23] captured the interactive features of miRNA and disease, applying stacked autoencoder and modified cascade forest model to complete the final prediction. It is not surprising that predicting disease-related miRNAs is an ongoing focus of research, but our previous benchmark study [24] has shown that most of the existing models were not fully capable of distinguishing causal miRNA-disease associations from non-causal associations. Causal inference has become an emerging and important topic in bioinformatics, like casual genetic association inference [25] and casual gene regulatory network inference [26]. To fill the gap of causal miRNA-disease association prediction, efforts to design dedicated computational models aimed at discerning causal miRNA-disease associations remain an urgent priority.

In the latest Human MicroRNA Disease Database (HMDD) v3.2 [27], Gao et al. [28] systematically annotated the miRNA-disease associations by manual reviewing of the literature, and finally 4294 causal miRNA-disease associations were labeled, which made it possible to train a computational model for causal miRNA-disease association prediction. Indeed, Gao et al. proposed the MiRNA-Disease Causal Association Predictor (MDCAP), which exploited class label propagation algorithm to predict potential miRNA disease causality. MDCAP showed a reliable predictive performance on distinguishing between miRNA-disease associations and unrelated miRNA-disease pairs (AUROC > 0.9). Nonetheless, the performance of MDCAP decreased significantly on the challenges in discriminating causal and non-causal miRNA-disease associations (AUROC = 0.695). To this ends, we have previously [29] introduced the Levenshtein-distance Enhanced MiRNA-Disease Causal Association Predictor (LE-MDCAP) that was built on Levenshtein distance estimation and matrix decomposition algorithm. Although this model demonstrated the ability to discriminate potential causal miRNAs-disease associations from non-causal ones (AUROC = 0.820), further improvements are still needed to improve the predictive performance. Wang et al. also developed DisiMiR [30], using network influence and miRNA conservation to predict causal miRNA for four specific diseases (breast cancer, Alzheimer’s disease, gastric cancer, and hepatocellular cancer). Nevertheless, additional efforts are necessary to establish a universally applicable predictive model for causal miRNA-disease associations.

In this study, we devised an improved prediction model for causal miRNA-disease association prediction, Deep Network Imputation-assisted MiRNA-Disease Causal Association Predictor (DNI-MDCAP), by the means of integrating multiple miRNA features and applying the deep graph embedding based-network imputation as well as graph semi-supervised learning algorithm. To demonstrate the effectiveness of our proposed approach, we conducted thorough evaluations through tenfold cross-validation and independent test, providing comprehensive measurements of our model’s performance. In addition, case studies were conducted to compare the prediction results with the latest experimental evidence that has not been recorded in the HMDD v3.2 database, further validating the reliability of our model.


Overview of DNI-MDCAP workflow

To improve causal miRNA-disease association prediction, we adopted an integrated approach that combined various miRNA features and leveraged the network imputation method based on the deep graph embedding learning model, which finally led to the development of Deep Network Imputation-assisted MiRNA-Disease Causal Association Predictor (DNI-MDCAP) for predicting potential associations.

The overall workflow of DNI-MDCAP is illustrated in Fig. 1. The workflow was consisted of several steps. Firstly, we obtained miRNA-related data from various databases, including sequence, transcription factor (TF), target gene, expression, pathway, and disease association information. According to the above miRNA-related data, we measured the similarity between any two miRNAs by Levenshitein distance, Tanimoto coefficient or Gaussian interaction profile kernel, and then integrated these miRNA similarity metrics into a final miRNA similarity matrix. The disease semantic similarity matrix and the known causal miRNA-disease association matrix were obtained from the hierarchical relationships between disease terms recorded in Medical Subject Headings (MeSH) database [31] and the known causality annotation from HMDD v3.2 [27, 28], respectively. Next, because current knowledge of causal miRNA-disease associations is still sparse, we employed the deep graph embedding learning-based link prediction algorithm to impute the missing links within the miRNA similarity and the causal miRNA-disease association networks. Lastly, we implemented the graph semi-supervised learning method to build models for each miRNA similarity matrix. The prediction scores from these models were then combined via a weighted sum approach to derive the final prediction results of DNI-MDCAP.

Fig. 1
figure 1

Workflow of DNI-MDCAP

Calculation of miRNA similarity

On accounting of our previous experience in establishing LE-MDCAP [29], Levenshtein distance was an informative method to calculate some of miRNA similarity metrics. Levenshtein distance is a measure of the degree of difference between two feature strings, also known as the edit distance. The similarity score \(MS ({m}_{1}, {m}_{2})\) of miRNA \({m}_{1}\) and \({m}_{2}\) based on Levenshtein distance can be calculated as Eq. (1), where \(LD{\prime}({m}_{1}, {m}_{2})\) indicates the minimum edit times to convert \({m}_{1}\) to \({m}_{2}\), and \(len\) represents the length of the miRNA feature string.


The advantage of utilizing the Levenshtein distance as a measure of miRNA similarity, derived from Eq. (2), lies in its ability to effectively compare features in different scales.

$$\begin{array}{c}0\le L{D}^{\mathrm{^{\prime}}}\left({m}_{1},{m}_{2}\right)\le {\text{len}}\left({m}_{1}\right)+{\text{len}}\left({m}_{2}\right)\end{array}$$

Notably, \(MS ({m}_{1}, {m}_{2}\)) should be in the range of 0.5 to 1, as in this study we only considered the unidirectional editing distance from \({m}_{1}\) to \({m}_{2}\) in this study. The higher the score, the more similar the two miRNAs are.

For sequence-based miRNA similarity, we collected sequence information from the miRBase v22.1 [32] and employed the Levenshtein distance to measure the similarity between any two miRNAs at three levels: miRNA precursors, mature miRNAs and miRNA seed sequences. Accordingly, three sequence based-miRNA similarity metrics were obtained, namely \({MS}_{SP}\), \({MS}_{SM}\), and \({MS}_{SS}\). The resulting sequence-based miRNA similarity matrix was calculated on the weighted sum of the above three metrics, where the weight sum was 1 and optimized with a step size of 0.05, with the final selected combination weight \({MS}_{S}\) = 0.80 \({MS}_{SP}\)+ 0.15 \({MS}_{SM}\)+ 0.05 \({MS}_{SS}\).

In terms of TF-based and target gene-based miRNA similarity metrics, we fetched human TF-miRNA regulation data from TransmiR v2.0 [33] database and the experimentally validated microRNA-target interactions data from miRTarBase v8.0 [34] database. For each miRNA-TF/target pair, a score of 1 was used to indicate the existence of a known regulation between miRNA and a TF/target gene, and 0 otherwise. Again, Levenshtein distance was used to calculate the similarity between any pair of miRNAs represented by the binary strings of TF and target gene regulations, which are denoted as \({MS}_{tf}\) and \({MS}_{gene}\), respectively.

As for expression-based miRNA similarity, we obtained miRNA expression data for 137 cell types by organizing the research results of Lorenzi [35], and normalized the expressions to eliminate the scaling differences. For pathway-based miRNA similarity, we downloaded the p-values of enrichment for miRNA target genes from the miRPathDB v2.0 [36] and retained pathways with at least three miRNAs with p-values of less than 0.05. Notably, the expression and pathway information are not binary but exhibits a continuous distribution. Therefore, we used the Tanimoto coefficient to calculate the similarity between miRNAs, denoted as \({MS}_{E}\) and \({MS}_{P}\). The similarity between miRNA \({m}_{i}\) and miRNA \({m}_{j}\) was calculated as Eq. (3):

$$\begin{array}{c}{S}_{ij}=\frac{{m}_{i}\cdot {m}_{j}}{{\Vert {m}_{i}\Vert }^{2}+{\Vert {m}_{j}\Vert }^{2}-{m}_{i}\cdot {m}_{j}}\end{array}$$

Unlike many other similarity measurement methods that rely solely on the magnitude of differences, the Tanimoto coefficient takes into account the distribution and overlap of values within the data, making it particularly suitable for capturing similarities in continuous data with different ranges and distributions. Moreover, we evaluated the influence of different miRNA similarity measurement methods on the predictive performance of the model and chose the similarity metrics mainly based on its performance in casual-versus-non-causal discrimination (Additional file 1: Table S1).

Finally, based on previous research [37, 38], we also constructed the commonly used Gaussian interaction profile kernel similarity matrix \(GM\) for miRNAs as a baseline method for measuring miRNA similarity.

Known causal miRNA-disease associations

The human causal miRNA-disease association dataset was downloaded directly from HMDD v3.2 database. To facilitate the comparison of performance between DNI-MDCAP and MDCAP, we used the same dataset, containing 4228 experimentally validated causal associations between 535 miRNAs and 302 diseases. We constructed a binary adjacency matrix \(MD\) of \(nm\times nd\) to better represent the causal relationship of miRNA and diseases, where \(nm\) and \(nd\) represent the number of miRNAs and diseases, respectively. Specifically, if miRNA \(m\left(i\right)\) was confirmed to have a causal association with disease \(d(i)\), the value of \(MD(i,j)\) was 1, and 0 otherwise.

Calculation of disease semantic similarity

We introduced the well-known Wang’s diseases semantic similarity [39] to construct a disease similarity matrix \(DS\), with the help of the semantic topological relationships between diseases recorded in the MeSH database. In MeSH, the topology of disease can be described as a directed acyclic graph (DAG), in other words, \({DAG}_{D}= (D, {T}_{D}, {E}_{D})\), where \({T}_{D}\) represents the node including disease \(D\) and its ancestral disease, \({E}_{D}\) represents the edges of all relationships of the \({DAG}_{D}\). The contribution of disease \(d\) to the semantic value of disease \(D\) can be represented by the Eq. (4), where \(\Delta\) is the semantic contribution factor and is usually set to 0.5.

$$\left\{\begin{array}{cc}{D}_{D}(D)=1& \text{if }d=D\\ {D}_{D}(d)=max\left\{{\Delta }^{*}{D}_{D}\left({d}^{\mathrm{^{\prime}}}\right)\mid {d}^{\mathrm{^{\prime}}}\in \text{ children of }d\right\}& \text{if }d\ne D\end{array}\right.$$

The semantic value \(DC\left(D\right)\) of the disease \(D\) can be obtained by summing all the contributions of the ancestral disease and disease \(D\), as Eq. (5).

$$\begin{array}{c}DC\left(D\right)=\sum_{d\in T\left(D\right)} {D}_{D}\left(d\right)\end{array}$$

Thus, the semantic similarity of diseases \({D}_{i}\) and \({D}_{j}\) was calculated as Eq. (6).

$$\begin{array}{c}{\text{DSS}}\left({D}_{i},{D}_{j}\right)=\frac{\sum_{d\in T\left({D}_{i}\right)\cap T\left({D}_{j}\right)} \left({D}_{{D}_{i}}\left(d\right)+{D}_{{D}_{j}}\left(d\right)\right)}{DC\left({D}_{i}\right)+DC\left({D}_{j}\right)}\end{array}$$

In the light of the above calculations, diseases that have more largely shared DAG structures will have higher semantic similarity scores.

Deep network imputation

We applied the Node2Vec algorithm, which is based on deep graph embedding learning and random walk [40] to predict uncharted associations in the miRNA similarity network and miRNA-disease association network to enhance and complete the connections between nodes in these two types of networks. Particularly, we transformed each miRNA similarity matrix into a binary adjacency matrix by truncating at a certain threshold \(\xi\). For each network, \(\xi\) is optimized according to the best AUROC when distinguishing causal miRNA-disease associations in cross-validation. The thresholds for \({MS}_{S}\), \({MS}_{E}\), \(GM\), \({MS}_{P}\), \({MS}_{gene}\), \({MS}_{tf}\) were optimized as 0.75, 0.91, 0.3, 0.60, 0.97, 0.95, respectively.

The objective function of deep network imputation is shown in Eq. (7), where \(f\) is the mapping function of node \(u\) mapped to an embedded vector, and \(N (u)\) is the set of nearest neighbors of nodes \(u\) sampled by the sampling strategy \(S\).

$$\begin{array}{c}\underset{f}{max} \sum_{u\in V} {\text{logPr}}\left({N}_{S}\left(u\right)\mid f\left(u\right)\right)\end{array}$$

Optimization procedure for the specific objective function was described in detail in the article by Mikolov et al. [41], and is not repeatedly described here. Finally, the learned deep graph embedding features were used to predict new links, thus reconstructing the re-linked miRNA similarity networks and known causal miRNA-disease association network. To effectively showcase the advantages of network imputation, we compared the model performance of LE-MDCAP and DNI-MDCAP with and without network imputation (Additional file 1: Table S2). For both model, network imputation was helpful for performance improvement.

Semi-supervised learning model

As mentioned above, we acquired the re-linked causal miRNA-disease association matrix \(MD\) and six miRNA similarity matrices, namely \({MS}_{S}\), \({MS}_{E}\), \(GM\), \({MS}_{P}\), \({MS}_{gene}\), \({MS}_{tf}\) by deep network imputation. By combining each miRNA similarity matrix MS with the causal miRNA-disease association matrix MD and the disease semantic similarity matrix \(DS\), the semi-supervised learning method proposed by Liang et al. [42] can be used to separately predict the causal association scores of miRNAs and diseases.

Suppose \(n\) and \(m\) represent the number of miRNAs and diseases in the dataset, respectively. For the miRNA space, given the causal miRNA-disease association matrix \(MD\) and the miRNA similarity matrix \(MS\), an incidence matrix \(Qm\) can be obtained to reflect the causal association probabilities between certain miRNAs and diseases. Then an objective function based on the norm and trace (Tr) was defined as Eq. (8), where \({q}_{m}^{i}\) and \({q}_{m}^{j}\) denote the i-th and j-th columns of \({Q}_{m}\). \({U}_{m}\) is a diagonal matrix in which the i-th diagonal element controls the effect of the initial association of \(MD\). Intuitively, the objective here is to ensure that similar miRNAs have similar causal disease associations while optimizing the consistency between the predicted causal miRNA-disease associations with the known ones.

$$\begin{array}{c}\underset{{Q}_{m}}{min} \sum_{i,j=1}^{n} M{S}_{ij}^{m}{\Vert {q}_{m}^{i}-{q}_{m}^{j}\Vert }_{2}+{\text{Tr}}{\left({Q}_{m}-MD\right)}^{T}{U}_{m}\left({Q}_{m}-MD\right)\end{array}$$

For the disease space, the objective function was defined as Eq. (9). Here \(Qd\) is a label matrix to be solved.

$$\begin{array}{c}\underset{{Q}_{d}}{min} \sum_{i,j=1}^{m} D{S}_{ij}^{d}{\Vert {q}_{d}^{i}-{q}_{d}^{j}\Vert }_{2}+{\text{Tr}}{\left({Q}_{d}-M{D}^{T}\right)}^{T}{U}_{d}\left({Q}_{d}-M{D}^{T}\right)\end{array}$$

These two parts of the objective function were solved by an iterative algorithm until convergence, which was described in details in in the original article by Liang et al. [42] and will not be repeated here. We also compared the matrix factorization with semi-supervised learning method to demonstrate their impact on model prediction powers (Additional file 1: Table S2). Briefly, even with the same miRNA similarity network, semi-supervised learning (adopted by DNI-MDCAP) showed better performance than matrix factorization (adopted by LE-MDCAP) in distinguishing between causal and non-causal miRNA-disease associations. Besides, semi-supervised learning seems more compatible to the network imputation, as a noticeable gap of performance can be observed for DNI-MDCAP with and without network imputation, while such gap is much smaller for LE-MDCAP.

The combined DNI-MDCAP prediction score

For each miRNA similarity matrix input \(MS\), the corresponding \({Q}_{m}\) and \({Q}_{d}\) obtained above were finally integrated to generate a predictive correlation score matrix \({MD\prime}\), as specified in Eq. (10).


Notably, in this process, only one miRNA similarity matrix will be considered at a time. Since six miRNA similarity matrices \({MS}_{S}\), \({MS}_{E}\), \(GM\), \({MS}_{P}\), \({MS}_{gene}\), \({MS}_{tf}\) were considered, this process was conducted six times with different input \(MS\) to obtain six prediction score matrices, that is, \({MD{\prime}}_{S}\), \({MD{\prime}}_{E}\), \({MD{\prime}}_{GM}\), \({MD{\prime}}_{P}\), \({MD{\prime}}_{gene}\), \({MD{\prime}}_{tf}\). Finally, the combined predictive score matrix \({MD{\prime}}_{combined}\) was obtained based on the weighted sum of the above six predictive scores. The wights were optimized with a step size of 0.05. The final combined predictive score matrix was calculated as \({MD{\prime}}_{combined}\) = 0.15 \({MD{\prime}}_{S}\) + 0.1 \({MD{\prime}}_{E}\) + 0.05 \({MD{\prime}}_{GM}\) + 0.25 \({MD{\prime}}_{P}\) + 0.2 \({MD{\prime}}_{gene}\) + 0.25 \({MD{\prime}}_{tf}\).

Model evaluation and online server construction

To assess the predictive accuracy of DNI-MDCAP, we conducted both independent test and ten-fold cross-validation. It is important to clarify that the independent test dataset was not used in feature selection and score integration weight optimization processes to avoid overfitting. Additionally, we compared DNI-MDCAP with previously established prediction models, MDCAP and LE-MDCAP, in discriminating causal and non-causal miRNA-disease associations. DNI-MDCAP, MDCAP and LE-MDCAP were trained and tested on the same training and test samples. We also compared DNI-MDCAP with recent models designed for general (not for causal) miRNA-disease association. In this performance comparison, all models were re-trained and tested on the filtered training and testing sets of DNI-MDCAP. More specifically, compared to the original training and testing sets, in order to ensure to a fair comparison, we removed all miRNAs and diseases that have not been considered by the previous models (among both positive and negative samples), and fixed the positive-to-negative ratio to 1:5 in the causal-versus-non-disease test.

We plotted the receiver operating characteristic (ROC) curve by calculating the true positive rate (TPR) and false positive rate (FPR) across various thresholds and measured the performance via area under the ROC curve (AUROC).

For the users’ convenience, DNI-MDCAP is accessible as an online server that was implemented by HTML + PHP + Apache framework.


Overall performance evaluation of DNI-MDCAP

We first assessed the ability of DNI-MDCAP to distinguish causal miRNA-disease associations (causal) from miRNA-disease pairs with no association (non-disease) using both ten-fold cross-validation and independent test. The evaluation was based on known causal miRNA-disease associations in the HMDD v3.2 database. We randomly selected approximately one-fifth of the known miRNA-disease causal associations as the independent test set (before network imputation), while the remaining four-fifths were used as the training set. Similarly, in each round of ten-fold cross-validation, the known miRNA-disease causal associations were divided into consistent proportions for the training and the test sets. To prevent data leakage, the model’s parameter selection was only based on the training samples. Then, AUROC was introduced as the measure of the predictive model’s performance. DNI-MDCAP achieved an AUROC value of 0.896 in the ten-fold cross-validation (Fig. 2a) and an AUROC value of 0.889 in the independent test (Fig. 2b). These results have demonstrated the competitive performance of our method in predicting potential causal miRNA-disease associations.

Fig. 2
figure 2

ROC curves performed by DNI-MDCAP: a ROC curve of tenfold cross-validation. b ROC curve of independent test

DNI-MDCAP accurately distinguishes causal and non-causal miRNA-disease associations

Discriminating causal miRNA-disease associations from non-causal ones is more challenging compared to distinguishing causal associations from unrelated miRNA-disease pairs. To validate the DNI-MDCAP model, we categorized all miRNA-disease pairs in the dataset into three groups: causal miRNA-disease associations (causal), non-causal miRNA-disease associations (non-causal), and unrelated miRNA-disease pairs (non-disease). To test the DNI-MDCAP’s ability to discern causal associations from non-causal ones, in this sub-section, we considered ‘causal’ as positive samples and ‘non-causal’ as negative samples for method evaluation. To ensure the fairness of the performance comparison, the above DNI-MDCAP prediction scores were directly applied to this dataset, and no re-training or re-optimization of the model was conducted during this evaluation. The results indicated that the previously published MDCAP model exhibited limited discriminative power (AUROC = 0.695), whereas our previous predictive model LE-MDCAP showed significant improvement (AUROC = 0.820). Notably, DNI-MDCAP outperformed these two existing models in distinguishing between causal and non-causal miRNA-disease associations, achieving an AUROC of 0.870. To further verify the usefulness of deep network imputation, we also performed an ablation test using the raw miRNA similarity network and miRNA-disease network instead of the computationally imputed ones. The results suggested that the AUROC of DNI-MDCAP decreased to 0.821 without network imputation (Fig. 3a). Besides, we assessed the statistical significance of the variation in prediction scores between the three miRNA-disease groups by performing the Wilcoxon rank-sum test. The analysis revealed a statistically highly significant distinction (p = 4.95e-247) between the predicted scores of the causal and non-causal associations (Fig. 3b). In summary, the above results collectively manifested that DNI-MDCAP was highly effective in distinguishing causal miRNA-disease associations from non-causal associations, exhibiting excellent predictive performance. Because our previous benchmarking was performed in 2019 [24], we also compared the efficacy of DNI-MDCAP with those of recently published miRNA-disease prediction models, including SMALF [21], GATMDA [22], and CFESMDA [23]. Several previous models show better AUROC in the casual-versus-non-disease test, but they only show weak to moderate prediction performance in the casual-versus-non-causal test (Additional file 1: Fig. S1a, b). We also used violin plots to depict the distribution of the prediction scores of different models between the causal, non-causal and non-disease groups (Additional file 1: Fig. S1c–f). The results suggest that DNI-MDCAP recognizes causal miRNA-disease associations with significantly higher prediction scores than non-causal associations, while the differences in prediction score between causal and non-causal associations are not so obvious for other previous models. Together, the performance comparison results suggest that, as an ad hoc model for causal miRNA-disease association prediction, DNI-MDCAP indeed showed superior performances in distinguishing causal miRNA-disease associations from non-causal and non-disease associations.

Fig. 3
figure 3

Improved predictive performance of DNI-MDCAP: a ROC curves of DNI-MDCAP (with or without imputation) and previous models in discriminating causal miRNA-disease associations from the non-causal associations. b Violin plots of DNI-MDCAP showing the distribution of model prediction scores in the causal, non-causal and non-disease groups

Case study of DNI-MDCAP prediction results with latest literature

We conducted case studies by examining the top predictions of DNI-MDCAP and validating these predictions with recent literature records that were not included in the HMDD v3.2 dataset. These new literature records were not included in the training or test dataset of DNI-MDCAP to ensure their independence. Therefore, they provided an additional chance to evaluate DNI-MDCAP’s performance, complementing the conventional AUROC evaluation. To start with, we investigated whether the predictions from DNI-MDCAP are helpful in identifying new causal miRNAs for a specific disease. Diabetic nephropathy is a common complication in diabetic patients, often leading to end-stage renal failure and posing severe health risks [43]. Understanding the molecular mechanisms underlying diabetic nephropathy is crucial for the development of effective therapeutic interventions to alleviate its symptoms. In this study, we utilized the prediction scores generated by DNI-MDCAP to identify causal miRNAs that may potentially involve in the pathogenic pathways of diabetic nephropathy. Table 1 indicates that all of the top five causal miRNA-diabetic nephropathies associations predicted by DNI-MDCAP were supported by recent literature evidence. Furthermore, seven out of the top ten prediction results were supported by the literature. Among all potentially causal miRNAs, hsa-mir-30c had the highest score of 0.3931. Notably, miR-30c-5p was reported to effectively inhibit epithelial-mesenchymal transition and kidney fibrosis, playing a pivotal role in preventing the progression of diabetic nephropathy [44]. Another study suggested that miR-155 promoted hyperglycemia-induced podocyte inflammation by targeting SIRT1, leading to impaired renal function and exacerbated renal pathological changes [45]. Similarly, miR-29a has been suggested to contribute to the pathogenesis of tubulointerstitial fibrosis by modulating the CB1R pathway, thereby accelerating the progression of diabetic nephropathy [46]. A publication in 2020 demonstrated that the ablation of miR-214 from renal proximal tubules prevented a decrease in ULK1 expression and autophagy impairment in diabetic kidneys, resulting in less renal hypertrophy and albuminuria [8]. Furthermore, miR-20a was indicated to target CXCL6 and modulate JAK/STAT3 signaling, exerting an inhibitory effect on diabetic nephropathy [47]. All these articles confirmed that the top causal miRNA-diabetic nephropathies associations predicted by DNI-MDCAP were practical and instructive.

Table 1 Top 10 causal miRNA-diabetic nephropathies associations predictions by DNI-MDCAP and their literature verification

In addition, we conducted another case study to evaluate if DNI-MDCAP could help identify potential causal associations involving a specific miRNA of interest. Previous studies have implicated hsa-mir-193a in several disease processes, including the progression of non-alcoholic fatty liver disease [48] and tumor cell metabolism [49] among others. Recognizing its relevance, we delved further into more causal relationships between hsa-mir-193a and diseases, as revealed by the top predictions from DNI-MDCAP (Table 2). Out of the top five predictions, four have been validated by recent literature evidence. Moreover, among the top ten prediction results, eight have been validation by the recent literature. The association with hepatocellular carcinoma (HCC) achieved the highest score of 0.6887 among all potential diseases. In HCC cells, miR-193a was regulated by Mig-6, leading to autophagy inhibition, which presents a potential therapeutic target for HCC treatment [50]. In colorectal cancer, miR-193a functioned through the miR-193a-5p/PIK3R3/AKT axis, playing a role in the initiation and progression of the disease [51]. Furthermore, a study in 2020 unveiled the involvement of miR-193a in a competitive endogenous RNA regulatory pathway in breast cancer, suggesting a novel strategy for breast cancer treatment [52]. In bladder cancer cells, miR-193a was identified as an upstream target of ZNFX1-AS1, promoting tumor cell proliferation, migration, and invasion [53]. These articles collectively verified the involvement of hsa-mir-193a in the onset and progression of diseases predicted by DNI-MDCAP, and supported the causality of the predicted associations.

Table 2 Top ten causal hsa-mir-193a-diseases predictions by DNI-MDCAP and their literature verification

DNI-MDCAP server

We have developed a user-friendly web server interface (Fig. 4) for DNI-MDCAP (available at to facilitate easy querying. Users can access prediction results by entering miRNA names or disease keywords, supporting both precise and fuzzy search modes. Furthermore, users have the flexibility to choose the ranking criteria for the prediction results, including miRNA rank (miRNA), disease rank (Disease), or score rank (Score, default), enabling convenient identification and prioritization of the most likely causal miRNAs associated with specific diseases.

Fig. 4
figure 4

The query interface and sample result of DNI-MDCAP server


In recent years, the field of bioinformatics has witnessed a surge in studies exploring the connections between miRNAs and diseases. Many of the identified miRNAs are found to undergo changes in expression or localization as the body transitions from a physiological to a pathological state [54, 55]. These miRNAs are often observed in a passive role during disease progression, and their association with diseases are referred as non-causal miRNA-disease associations. Despite not being directly involved in the onset and progression of diseases, these miRNAs are widely used as biomarkers in clinical settings for purposes such as increasing diagnostic sensitivity, evaluating treatment response, and predicting prognosis [56]. However, non-causal miRNA-disease associations do not offer provide insight into the underlying mechanisms of the disease, nor do they serve as viable therapeutic targets. It is crucial to identify causal miRNAs that directly contribute to the development of diseases and are actively involved in pathogenesis. To address the need for the identification of disease causal miRNAs, we here proposed the prediction method DNI-MDCAP, which enriched existing models for predicting the causal miRNA-disease associations. Notably, DNI-MDCAP exhibited significant advantages in distinguishing causal miRNA-disease from non-causal ones, outperforming MDCAP and our previously proposed LE-MDCAP. The results of the case studies provided supplemental confirmation of the predictive reliability of DNI-MDCAP. Ablation experiment also suggested both the extended miRNA similarity metrics, network imputation and semi-supervised learning substantially contribute to the performance of DNI-MDCAP. Another advantage of DNI-MDCAP is that it performs causal-versus-non-disease and causal-versus-non-causal discrimination using the same model, avoid complicated model retraining and design. In summary, DNI-MDCAP has displayed dependable predictive performance in distinguishing causal and non-causal miRNA disease associations, enabling optimal selection of causal miRNA-disease associations.

Nonetheless, there is still room for improvement in DNI-MDCAP. One practical limitation of DNI-MDCAP is its limited ability to predict diseases, as it relies on known causal miRNA-disease association datasets. As a result, it may not be applicable to novel diseases for which there are few even no known causally associated miRNAs. Nevertheless, as more disease-causal miRNA annotation data become available, it can be anticipated that the predictive performance and disease coverage of DNI-MDCAP will be improved in future studies. Secondly, in the ten-fold cross-validation and independent test, MDCAP predicted an AUROC of 0.928 and 0.925 for miRNA-disease association, respectively, which was superior to DNI-MDCAP (AUROC of 0.896 and 0.889, respectively). As a consequence, the algorithm model needs further upgrade in the future to better balance the predictive performance of causal-versus-non-disease and causal-versus-non-causal discrimination. An alternative solution should be applying distinct computational frameworks for these two tasks to further improve the performances in the future, in order to optimize the performance of each task, respectively. Integrating miRNA and disease feature dimensions through deep learning may also contribute to more accurate similarity measurements, enhancing the model’s predictive performance. Finally, causal miRNAs show bidirectional impacts on diseases, i.e., either promoting or inhibiting disease progression. Therefore, we believe that in future predictions of miRNA-disease causality, it would be beneficial to consider the prediction of causal association categories (i.e., disease promoting or inhibiting). This addition would help identify the precise direction for intervening in disease treatment targets, leading to more accurate and effective therapeutic interventions.


DNI-MDCAP is a computational model based on diverse miRNA similarity metrics, deep graph embedding assisted-network imputation and semi-supervised learning algorithm, designed to predict potential causal miRNA-disease associations. It exhibited reliable predictive performance, effectively distinguishing between causal and non-causal miRNA-disease associations as well as causal and unrelated ones. The accuracy of its predictions has been validated by recently published literature. This advance is helpful to refine our undemanding of causality in miRNA-disease associations and provides meaningful clues for potential causal associations between miRNAs and diseases. Consequently, it provides useful guidance for exploring novel targets for disease treatment.

Availability and requirements

  • Project name: DNI-MDCAP

  • Project home page:

  • Operating system: Platform independent

  • Programming language: Python, R

  • Other requirements: None

  • License: GNU GPL

  • Any restrictions to use by non-academics: None.

Availability of data and materials

DNI-MDCAP web server is freely available at The data and code used in the current study are available at:





Human MiRNA Disease Database


MiRNA-Disease Causal Association Predictor


Levenshtein-distance Enhanced MiRNA-Disease Causal Association Predictor


Deep Network Imputation-assisted MiRNA-Disease Causal Association Predictor


Transcription factor


Directed acyclic graph


Receiver operating characteristic


True positive rate


False positive rate


Area under the receiver operating characteristic curve


  1. Bartel DP. MicroRNAs: target recognition and regulatory functions. Cell. 2009;136(2):215–33.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Lu TX, Rothenberg ME. MicroRNA. J Allergy Clin Immunol. 2018;141(4):1202–7.

    Article  CAS  PubMed  Google Scholar 

  3. Peng Y, Chen FF, Ge J, Zhu JY, Shi XE, Li X, Yu TY, Chu GY, Yang GS. miR-429 inhibits differentiation and promotes proliferation in porcine preadipocytes. Int J Mol Sci. 2016;17(12):2047.

    Article  PubMed  PubMed Central  Google Scholar 

  4. Fan L, Lai R, Ma N, Dong Y, Li Y, Wu Q, Qiao J, Lu H, Gong L, Tao Z, et al. miR-552-3p modulates transcriptional activities of FXR and LXR to ameliorate hepatic glycolipid metabolism disorder. J Hepatol. 2021;74(1):8–19.

    Article  CAS  PubMed  Google Scholar 

  5. Guo FH, Guan YN, Guo JJ, Zhang LJ, Qiu JJ, Ji Y, Chen AF, Jing Q. Single-Cell transcriptome analysis reveals embryonic endothelial heterogeneity at spatiotemporal level and multifunctions of microRNA-126 in mice. Arterioscler Thromb Vasc Biol. 2022;42(3):326–42.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Zhao D, Wu K, Sharma S, Xing F, Wu SY, Tyagi A, Deshpande R, Singh R, Wabitsch M, Mo YY, et al. Exosomal miR-1304-3p promotes breast cancer progression in African Americans by activating cancer-associated adipocytes. Nat Commun. 2022;13(1):7734.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Zhang Y, Cai Y, Zhang H, Zhang J, Zeng Y, Fan C, Zou S, Wu C, Fang S, Li P, et al. Brown adipose tissue transplantation ameliorates diabetic nephropathy through the miR-30b pathway by targeting Runx1. Metabolism. 2021;125:154916.

    Article  CAS  PubMed  Google Scholar 

  8. Ma Z, Li L, Livingston MJ, Zhang D, Mi Q, Zhang M, Ding HF, Huo Y, Mei C, Dong Z. p53/microRNA-214/ULK1 axis impairs renal tubular autophagy in diabetic kidney disease. J Clin Invest. 2020;130(9):5011–26.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Zhang X, Ji R, Liao X, Castillero E, Kennel PJ, Brunjes DL, Franz M, Mobius-Winkler S, Drosatos K, George I, et al. MicroRNA-195 regulates metabolism in failing myocardium via alterations in Sirtuin 3 expression and mitochondrial protein acetylation. Circulation. 2018;137(19):2052–67.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Kandettu A, Radhakrishnan R, Chakrabarty S, Sriharikrishnaa S, Kabekkodu SP. The emerging role of miRNA clusters in breast cancer progression. Biochim Biophys Acta Rev Cancer. 2020;1874(2):188413.

    Article  CAS  PubMed  Google Scholar 

  11. Zhao W, Zhang R, Zang CY, Zhang LF, Zhao R, Li QC, Yang ZJ, Feng Z, Zhang W, Cui RT. Exosome derived from mesenchymal stem cells alleviates pathological scars by inhibiting the proliferation, migration and protein expression of fibroblasts via delivering miR-138-5p to target SIRT1. Int J Nanomed. 2022;17:4023–38.

    Article  CAS  Google Scholar 

  12. Uppala SN, Tryphena KP, Naren P, Srivastava S, Singh SB, Khatri DK. Involvement of miRNA on epigenetics landscape of Parkinson’s disease: from pathogenesis to therapeutics. Mech Ageing Dev. 2023;213:111826.

    Article  CAS  PubMed  Google Scholar 

  13. Arroyo AB, Aguila S, Fernandez-Perez MP, Reyes-Garcia AML, Reguilon-Gallego L, Zapata-Martinez L, Vicente V, Martinez C, Gonzalez-Conejero R. miR-146a in cardiovascular diseases and sepsis: an additional burden in the inflammatory balance? Thromb Haemost. 2021;121(9):1138–50.

    Article  PubMed  Google Scholar 

  14. Zhang MW, Shen YJ, Shi J, Yu JG. MiR-223-3p in cardiovascular diseases: a biomarker and potential therapeutic target. Front Cardiovasc Med. 2020;7:610561.

    Article  CAS  PubMed  Google Scholar 

  15. Pogue AI, Lukiw WJ. microRNA-146a-5p, neurotropic viral infection and prion disease (PrD). Int J Mol Sci. 2021;22:17.

    Article  Google Scholar 

  16. Mori MA, Ludwig RG, Garcia-Martin R, Brandao BB, Kahn CR. Extracellular miRNAs: from biomarkers to mediators of physiology and disease. Cell Metab. 2019;30(4):656–73.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Fu L, Peng Q. A deep ensemble model to predict miRNA-disease association. Sci Rep. 2017;7(1):14482.

    Article  PubMed  PubMed Central  Google Scholar 

  18. Chen X, Zhu CC, Yin J. Ensemble of decision tree reveals potential miRNA-disease associations. PLoS Comput Biol. 2019;15(7):e1007209.

    Article  PubMed  PubMed Central  Google Scholar 

  19. Chen X, Liu MX, Yan GY. RWRMDA: predicting novel human microRNA-disease associations. Mol Biosyst. 2012;8(10):2792–8.

    Article  CAS  PubMed  Google Scholar 

  20. Chen H, Zhang Z. Similarity-based methods for potential human microRNA-disease association prediction. BMC Med Genom. 2013;6:12.

    Article  CAS  Google Scholar 

  21. Liu D, Huang Y, Nie W, Zhang J, Deng L. SMALF: miRNA-disease associations prediction based on stacked autoencoder and XGBoost. BMC Bioinf. 2021;22(1):219.

    Article  CAS  Google Scholar 

  22. Li G, Fang T, Zhang Y, Liang C, Xiao Q, Luo J. Predicting miRNA-disease associations based on graph attention network with multi-source information. BMC Bioinf. 2022;23(1):244.

    Article  CAS  Google Scholar 

  23. Hu X, Yin Z, Zeng Z, Peng Y. Prediction of miRNA-disease associations by cascade forest model based on stacked autoencoder. Molecules. 2023;28:13.

    Article  Google Scholar 

  24. Huang Z, Liu L, Gao Y, Shi J, Cui Q, Li J, Zhou Y. Benchmark of computational methods for predicting microRNA-disease associations. Genome Biol. 2019;20(1):202.

    Article  PubMed  PubMed Central  Google Scholar 

  25. Xue H, Shen X, Pan W. Causal inference in transcriptome-wide association studies with invalid instruments and GWAS summary data. J Am Stat Assoc. 2023;118(543):1525–37.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. Belyaeva A, Squires C, Uhler C. DCI: learning causal differences between gene regulatory networks. Bioinformatics. 2021;37(18):3067–9.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. Huang Z, Shi J, Gao Y, Cui C, Zhang S, Li J, Zhou Y, Cui Q. HMDD v3.0: a database for experimentally supported human microRNA-disease associations. Nucleic Acids Res. 2019;47(D1):D1013–7.

    Article  CAS  PubMed  Google Scholar 

  28. Gao Y, Jia K, Shi J, Zhou Y, Cui Q. A computational model to predict the causal miRNAs for diseases. Front Genet. 2019;10:935.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Huang Z, Han Y, Liu L, Cui Q, Zhou Y. LE-MDCAP: a computational model to prioritize causal miRNA-disease associations. Int J Mol Sci. 2021;22(24):13607.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Wang KR, McGeachie MJ. DisiMiR: predicting pathogenic miRNAs using network influence and miRNA conservation. Noncod RNA. 2022;8(4):45.

    Google Scholar 

  31. Medical Subject Headings database []

  32. Kozomara A, Birgaoanu M, Griffiths-Jones S. miRBase: from microRNA sequences to function. Nucleic Acids Res. 2019;47(D1):D155–62.

    Article  CAS  PubMed  Google Scholar 

  33. Tong Z, Cui Q, Wang J, Zhou Y. TransmiR v20: an updated transcription factor-microRNA regulation database. Nucleic Acids Res. 2019;47(D1):D253–8.

    Article  CAS  PubMed  Google Scholar 

  34. Huang HY, Lin YC, Li J, Huang KY, Shrestha S, Hong HC, Tang Y, Chen YG, Jin CN, Yu Y, et al. miRTarBase 2020: updates to the experimentally validated microRNA-target interaction database. Nucleic Acids Res. 2020;48(D1):D148–54.

    CAS  PubMed  Google Scholar 

  35. Lorenzi L, Chiu HS, Cobos FA, Gross S, Volders PJ, Cannoodt R, Nuytens J, Vanderheyden K, Anckaert J, Lefever S, et al. The RNA Atlas expands the catalog of human non-coding RNAs. Nat Biotechnol. 2021;39(11):1453–65.

    Article  CAS  PubMed  Google Scholar 

  36. Kehl T, Kern F, Backes C, Fehlmann T, Stockel D, Meese E, Lenhof HP, Keller A. miRPathDB 2.0: a novel release of the miRNA Pathway Dictionary Database. Nucleic Acids Res. 2020;48(D1):D142–7.

    Article  CAS  PubMed  Google Scholar 

  37. Lu M, Zhang Q, Deng M, Miao J, Guo Y, Gao W, Cui Q. An analysis of human microRNA and disease associations. PLoS ONE. 2008;3(10):e3420.

    Article  PubMed  PubMed Central  Google Scholar 

  38. van Laarhoven T, Nabuurs SB, Marchiori E. Gaussian interaction profile kernels for predicting drug-target interaction. Bioinformatics. 2011;27(21):3036–43.

    Article  PubMed  Google Scholar 

  39. Wang D, Wang J, Lu M, Song F, Cui Q. Inferring the human microRNA functional similarity and functional network based on microRNA-associated diseases. Bioinformatics. 2010;26(13):1644–50.

    Article  CAS  PubMed  Google Scholar 

  40. Wu XB, Zhou Y. GE-Impute: graph embedding-based imputation for single-cell RNA-seq data. Brief Bioinform. 2022;23(5):bbac313.

    Article  PubMed  Google Scholar 

  41. Mikolov T SI, Chen K, Corrado G, Dean J.: Distributed representations of words and phrases and their compositionality. In: Proceedings of the 26th international conference on neural information processing systems. Lake Tahoe, Nevada: Curran Associates Inc; 2013. p. 3111–9.

  42. Liang C, Yu S, Wong KC, Luo J. A novel semi-supervised model for miRNA-disease association prediction based on [Formula: see text]-norm graph. J Transl Med. 2018;16(1):357.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  43. Thomas MC, Brownlee M, Susztak K, Sharma K, Jandeleit-Dahm KA, Zoungas S, Rossing P, Groop PH, Cooper ME. Diabetic kidney disease. Nat Rev Dis Primers. 2015;1:15018.

    Article  PubMed  PubMed Central  Google Scholar 

  44. Gao BH, Wu H, Wang X, Ji LL, Chen C. MiR-30c-5p inhibits high glucose-induced EMT and renal fibrogenesis by down-regulation of JAK1 in diabetic nephropathy. Eur Rev Med Pharmacol Sci. 2020;24(3):1338–49.

    PubMed  Google Scholar 

  45. Wang X, Gao Y, Yi W, Qiao Y, Hu H, Wang Y, Hu Y, Wu S, Sun H, Zhang T. Inhibition of miRNA-155 alleviates high glucose-induced podocyte inflammation by targeting SIRT1 in diabetic mice. J Diabetes Res. 2021;2021:5597394.

    Article  PubMed  PubMed Central  Google Scholar 

  46. Tung CW, Ho C, Hsu YC, Huang SC, Shih YH, Lin CL. MicroRNA-29a attenuates diabetic glomerular injury through modulating cannabinoid receptor 1 signaling. Molecules. 2019;24(2):264.

    Article  PubMed  PubMed Central  Google Scholar 

  47. Wang SZ, Zhang YL, Shi HB. Potential repressive impact of microRNA-20a on renal tubular damage in diabetic kidney disease by targeting C-X-C motif chemokine ligand 6. Arch Med Res. 2021;52(1):58–68.

    Article  CAS  PubMed  Google Scholar 

  48. Johnson K, Leary PJ, Govaere O, Barter MJ, Charlton SH, Cockell SJ, Tiniakos D, Zatorska M, Bedossa P, Brosnan MJ, et al. Increased serum miR-193a-5p during non-alcoholic fatty liver disease progression: diagnostic and mechanistic relevance. JHEP Rep. 2022;4(2):100409.

    Article  PubMed  Google Scholar 

  49. Asl ER, Amini M, Najafi S, Mansoori B, Mokhtarzadeh A, Mohammadi A, Lotfinejad P, Bagheri M, Shirjang S, Lotfi Z, et al. Interplay between MAPK/ERK signaling pathway and MicroRNAs: a crucial mechanism regulating cancer cell metabolism and tumor progression. Life Sci. 2021;278:119499.

    Article  CAS  PubMed  Google Scholar 

  50. Qu L, Tian Y, Hong D, Wang F, Li Z. Mig-6 inhibits autophagy in HCC cell lines by modulating miR-193a-3p. Int J Med Sci. 2022;19(2):338–51.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  51. Xu H, Liu Y, Cheng P, Wang C, Liu Y, Zhou W, Xu Y, Ji G. CircRNA_0000392 promotes colorectal cancer progression through the miR-193a-5p/PIK3R3/AKT axis. J Exp Clin Cancer Res. 2020;39(1):283.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  52. Li J, Zeng T, Li W, Wu H, Sun C, Yang F, Yang M, Fu Z, Yin Y. Long non-coding RNA SNHG1 activates HOXA1 expression via sponging miR-193a-5p in breast cancer progression. Aging (Albany NY). 2020;12(11):10223–34.

    Article  CAS  PubMed  Google Scholar 

  53. Wu JP, Zhang GY, Sun XZ. LncRNA ZNFX1-AS1 targeting miR-193a-3p/SDC1 regulates cell proliferation, migration and invasion of bladder cancer cells. Eur Rev Med Pharmacol Sci. 2020;24(9):4719–28.

    PubMed  Google Scholar 

  54. Kai K, Dittmar RL, Sen S. Secretory microRNAs as biomarkers of cancer. Semin Cell Dev Biol. 2018;78:22–36.

    Article  CAS  PubMed  Google Scholar 

  55. Wiedrick JT, Phillips JI, Lusardi TA, McFarland TJ, Lind B, Sandau US, Harrington CA, Lapidus JA, Galasko DR, Quinn JF, et al. Validation of MicroRNA biomarkers for Alzheimer’s Disease in human cerebrospinal fluid. J Alzheimers Dis. 2019;67(3):875–91.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  56. Zhang HD, Jiang LH, Sun DW, Hou JC, Ji ZL. CircRNA: a novel type of biomarker for cancer. Breast Cancer Tokyo. 2018;25(1):1–7.

    Article  Google Scholar 

Download references


We thank Dr. Yuanxu Gao and Dr. Chunmei Cui for their assistance in dataset preparation.


This study was supported by the National Key Research and Development Program of China (2021YFF1201201 to Y.Z.) and National Natural Science Foundation of China (32222020 to Y.Z., 62072154 to J.L.).

Author information

Authors and Affiliations



YZ conceptualized the study; YZ, YH, QZ and JL designed the methodology; YH and QZ performed the analysis; YH, QZ and LL established the online software; QZ and YH drafted the manuscript; YZ and JL revised the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Yuan Zhou.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1: Fig. S1.

Comparison of predictive performance between DNI-MDCAP and other previous models: All models were re-trained and tested on the filtered training and testing sets of DNI-MDCAP. More specifically, compared to the original training and testing sets, in order to ensure to a fair comparison, we removed all miRNAs and diseases that have not been considered by the previous models (among both positive and negative samples), and fixed the positive-to-negative ratio to 1:5 in the causal-versus-non-disease test. It is also noteworthy that the previous models were designed for general miRNA-disease association prediction without a specification of causality. a ROC curves of DNI-MDCAP and the previous models in discriminating causal miRNA-disease associations from the non-causal associations. b ROC curves of DNI-MDCAP and the previous models in discriminating causal miRNA-disease associations from the non-disease associations. cf Violin plots showing the distribution of prediction scores of different models, between the causal, non-causal and non-disease groups. Table S1. Performance comparison using different miRNA similarity metrics. Table S2. Ablation experiments comparing the components in the computational frameworks of LE-MDCAP and DNI-MDCAP.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Han, Y., Zhou, Q., Liu, L. et al. DNI-MDCAP: improvement of causal MiRNA-disease association prediction based on deep network imputation. BMC Bioinformatics 25, 22 (2024).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: