### Known disease-lncRNA associations

Because the number of lncRNA-disease associations is limited and many heterogeneous biological datasets have been constructed, we collected 8842 known disease-lncRNA associations from the MNDR dataset (http://www.bioinformatics.ac.cn/mndr/index.html) and 2934 known disease-lncRNA associations from the LncRNADisease dataset (http://www.cuilab.cn/lncrnadisease). Since the disease names in the LncRNADisease database differ from those in the MNDR dataset, we mapped the diseases in these two disease-lncRNA association datasets to their MeSH descriptors. After eliminating diseases without any MeSH descriptors, merging the diseases with the same MeSH descriptors and removing the lncRNAs that were not present in the lncRNA-miRNA dataset (*DS*_{4}) used in this paper, 583 known lncRNA-disease associations (*DS*_{1}) were obtained from the LncRNADisease dataset (see Additional file 1), and 702 known lncRNA-disease associations (*DS*_{2}) were obtained from the MNDR dataset (see Additional file 2). Furthermore, after integrating the *DS*_{1} and *DS*_{2} datasets and removing the duplicate associations, we obtained the *DS*_{3} dataset, which included 1073 disease-lncRNA associations (see Additional file 3).

### Known lncRNA-miRNA associations

To construct the lncRNA-miRNA network, the lncRNA-miRNA association dataset *DS*_{4} was obtained from the starBasev2.0 database (http://starbase.sysu.edu.cn/) in February 2, 2017 and provided the most comprehensive experimentally confirmed lncRNA-miRNA interactions based on large-scale CLIP-Seq data. After the data pre-processing (including the elimination of duplicate values, erroneous data, and disorganized data), removing the lncRNAs that did not exist in the *DS*_{3} dataset and merging the miRNA copies that produced the same mature miRNA, we finally obtained 1883 lncRNA-miRNA associations (*DS*_{4}) (see Additional file 4).

### Known disease-miRNA associations

To validate the performance of DCSMDA, the known human miRNA-disease associations were downloaded from the latest version of the HMDD database, which is considered the golden-standard dataset. In this dataset, after eliminating the duplicate associations and miRNA-disease associations involved with other diseases or lncRNAs not contained in the *DS*_{3} or *DS*_{
4
}, we finally obtained 3252 high-quality lncRNA-disease associations (*DS*_{5}) (see Additional file 5).

### Construction of the disease-lncRNA-miRNA interaction network

To clearly demonstrate the process of constructing the disease-lncRNA-miRNA interaction network, we use the disease-lncRNA dataset *DS*_{3} and the lncRNA-miRNA dataset *DS*_{4} as examples. We defined *L* to represent all the different lncRNA terms in *DS*_{3} and *DS*_{4} and then constructed the disease-lncRNA-miRNA interactive network based on *DS*_{3} and *DS*_{4} according to the following 3 steps:

Step 1 (Construction of the disease-lncRNA network): Let *D* and *L* be the number of different diseases and lncRNAs obtained from *DS*_{3}, respectively. *S*_{
D
} = {*d*_{
1
}*, d*_{
2
}*,..., d*_{
D
}} represents the set of all *D* different diseases in *DS*_{3}. *S*_{
L
} = {*l*_{
1
}*, l*_{
2
}*,..., l*_{
L
}} represents the set of all *L* different lncRNAs in *DS*_{3}, and for any given *d*_{
i
} ∈ *S*_{
D
} and *l*_{
j
}∈*S*_{
L
}, we can construct the *D*L* dimensional matrix KAM1 as follows:

$$ KAM1\left(i,j\right)=\Big\{{\displaystyle \begin{array}{c}1\kern0.5em if\kern0.2em {d}_i\kern0.2em is\kern0.34em related\kern0.34em to\kern0.2em {l}_j\kern0.2em in\kern0.2em {DS}_3\\ {}0\kern7.8em otherwise\end{array}} $$

(1)

Step 2 (Construction of the lncRNA-miRNA network): Let *M* be the number of different miRNAs obtained from *DS*_{4}. *S*_{
M
} = {*m*_{
1
}*, m*_{
2
}*,..., m*_{
M
}} represents the set of all *M* different miRNAs in *DS*_{4}, and for any given *m*_{
i
}∈*S*_{
M
} and *l*_{
j
}∈*S*_{
L
}, we can construct the *M*L* dimensional matrix *KAM2* as follows:

$$ KAM2\left(i,j\right)=\left\{\begin{array}{c}1\kern0.5em if\ {m}_i\ is\ related\ to\ {l}_j\ in\ {DS}_4\\ {}0\kern5.25em otherwise\end{array}\right. $$

(2)

Step 3 (Constriction of the disease-lncRNA-miRNA interactive network): Based on the disease-lncRNA network and lncRNA-miRNA network, we can obtain the undirected graph *G*_{
3
} *=* (*V*_{
3
}*, E*_{
3
}), where *V*_{
3
} = *S* _{
D
} ∪*S* _{
L
} ∪*S* _{
M
} = {*d*_{
1
}*, d*_{
2,
}*..., d*_{
D
}*, l*_{
D + 1
}*, l*_{
D + 2
}*..., l*_{
D + L
}*, m*_{
D + L + 1
}*, m*_{
D + L + 2
}*..., m*_{
D + L + M
}} is the set of vertices, *E*_{
3
} is the edge set of *G*_{
3
}, and *d*_{
i
}∈*S*_{
D
}, *l*_{
j
}∈*S*_{
L
}, m_{k}∈S_{M}. Here, an edge exists between *d*_{
i
} and *l*_{
j
} in *E*_{
3
}*KAM1*(*d*_{
i
}, *l*_{
j
}) = 1, an edge exists between *l*_{
j
} and *m*_{
k
} in *E*_{
3
} if *KAM2*(*m*_{
k
}*, l*_{
j
}) = 1. Then, for any given *a*, *b*∈*V*_{
3
}, we can define the Strong Correlation (*SC*) between *a* and *b* as follows:

$$ SC\left(a,b\right)=\left\{\begin{array}{c}1\kern0.5em if\kern0.34em there\kern0.34em is\kern0.34em an\kern0.34em edge\kern0.34em between\kern0.2em a\kern0.2em and\kern0.2em b\\ {}0\kern11em otherwise\end{array}\right. $$

(3)

Notably, although we did not use any known disease-miRNA associations, the diseases and miRNAs can still be indirectly linked by integrating the edges between the disease nodes, the lncRNA nodes and edges between the miRNA nodes and lncRNA nodes in *G*_{
3
}.

### Disease semantic similarity

We downloaded the MeSH descriptors of the diseases from the National Library of Medicine (http://www.nlm.nih.gov/), which introduced the concept of Categories and Subcategories and provided a strict system for disease classification. The topology of each disease was visualized as a Directed Acyclic Graph (DAG) in which the nodes represented the disease MeSH descriptors, and all MeSH descriptors in the DAG were linked from more general terms (parent nodes) to more specific terms (child nodes) by a direct edge (see Fig. 4). Let *DAG(A)* = (*A, T*(*A*)*, E*(*A*)), where *A* represents disease *A*, *T*(*A*) represents the node set, including node *A* and its ancestor nodes, and *E*(*A*) represents the corresponding edge set. Then, we defined the contribution of disease term *d* in *DAG*(*A*) to the semantic value of disease *A* as follows:

$$ \left\{\begin{array}{c}{D}_A(d)=1\kern16.8em if\kern0.3em d=A\\ {}{D}_A(d)=\max \left\{0.5\ast {D}_A\left({d}^{\ast}\right)|{d}^{\ast}\in children\kern0.3em of\kern0.3em d\right\}\kern0.3em if\kern0.3em d\ne A\end{array}\right. $$

(4)

For example, the semantic value of the disease ‘Gastrointestinal Neoplasms’ shown in Fig. 4 is calculated by summing the weighted contribution of ‘Neoplasms’ (0.125), ‘Neoplasms by Site’ (0.25), ‘Digestive System Diseases’ (0.25), ‘Digestive System Neoplasms’ (0.5), ‘Digestive System Neoplasms’ (0.5) and ‘Gastrointestinal Diseases’ (0.5) to ‘Gastrointestinal Neoplasms’ and the contribution to ‘Gastrointestinal Neoplasms’ (1) by ‘Gastrointestinal Neoplasms’.

Then, the sematic value of disease *A* can be obtained by summing the contribution from all disease terms in = *DAG*(*A*), and the semantic similarity between the two diseases *d*_{
i
} and *d*_{
j
} can be calculated as follows:

$$ SSD\left({d}_i,{d}_j\right)=\frac{\sum \limits_{d\in \left(T\left({d}_i\right)\cap T\left({d}_j\right)\right)}\left({D}_{d_i}(d)+{D}_{d_j}(d)\right)}{\sum \limits_{d\in T\left({d}_i\right)}{D}_{d_i}(d)+{\sum}_{d\in T\left({d}_j\right)}{D}_{d_j}(d)} $$

(5)

where *SSD* is the disease semantic similarity matrix.

### MiRNA Gaussian interaction profile kernel similarity

Based on the assumption that similar miRNAs tend to show similar interaction and non-interaction patterns with lncRNAs, in this section, we introduce the Gaussian interaction profile kernel used to calculate the network topologic similarity between miRNAs and used the vector *MLP*(*m*_{
i
}) to denote the ith row of the adjacency matrix *KAM2*. Then, the Gaussian interaction profile kernel similarity for all investigated miRNAs can be calculated as follows:

$$ MGS\left({m}_i,{m}_j\right)=\exp \left(-\frac{M\ast {\left\Vert MLP\left({m}_i\right)- MLP\left({m}_j\right)\right\Vert}^2}{\sum \limits_{i=1}^M{\left\Vert MLP\left({m}_i\right)\right\Vert}^2}\right) $$

(6)

where parameter *M* is the number of miRNAs in *DS*_{4}.

### Disease Gaussian interaction profile kernel similarity

Based on the assumption that similar diseases tend to show similar interaction and non-interaction patterns with lncRNAs, the Gaussian interaction profile kernel similarity for all investigated diseases can be calculated as follows:

$$ DGS\left({d}_i,{d}_j\right)=\exp \left(-\frac{D\ast {\left\Vert DLP\left({d}_i\right)- DLP\left({d}_j\right)\right\Vert}^2}{\sum \limits_{i=1}^D{\left\Vert DLP\left({d}_i\right)\right\Vert}^2}\right) $$

(7)

where parameter *D* is the number of diseases in *DS*_{3,} and *DLP*(*d*_{
i
}) represent the ith row of the matrix *KAM1*. Then, based on previous work [46], we can improve the predictive accuracy problems by logistic function transformation as follows:

$$ FDGS\left({d}_i,{d}_j\right)=\frac{1}{1+{e}^{-15\ast DGS\left({d}_i,{d}_j\right)+\log (9999)}} $$

(8)

### lncRNA Gaussian interaction profile kernel similarity

Based on the assumption that similar lncRNAs tend to show similar interaction and non-interaction patterns with miRNAs and similar lncRNAs tend to show similar interaction and non-interaction patterns with diseases, the Gaussian interaction profile kernel similarity matrix for all investigated lncRNAs in *DS*_{3} can be computed in a similar way as that for disease, as follows:

$$ LGS1\left({l}_i,{l}_j\right)=\exp \left(-\frac{L\ast {\left\Vert LDP\left({l}_i\right)- LDP\left({l}_j\right)\right\Vert}^2}{\sum \limits_{i=1}^L{\left\Vert LDP\left({l}_i\right)\right\Vert}^2}\right) $$

(9)

where parameter *L* is the number of lncRNAs in *DS*_{3,} and *LDP*(*l*_{
i
}) represents the ith column of the matrix *KAM1*.

Obviously, the Gaussian interaction profile kernel similarity for all investigated lncRNAs in *DS*_{4} can be computed as follows:

$$ LGS2\left({d}_i,{d}_j\right)=\exp \left(-\frac{L\ast \parallel LMP\left({l}_i\right)- LMP\left({l}_j\right){\parallel}^2}{\sum \limits_{i=1}^L\parallel LMP\left({l}_i\right){\parallel}^2}\right) $$

(10)

where *LMP*(*l*_{
i
}) represents the ith column of the matrix *KAM2*.

### Disease functional similarity based on the lncRNAs

To calculate the functional similarity of the diseases, we first constructed the undirected graph *G*_{
1
} = (*V*_{
1
}*, E*_{
1
}) based on *KAM1*, where *V*_{
1
} = *S*_{
D
}∪*S*_{
M
} = {*d*_{
1
}*, d*_{
2
}*, …, d*_{
D
}*, l*_{
D + 1
}*, l*_{
D + 2
}*,…, l*_{
D + M
}} is the set of vertices, *E*_{
1
} is the set of edges, and for any two nodes *a, b*∈*V*_{
1
}, an edge exists between a and b in *E*_{
1
} if *KAM1*(*a, b*) = 1. Therefore, we can calculate the similarities between two disease nodes by comparing and integrating the similarities of the lncRNA nodes associated with these two disease nodes based on the assumption that similar diseases tend to show similar interaction and non-interaction patterns with lncRNAs. The procedure used to calculate the disease functional similarity is shown in Fig. 5.

Because different lncRNA terms in *DS*_{3} may relate to several diseases, assigning the same contribution value to all miRNAs is not suitable, and therefore, we defined the contribution value of each lncRNA as follows:

$$ C\left({l}_i\right)=\frac{\mathrm{The}\kern0.34em \mathrm{number}\kern0.34em \mathrm{of}\kern0.2em {l}_i-\mathrm{related}\kern0.34em \mathrm{edges}\ \mathrm{in}\ {E}_1}{\mathrm{The}\ \mathrm{number}\ \mathrm{of}\ \mathrm{all}\ \mathrm{edges}\ \mathrm{in}\ {E}_1} $$

(11)

Based on the definition of *C*(*l*_{
i
}), we can define the contribution value of each lncRNA to the functional similarity of each disease pair as follows:

$$ {CD}_{ij}\left({l}_k\right)=\Big\{{\displaystyle \begin{array}{c}1\kern2.30em if\kern0.3em lncRNA\kern0.3em {l}_k\kern0.2em related\kern0.34em to\kern0.2em {d}_i\kern0.2em and\kern0.2em {d}_j\kern0.2em simultaneously\\ {}C\left({l}_k\right)\kern6em if\kern0.34em lncRNA\kern0.3em {l}_k\kern0.2em only\kern0.34em related\kern0.34em to\kern0.2em {d}_i\kern0.2em or\kern0.2em {d}_j\end{array}}\operatorname{} $$

(12)

Finally, we can define the functional similarity between diseases *d*_{
i
} and *d*_{j} by integrating lncRNAs related to *d*_{
i
}*, d*_{
j
} or both as follows:

$$ FSD\left({d}_i,{d}_j\right)=\frac{\sum \limits_{l_k\in \left(D\left({d}_i\right)\cup D\left({d}_j\right)\right)}C{D}_{ij}\left({l}_k\right)}{\mid D\left({d}_i\right)\mid +\mid D\left({d}_j\right)\mid -\mid D\left({d}_i\right)\cap D\left({d}_j\right)\mid } $$

(13)

where *D*(*d*_{
i
}) and *D*(*d*_{
j
}) represent all lncRNAs related to *di* and *d*_{
j
} in *E*_{
1
}, respectively.

### MiRNA functional similarity based on lncRNAs

Based on the assumption that similar miRNAs tend to show similar interaction and non-interaction patterns with lncRNAs, we can also calculate the miRNA functional similarity in the lncRNA-miRNA interactive network. Similar to the procedure used to calculate the disease functional similarity, first, we constructed the undirected graph *G*_{
2
} = (*V*_{
2
}*, E*_{
2
}), where *V*_{
2
} = *S*_{
M
}∪*S*_{
L
} = {*m*_{
1
}*, m*_{
2
}*,…, l*_{
M + 1
}*, l*_{
M + 2
}*,…, l*_{
M + L
}} is the set of vertices, *E*_{
2
} is the set of edges, and for any two nodes *a, b* ∈ *V*_{
2
}, an edge exists between *a* and *b* in *E*_{
2
} if *KAM2*(*a*, *b*) = 1. Then, we defined the contribution of each lncRNA to the functional similarity of each miRNA pair as follows:

$$ {CM}_{ij}\left({l}_k\right)=\Big\{{\displaystyle \begin{array}{c}1\kern1.20em if\kern0.34em lncRNA\kern0.3em {l}_k\kern0.2em related\kern0.2em {m}_i\kern0.2em and\kern0.2em {m}_j\kern0.2em simultaneously\\ {}C\left({l}_k\right)\kern5em if\kern0.34em lncRNA\kern0.3em {l}_k\kern0.2em only\kern0.34em related\kern0.2em {m}_i\kern0.2em or\kern0.2em {m}_j\end{array}}\operatorname{} $$

(14)

Additionally, we can define the functional similarity between *m*_{
i
} and *m*_{
j
} as follows:

$$ FSM\left({m}_i,{m}_j\right)=\frac{\sum \limits_{l_k\in \left(D\left({m}_i\right)\cup D\left({m}_j\right)\right)}C{M}_{ij}\left({m}_k\right)}{\mid D\left({m}_i\right)\mid +\mid \mathrm{D}\left({m}_j\right)\mid -\mid D\left({m}_i\right)\cap D\left({m}_j\right)\mid } $$

(15)

where *D*(*m*_{
i
}) represents all lncRNAs related to *m*_{
i
}, and *D*(*m*_{
j
}) represents lncRNAs relate to *m*_{
j
} in *E*_{
2
}.

### Integrated similarity

The processes used to calculate the integrated similarities of the diseases, lncRNAs and miRNAs are illustrated in Fig. 6. Combining the disease semantic similarity, the disease Gaussian interaction profile kernel similarity and the disease functional similarity mentioned above, we can construct the disease integrated similarity matrix *FDD* as follows:

$$ FDD=\frac{SSD+ FDGS+ FSD}{3} $$

(16)

Additionally, based on the miRNA Gaussian interaction profile kernel similarity and the miRNA functional similarity, we can construct the miRNA integrated similarity matrix *FMM* as follows:

$$ FMM=\frac{MGS+ FSM}{2} $$

(17)

Furthermore, based on the Gaussian interaction profile kernel similarity matrices *LGS1* and *LGS2*, we can construct the lncRNA integrated similarity matrix *FLL* as follows:

$$ FLL=\frac{LGS1+ LGS2}{2} $$

(18)

### Prediction of disease-miRNA associations based on a distance correlation set

In this section, we developed a novel computational method, i.e., DCSMDA, to predict potential disease-miRNA associations by introducing a distance correlation set based on the following assumptions: similar diseases tend to show similar interaction and non-interaction patterns with lncRNAs, and similar lncRNAs tend to show similar interaction and non-interaction patterns with miRNAs. As illustrated in Fig. 7, the DCSMDA procedure consists of the following 5 major steps:

Step 1 (Construction of the adjacency matrix based on *G*_{
3
}): First, we construct a (*D + L + M*) * (*D + L + M*) Adjacency Matrix (*AM*) based on the undirected graph *G*_{
3
} and *SC*, and then for any two nodes *v*_{
i
}*, v*_{
j
}∈*V*_{
3
}*,* we can define the *AM*(*i, j*) as follows:

$$ AM\left(i,j\right)=\left\{\begin{array}{c} SC\left({d}_i,{d}_j\right),\kern0.75em if\kern0.5em i\in \left[1,D\right]\ \mathrm{and}\ j\in \left[1,D\right].\kern6.25em \\ {} SC\left({d}_i,{l}_j\right),\kern0.75em if\kern0.5em i\in \left[1,D\right]\ \mathrm{and}\kern0.5em j\in \left[D,D+L\right].\kern4.75em \\ {} SC\left({d}_i,{m}_j\right),\kern1.25em if\kern0.5em i\in \left[1,D\right]\ \mathrm{and}\ j\in \left[D+L,D+L+M\right].\kern3em \\ {} SC\left({m}_i,{d}_j\right),\kern1em if\kern0.5em i\in \left[D,D+L\right]\ \mathrm{and}\ j\in \left[1,D\right].\kern4.75em \\ {} SC\left({m}_i,{m}_j\right),\kern1.25em if\kern0.5em i\in \left[D,D+L\right]\ \mathrm{and}\ j\in \left[\mathrm{D},D+L\right].\kern3.25em \\ {} SC\left({m}_i,{l}_j\right),\kern1.25em if\kern0.5em i\in \left[D,D+L\right]\ \mathrm{and}\ j\in \left[D+L,D+L+M\right].\kern1.75em \\ {} SC\left({l}_i,{d}_j\right),\kern1.25em if\kern0.5em i\in \left[D+L,D+L+M\right]\ \mathrm{and}\ j\in \left[1,D\right].\kern3em \\ {} SC\left({l}_i,{m}_j\right),\kern1.25em if\kern0.5em i\in \left[D+L,D+L+M\right]\ \mathrm{and}\ j\in \left[\mathrm{D},D+L\right].\kern1.75em \\ {} SC\left({l}_i,{m}_j\right),\kern1.25em if\kern0.5em i\in \left[D+L,D+L+M\right]\ \mathrm{and}\ j\in \left[D+L,D+L+M\right]\end{array}\right. $$

(19)

where *i*∈[1*, D* + *L + M*] and *j*∈[1*, D + L* + *M*], and to calculate the shortest distance matrix in step 2, we define *AM* (*i, j*) = 1 if *i = j*.

Step 2 (Construction of the shortest distance matrix based on adjacency matrix *AM*): First, we set parameter *b* to control the bandwidth of the distance correlation set and let *b* be a pre-determined positive integer, and then, we can obtain *b* matrices, such as *AM*^{1}*, AM*^{2}*,..., AM*^{b}, based on the above formula (19), and the Shortest Path Matrix is calculated as follows:

$$ SPM\left(i,j\right)=\left\{\ \begin{array}{c}1,\kern2.5em if\ AM\left(i,j\right)=1\\ {}k,\kern2.25em otherwise\kern1.25em \end{array}\right. $$

(20)

where *i*∈[1*, D* + M + *L*], *j*∈[1*, D* + M + *L*], *k*∈[2*, b*], and *k* satisfies the following: *AM* ^{k}(*i*, *j*)≠0, while *AM* ^{1}(*i*, *j*) = *AM* ^{2}(*i*, *j*) = … = *AM* ^{k-1}(*i*, *j*) = 0.

Step 3 (Calculation of distance correlation sets and distance coefficient of each node pair in *G*_{
3
}):

For each node *v*_{
i
} ∈ *V*_{
3
}, we can obtain distance correlation set *DCS*(*i*) according to the shortest distance matrix as follows:

$$ DCS(i)=\left\{{v}_j|r\ge SPM\left(i,j\right)>0\right\} $$

(21)

where *DCS*(*i*) of each node contains itself and all nodes with the shortest distance less than *b*.

For instance, in the disease-miRNA-lncRNA interaction network illustrated in Fig. 7, *DCS* (seed node) is all candidate nodes when *b* is set to 2.

Then, we can calculate the distance coefficient (*DC*) of the node pair (v_{i}, v_{j}) as follows:

$$ P\left(i,j\right)=\left\{\begin{array}{c} SPM{\left(i,j\right)}^{b+1}, if\ i\in DCS(j)\ or\ j\in DCS(i)\\ {}0,\kern3.5em otherwise\end{array}\right. $$

(22)

Furthermore, we can construct a Distance Correlation Matrix (*DCM*) based on the disease integrated similarity, the lncRNA integrated similarity, and the miRNA integrated similarity as follows:

$$ DCM\left(i,j\right)=\Big\{{\displaystyle \begin{array}{c}P\left(i,j\right)\ast \exp \left( FDD\left(i,j\right)\right),\kern7.9em if\kern0.5em i\in \left[1,D\right]\ \mathrm{and}\ j\in \left[1,D\right].\kern6.3em \\ {}P\left(i,j\right)\ast \exp \left( FLL\left(i,j\right)\right),\kern6em if\kern0.5em i\in \left[D,D+L\right]\ \mathrm{and}\ j\in \left[\mathrm{D},D+L\right].\kern4.75em \\ {}P\left(i,j\right)\ast \exp \left( FMM\left(i,j\right)\right),\kern0.5em if\kern0.5em i\in \left[D+L,D+L+M\right]\ \mathrm{and}\ j\in \left[D+L,D+L+M\right]\kern3em \\ {}P\left(\mathrm{i},\mathrm{j}\right)\ast \frac{SPM\left(i,j\right)}{b},\kern18.5em \mathrm{otherwise}\kern5.5em \end{array}}\operatorname{} $$

(23)

where *i*∈[1, *D + L + M*] and *j*∈[1, *D + L + M*].

Step 4 (Estimation of the association degree between a pair of nodes): Based on formula (23), we can estimate the association degree between v_{i} and v_{j} as follows:

$$ PM\left(i,j\right)=\frac{\sum \limits_{k=1}^{D+L+M} DCM\left(i,k\right)+{\sum}_{k=1}^{D+L+M} DCM\left(k,j\right)}{D+L+M} $$

(24)

Thus, we can obtain prediction matrix *PM*, where the entity *PM (i, j)* in row *i* column *j* represents the predicted association between node *v*_{
i
} and *v*_{
j
}.

Step 5 (Calculation of the final prediction result matrix between the miRNAs and diseases): Let \( PM=\left[\begin{array}{c}{C}_{11}\kern0.75em {C}_{12}\kern1em {C}_{13}\\ {}{C}_{21}\kern0.75em {C}_{22}\kern1em {C}_{23}\\ {}{C}_{31}\kern0.75em {C}_{32}\kern0.75em {C}_{33}\end{array}\right] \), where *C*_{11} is a *D*×*D* matrix, *C*_{12} is a *D*×*L* matrix, *C*_{13} is a *D*×*M* matrix, *C*_{21} is an *L*×*D* matrix, *C*_{
22
} is an *L* ×*L* matrix, *C*_{
23
} is an L×M matrix, C_{31} is an M×D matrix, *C*_{
32
} is an *M*×*L* matrix and *C*_{
33
} is an *M* ×*M* matrix. Obviously, *C*_{
13
} is our predicted result, which provides the association probability between each disease and miRNA. A previous study [27] demonstrated that the Gaussian interaction profile kernel similarity is a high-efficiency tool for optimizing the result of prediction, and therefore, we used the miRNA Gaussian interaction profile kernel similarity and the disease Gaussian interaction profile kernel similarity to optimize the result of the DCSMDA as follows:

$$ FAD= FDD\ast {C}_{13}\ast FMM $$

(25)

where the matrix FAD denotes the relationship between the miRNA-disease pairs.