Skip to main content

Drug–target interaction prediction using unifying of graph regularized nuclear norm with bilinear factorization



Wet-lab experiments for identification of interactions between drugs and target proteins are time-consuming, costly and labor-intensive. The use of computational prediction of drug–target interactions (DTIs), which is one of the significant points in drug discovery, has been considered by many researchers in recent years. It also reduces the search space of interactions by proposing potential interaction candidates.


In this paper, a new approach based on unifying matrix factorization and nuclear norm minimization is proposed to find a low-rank interaction. In this combined method, to solve the low-rank matrix approximation, the terms in the DTI problem are used in such a way that the nuclear norm regularized problem is optimized by a bilinear factorization based on Rank-Restricted Soft Singular Value Decomposition (RRSSVD). In the proposed method, adjacencies between drugs and targets are encoded by graphs. Drug–target interaction, drug-drug similarity, target-target, and combination of similarities have also been used as input.


The proposed method is evaluated on four benchmark datasets known as Enzymes (E), Ion channels (ICs), G protein-coupled receptors (GPCRs) and nuclear receptors (NRs) based on AUC, AUPR, and time measure. The results show an improvement in the performance of the proposed method compared to the state-of-the-art techniques.


The study of DTIs has been attracted many researchers’ attention in the field of pharmaceutical science in recent years [1,2,3,4]. In this regard, many efforts have been made to investigate drug repositioning as well as the discovery of the interaction between new targets and existing drugs. DTI means binding a drug to a target location, that leads to a change in its behavior or function. On the other hand, the identification of DTIs minimizes the adverse side effects of drugs [1].

Performing wet-lab experiments is a significant challenge in terms of cost, time and effort [5]. In this regard, Computational Prediction (CP) methods have been used in recent years [6]. In addition, there is ample evidence of Disease-Associated Microbes (DAM) as well as Long non-coding RNA (lncRNA)-Disease Associations (LDA) [7, 8]. Using traditional approaches of experiments to confirm these connections often requires a great deal of materials and time which are expected computational methods to be used to predict these associations. Many of these algorithms use profile-based methods (for example, NCPLP [7] in ADM and BLM-NPAI [8] in LDA) to predict these associations.

Despite the synthesis of many compounds, their target profiles and drug effects are still unidentified. Besides, there is no cure for many diseases and many new diseases are introduced each year. Therefore, much information has been gathered about various compound properties, features, responses and target proteins by researchers. The emergence of a large dataset has led to the use of CP with problems such as high dimensional, complex data, which indicates the need for efficient and robust algorithms in DTIs.

Computational methods for DTIs have been used by state-of-the-art researchers. Generally, three main categories can be introduced for computational methods in this application [1]. In the first category, the concept of that similar molecules tend to share similar properties and usually bind similar proteins, is used which these methods are called ligand-based approaches [6, 9, 10]. This approach predicts interactions using similarities between identical protein ligands. Since these ligand-based methods do not use sequence information of the proteins for prediction, it is possible that a novel interaction restricts to link between known ligands and protein families. On the other hand, the performance of these methods is highly dependent on known ligands, and if these ligands are low for a candidate protein, the performance of these methods is drastically reduced [1, 11].

In the second category, the 3D structures of drugs and proteins are used by a simulation to determine the interaction, known as docking approaches. The main problem is that the 3D structure of some proteins is not known [12, 13]. The third category is chemogenomic-based methods that uses information about drugs and targets simultaneously. This method has been considered by many researchers in recent years; furthermore, it can be used in a broad biological data, which are also used for the prediction of data from process information such as chemical structure graphs and genomic sequences for the drugs and targets from both sides of the drug and target simultaneously [14]. For this purpose, biological information that is available in public datasets can be used. This general method can be divided into two categories: Feature-based and similarity-based methods [15]. In the feature-based method, a supervised machine learning technique is used. In fact, in this method, feature vectors use sets of drug–target pairs with class labels that indicate the presence of interaction (positive instance) or no interaction (negative example). Also, it should be noted that negative samples are samples without non-interactions or unknown drug–target interactions [16,17,18]. In similarity-based methods, two matrices of similarity related to the drug and similarity of the target along with the interaction matrix are used which represent the interaction between the drug and the target [19,20,21]. These similarities are usually created for the drug by chemical structures and for the target by protein sequence alignment. Similarity-based methods have several apparent advantages [22]:

  1. 1.

    Feature-based approaches require a feature extraction or selection process which is complex and challenging, while similarity-based techniques do not require this process.

  2. 2.

    Computing similarity measures have already been expanded and used extensively such as chemical structure similarity for drugs and genomic sequence similarity for targets.

  3. 3.

    Similarity-based approaches can provide better performance in prediction since directly related to kernel methods.

  4. 4.

    Similarity matrices represent chemical space and genomic space derived from the relationships between drugs and genes, respectively.

These advantages demonstrate the superiority of similarity-based approaches over other approaches. Adjacency matrices are commonly used to represent drug-drug and target-target similarity.

Another point of view that can be introduced to categorize the available methods in DTI is methods which includes classification, network inference and matrix factorization groups. Classification-based models are divided into Local Classification Model (LCM) [23, 24] and Global Classification Model (GCM) [25, 26]. It is difficult to diagnose drugs (resp. targets) that interact with the same target (resp. drug), so the LCM is not able to show a link between targets or drugs [27]. Also, GCM cannot show the relationship between targets or drugs due to the complexity of similarity calculations based on the tensor product or high dimensional concatenate feature vectors. Overall, these models do not easily capture the underlying structure among drug–target pairs [28].

In recent years, many types of researches have been done based on deep learning in DTI [29,30,31]. A comprehensive deep learning library called DeepPurpose has been introduced for DTI prediction [29]. This library includes the implementation of 15 compound and protein encoders and more than 50 neural architectures with other beneficial features in DTI.

Convolution Neural Networks (CNNs) were used to obtain 1-dimensional representations of protein sequences (amino acid sequences) and simplified molecular input line-entry system (SMILES) compounds in [30]. The extraction features were claimed to show an appropriate representation of local dependencies or patterns and serve as a suitable input for a fully connected neural network (FCNN) for the binary classifier. The results show that the use of CNNs to obtain data display, as an alternative to traditional descriptors, improves performance in DTI.

A deep learning model based on DeepLSTM was developed to predict DTI in [31]. Position-Specific Scoring Matrices (PSSM) and Legendre Moment (LM) were used to extract the evolutionary features of proteins. The Sparse Principal Component Analysis (SPCA) was then used to compress the features of drugs and proteins in a uniform vector space.

It should also be noted that the use of deep learning also faces with similar problems; on the other hand, the use of deep learning has a large dataset for training, which unfortunately in these applications, providing data is expensive and time-consuming.

Interactions between drugs and targets show a significant relationship that is represented by a bi-partite network [32]. The information in this network is taken from drug–target interactions. A bi-partite network is based on network inference (e.g., NBI [19]) which has transformed DTI prediction to link prediction between graph nodes. Two-step resource allocation is used by NBI to infer the potential links between nodes. Although, it just depends on the local or the first-order topology of nodes, it tends to completely bias the high-degree nodes [26]. In addition, NBI cannot predict the interaction between the target-drug pair without known accessible pathways in the network [32]. The heterogeneous network is a promising model. This network is made by a DTI network and two other networks which is produced by drug similarities and target similarities, respectively [28, 33].

The models based on matrix factorization, such as BMF2K [34], CMF [35], NRLMF [36] are good models to obtain structural information between drug–target interactions. Accordingly, drugs and targets are planned to a common low-rank feature space according to the drug similarity matrix and the target similarity matrix [28]. Two networks (dual-network L2,1-collaborative matrix factorization) have been proposed to predict Drug-Disease Interactions (DDI), called the L2,1-network matrix factor [37]. In this method, to achieve better results, the Gaussian interaction profile kernels and L2,1-norm are presented. Moreover, the network similarities of drugs and diseases are combined with their chemical and semantic similarities. In order to identify potential links in biomedical bi-partite networks, a method called graph regularized generalized matrix factorization (GRGMF) is proposed to predict links [38]. For this purpose, a matrix factorization model is formulated to use latent patterns behind observed links. It is claimed that the results showed an improvement in the proposed method.

In this regard, factorization approaches can be used to predict DTIs [21]. In general, based on the reported experiences [6], matrix factorization methods have achieved the best results in DTI. Since there are few factors for DTI and these latent factors characterize drugs and targets, the DTI matrix can be converted into a latent factor matrix of drug and target. The DTI matrix is of low-rank which can be solved using matrix factorization. Matrix factorization is a bi-linear non-convex problem that there are no convergence guarantees [11]. Nuclear Norm Minimization (NNM) based methods have also been proposed to improve it. By shrinking all Singular Values (SVs) uniformly, the NNM is usually used to estimate the matrix rank. Despite the precise physical meanings of SVs, NNM cannot accurately estimate the matrix rank.

In this paper, unifying matrix factorization and NNM approaches combined with graph regularization penalties are proposed, which will be described in detail in the following sections. It should be noted that similarly to Mongia et al. study [11], a similarity matrix is used in this proposed method.

The strengths of the proposed method are as follows:

  • Unifying nuclear norm with bilinear factorization is presented based on the similarity of drug-drug and target-target, which has caused the advantages of both methods to be combined.

  • Rank-Restricted Soft Singular Value Decomposition method is used to optimize the nuclear norm minimization in the DTI problem.

  • The performance of the proposed method based on AUC and AUPR measures had the best performance and also the results were suitable in datasets with different features.

  • The time complexity of the running time is \(O\left( {r^{3} + n^{b} \log \left( n \right)} \right)\) for each iteration in the proposed algorithm. This complexity is polynomial, which has performed better than other new methods.

  • The proposed method can be widely used in other applications of CP. Using the proposed method does not require a complex process.

The rest of the paper is organized as follows: In “Methods” section, the proposed method that includes a novel algorithm for DTI is introduced. The experimental setup and results, that include the introduction of datasets, evaluation criteria and comparison of methods, are presented in “Discussion” section. Finally, the conclusion and discussion are presented in “Conclusion” section.


The interaction between the target and the drug is shown by the adjacency matrix X, where drugs are presented in rows and targets are presented in columns. The value of the matrix represents the interactions. Since all DTIs are not known, this matrix is partially observed. This is expressed as follow:

$$Y = P\left( X \right)$$

In Eq. (1), P is a sub-sampling operator which in this binary sampling matrix the value of 1 means that there is a known interaction and 0 means otherwise. Y is an available partially sampled DTI matrix. The purpose of this equation is to estimate the matrix X given Y and known P. X is a low-rank that needs to be recovered. Equation (2) can be used for this purpose.

$$\begin{array}{*{20}c} {\begin{array}{*{20}c} {min} \\ X \\ \end{array} } & {rank\left( X \right) \; such\; that\; Y = P\left( X \right)} \\ \end{array}$$

Low-Rank Matrix Approximation (LRMA) has been used in many practical cases that have low rank properties, so in recent years it has attracted considerable interest in different areas, such as computer vision and machine learning [39,40,41,42]. In general, LRMA methods are divided into two categories, the low rank matrix factorization (LRMF) [43,44,45,46] and the rank minimization methods [47,48,49]. The purpose of the LRMF concerning the input matrix Y is to factorize it to the product of two low rank matrices that can be used to reconstruct the low rank matrix X with exceptional fidelity. A variety of LRMF-based methods, such as classical Singular Value Decomposition (SVD) under '\({\mathcal{L}}_{2} - {\text{norm}}\) '[50, 51], robust LRMF methods under ‘\({\mathcal{L}}_{1} - {\text{norm}}\)’ [52, 53] and other probabilistic methods have been proposed [54, 55]. The problem of Low rank models for recovering a rank-k matrix Z can be expressed by minimizing Eq. (3).

$$\begin{array}{*{20}c} {\begin{array}{*{20}c} {min} \\ Z \\ \end{array} } & {f\left( {Y - Z} \right) \; subject \;to \;rank\left( Z \right) = k} \\ \end{array}$$

where f(.) defines a loss function.The rank limitation in Eq. (3) has typically been imposed by a factorization \({\text{Z}} = {\text{AB}}^{{\text{T}}}\), as

$$\begin{array}{*{20}c} {\begin{array}{*{20}c} {min} \\ {A,B} \\ \end{array} } & {f\left( {Y - AB^{T} } \right) } \\ \end{array}$$

based on its intractability. It has been proven that when the loss function is the Least Squares (LS) loss, i.e., \({\text{f}}\left( {{\text{Y}} - {\text{AB}}^{{\text{T}}} } \right) = \left\| {{\text{Y}} - {\text{AB}}^{{\text{T}}} } \right\|_{{\text{F}}}^{2}\), then Eq. (4) does not have local minima and a closed form solution can be obtained via the SVD of Y [56]. One of the disadvantages of this factorization approach is highly susceptibility of the LS loss to outliers and the presence of missing data in Y results in local minima. Factorization with missing data is a NP-Hard problem [57], while outliers can be addressed with robust loss functions [58, 59].

In DTI, matrix X can be converted into two matrices as follows:

$${\text{ X}}_{{{\text{M}} \times {\text{N}}}} = {\text{A}}_{{{\text{M}} \times {\text{k}}}} {\text{B}}_{{{\text{k}} \times {\text{N}}}} ,{\text{ k}} < < \left( {{\text{m}},{\text{n}}} \right)$$

Here M and N are the numbers of drugs and targets, respectively and k is the presumed rank of the matrix. The Eq. (4) for DTI is expressed as Eq. (6).

$${ }\begin{array}{*{20}c} {\begin{array}{*{20}c} {{\text{min}}} \\ {{\text{A}},{\text{B}}} \\ \end{array} } & {{\text{f}}\left( {{\text{Y}} - {\text{P}}({\text{AB}}^{{\text{T}}} )} \right){ }} \\ \end{array} { }$$

As mentioned earlier, the second category of LRMA methods is based on rank minimization. These methods by setting an additional rank constraint on the estimated matrix can reconstruct the data matrix. Direct rank minimization is challenging to solve because these are NP-hard. To solve this type of problem, the NNM methodology is used. In this methodology, the problem is generally solved by replacement minimizing the nuclear norm of the estimated matrix that is a convex relaxation of minimizing the matrix rank. \(\left\| X \right\|_{*}\) is the nuclear norm of matrix X. For example, \(\left\| X \right\|_{*} = \mathop \sum \limits_{i} \sigma_{i}\) is the nuclear norm which is the sum of its SV that \(\sigma_{i}\) represents the i-th SV of the matrix X. NNM attempts to recover matrix X, actual low rank, by minimizing \(\left\| X \right\|_{*}\) from degraded observation matrix Y. In recent years, NNM-based methods have been used in many applications such as video denoising [60], background extraction [61], data recovery [62] and subspace clustering [63, 64]. The matrix rank can be recovered under the conditions of the limited and theoretic warranty. However, in some applications, it acts the various rank components equally, and therefore it cannot be precise enough to estimate the matrix rank. Thus several methods have been proposed to improve NNM performance [11, 65].

For noisy input, by solving the NNM problem, the inherent low rank reconstruction can be achieved with a high probability. Also, the Nuclear Norm Proximal (NNP) is also represented by the following equation:

$$\begin{array}{*{20}c} {\begin{array}{*{20}c} {min} \\ Z \\ \end{array} } & {f\left( {Y - Z} \right) + \lambda \left\| Z \right\|_{*} } \\ \end{array}$$

By using a soft threshold process on the SV of the observation matrix, it can be easily solved in closed form:

$$\hat{X} = US_{{\frac{\lambda }{2}}} \left( {\Sigma } \right)V^{T}$$

In this equation, \({\text{Y}} = {\text{U}}\Sigma {\text{V}}^{{\text{T}}}\) is the SVD of Y, the soft thresholding function on diagonal matrix \({\Sigma }\) with parameter \(\frac{\lambda }{2}\) is indicated by \({\text{S}}_{{\frac{\lambda }{2}}} \left( \Sigma \right)\). For each diagonal element \(\Sigma_{{{\text{ii}}}}\) in \(\Sigma\), there is:

$$S_{{\frac{\lambda }{2}}} \left( \Sigma \right)_{ii} = max\left( {\Sigma_{ii} - \frac{\lambda }{2},0} \right)$$

λ is considered as a trade-off parameter between the loss function and the low-rank regularization, that is created through the nuclear norm. These models have generalized the use of low-rank compared to many applications, where Z is low rank but has no a priori [40, 66].

In addition to having convexity and theoretic guidance of the λ [40], these models also have multiple drawbacks.

To show how to create a determined rank in Z, by setting λ [67], Z has a predetermined rank. It usually gives more undesirable results than its direct usage in Eq. (2). Additionally, the “kernel trick” cannot be used because access to the factorization of Z in Eq. (3) is not available. Also, Eq. (3) is a Semidefinite Program (SDP) and Off-theshelf SDP optimizers are suitable for low-middle dimensional optimization (i.e., hundreds of variables) and are not amenable for large scale datasets with the high dimension.

To deal with the limitations, this paper uses a robust method called Rank-Restricted Soft SVD (RRSSVD) based on Hastie et al. study [68] for the DTIs. In the following, we will describe this method in this application. Based on [69], the nuclear norm can be expressed as follows:

$$\left\| X \right\|_{*} = {}_{A,B}^{min} \frac{1}{2}\left( {\left\| A \right\|_{F}^{2} + \left\| B \right\|_{F}^{2} } \right) {\text{subject}}\;{\text{ to}}\; X = AB^{T}$$

In this section, the relationship between factorization and nuclear norm approaches is used based on the method presented in [67], that bridges the gap between two methods is presented in Eq. (11).

$$\begin{array}{*{20}c} {\begin{array}{*{20}c} {min} \\ {A,B} \\ \end{array} } & {f\left( {Y - AB^{T} } \right) + \frac{\lambda }{2}\left( {\left\| A \right\|_{F}^{2} + \left\| B \right\|_{F}^{2} } \right) } \\ \end{array}$$

Equation (11) for the DTI problem can be expressed as follows:

$$\begin{array}{*{20}c} {\begin{array}{*{20}c} {min} \\ {A,B} \\ \end{array} } & {\left\| {Y - P(AB^{T} )} \right\|_{F}^{2} + \frac{\lambda }{2}\left( {\left\| A \right\|_{F}^{2} + \left\| B \right\|_{F}^{2} } \right) } \\ \end{array}$$

In this paper, Eq. (12) is used to solve Eq. (13).

$$\begin{array}{*{20}c} {\begin{array}{*{20}c} {min} \\ Z \\ \end{array} } & {\left\| {Y - P\left( Z \right))} \right\|_{F}^{2} + \lambda \left\| Z \right\|_{*} } \\ \end{array}$$

To solve the Eq. (13), Algorithm 1 is used based on the RRSSVD method in DTI prediction. In this algorithm, theorems 1 and 2 [68] are used for DTI.

Theorem 1

For the optimization problem (14), where Ym×n is a fully observed matrix and \(0 < {\text{r}} \le {\text{min}}\left( {{\text{m}},{\text{n}}} \right)\).

$$\begin{array}{*{20}c} {\begin{array}{*{20}c} {min} \\ {Z:rank\left( Z \right) \le r} \\ \end{array} } & {F_{\lambda } \left( Z \right): = \frac{1}{2}\left\| {Y - Z} \right\|_{F}^{2} + \lambda \left\| Z \right\|_{*} } \\ \end{array}$$

a solution provided by

$$\tilde{Z} = U_{r} S_{\lambda } \left( {D_{r} } \right)V_{r}^{T}$$

where the rank-r SVD of Y is \({\text{U}}_{{\text{r}}} {\text{D}}_{{\text{r}}} {\text{V}}_{{\text{r}}}^{{\text{T}}}\) and \({\text{S}}_{\lambda } \left( {{\text{D}}_{{\text{r}}} } \right) = {\text{diag}}\left[ {\left( {\sigma_{1} - \lambda } \right)_{ + } , \ldots ,\left( {\sigma_{{\text{r}}} - \lambda } \right)_{ + } } \right]\).

Theorem 2

For the optimization problem (14), where \(Y_{m \times n}\) is a fully observed matrix and \(0 < r \le min\left( {m,n} \right)\).

$$\begin{array}{*{20}c} {\begin{array}{*{20}c} {min} \\ {A_{{m \times {\text{r}}}} ,B_{n \times r} } \\ \end{array} } & {\frac{1}{2}\left\| {Y - AB^{T} } \right\|_{F}^{2} + \frac{\lambda }{2}\left( {\left\| A \right\|_{F}^{2} + \left\| B \right\|_{F}^{2} } \right) } \\ \end{array}$$

a solution provided by \(\tilde{A} = U_{r} S_{\lambda } \left( {D_{r} } \right)^{\frac{1}{2}}\) and \(\tilde{B} = V_{r} S_{\lambda } \left( {D_{r} } \right)^{\frac{1}{2}}\), and all solutions satisfy \(\tilde{A}\tilde{B}^{{\text{T}}} = \tilde{Z}\), where \(\tilde{Z}\) is as given in (15).

Finally, it should be noted that relative change in the Frobenius norm has been used to check convergence. Equation (17) is used to calculate it. This equation is based on a pair of iterates \({ }\left( {{\text{U}},{\text{D}}^{2} ,{\text{V}}} \right){ }\left( {{\text{old}}} \right)\) and \({ }\left( {\tilde{U},\tilde{D}^{2} ,\tilde{V}} \right){ }\left( {{\text{new}}} \right)\).

$$\nabla F = \frac{{\left\| {UD^{2} V^{T} - \tilde{U}\tilde{D}^{2} \tilde{V}^{T} } \right\|_{F}^{2} }}{{\left\| {UD^{2} V^{T} } \right\|_{F}^{2} }} = \frac{{tr\left( {D^{4} } \right) + tr\left( {\tilde{D}^{4} } \right) - 2tr\left( {D^{2} U^{T} \tilde{U}\tilde{D}^{2} \tilde{V}^{T} V} \right)}}{{tr\left( {D^{4} } \right)}}$$

By this algorithm, \(\tilde{Z} = \tilde{A}\tilde{B}^{T}\) is considered as the output.

figure a

As mentioned in this paper, the adjacency matrix is used that represents the interaction matrix between targets and drugs. In this matrix, if there is a known interaction between the drug (dt) and the target (\(t_{j}\)), the value is 1 and otherwise the value is zero. In this article, in addition to the interaction matrix, a drug similarity matrix (\(S_{d}\)) and a target similarity matrix (\(S_{t}\)) are used. With the number of sub-structures shared in the chemical structure between the two drugs, the SIMCOMP introduced in [70] is used. In fact, \(S_{d}\) indicates the similarity of the chemical structure of the drug pair. Similarly, \(S_{t}\) represents the degree of similarity between the two proteins, which is calculated from the similarity of the genome sequence based on the amino acid sequence of the target protein. It should be noted that normalized Smith-Waterman [71] has been used to calculate this.

In addition to the use of the similarity matrix introduced, there are four other types of similarity matrices such as Cosine (\(S^{cos}\)), Correlation (\(S^{cor}\)), Hamming (\(S^{ham}\)) and Jaccard (\(S^{jac}\)) which are used to predict DTI [11]. This paper also uses five similarity matrices calculated using the drug–target interaction matrix. The similarity matrices are used for DTI prediction, which by this method; Eq. (13) is expressed as follows:

$$\begin{array}{*{20}c} {\begin{array}{*{20}c} {min} \\ Z \\ \end{array} } & {\left\| {Y - P\left( Z \right))} \right\|_{F}^{2} + \lambda \left\| Z \right\|_{*} + \alpha_{1} Tr\left( {Z^{T} \mathop \sum \limits_{i = 1}^{nsim} L_{d}^{i} X} \right) + \alpha_{2} Tr\left( {Z^{T} \mathop \sum \limits_{i = 1}^{nsim} L_{t}^{i} X^{T} } \right) } \\ \end{array}$$

In Eq. (18), \(\alpha_{1} > 0\) and \(\alpha_{2} > 0\) are the balancing parameters, \({\text{Tr}}\left( . \right)\) is the trace of a matrix, nsim indicates the number of similarity matrices (similar reference [11] nsim = 5). Ld and Lt are the graph Laplacians [72] for Sd and St, where \({\text{L}}_{{\text{d}}} = {\text{D}}_{{\text{d}}} - {\text{S}}_{{\text{d}}}\) and \({\text{L}}_{{\text{t}}} = {\text{D}}_{{\text{t}}} - {\text{S}}_{{\text{t}}}\) are computed, respectively. Dd and Dt are degree matrices for drugs and targets that are calculated as \({\text{D}}_{{\text{d}}}^{{{\text{ii}}}} = \mathop \sum \limits_{{\text{j}}} {\text{S}}_{{\text{d}}}^{{{\text{ij}}}}\) and \({\text{D}}_{{\text{t}}}^{{{\text{ii}}}} = \mathop \sum \limits_{{\text{j}}} {\text{S}}_{{\text{t}}}^{{{\text{ij}}}}\).

As shown in Algorithm 2, a method proposed by Mongia et al. [11] is used to solve this equation. It should be noted that this algorithm uses the RRSVD_DTI method that is proposed for DTI prediction in this paper.

figure b

In Algorithm 2, \(S_{d}^{com} = S_{d} + S_{d}^{cos} + S_{d}^{cor} + S_{d}^{ham} + S_{d}^{jac} = \mathop \sum \limits_{i = 1}^{nsim} S_{d}^{i}\) and \(S_{t}^{com} = S_{t} + S_{t}^{cos} + S_{t}^{cor} + S_{t}^{ham} + S_{t}^{jac} = \mathop \sum \limits_{i = 1}^{nsim} S_{t}^{i}\) represent the combined similarity for drug and target, \(D_{d}^{com} = diag\left( {\mathop \sum \limits_{j} S_{d}^{Com} } \right)\) and \(D_{t}^{com} = diag\left( {\mathop \sum \limits_{j} S_{t}^{Com} } \right)\) represent the combined degree matrix for the drug and target also \(L_{d}^{com} = D_{d}^{com} - D_{d}^{com}\) and \(L_{t}^{com} = D_{t}^{com} - D_{t}^{com}\) represent the combined Laplacian matrix for the drug and target, respectively.


In this section, the experiments and results based on the proposed method are analyzed separately. In the first step, datasets and how to divide them into training and testing sets are presented. All experiments and extracted parameters are performed separately for each dataset under different validation settings. The following compares the proposed method with other new methods based on AUC, AUPR and time criteria.

Dataset description

Reference [14] examines information on drug and target proteins interactions for the public databases; KEGG BRITE [73], BRENDA [74] SuperTarget [75] and DrugBank [76]. In this paper, similar to [2, 11, 14], four benchmark datasets are used, which are from four different classes of target proteins. In fact, these benchmarks are simulated from public databases. The following is a description of these datasets:

  • Enzymes (Es): In this dataset, 445 drugs, 664 targets and 2926 interactions have been extracted.

  • Ion channels (ICs): In this dataset, 201 drugs, 204 targets and 1476 interactions have been extracted

  • G protein-coupled receptors (GPCRs): In this dataset, 223 drugs, 95 targets and 635 interactions were extracted

  • Nuclear receptors (NRs): In this dataset, 54 drugs, 26 targets and 90 interactions have been extracted

It should be noted that these datasets are simulated from public databases which at the link: are publicly available.

Experimental setup

In this section, the setting of datasets is based on recent work done on the DTI problem. Three cross-validation settings (CVS) as named CVS1, CVS2 and CVS3 are introduced [6]. In CVS1, standard setting for evaluation, the target-drug pairs for the test set were randomly selected for prediction. In CVS2 and CVS3, settings are performed to evaluate the ability of methods to predict interactions for novel drugs (i.e., drugs for which no interaction information is available) and novel targets, respectively. It can be pointed out that in CVS2, entire drug profiles and CVS3, total target profiles are selected as a test set.

When at least one DTI is known for di and tj respectively in the training data the CVS1 predicts the unknown pair (di, tj). To prevent using the pairs, CV used the pairs between the drugs having at least 2 targets and the targets interacting with at least 2 drugs, which should be used in three other scenarios. Some of these pairs are selected randomly for testing in each round of CV and the union of the rest of them and other entries are used for training.

However, when there are no DTIs for observation of new drugs and new targets in the training data, CVS2 and CVS3 predict new drugs and new targets respectively.

Performance of CV on drugs in CSV2, where the rows corresponding to drugs are randomly blinded for testing and the remaining rows are used for training. Performance of CV on targets in CSV3, where the columns (accounting for targets) are randomly blinded for testing and the resting columns are used for training.

We have made various tasks of CV under 3 scenarios shown in Fig. 1, respectively [28].

Fig. 1
figure 1

Presentation of cross-validation schemes for three scenarios. Each column represents a scenario. Row includes the DTI matrices, in which the entries marked with “?” are the pairs of interest to be tested

Similarly to the method presented by Mongia et al. [11], tenfold cross-validation (CV) is used, where data was divided into tenfolds and out of those 10 folds, one was selected as a test set while the remaining ninefolds were considered as a training set. In experiments, 5 repetitions of tenfold CV for each of the methods under three CVS are performed. For a more accurate evaluation, in each repetition, all CVSs for each dataset are similar to Mongia and Ezzat study [6, 11].

Evaluation metrics

To evaluate the performance of the proposed method, Area Under the ROC Curve (AUC) and Area Under the Precision-Recall curve (AUPR) criteria based on Mongia and Ezzat study [6, 11] have been used. In the following, the requirements are introduced:

  • AUC is a famous quality measure of ranking performance. It uses the ROC curve, which is a graphical plot that illustrates the diagnostic ability with a positive rate for a method as a function of the false-positive rate. AUC measures all two-dimensional areas below the ROC curve. It is also used as a measure of classification performance, aggregating over decision thresholds. Interpreting AUC shows a better model that is a random positive example more highly than a random negative example.

  • AUPR is another measure which is used to evaluate the performance of DTI methods in this paper. It uses the precision–recall curve, which is a ratio of true positives plot that illustrates the positive predictions for each given recall rate. AUPR performance evaluation shows this area under the Precision-Recall curve punishes more false positives than AUC. The AUPR offers a quantitative assessment of the separation of true interactions from true non-interactions among predicted scores. For this reason, due to few true drug–target interactions, AUPR is a more important qualitative scale than AUC which finds true drug–target interactions among prediction scores.

Parameter settings

In this paper, cross-validation on the training set is used to set the parameters of the proposed method. In fact, experiments are designed under each cross-validation setting to find the best parameter for each dataset. The considered parameters are \(P, \lambda , \alpha_{1} , \alpha_{2} , v_{1} , v_{2} ,r\). It should be noted that the ranges for \(\lambda , \alpha_{1} , \alpha_{2} , v_{1} , v_{2}\) and \(P,r\) parameters are considered (0,1) and (1,10), respectively. All the extracted parameters are shown in Table 1.

Table 1 Extracted parameters for the proposed method and [11]

It is necessary to mention, implementationsFootnote 1 were performed in MATLAB programming language on hardware configuration, 4G memory, and Core i7 M620 2.6 GHz CPU.

Interaction prediction

In Fig. 2, the results obtained in each validation set are shown by boxplot. In fact, this diagram is drawn based on five different runs on four databases. Two criteria, AUC and AUPR, have been considered in drawing this diagram. The results show the performance of the proposed method which is appropriate in each run and is without outliers.

Fig. 2
figure 2

Boxplot diagram based on the proposed method in 5 times run under different validation settings. In each diagram, the results are reported on four datasets. The left column shows the AUC and the right column shows the AUPR. Each row from top to bottom shows the results under validation setting CVS1, validation setting CVS2 and validation setting CVS3, respectively

Comparison with the others methods

In this paper, to evaluate the proposed method, experiments have been designed to compare this method with 6 state-of-the-art methods introduced in recent years. In the following, we will describe these methods.

Weighted Graph Regularized Matrix Factorization (WGRMF) method [20] was introduced in 2016. In this method, since the data are located on or close to low-dimensional non-linear manifolds, two methods of matrix factorization are proposed, which in these methods, graph regularization is used. Also, a preprocessing step has been presented to improve the predictions of a “new drug” and a “new target” by introducing intermediate interaction likelihood scores.

In 2019, an improved graph regularized matrix factorization (GRMF) method was proposed to learn DTI flow patterns by combining the matrix analysis method called L1,2-GRMF [77]. In this method, WKNKN for preprocessing is used to improve prediction accuracy.

The Collaborative Matrix Factorization (CMF) method [35] was introduced in 2013. The main idea of this method is to use more than one target and drug similarity matrix. In this method, a weighted matrix for the automatic selection of similarities is estimated to improve DTI prediction.

Subsequently, a factor model called Multiple Similarities Collaborative Matrix Factorization (MSCMF) is proposed in which drugs and targets are projected in a common low-rank feature space. Finally, these two low-rank matrices and weights associated with similarity matrices are estimated by an alternating least squares algorithm.

Regularized Least Square Weighted Nearest Neighbor profile (RLS-WNN) method [25] was introduced in 2013. In this method, a simple weighted nearest neighbor procedure is introduced which is claimed that the procedure has performed well in DTI prediction, and to improve on previous work, this procedure is combined with the recent machine learning method.

Multi Graph Regularized Nuclear Norm Minimization (MGRNNM) method [11] was proposed in 2020. In this method, a new framework for predicting DTI from three inputs; known drug–target interaction network, similarities over drugs and those over targets is proposed. A method for finding a low-rank interaction matrix has been introduced. This matrix is made up of graphs that represent the proximities of drugs and targets. This paper proposes to capture the proximities exhaustively in predicting DTI, various multiple drug-drug similarities and target-target similarities as multiple graph Laplacian (over drugs / targets) regularization terms be used.

Four references are introduced specifically for the DTI prediction task. In the following, two references are introduced as baseline references.

Matrix completion (MC) method [78] was introduced in 2011. The paper focuses on solving matrix completion problems. In this regard, a non-convex optimization problem is proposed to solve the matrix completion. The proposed method is a variant of convex nuclear-norm minimization, with a fast numerical algorithm to solve it.

Matrix Completion on Graphs (MCG) method [79] was introduced in 2014. This paper introduces a novel matrix completion model for several real-world applications such as recommender systems. In this new model, the proximity information is used. It is stated that the purpose of this method is to find a low-rank solution created by the proximities of rows and columns. It should be noted that these proximities are encoded by graphs.

Neighborhood Regularized Logistic Matrix Factorization (NRLMF) method [36] was introduced in 2016. This method is based on the possibility of interacting a drug with a target through the logistic matrix factorization. NRLMF is more important in drug–target interaction pairs (positive observations) than in unknown pairs (negative observations). Because positive observations have already been experimentally confirmed, they are usually more reliable. For this reason, the local structure of drug–target interaction data has also been used through neighborhood adjustment to achieve better predictive accuracy.

DDR method was introduced in 2018 [80]. DDR works by using multiple similarities between drugs and considerable similarities between target proteins through a heterogeneous graph containing known DTIs.

Triple Matrix Factorization-based model (TMF) [28] was introduced in 2018. This model shows a new sight for the effective mechanism of DTIs by indicating prevailing features. TMF assesses the predictions on four benchmark datasets over different screening scenarios which represent its considerable priority.

In this section, a comparison between the proposed method and the current prediction methods is performed. AUC and AUPR criteria were used to evaluate the performance. Tables 2, 3, 4 and 6 show the results found on AUPR under validation setting; CVS1, CVS1 and CVS1 in four data sets, respectively. Tables 3, 5, 6 and 7 also show the results based on AUC under validation setting; CVS1, CVS1 and CVS1 in four datasets, respectively. In these tables, the best result are shown in bold. As shown in the tables, the proposed method performed well in all four data sets and all three validation sets. It should be noted that the SMGRNNM method is the same method presented in the Mongia et al. study [11], which uses only the standard similarity matrices.

Table 2 AUPR results for interaction prediction under validation setting CVS1
Table 3 AUC results for interaction prediction under validation setting CVS1
Table 4 AUPR results for interaction prediction under validation setting CVS2
Table 5 AUC results for interaction prediction under validation setting CVS2
Table 6 AUPR results for interaction prediction under validation setting CVS3
Table 7 AUC results for interaction prediction under validation setting CVS3

Time complexity

In this section, the time complexity of the proposed algorithm is compared with MGRNNM method [11] as one of the best methods presented in DTI. Since Algorithm 2 is an iterative solution, in this paper, similar to MGRNNM, the time complexity is calculated for each iteration. In the proposed method and MGRNNM, in each iteration two.

Sylvester equations and one NNM are solved. According to Kirrinnis et al. study [81], the complexity of solving the Sylvester equation is equal to \(O\left( {n^{b} \log \left( n \right)} \right)\), in which the parameter b is between 2 and 3. To solve NNM in MGRNNM, the SVS method is used which the complexity of this algorithm is equal to \(O\left( {n^{3} } \right)\) in each iteration. In fact, MGRNNM has time complexity in each iteration \(O\left( {n^{3} + n^{b} \log \left( n \right)} \right)\). The proposed method uses the RRSSVD_DTI algorithm to solve NNM. According to Hastie et al. study [68], this algorithm requires \(O\left( {r^{3} } \right)\) time per each iteration. It can be said that the time complexity of the proposed algorithm in each iteration is equal to \(O\left( {r^{3} + n^{b} \log \left( n \right)} \right)\) and since \(r \ll {\text{min}}\left( {m,n} \right)\), the time complexity of the proposed algorithm is minor than MGRNNM.

In the following, more details of the time complexity are presented. The proposed method and MGRNNM are compared on four datasets based on running time. As shown in Fig. 3, the proposed method performed well on all the datasets.

Fig. 3
figure 3

Comparing the running time of the proposed method with Mongia et al. study [11] in four datasets. In each bar chart, the blue color shows the running time of Mongia et al. study [11] and the orange color indicates the running time of the proposed method. The top row shows the running time in the E and IC datasets, respectively, from left to right. The bottom row shows the running time in the GPCR and NR datasets, respectively, from left to right

It should be noted that in these applications, the online web detection system is designed to meet the needs of many people simultaneously. In such scenarios, response time can be very significant, which shows the proper performance of the proposed method in time complexity that our approach can be used in DTI.


The use of adjacency matrices to represent the interaction between drug and target has been considered by many researchers in recent years. In DTI prediction, the detection of low-rank interaction has been significant. Using LRMA methods to solve these problems can improve performance. The results reported in this article showed that unifying matrix factorization and nuclear norm minimization approaches based on similarity matrix has a good effect on solving low-rank problem.

AUC, AUPR and running time are commonly used to evaluate the performance of methods in DTI. It should be noted that the use of a graph to express the adjacency of the target and the drug in this application has improved the performance of the proposed method. Overall, this paper presents a powerful and fast method for applying DTI, which shows improved performance in four benchmark datasets.

To evaluate the proposed method, all four datasets are divided into three cross-validation settings. Results were reported as mean and variance of AUC and AUPR per 5 runs. It should be noted that the proposed method is compared with several methods [11, 20, 25, 28, 35, 36, 78,79,80]. The results have been shown the proposed method which was based on the similarity matrix, had the best performance. The time complexity of the proposed method was more appropriate than other methods.


In drug-related processes such as drug discovery, drug side-effect prediction and drug repurposing, the interaction between drugs and targets (proteins) is very important. Drugs effect on targets (proteins) by altering the pharmaceutical functions of targets, such as enzymes, ion channels, G protein-coupled receptors (GPCRs), and nuclear receptors. DTIs analysis requires costly and time-consuming experiments. In this regard, CP-based approaches have been used to narrow down the search space and also reduce the cost and time of experiments.

In this paper, CP based on chemogenomic methods in DTI is used. It was shown that the use of similarity matrices in this application provides the best performance compared to the other methods. Also, we presented that the use of unifying of graph regularized nuclear norm with bilinear factorization can be very effective in predicting DTI. In this paper, the proposed method on four datasets based on three different cross-validation settings is compared with six state-of-the-art methods. The results show a better performance of the proposed method. In general, the superiority of the proposed method can be expressed as follows:

  • There is a trade-off parameter between the loss function and the low-rank regularization as λ in the NNM approach which is induced by the nuclear norm. The use of low-rank priors to many applications has been developed by these models where Z (in Eq. (3)) is low rank but its rank is not known a priori. These models also have multiple problems despite their convexity and theoretical guidelines for the choice of λ [9]. First, it is unclear how to impose a certain rank in Z: adjusting λ so Z has a predetermined rank usually produces unpleasant results than imposing it directly in Eq. (3) in many works. Second, it is impossible to obtain the Z factorization in Eq. (7) Causes not use the “kernel trick”. Third, Eq. (7) is a Semidefinite Program (SDP). Off-theshelf SDP optimizers just divide into hundreds of variables, not amenable to the high dimensionality typically found in DTI problems. While many studies improve this issue, they still perform an SVD of Z in each iteration and make them disproportionate for managing dense and large-scale datasets. This paper indicates many nuclear norms regularized problems of the form (7) which can be optimized with a bilinear factorization of Z = UVT by using the variational definition of the nuclear norm. In this paper, a unification of traditional bilinear factorization and nuclear norm approaches under one formulation in DTI applications have been proposed. Based on this result, we can analyze the conditions that both methods are equal and offer the best solution when they are not. This article explains how the proposed method can be used in DTI application. In the reference [9], the method based on nuclear norm regularization has been used. The optimization equations of the proposed method and its solution are expressed differently.

  • Unifying nuclear norm with bilinear factorization is presented based on the similarity of drug-drug and target-target, which has caused the advantages of both methods to be combined.

  • Rank-Restricted Soft Singular Value Decomposition method is used to optimize the nuclear norm minimization in the DTI problem. This method has not been used in this application so far. It was shown that the use of this method could have appropriate performance in data based on graph similarity.

  • One of the critical parameters in evaluating the performance of these methods is the running time. With the increasing growth of this data, the use of computational methods will increase; on the other hand more similarity measures and samples with more features can be used. The time complexity of the running time is \(O\left( {r^{3} + n^{b} \log \left( n \right)} \right)\) for each iteration in the proposed algorithm. This complexity is polynomial, which has performed better than other new methods.

  • The performance of the proposed method based on the AUC and AUPR measures had the best performance and also the results were suitable in datasets with different features.

  • The proposed method can be widely used in other applications of CP. Using the proposed method does not require a complex process.

Availability of data and materials

Datasets are simulated from public databases which at the link: are publicly available. The authors declare that they have provided the code and data publicly accessible, which can be downloaded from:





Drug–target interactions


Rank-Restricted Soft Singular Value Decomposition




Ion channels


G protein-coupled receptors


Nuclear receptors


Area Under the ROC Curve


Area Under the Precision-Recall curve


Computational prediction


Nuclear norm minimization


Singular values


Low rank matrix factorization


Singular Value Decomposition


Least Squares


Nuclear norm proximal


Cross-validation settings




Weighted Graph Regularized Matrix Factorization


Matrix Factorization


Multiple Similarities Collaborative Matrix Factorization


Regularized Least square nearest neighbor profile


Multi Graph Regularized Nuclear Norm Minimization


Matrix completion


Matrix completion on graphs, NRLMF: Neighborhood Regularized Logistic Matrix Factorization


  1. Sachdev K, Gupta MK. A comprehensive review of feature based methods for drug target interaction prediction. J Biomed Inform. 2019;93:103159.

    Article  PubMed  Google Scholar 

  2. Mongia A, Jain V, Chouzenoux E, Majumdar A. Deep latent factor model for predicting drug target interactions. In: ICASSP 2019–2019 IEEE international conference on acoustics, speech and signal processing (ICASSP), 2019; p. 1254–8.

  3. K. Sachdev, M.K. Gupta, A hybrid ensemble‐based technique for predicting drug–target interactions. Chem Biol Drug Des., 2020.

  4. Deng Y., Xu X, Qiu Y, Xia J, Zhang W, Liu S. A multimodal deep learning framework for predicting drug-drug interaction events. Bioinformatics, 2020.

  5. Fakhraei S, Huang B, Raschid L, Getoor L. Network-based drug–target interaction prediction with probabilistic soft logic. IEEE/ACM Trans Comput Biol Bioinform. 2014;11:775–87.

    Article  PubMed  Google Scholar 

  6. Ezzat A, Wu M, Li X-L, Kwoh C-K. Computational prediction of drug–target interactions using chemogenomic approaches: an empirical survey. Brief Bioinform. 2019;20:1337–57.

    Article  CAS  PubMed  Google Scholar 

  7. Yin M-M, Liu J-X, Gao Y-L, Kong X-Z, Zheng C-H. NCPLP: a novel approach for predicting microbe-associated diseases with network consistency projection and label propagation. IEEE Trans Cybern. 2020.

  8. Cui Z, Liu J-X, Gao Y-L, Zhu R, Yuan S-S. LncRNA-disease associations prediction using bipartite local model with nearest profile-based association inferring. IEEE J Biomed Health Inform. 2019;24:1519–27.

    Article  PubMed  Google Scholar 

  9. Keiser MJ, Roth BL, Armbruster BN, Ernsberger P, Irwin JJ, Shoichet BK. Relating protein pharmacology by ligand chemistry. Nat Biotechnol. 2007;25:197–206.

    Article  CAS  PubMed  Google Scholar 

  10. Hendrickson JB. Concepts and applications of molecular similarity. Science. 1991;252:1189–90.

    Article  Google Scholar 

  11. Mongia A, Majumdar A. Drug–target interaction prediction using multi graph regularized nuclear norm minimization. Plos One. 2020;15:e0226484.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Li H, Gao Z, Kang L, Zhang H, Yang K, Yu K, et al. TarFisDock: a web server for identifying drug targets with docking approach. Nucleic Acids Res. 2006;34:W219–24.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Yıldırım MA, Goh K-I, Cusick ME, Barabási A-L, Vidal M. Drug—target network. Nat Biotechnol. 2007;25:1119–26.

    Article  PubMed  CAS  Google Scholar 

  14. Yamanishi Y, Araki M, Gutteridge A, Honda W, Kanehisa M. Prediction of drug–target interaction networks from the integration of chemical and genomic spaces. Bioinformatics. 2008;24:i232–40.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Mousavian Z, Masoudi-Nejad A. Drug–target interaction prediction via chemogenomic space: learning-based methods. Expert Opin Drug Metab Toxicol. 2014;10:1273–87.

    Article  PubMed  Google Scholar 

  16. Ezzat A, Wu M, Li X-L, Kwoh C-K. Drug–target interaction prediction via class imbalance-aware ensemble learning. BMC Bioinform. 2016;17:267–76.

    Article  Google Scholar 

  17. Yu H, Chen J, Xu X, Li Y, Zhao H, Fang Y, et al. A systematic prediction of multiple drug–target interactions from chemical, genomic, and pharmacological data. PloS One. 2012;7:e37608.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. He Z, Zhang J, Shi X-H, Hu L-L, Kong X, Cai Y-D, et al. Predicting drug–target interaction networks based on functional groups and biological features. PloS One. 2010;5:e9603.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  19. Cheng F, Liu C, Jiang J, Lu W, Li W, Liu G, et al. Prediction of drug–target interactions and drug repositioning via network-based inference. PLoS Comput Biol. 2012;8:e1002503.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. Ezzat A, Zhao P, Wu M, Li X-L, Kwoh C-K. Drug–target interaction prediction with graph regularized matrix factorization. IEEE/ACM Trans Comput Biol Bioinf. 2016;14:646–56.

    Article  Google Scholar 

  21. Thafar MA, Olayan RS, Ashoor H, Albaradei S, Bajic VB, Gao X, et al. DTiGEMS+: drug–target interaction prediction using graph embedding, graph mining, and similarity-based techniques. J Cheminform. 2020;12:1–17.

    Article  CAS  Google Scholar 

  22. Ding H, Takigawa I, Mamitsuka H, Zhu S. Similarity-based machine learning methods for predicting drug–target interactions: a brief review. Brief Bioinform. 2014;15:734–47.

    Article  PubMed  Google Scholar 

  23. Bleakley K, Yamanishi Y. Supervised prediction of drug–target interactions using bipartite local models. Bioinformatics. 2009;25:2397–403.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. van Laarhoven T, Nabuurs SB, Marchiori E. Gaussian interaction profile kernels for predicting drug–target interaction. Bioinformatics. 2011;27:3036–43.

    Article  CAS  PubMed  Google Scholar 

  25. Van Laarhoven T, Marchiori E. Predicting drug–target interactions for new drug compounds using a weighted nearest neighbor profile. PloS One. 2013;8:e66952.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  26. Shi J-Y, Liu Z, Yu H, Li Y-J. Predicting drug–target interactions via within-score and between-score. BioMed Res Int. 2015;20:15.

    Google Scholar 

  27. Shi J-Y, Yiu S-M, Li Y, Leung HC, Chin FY. Predicting drug–target interaction for new drugs using enhanced similarity measures and super-target clustering. Methods. 2015;83:98–104.

    Article  CAS  PubMed  Google Scholar 

  28. Shi J-Y, Zhang A-Q, Zhang S-W, Mao K-T, Yiu S-M. A unified solution for different scenarios of predicting drug–target interactions via triple matrix factorization. BMC Syst Biol. 2018;12:45–56.

    Article  CAS  Google Scholar 

  29. Huang K, Fu T, Glass LM, Zitnik M, Xiao C, Sun J. DeepPurpose: a deep learning library for drug–target interaction prediction. Bioinformatics. 2020;36:5545–7.

    Article  CAS  PubMed Central  Google Scholar 

  30. Monteiro NR,Ribeiro B, Arrais J. Drug–target interaction prediction: end-to-end deep learning approach. IEEE/ACM Trans Comput Biol Bioinform. 2020.

  31. Wang Y-B, You Z-H, Yang S, Yi H-C, Chen Z-H, Zheng K. A deep learning-based method for drug–target interaction prediction based on long short-term memory neural network. BMC Med Inform Decis Mak. 2020;20:1–9.

    Article  Google Scholar 

  32. Ay M, Goh K-I, Cusick ME, Barabasi A-L, Vidal M. Drug–target network. Nat Biotechnol. 2007;25:1119–27.

    Article  CAS  Google Scholar 

  33. Seal A, Ahn Y-Y, Wild DJ. Optimizing drug–target interaction prediction based on random walk on heterogeneous networks. J Cheminform. 2015;7:1–12.

    Article  CAS  Google Scholar 

  34. Gönen M. Predicting drug–target interactions from chemical and genomic kernels using Bayesian matrix factorization. Bioinformatics. 2012;28:2304–10.

    Article  PubMed  CAS  Google Scholar 

  35. Zheng X, Ding H, Mamitsuka H, Zhu S. Collaborative matrix factorization with multiple similarities for predicting drug–target interactions. In: Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining, 2013, pp. 1025–1033.

  36. Liu Y, Wu M, Miao C, Zhao P, Li X-L. Neighborhood regularized logistic matrix factorization for drug–target interaction prediction. PLoS Comput Biol. 2016;12:e1004760.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  37. Cui Z, Gao Y-L, Liu J-X, Wang J, Shang J, Dai L-Y. The computational prediction of drug-disease interactions using the dual-network L 2, 1-CMF method. BMC Bioinform. 2019;20:1–10.

    Article  CAS  Google Scholar 

  38. Zhang Z-C, Zhang X-F, Wu M, Ou-Yang L, Zhao X-M, Li X-L. A graph regularized generalized matrix factorization model for predicting links in biomedical bipartite networks. Bioinformatics. 2020;36:3474–81.

    Article  CAS  PubMed  Google Scholar 

  39. Gu S, Xie Q, Meng D, Zuo W, Feng X, Zhang L. Weighted nuclear norm minimization and its applications to low level vision. Int J Comput Vis. 2017;121:183–208.

    Article  Google Scholar 

  40. Candès EJ, Li X, Ma Y, Wright J. Robust principal component analysis? J ACM (JACM). 2011;58:1–37.

    Article  Google Scholar 

  41. Song G-J, Ng MK. Nonnegative low rank matrix approximation for nonnegative matrices. Appl Math Lett. 2020;105:106300.

    Article  Google Scholar 

  42. Xia S, Song J, Chen D, Wang J. Uncertainty quantification for hyperspectral image denoising frameworks based on low-rank matrix approximation. arXiv preprint arXiv:2004.10959, 2020.

  43. Eriksson A, Van Den Hengel A. Efficient computation of robust low-rank matrix approximations in the presence of missing data using the L 1 norm. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 201; p. 771–778.

  44. Chi Y, Lu YM, Chen Y. Nonconvex optimization meets low-rank matrix factorization: an overview. IEEE Trans Signal Process. 2019;67:5239–69.

    Article  Google Scholar 

  45. Zhou D, Cao Y, Gu Q. Accelerated factored gradient descent for low-rank matrix factorization. In: International conference on artificial intelligence and statistics, 2020, p. 4430–40.

  46. Huang Z, Salama P, Shao W, Zhang J, Huang K. Low-rank reorganization via proportional hazards non-negative matrix factorization unveils survival associated gene clusters, arXiv preprint arXiv:2008.03776, 2020.

  47. Lei B, Cheng N, Frangi AF, Tan E-L, Cao J, Yang P, et al. Self-calibrated brain network estimation and joint non-convex multi-task learning for identification of early Alzheimer’s disease. Med Image Anal. 2020;61:101652.

    Article  PubMed  Google Scholar 

  48. Bagherian M, Sabeti E, Wang K, Sartor MA, Nikolovska-Coleska Z, Najarian K. Machine learning approaches and databases for prediction of drug–target interaction: a survey paper. Brief Bioinform. 2020.

  49. Brzyski D, Hu X, Goni J, Ances B, Randolph TW, Harezlak J. A sparsity inducing nuclear-norm estimator (SpINNEr) for matrix-variate regression in brain connectivity analysis, arXiv preprint arXiv:2001.11548, 2020.

  50. Srebro N, Jaakkola T. Weighted low-rank approximations. In: Proceedings of the 20th international conference on machine learning (ICML-03), 2003; p. 720–7.

  51. Zhao Z, Wang S, Wong D, Guo Y, Chen X. The sparse and low-rank interpretation of SVD-based denoising for vibration signals. In: 2020 IEEE international instrumentation and measurement technology conference (I2MTC), 2020, pp. 1–6.

  52. Xu S, Zhang C, Zhang J. Adaptive quantile low-rank matrix factorization. Pattern Recognit. p. 107310; 2020.

  53. Zhao Q, Meng D, Xu Z, Zuo W, Yan Y. $ L_ 1 $-norm low-rank matrix factorization by variational Bayesian method. IEEE Trans Neural Netw Learn Syst. 2015;26:825–39.

    Article  PubMed  Google Scholar 

  54. Kong Y, Shao M, Li K, Fu Y. Probabilistic low-rank multitask learning. IEEE Trans Neural Netw Learn Syst. 2017;29:670–80.

    Article  PubMed  Google Scholar 

  55. Tu W, Liu P, Zhao J, Liu Y, Kong L, Li G, et al. M-estimation in low-rank matrix factorization: a general framework. In: 2019 IEEE international conference on data mining (ICDM), 2019; pp. 568–77.

  56. Baldi P, Hornik K. Neural networks and principal component analysis: learning from examples without local minima. Neural Netw. 1989;2:53–8.

    Article  Google Scholar 

  57. Gillis N, Glineur F. Low-rank matrix approximation with weights or missing data is NP-hard. SIAM J Matrix Anal Appl. 2011;32:1149–65.

    Article  Google Scholar 

  58. Jain P, Oh S. Provable tensor factorization with missing data. In: Advances in Neural Information Processing Systems, 2014, p. 1431–9.

  59. Hladík M, Hartman D, Zamani M. Maximization of a PSD quadratic form and factorization. Optim Lett., 2020;pp. 1–14.

  60. Gu S, Zhang L, Zuo W, Feng X. Weighted nuclear norm minimization with application to image denoising. In: Proceedings of the IEEE conference on computer vision and pattern recognition, 2014; p. 2862–9.

  61. Yang Y, Yang Z, Li J, Fan L. Foreground-background separation via generalized nuclear norm and structured sparse norm based low-rank and sparse decomposition. IEEE Access. 2020;8:84217–29.

    Article  Google Scholar 

  62. Jiang T-X, Huang T-Z, Zhao X-L, Deng L-J. Multi-dimensional imaging data recovery via minimizing the partial sum of tubal nuclear norm. J Comput Appl Math. 2020;372:112680.

    Article  Google Scholar 

  63. Zhu W, Peng B. Sparse and low-rank regularized deep subspace clustering. Knowl-Based Syst. 2020; p. 106199.

  64. Sun X, Wang Y, Zhang X. Multi-view subspace clustering via non-convex tensor rank minimization. In: 2020 IEEE international conference on multimedia and expo (ICME), 2020, p. 1–6

  65. Li J, Fan W, Li Y, Qian Z. Low-frequency noise suppression in desert seismic data based on an improved weighted nuclear norm minimization algorithm. IEEE Geosci Remote Sens Lett. 2020.

  66. Huang D, Cabral R, De la Torre F. Robust regression. IEEE Trans Pattern Anal Mach Intell. 2015;38:363–75.

    Article  Google Scholar 

  67. Cabral R, De la Torre F, Costeira JP, Bernardino A. Unifying nuclear norm and bilinear factorization approaches for low-rank matrix decomposition. In: Proceedings of the IEEE international conference on computer vision, 2013; p. 2488–95.

  68. Hastie T, Mazumder R, Lee JD, Zadeh R. Matrix completion and low-rank SVD via fast alternating least squares. J Mach Learn Res. 2015;16:3367–402.

    PubMed  PubMed Central  Google Scholar 

  69. Recht B, Fazel M, Parrilo PA. Guaranteed minimum-rank solutions of linear matrix equations via nuclear norm minimization. SIAM Rev. 2010;52:471–501.

    Article  Google Scholar 

  70. Hattori M, Tanaka N, Kanehisa M, Goto S. SIMCOMP/SUBCOMP: chemical structure search servers for network analyses. Nucleic Acids Res. 2010;38:W652–6.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  71. Smith TF, Waterman MS. Identification of common molecular subsequences. J Mol Biol. 1981;147:195–7.

    Article  CAS  PubMed  Google Scholar 

  72. May JP. Equivariant homotopy and cohomology theory, vol. 91 of CBMS Regional Conference Series in Mathematics. In: Published for the conference board of the mathematical sciences, Washington, DC, 1996, p. 88

  73. Kanehisa M, Goto S, Hattori M, Aoki-Kinoshita KF, Itoh M, Kawashima S, et al. From genomics to chemical genomics: new developments in KEGG. Nucleic Acids Res. 2006;34:D354–7.

    Article  CAS  PubMed  Google Scholar 

  74. Schomburg I, Chang A, Ebeling C, Gremse M, Heldt C, Huhn G, et al. BRENDA, the enzyme database: updates and major new developments. Nucleic Acids Res. 2004;32:D431–3.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  75. Günther S, Kuhn M, Dunkel M, Campillos M, Senger C, Petsalaki E, et al. SuperTarget and Matador: resources for exploring drug–target relationships. Nucleic Acids Res. 2007;36:D919–22.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  76. Wishart DS, Knox C, Guo AC, Cheng D, Shrivastava S, Tzur D, et al. DrugBank: a knowledgebase for drugs, drug actions and drug targets. Nucleic Acids Res. 2008;36:D901–6.

    Article  CAS  PubMed  Google Scholar 

  77. Cui Z, Gao Y-L, Liu J-X, Dai L-Y, Yuan S-S. L 2, 1-GRMF: an improved graph regularized matrix factorization method to predict drug–target interactions. BMC Bioinform. 2019;20:1–13.

    Article  CAS  Google Scholar 

  78. Majumdar A, Ward RK. Some empirical advances in matrix completion. Signal Process. 2011;91:1334–8.

    Article  Google Scholar 

  79. Kalofolias V, Bresson X, Bronstein M, Vandergheynst P. Matrix completion on graphs. arXiv preprint arXiv:1408.1717, 2014.

  80. Olayan RS, Ashoor H, Bajic VB. DDR: efficient computational method to predict drug–target interactions using graph mining and machine learning approaches. Bioinformatics. 2018;34:1164–73.

    Article  CAS  PubMed  Google Scholar 

  81. Kirrinnis P. Fast algorithms for the Sylvester equation AX− XBT= C. Theoret Comput Sci. 2001;259:623–38.

    Article  Google Scholar 

Download references


Not applicable.


Not applicable.

Author information

Authors and Affiliations



AGS and ZA designed the novel method. AGS and MI implemented the proposed method. AGS, MI and JP performed interpretation on result of implementation. AGS and ZA wrote the manuscript. All authors read and approved the final manuscript.

Corresponding authors

Correspondence to Ali Ghanbari Sorkhi or Jamshid Pirgazi.

Ethics declarations

Ethics approval and consent to participate

No ethics approval was required for the study.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Sorkhi, A.G., Abbasi, Z., Mobarakeh, M.I. et al. Drug–target interaction prediction using unifying of graph regularized nuclear norm with bilinear factorization. BMC Bioinformatics 22, 555 (2021).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI:


  • Drug–target interaction
  • Computational prediction
  • Low-rank interaction
  • Drug discovery