 Methodology article
 Open Access
 Published:
DPDDI: a deep predictor for drugdrug interactions
BMC Bioinformatics volume 21, Article number: 419 (2020)
Abstract
Background
The treatment of complex diseases by taking multiple drugs becomes increasingly popular. However, drugdrug interactions (DDIs) may give rise to the risk of unanticipated adverse effects and even unknown toxicity. DDI detection in the wet lab is expensive and timeconsuming. Thus, it is highly desired to develop the computational methods for predicting DDIs. Generally, most of the existing computational methods predict DDIs by extracting the chemical and biological features of drugs from diverse drugrelated properties, however some drug properties are costly to obtain and not available in many cases.
Results
In this work, we presented a novel method (namely DPDDI) to predict DDIs by extracting the network structure features of drugs from DDI network with graph convolution network (GCN), and the deep neural network (DNN) model as a predictor. GCN learns the lowdimensional feature representations of drugs by capturing the topological relationship of drugs in DDI network. DNN predictor concatenates the latent feature vectors of any two drugs as the feature vector of the corresponding drug pairs to train a DNN for predicting the potential drugdrug interactions. Experiment results show that, the newly proposed DPDDI method outperforms four other stateoftheart methods; the GCNderived latent features include more DDI information than other features derived from chemical, biological or anatomical properties of drugs; and the concatenation feature aggregation operator is better than two other feature aggregation operators (i.e., inner product and summation). The results in case studies confirm that DPDDI achieves reasonable performance in predicting new DDIs.
Conclusion
We proposed an effective and robust method DPDDI to predict the potential DDIs by utilizing the DDI network information without considering the drug properties (i.e., drug chemical and biological properties). The method should also be useful in other DDIrelated scenarios, such as the detection of unexpected side effects, and the guidance of drug combination.
Background
By taking advantage of the synergistic effects caused by drugdrug interactions (DDIs), the combinational treatment of multiple drugs for complex diseases are popular nowadays [1]. However, unexpected DDI can also trigger side effects, adverse reactions, and even serious toxicity, leading patients in danger [2]. As there exists increasing needs of multidrug treatments, the identification of DDIs is more and more urgent. Nevertheless, it is expensive and timeconsuming to detect DDIs among a large scale of drug pairs both in vitro and in vivo. To assist the screening of DDIs, computational approaches have been developed to deduce candidate drugdrug interactions.
Existing computational methods can be roughly classified into two categories: text miningbased and machine learningbased methods. The text miningbased methods discover and collect annotated DDIs from scientific literatures, electronic medical records [3, 4], insurance claim databases and the FDA Adverse Event Reporting System [5]. They are quite useful in building DDIrelated databases. However, those methods cannot detect unannotated DDIs, and cannot give alerts to potential DDIs before a combinational treatment is made [2]. In contrast, machine learningbased methods provide a promising way to identify unannotated potential drugdrug interactions for downstream experimental validations.
Usually, machine learningbased methods consist of the feature extractor and the supervised predictor. The feature extractor represents drugs in a form of feature vector according to drug properties, such as chemical structure [2, 6,7,8,9,10,11,12,13,14], targets [2, 8,9,10,11], Anatomical Therapeutic Chemical classification (ATC) codes [8,9,10, 12], side effects [8, 9, 11, 13, 14], medication and/or clinical observations [11].
The supervised predictor is usually implemented by classification algorithms, such as KNN [12], SVM [12], logistic regression [2, 8, 10], decision tree [10], naïve Bayes [10]), and network propagation methods, such as reasoning over drugdrug network structure [6,7,8], label propagation [13], random walk [11, 15], probabilistic soft logic [9, 10]) or matrix factorization [14]. Usually, the predictor first trains a model with both feature vectors/similarity matrices and annotated DDI labels, then deduces potential DDIs with the welltrained model. Most methods utilize a single predictor [2, 5,6,7,8, 13,14,15,16], while some of them integrate multiple predictors [10, 12].
In general, the performance of existing approaches heavily relies on the quality of handcrafted features derived from the drug properties. However, some drug properties may not always be available. One common solution is to remove the drugs that lack a certain drug property, which results in smallscale pruned datasets and thus is not pragmatic and suitable in the real scenario [17]. In addition, some handcrafted drug features may not be precise enough to represent or characterize the property of drugs, which may jeopardize the construction of a robust and accurate model for link prediction.
As one of the most popular graph embedding methods, Graph Convolution Network (GCN) provides a promising way to predict DDIs when some properties of drugs are not available. Inspired by the traditional convolutional neural networks (CNNs) operating on regular Euclidean data like images (2D grid) and text (1D sequence) [18], GCN formulates convolution on an irregular graph in nonEuclidean domains, then aggregates information about each node’s neighborhood to distill the network into dense vector embedding without requiring manual feature engineering [19]. The dense vector embedding, also called lowdimensional representations, are learned to preserve the structural relationships between nodes (e.g., drugs) of the network, and thus can be used as features in building machine learning models for various downstream tasks, such as link prediction [17]. Recently, the GCN has been applied to the field of drug development and discovery [20], such as molecular activity prediction [21], drug side effect prediction [22], drug target interactions prediction [23].
In this work, we introduced a deep predictor of drugdrug interactions (namely DPDDI), which uses a graph convolution network (GCN) to learn the lowdimensional feature representation of each drug in the DDI networks, and adopts the deep neural network (DNN) to train models. GCN characterizes drugs in a graph embedding space for capturing the topological relationship to their neighborhood drugs. Experiment results demonstrate that our DPDDI outperforms other existing stateofart methods in DDI prediction.
Results
In this section, we first introduce how to set the parameters of DPDDI predictor, then compare the performance of DPDDI with four other stateoftheart methods in 5fold crossvalidation (5CV) test. We also compare the results of three feature aggregation operators, discussing the effect of sampling rate of negative samples and the robustness of DPDDI on different scale dataset. In the end, we show the effectiveness of DPDDI through a case study.
In statistical prediction, the jackknife test and qfold crossvalidation (CV) test are often used to examine the effectiveness of a predictor [24]. Of the two test methods, the jackknife test is deemed the least arbitrary that can always yield a unique result [25]. However, for large scale database, the jackknife test is quite time consuming. To reduce the computational time and evaluate performance of a predictor, in this study, we adopted the 5fold crossvalidation (5CV) test as done by most investigators [26,27,28,29]. For 5CV test, the samples in the DDI dataset are randomly partitioned into 5 subsets with approximately equal size. One of the 5 subsets is singled out in turn as testing set; 90 and 10% of the other 4 subsets are used as the training samples (forming training set) and validation samples (forming validation set), respectively. The predictor is constructed on the training set and its parameters are tuned by using the validation set. This process is repeated for 5 iterations, each time setting aside a different test subset. To avoid the bias aroused from random data split, we implement 10 independent runs of 5CV, and use the average of the results to assess the performance of our DPDDI predictor.
Parameter setting
We performed a grid search of the parameters by seeking both the minimum value of the loss function and the best accuracy with the training dataset. Both the GCNbased feature extractor and the DNNbased predictor need to tune the learning rate, epochs, batch size, dropout rate, as well as neuro numbers (dimensions) in hidden layers.
Specifically, with the full batch size, the GCNbased feature extractor tuned the learning rate (Lrate) from the list of {0.1, 0.01, 0.001, 0.005, 0.0001}, the Epochs from {200, 500, 800, 1000, 1200, 1400, 1600}, the Dropout from {0.01, 0.001, 0.0001}, and the hidden layer dimensions (Hdim) from {[800,512], [800,256], [800,128], [512,256], [512,128], [512,64], [256,64], [128,32]}. The DNNbased predictor tuned the learning rate (Lrate) from {0.1, 0.05, 0.01, 0.005}, the Epochs from {20, 40, 60, 80,100,140,160}, the batch size (Bsize) from {10, 20, 40, 50, 60, 80}, the Dropout from {0.01, 0.001, 0.0001} and the hidden layer dimensions (Hdim) from {[128, 32], [128, 64], [64, 32], [128,64,32], [128, 32, 16], [64, 32, 16], [128, 64, 32, 16], [64, 32, 16, 4]}. The parameters led optimal prediction are shown in Table 1.
Comparison with other stateoftheart methods
We compared our DPDDI with four other stateoftheart methods, including two Vilar’s methods (named as Vilar 1 and Vilar 2, respectively) [6, 7], label propagationbased method (named as LP) [13] and Zhang’s method (named as CE) [11] in 5CV test. Vilar et al [6] integrates a Tanimoto similarity matrix of molecular structures with known DDI matrix by a linear matrix transformation to identify potential DDIs. Vilar et al [7] uses the drug interaction profile fingerprints (IPFs) to measure similarity for predicting DDIs. Label propagation method [13] applies label propagation to assign labels from known DDIs to previously unlabeled nodes by computing drug similarityderived weights of edges on the DDI network. Zhang et al [11] collects a variety of drugrelated data (e.g., known drugdrug interactions, drug substructures, proteins, pathways, indications, and side effects, etc.) to build many base classifiers, then performed the prediction with an ensemble (CE) classifier model.
To ensure a fair comparison, the DB2 dataset from [11] is adopted. In the DB2 dataset, all unlabeled drug pairs are considered as the negative samples. The comparison results in 5CV test are shown in Table 2, from which we can see that DPDDI achieves the best results, outperforming the other four stateoftheart methods across all the metrics. Specifically, DPDDI achieves the improvements of 0.2 ~ 24.9%, 6.6 ~ 64.5%, 2.2 ~ 31.5%, 2.5 ~ 50.1%, 0.6 ~ 22.1%, 8.9 ~ 50.6% against other three methods of Vilar 1, Vilar 2 and LP in terms of AUC, AUPR, Recall, Precision, Accuracy, and F1score, respectively. Although the AUC and ACC of DPDDI are slightly lower than that of Zhang’s method [11], the AUPR and F_{1} of DPDDI are higher. AUPR is often believed to be a more significant quality measure than AUC, as it punishes much more the existence of false positive drugdrug interactions. F_{1} represents the harmonic mean of precision and recall, which focus on the proportion of correctly predicted drugdrug interaction pairs. ACC focuses not only on the proportion of correctly predicted drugdrug interaction pairs, but also on the proportion of correctly predicted drugdrug noninteraction pairs. For the prediction of drugdrug interaction, F_{1} should be more effective measure than ACC.
In addition, Zhang et al [11] used 9 drugrelated data sources, while our DPDDI just use the known drugdrug interaction data. If we integrate more drugrelated data sources (e.g., drug substructure, drug target, drug enzyme, drug transporter, drug pathway, drug indication, drug side effect and drug off side effect used in [11]) to construct the dugdrug similarity network, using DPDDI framework to predict DDIs, DPDDI should be able to achieve better performance.
Comparison of different feature aggregate operators
After obtaining the latent feature vectors of single drugs in the embedding space by GCN, we adopt three feature operators (i.e., inner product, summation and concatenation) to aggregate the feature vectors of two drugs into one feature vector for representing the drugdrug pairs. Then these aggregation feature vectors are fed into the DNN model to evaluate their effects to our DPDDI on DB1 dataset in 5CV test. As shown in Table 3, the concatenate operator achieves the best results and is thus selected in our DPDDI model to aggregate the feature vectors of drugs.
Comparison of the network structure features, chemical features and biological features of drugs
In order to evaluate the effectiveness of the network structure (NS) features, we also considered the chemical and biological features derived from three heterogeneous sources, such as chemical structure (CS), drugbinding proteins (DBP), and Anatomical Therapeutic Chemical Classification labels (ATC). Chemical structures of the drugs are characterized by 881dimensional PubChem fingerprints. The DBP features of drugs are represented by 1121dimensional binary vectors in which each bit indicates the binding occurrence of a specific DBP across the drugs. The 118dimensional ATC features of drugs are converted from the 7bit ATC code via a onehot coding. These features (i.e., network structure features, chemical structure features, DBP features and ATC features of drugs) are respectively concatenated to feed the DNN models for predicting DDIs, and the results of these features with DNN on DB1 dataset in 5CV test are shown in Table 4, from which we can see that the network structure features generate the best performance.
Influence of dataset scale size
In order to verify the robustness of our DPDDI approach, we use three datasets (i.e., DB1, DB2 and DB3) with different sizes to assess the performance of DPDDI in 5CV test. DB1 dataset contains 1562 drugs and 180,576 annotated drugdrug interactions. DB2 contains 548 drugs and 48,584 annotated drugdrug interactions. DB3 dataset contains 1934 drugs and 230,887 annotated drugdrug interactions. As shown in Table 5, although the dataset size has some effect on the performance of DPDDI (i.e., higher performance is achieved on dataset of a larger size), our DPDDI obtain reasonable prediction results on small dataset as well. These results show that our DPDDI approach is relatively robust with respect to the size of datasets for predicting DDI.
We also investigate the effects of negative sample size on DPDDI by sampling different unlabeled drug pairs to generate the negative sample sets, which are combined with the known DDI pairs (i.e., positive sample set) to form the DDI training, validation and testing datasets.
From DB1 dataset, we randomly selected different number of unlabeled drug pairs and combine them with the known DDI pairs to construct the datasets of DB1:1, DB1:3 and DB1:6, in which the ratio of positive samples (i.e., known DDI pairs) and negative samples (i.e., unlabeled drug pairs) are kept 1:1, 1:3 and 1:6, respectively. Figure 1 shows the results of DPDDI on DB1:1, DB1:3 and DB1:6 datasets in 5CV test. We can see that DPDDI achieves the highest values in terms of AUC, AUPR, Precision, Recall, Accuracy and F_{1} on DB1:1 dataset, indicating that the imbalance between positive and negative samples does have impacts on the performance of DPDDI.
Case studies
In this section, we investigate the performance of DPDDI in predicting the unobserved DDIs. DB1 contains 180,576 annotated drugdrug interaction pairs among 1562 drugs, and 1,038,565 unlabeled drug pairs which may contain unobserved DDIs. By training DPDDI with DDI network from DB1 dataset, the possible interactions among drugs are inferred. Higher scores of unobserved drug pairs indicate that there are higher probabilities to interact between these drugs. Table 6 shows the top 20 predicted drugdrug interactions of DPDDI, which are not available in DB1 dataset. By searching for the evidence of these newly predicted DDIs on DrugBank (version 5.0) database and Drug Interaction Checker website (Drugs.com), we find that a significant fraction of newly predicted DDIs (13 out of 20) is confirmed. For instance, the description of the interaction between drug “Doxycycline” and drug “Bleomycin” is “Doxycycline may decrease the excretion rate of Bleomycin which could result in a higher serum level”. The case studies demonstrate that our DPDDI can effectively detect the potential drugdrug interactions. Maybe other 7 newly predicted DDIs our of 20 are confirmed by later experiments.
In addition, among the top 20 predicted DDIs of DPDDI, we find that the drug of “doxycycline” interacts with other 8 drugs, and 5 out of 8 DDI pairs have been confirmed by current experimental evidences. These results indicate that “doxycycline” drug may have higher activity and is easy to interact with other drugs for implementing the drug efficacy.
Discussions
One the key factor in DDI prediction is the features considered. We compared the GCNderived DDI network structure feature with the other three chemical structure and biological features. The results in Table 4 show the superiority of our GCNderived DDI network structure feature across all the performance metrics. Especially, our GCNderived DDI network structure feature achieves > 20% improvement in terms of AUPR, Recall, Precision, and F_{1}score. These results demonstrate that DDI network structure featuresbased GCN contains more DDI discriminant information, and can effectively learn a lowdimensional feature representation for each drug in the DDI network, i.e., the lowdimensional representation preserve the ample structural information of DDI network.
In DDI prediction, how to best aggregate the feature vectors of two drugs into one vector for presenting one drug pair is another key factor. We adopt three feature operators of inner product, summation and concatenation to aggregate the feature vectors of two drugs. Results in Table 3 show that the concatenate operator achieves the best performance whereas the inner product operator gets the worst performance. Therefore, concatenation operator was adopted in our DPDDI.
In addition, we paid particular attention to how to balance samples in the training phase. Many former works in similar areas [22, 30, 31] adopted the same number of negative samples and positive samples to avoid the computational challenge caused by the sample imbalance. Consistently, our results in Fig. 1 show that the balanced sample scheme achieves the best performance in terms of AUPR, Recall, Precision and F1 score. These results indicate that the imbalance between positive and negative samples does have influence on DPDDI. For fairly comparing with other stateoftheart methods, the known drugdrug interaction pairs (positive samples) and all unlabeled drugdrug pairs (negative samples) are used to train the prediction model. Considering that more sever sample imbalance can result in the higher errors, we also introduce a weight W_{pos} in Eq.(2) for sample balancing.
The comparison experiments (in Tables 1 and 4) demonstrate the superior performance and robustness of DPDDI compared to four other stateoftheart methods on three DDI datasets with different scale. Investigation on the top predicted DDIs confirm the competence of DPDDI for predicting the new DDIs.
The superior performance of DPDDI can be attributed to the following aspects: i) Designing a GCN model to learn the lowdimensional feature representations of drugs and capture the structure information of DDI network. ii) Constructing a DNN model as the predictor to distinguish whether interaction exists between two drugs. iii) DNN model can learn the nonlinear relationship of drug pairs by mapping the drug pairs from a highdimension space into a lower dimension space.
DPDDI is effective in predicting the potential interactions between two drugs existed in DDI network. If the DDI network does not contain the drugs, e.g., a newly invented drug without prior information, DPDDI will fail. In this condition, it is possible to construct the drugdrug similarity network by introducing the drug chemical or biological properties, and then implement our DPDDI framework to predict the novel DDIs.
Conclusions
Aiming at the preliminary screening of DDIs, this work presents a novel prediction method (namely DPDDI) from a DDIs network. DPDDI consists of a feature extractor based on graph convolution network (GCN) and a predictor based on deep neural network (DNN). The former characterizes drugs in a graph embedding space, where each drug is represented as a lowdimensional latent feature vector for capturing the topological relationship to its neighborhood drugs. The latter concatenates latent feature vectors of any two drugs into one feature vector to represent the corresponding drug pairs for train a DNN for predicting potential interactions. Designated experiments for DPDDI bring several observations: i) the concatenation feature aggregation operator is better than two other feature aggregation operators, i.e., the inner product and the summation; ii) the GCNderived latent features greatly outperform other features derived from chemical, biological or anatomical properties of drugs; iii) DPDDI is robust to the datasets with different scale in drug number, DDI number, and network sparsity; iv) the performance of DPDDI is significantly superior to four stateoftheart methods; v) the finding of 13 verified DDIs out of top 20 unobserved candidates in case studies reveals the capability of DPDDI for predicting new DDIs. To summarize, the proposed DPDDI is an effective approach for predicting DDIs, and should be helpful in other DDIrelated scenarios, such as the detection of unexpected side effects, and the guidance of drug combination.
Methods
Datasets
We extracted the approved small molecular drugs and their interaction relationships from DrugBank 4.0 [32] to build the DB1 dataset which contains 1562 drugs and 180,576 annotated drugdrug interactions. In order to compare with other stateoftheart methods, a smaller dataset (named as DB2) built by Zhang et al. [11] was adopted to evaluate the performance of our DPDDI. DB2 contains 548 drugs and 48,584 annotated drugdrug interactions. Moreover, we also collected a new and larger dataset from DrugBank 5.0 [33] to build the DB3 dataset for assess the robustness of our DPDDI, including 1934 drugs and 230,887 annotated drugdrug interactions. In DB1, DB2 and DB3, the known drugdrug interaction pairs are used as the positive samples to build the positive set, and the other unlabeled drugdrug pairs are considered as the negative samples in which we utilize a random sampling strategy to build the negative set. From the perspective of interactions, these three datasets can be treated as DDI networks. The network characteristics are summarized in Table 7.
In order to compare our networkbased features with other drug features derived from diverse drug properties, we also downloaded the drug chemical structures, Anatomical Therapeutic Chemical classification (ATC) codes and drugbinding proteins (DBPs) from DrugBank.
The chemical structurebased feature represents each drug by an 881dimensional binary vector in which each bit represents the specific substructure according to Pubchem fingerprints. ATC codes are released by the World Health Organization [34], and they categorize drug substances at different levels according to organs they affect, application area, therapeutic properties, chemical and pharmacological properties. It is generally accepted that compounds with similar physicochemical properties exhibit similar biological activity. As 138 of 1562 drugs in DB1 have no ATC code, we adopted their predicted codes by SPACE [35], which deduce ATC codes from chemical structures. To feed the 7bit ATC code into DNN, we convert them into a onehot code with 118 bits.
We also used drugbinding protein (DBP) data collected by [16], including 899 drug targets and 222 nontarget proteins. Similarly, each drug is represented as a binary DBPbased feature vector, of which each bit indicates whether the drug binds to a specific protein.
Problem formulation
Our task is to deduce DDI candidates among those unannotated drugdrug pairs based on annotated DDIs in the form of a network. Technically, let G(D, E) be a DDI network, where D = {d_{1}, d_{2}, …, d_{m}} is the set of m approved drugs and E denotes the interactions between them. This network can be usually represented by an m × m symmetric binary adjacency matrix A_{m × m} = {a_{ij}}, where a_{ij} = 1 indicates an annotated interaction between drug d_{i} and drug d_{j}, and otherwise no annotated interaction between them.
DDI prediction can be solved by a threestep approach. First, the function of f_{1}(A) is to obtain the latent feature vector Z_{i} of each drug in A, where Z_{i} ∈ R^{1 × k}(k ≪ m) . Next, the latent vectors (Z_{i} and Z_{j}) of two drugs are aggregated into one feature vector to represent a drug pair. Last, the function of f_{2}(Z_{i}, Z_{j}) ( Z_{i}, Z_{j} ∈ Z) is used to reconstruct the network \( \hat{A} \). The function of f_{1} is referred as the feature extractor, while the function of f_{2} is named as the predictor in our model.
In this work, by implementing the solution based on deep learning, we provide a Deep Predictor for DrugDrug Interactions (named as DPDDI). DPDDI mainly consists of the following three phases: i) Extract the lowdimensional embedding latent features of drugs from DDI network by building a GCN model; ii) Aggregate the latent feature vectors (i.e., Z_{i} and Z_{j}) of drugs d_{i} and d_{j} to represent the drug pairs; iii) Feed the fused feature vectors into a DNN to predict DDIs. The overall framework of DPDDI is illustrated in Fig. 2.
The loss of DPDDI contains two parts as follows:
where L_{f} is the loss of its feature extractor, and L_{p} is the loss of its predictor. The first part adopts a binary weightedcrossentropy as follows:
where p(a_{ij}) is the true label of the training interaction a_{ij}, \( q\left({a}_{ij}\right)=\sigma \left({z}_i\bullet {z}_j^T\right) \) is the predicting probability computed by the inner product of latent vectors of two nodes generated by the GCN, and W_{pos} is the weight equal to the number of negative samples over the number of positive samples. The second part is defined by a binary crossentropy as follows:
where s(a_{ij}) is the predicting probability generated by the DNN.
Feature extractor
We employ a twolayer autoencoder of graph convolutional network (GCN) [36, 37] to obtain embedding representations of drug nodes. Each drug is represented as a latent feature vector, which contains the highdimensional information about its neighborhood in the DDI network without manual feature engineering. Such node embedding provides a promising way to represent the relationship between nodes in a complex network.
Technically, the GCN takes the adjacency matrix A as the input and outputs embedding vectors \( \left\{{Z}_i\in {R}^{1\times {H}_p},i=1,2,\dots, m\right\} \) for every drug in the DDI network, where H_{p} is the dimension of the last hidden layer. Like [38] recommendation, our GCN adopts two layers as well. Suppose that H^{(0)} is the feature matrix in which each row denotes the input feature vector of each node in the network. In case of no input features, H^{(0)} is just an identity matrix. Then, the output H^{(1)} of the first hidden layer is defined as:
where \( \hat{\mathrm{A}}={\overset{\sim }{\mathrm{D}}}^{\frac{1}{2}}\overset{\sim }{\mathrm{A}}{\overset{\sim }{\mathrm{D}}}^{\frac{1}{2}} \) is the symmetrically normalized adjacency matrix, \( {\overset{\sim }{\mathrm{D}}}_{ii}={\sum}_j{\overset{\sim }{\mathrm{A}}}_{ij} \) and \( \overset{\sim }{A}=A+{I}_N \), \( {W}^{(0)}\in {R}^{m\times {H}_1} \) is the weight matrix to be learned, and ReLU is the activation function. Similarly, the output H^{(2)} of the second hidden layer is recursively defined as:
where \( {W}^{(1)}\in {R}^{H_1\times {H}_2} \). Because our GCN contains only two layers, H^{(2)} is just the final embedding matrix Z \( \in {R}^{m\times {H}_2} \).
Feature aggregation for drug pairs
So far, the latent feature vector of single drug in the embedding space is obtained. The next task is to obtain feature vectors of drug pairs. Given two drugs d_{i} and d_{j}, and their latent vectors Z_{i} and Z_{j} obtained by GCN, three feature operators, i.e., inner product, summation and concatenation, are considered to aggregate the latent feature vectors of two drugs into a single feature vector to represent the drugdrug pair. Specifically, we separately adopt the inner product \( \boldsymbol{F}\left({d}_i,{d}_j\right)={\boldsymbol{Z}}_i\ {\boldsymbol{Z}}_{\boldsymbol{j}}^{\boldsymbol{T}} \), summation F(d_{i}, d_{j}) = Z_{i} + Z_{j} and concatenation F(d_{i}, d_{j}) = [Z_{i}, Z_{j}] of two drug latent vectors Z_{i} and Z_{j} to represent the drug pair (d_{i}, d_{j}).
Predictor
Given the feature vectors of drugdrug pairs, we construct a deep neural network (DNN) as the predictor in DPDDI for its the proven performance in classification. The predictor transforms DDI prediction into a binary classification, which is implemented by a fivelayer DNN. The numbers of neurons in the layers of the DNN are 256, 128, 64, 32 and 2, respectively. ReLU is adopted as the activation function in the first four layers, while SoftMax is used as the activation function in the last layer, which outputs how likely drug pairs are potential DDIs.
There are two steps to train our DPDDI. The first step is to train a GCN for obtaining the lowdimensional embedding latent features of drugs. The parameters (i.e., learning rate, epochs, dropout, inputdim, hiddendim, and outputdim) in GCN architecture are trained with the DDI network data. The second step is to learn the parameters (i.e., learning rate, dropout, epochs, batchsize, inputdim, hiddendim, and outputdim) of the DNN for final DDI prediction and to fine turn all the parameters of DPDDI framework. To explain our DPDDI method in detail, the pseudocode is shown in Table 8.
Evaluation metrics
The following metrics of accuracy (ACC), Recall, Precision and F_{1}score are used to measure the performance of DPDDI.
where TP and TN are the number of correctly predicted DDI pairs and unlabeled drugdrug pairs, respectively; FP and FN are the number of incorrectly predicted DDI pairs and unlabeled drugdrug pairs, respectively.
We also used the metrics of AUC and AUPR to measure the performance of our DPDDI. AUC is the area under the receiver operating characteristic (ROC) curve which illustrate the truepositive rate (i.e., TP/(TP + FN)) versus the falsepositive rate (i.e., FP/(FP + TN)) at different cutoffs. AUPR is the area under the precision–recall curve which plots the ratio of true positives among all positive predictions for each given recall rate.
Availability of data and materials
The datasets generated and analyzed during the current study and the code of DPDDI are openly available at the website of https://github.com/NWPU903PR/DPDDI.
Abbreviations
 DDIs:

Drugdrug interactions
 GCN:

Graph convolution network
 DNN:

Deep neural network
 ATC:

Anatomical Therapeutic Chemical classification
 DBP:

Drugbinding protein
 AUC:

Area Under the receiver operating characteristic Curve
 AUPR:

Area Under the PrecisionRecall curve
 ACC:

Accuracy
References
 1.
Han K, et al. Synergistic drug combinations for cancer identified in a CRISPR screen for pairwise genetic interactions. Nat Biotechnol. 2017;35(5):463–74.
 2.
Takeda T, et al. Predicting drug–drug interactions through drug structural similarities and interaction networks incorporating pharmacokinetics and pharmacodynamics knowledge. Aust J Chem. 2017;9:16.
 3.
Pathak J, Kiefer RC, Chute CG. Using linked data for mining drugdrug interactions in electronic health records. Stud Health Technol Inform. 2013;192:682–6.
 4.
Duke JD, et al. Literature based drug interaction prediction with clinical assessment using electronic medical records: novel myopathy associated drug interactions. PLoS Comput Biol. 2012;8(8):e1002614.
 5.
Vilar S, Friedman C, Hripcsak G. Detection of drug–drug interactions through data mining studies using clinical sources, scientific literature and social media. Brief Bioinform. 2018;19(5):863–77.
 6.
Vilar S, et al. Drugdrug interaction through molecular structure similarity analysis. Journal of the American Meidical informatics association. J Am Med Inform Assoc. 2012;19(6):1066–74.
 7.
Vilar S, et al. Detection of drugdrug interactions by modeling interaction profile fingerprints. PLoS One. 2013;8(3):e58321.
 8.
Gottlieb A, et al. INDI: a computational framework for inferring drug interactions and their associated recommendations. Mol Syst Biol. 2012;8:592.
 9.
Sridhar D, Fakhraei S, Getoor L. A probabilistic approach for collective similaritybased drugdrug interaction prediction. Bioinformatics. 2016;32(20):3175–82.
 10.
Cheng F, Zhao Z. Machine learningbased prediction of drugdrug interactions by integrating drug phenotypic, therapeutic, chemical, and genomic properties. J Am Med Inform Assoc. 2014;21(e2):e278–86.
 11.
Zhang W, et al. Predicting potential drugdrug interactions by integrating chemical, biological, phenotypic and network data. BMC Bioinformatics. 2017;18(1):18.
 12.
Andrej K, et al. Predicting potential drugdrug interactions on topological and semantic similarity features using statistical learning. PLoS One. 2018;13(5):e0196865.
 13.
Zhang P, et al. Label propagation prediction of drugdrug interactions based on clinical side effects. Sci Rep. 2015;5(1):12339.
 14.
Yu H, Mao KT, Shi JY, et al. Predicting and understanding comprehensive drugdrug interactions via seminonnegative matrix factorization. BMC Syst Biol. 2018;12(1):14.
 15.
Park K, et al. Predicting Pharmacodynamic drugdrug interactions through signaling propagation interference on proteinprotein interaction networks. PLoS One. 2015;10(10):e0140816.
 16.
Shi JY, et al. Detecting drug communities and predicting comprehensive drug–drug interactions via balance regularized seminonnegative matrix factorization. Aust J Chem. 2019;11(1):28.
 17.
Yue X, Wang Z, Huang J, et al. Graph embedding on biomedical networks: methods, applications and evaluations. Bioinformatics. 2020;36(4):1241–51.
 18.
Zhou J, et al. Graph Neural Networks: A Review of Methods and Applications. arXiv. 2018:1812.08434.
 19.
Wu Z, et al. A Comprehensive Survey on Graph Neural Networks. arXiv. 2020:1901.00596.
 20.
Sun M, et al. Graph convolutional networks for computational drug development and discovery. Brief Bioinform. 2020;21(3):919–35.
 21.
Pham T, Tran T, Venkatesh S. Graph Memory Networks for Molecular Activity Prediction. arXiv. 2018:1801.02622.
 22.
Zitnik M, Agrawal M, Leskovec J. Modeling polypharmacy side effects with graph convolutional networks. Bioinformatics. 2018;34(13):i457–66.
 23.
Gao KY, et al. Interpretable drug target prediction using deep neural representation. In: TwentySeventh International Joint Conference on Artificial Intelligence IJCAI18; 2018. p. 3371–7.
 24.
Chou KC, Zhang CT. Prediction of protein structural classes. Crit Rev Biochem Mol Biol. 2008;30(4):275–349.
 25.
Chou KC. Some remarks on protein attribute prediction and pseudo amino acid composition. J Theor Biol. 2011;273(1):236–47.
 26.
Fan XN, Zhang SW. LPIBLS: predicting lncRNAprotein interactions with a broad learning systembased stacked ensemble classifier. Neurocomputing. 2019;370:88–93.
 27.
Yan XY, Zhang SW. Identifying drugtarget interactions with decision templates. Curr Protein Pept Sc. 2018;19(5):498–506.
 28.
Zhang Y, et al. Predicting drugdrug interactions using multimodal deep autoencoders based network embedding and positiveunlabeled learning. Methods. 2020;179:37–46.
 29.
Zheng Y, Peng H, Zhang X, et al. DDIPULearn: a positiveunlabeled learning method for largescale prediction of drugdrug interactions. BMC Bioinformatics. 2019;20(Suppl 19):661.
 30.
Mikolov T, et al. Distributed representations of words and phrases and their compositionality. Adv Neural Inf Proces Syst. 2013;26:3111–9.
 31.
Trouillon T, et al. Complex Embeddings for Simple Link Prediction. arXiv. 2017:1606.06357.
 32.
Vivian L, et al. DrugBank 4.0: shedding new light on drug metabolism. Nucleic Acids Res. 2013;42(D1):D1091–7.
 33.
Wishart DS, et al. DrugBank 5.0: a major update to the DrugBank database for 2018. Nucleic Acids Res. 2017;46(D1):D1074–82.
 34.
Skrbo A, Begović B, Skrbo S. Classification of drugs using the ATC system (anatomic, therapeutic, chemical classification) and the latest changes. Med Arh. 2004;58(1 Suppl 2):138–41.
 35.
Liu Z, et al. Similaritybased prediction for anatomical therapeutic chemical classification of drugs by integrating multiple data sources. Bioinformatics. 2015;31(11):1788–95.
 36.
Kipf TN, Welling M. Variational Graph AutoEncoders. arXiv. 2016:1611.07308.
 37.
Kipf TN, Welling M. Semisupervised classification with graph convolutional networks. arXiv. 2016:1609.02907.
 38.
Defferrard M, Bresson X, Vandergheynst P. Convolutional neural networks on graphs with fast localized spectral filtering. Adv Neural Inf Proces Syst. 2016;29:3844–52.
Acknowledgments
We acknowledge anonymous reviewers for the valuable comments on the original manuscript.
Funding
This work has been supported by the National Natural Science Foundation of China (No. 61873202, PI:SWZ and No. 61872297, PI: JYS) and Shaanxi Provincial Key R&D Program, China (NO. 2020KW063, PI: JYS). The funding body did not play any roles in the design of the study, collection, analysis, and interpretation of data, and in writing the manuscript.
Author information
Affiliations
Contributions
YHF collected the dataset, performed the experiments and drafted the manuscript. JYS analyzed the result. Both JYS and SWZ modified manuscript and they are the corresponding authors. All authors read and approved the final manuscript.
Corresponding authors
Ethics declarations
Ethics approval and consent to participate
No ethics approval was required for the study.
Consent for publication
Not applicable.
Competing interests
None of the authors has any competing interests.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
About this article
Cite this article
Feng, YH., Zhang, SW. & Shi, JY. DPDDI: a deep predictor for drugdrug interactions. BMC Bioinformatics 21, 419 (2020). https://doi.org/10.1186/s1285902003724x
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s1285902003724x
Keywords
 Drugdrug interaction
 DDI prediction
 Graph convolution network (GCN)
 Feature extraction
 Deep neural network