Skip to main content

A neural network-based method for polypharmacy side effects prediction



Polypharmacy is a type of treatment that involves the concurrent use of multiple medications. Drugs may interact when they are used simultaneously. So, understanding and mitigating polypharmacy side effects are critical for patient safety and health. Since the known polypharmacy side effects are rare and they are not detected in clinical trials, computational methods are developed to model polypharmacy side effects.


We propose a neural network-based method for polypharmacy side effects prediction (NNPS) by using novel feature vectors based on mono side effects, and drug–protein interaction information. The proposed method is fast and efficient which allows the investigation of large numbers of polypharmacy side effects. Our novelty is defining new feature vectors for drugs and combining them with a neural network architecture to apply for the context of polypharmacy side effects prediction. We compare NNPS on a benchmark dataset to predict 964 polypharmacy side effects against 5 well-established methods and show that NNPS achieves better results than the results of all 5 methods in terms of accuracy, complexity, and running time speed. NNPS outperforms about 9.2% in Area Under the Receiver-Operating Characteristic, 12.8% in Area Under the Precision–Recall Curve, 8.6% in F-score, 10.3% in Accuracy, and 18.7% in Matthews Correlation Coefficient with 5-fold cross-validation against the best algorithm among other well-established methods (Decagon method). Also, the running time of the Decagon method which is 15 days for one fold of cross-validation is reduced to 8 h by the NNPS method.


The performance of NNPS is benchmarked against 5 well-known methods, Decagon, Concatenated drug features, Deep Walk, DEDICOM, and RESCAL, for 964 polypharmacy side effects. We adopt the 5-fold cross-validation for 50 iterations and use the average of the results to assess the performance of the NNPS method. The evaluation of the NNPS against five well-known methods, in terms of accuracy, complexity, and running time speed shows the performance of the presented method for an essential and challenging problem in pharmacology. Datasets and code for NNPS algorithm are freely accessible at

Peer Review reports


Drug combination, commonly referred to as polypharmacy, has become a common practice in modern medicine especially in elderly and patients with complex diseases [1,2,3,4,5,6,7,8,9]. While this strategy may treat the diseases more effectively, drug-drug interactions (DDIs) can occur unexpectedly [5, 6, 10,11,12,13,14,15,16,17,18]. DDI is a change in the pharmacologic effect of one drug when used with another drug. DDIs are the most common reason for patients to go to emergency units [4, 6, 12, 19,20,21,22] and can associate with Adverse Drug Reactions (ADRs) (i.e. side effects) including death, and it is a critical problem for public health [6, 10, 23,24,25,26,27]. Shtar et al. demonstrated that between 3 and 5% of all hospital medication injuries were dedicated to DDI [19]. Although some side effects can be discovered in experiments and clinical trials, they are usually costly and consuming time [10]. Most of the known polypharmacy side effects are rare and they are usually not observed in small clinical trials. So, it is difficult to identify these side effects manually [16]. Therefore, developing computational methods is desired for predicting DDIs. The methods in DDI prediction problem are divided into two categories. The first category just determines the presence or the absence of interactions, and they do not detect the type of side effects. These methods collect the interactions via experiments and clinical studies, medical records, and also through network modeling based on DDIs similarities, side effects similarities, and structure similarities [11, 28,29,30,31,32,33,34,35,36,37,38,39,40,41]. On the other hand, the goal of the second category is determining the type of side effects between drugs [16, 42,43,44,45]. To reduce the impact of polypharmacy side effects, the methods in the second category execute their role. In the following, some studies are expressed which address this issue. Nickel et al. proposed the relational learning approach named RESCAL which was based on a tensor factorization method [42]. DEDICOM was introduced by Papalexakis et al. and similar to RESCAL method was based on tensor decomposition [43]. Deep Walk method was based on a neural embedding approach which used a logistic regression classifier [44, 45]. The concatenated drug features method used a gradient boosting trees classifier to predict side effects [16]. Zitnik et al. designed a multi-relational method called Decagon, which was based on a tensor factorization decoder [16]. In this study, we develop neural network-based method for polypharmacy side effects prediction (NNPS). NNPS utilizes the neural network model mentioned with novel features achieves better results in comparison with the results of 5 well-known methods in terms of accuracy, complexity, and running time speed.

In next section, we describe the required datasets and the details of NNPS algorithm. In results section, the results of the NNPS model are compared with the results of the Decagon, Concatenated drug features, Deep Walk, DEDICOM, and RESCAL methods. The conclusion and some possible further works are presented in Discussion Section.



In this section, the mono side effects, the drug–protein interactions (DPIs), and the DDIs information are presented in details. In the following, we describe the databases and the summary of these databases is given in Table 1.

Table 1 Databases details

Drug–drug interactions and mono side effects information

As the multi-drug treatment is a common way [1,2,3], and modification in drug effect by another drug which is called DDIs, can produce adverse side effects, so, the knowledge of side effects information of DDI becomes the key issue in drug development and disease treatment. The DDI side effects (polypharmacy side effects) are collected from the TWOSIDES database [46]. TWOSIDES provides a reliable and comprehensive database for DDIs and has 1317 side effects on 645 drugs across 63,473 drug pairs. TWOSIDES is extracted from the Food and Drug Administration (FDA) Adverse Event Reporting System (FAERS). Like the previous study in the predicting polypharmacy side effects task [16], we consider 964 polypharmacy side effects which are occurred in at least 500 DDIs.

The side effects of individual drugs (mono side effects) are obtained from Side Effect Resource (SIDER) and OFFSIDES databases [46, 47]. The information of SIDER database is extracted from drug labels and contains 1556 drugs and 5868 side effects compiled from public documents. The information of OFFSIDES database is observed during clinical trials and contains 1332 drugs and 10,097 off-label side effects. Like TWOSIDES, OFFSIDES was generated from FAERS that collected from doctor reports, patients, and drug companies. Finally, by the union and the elimination of synonym side effects in SIDER and OFFSIDES databases, for 645 drugs which are in TWOSIDES database, 10,184 mono side effects are obtained.

Drug–protein interactions

DPIs are obtained from the Search Tool for Interactions of Chemicals (STITCH) database, which provide relationships between drugs and target proteins [48,49,50,51]. By using the STITCH database, we gain interactions between 8934 proteins and 645 drugs which are in TWOSIDES database. The number of interactions between these proteins and drugs is 18,690.

Feature vectors

For each side effect, two types of feature matrices including mono side effects matrix with dimension \(645 \times 10{,}184\) and DPIs matrix with dimension \(645 \times 8934\) are considered. Due to the large length of the features and their sparsities, using the feature extraction methods can be an effective way to reduce the size of features without losing important information. So, the Principle Components Analysis (PCA) is applied on mono side effects and DPIs matrices. The minimum number of the principle components is chosen such that 95% on variance in each matrix is retained. Two reduced feature matrices are denoted by \(F_{1}\) with dimension \(645 \times 503\) and \(F_{2}\) with dimension \(645 \times 22\), respectively. Then, by concatenating \(F_{1}\) (blue) and \(F_{2}\) (green), the drug feature matrix with dimension \(645 \times 525\) is resulted (Fig. 1a). The rows of the resulting drug feature matrix indicate the drugs ID, while the columns show the features information. For a given drug pair \((d_{i}, d_{j})\), i-th and j-th rows of the drug feature matrix are summed for representing the drug-drug pairs feature and feed to the neural network (Fig. 1b).

Fig. 1

For the i-th side effect, the NNPS architecture is used. a Concatenation of the PCA representation of mono side effects \((F_{1})\) (blue) and the PCA representation of drug–protein interactions \((F_{2})\) (green). b Sum of the i-th and j-th rows in the drug features matrix for each \(d_{i}\) and \(d_{j}\) drug pair. c A three-layer neural network that computes the probability \(p_{i}\) and classifies the i-th side effect based on the threshold i

Training the neural network model

The drug pairs associated with each type of side effects are split into training, validation, and test sets, and 5-fold cross-validation is considered. We use 80 percent of drug pairs for the training set, 10 percent for the validation set, and 10 percent for the test set. The following steps are considered to achieve the best neural network architecture based on training datasets.

  1. 1

    The number of hidden layers: \(\lbrace 1,2,3,4,5 \rbrace\)

  2. 2

    The number of neurons in hidden layers: \(\lbrace 25,50,100,200,300 \rbrace\)

  3. 3

    Activation functions: \(\lbrace\)Rectified Linear Unit (ReLU), hyperbolic tangent (tanh), and sigmoid\(\rbrace\)

  4. 4

    The dropout rate: \(\lbrace 0.1,0.3,0.5 \rbrace\)

  5. 5

    The learning rate: \(\lbrace 0.01,0.001 \rbrace\)

  6. 6

    The momentum: \(\lbrace 0.7,0.9 \rbrace\)

We trained several networks with two, three, four, and five hidden layers and varying numbers of neurons (300, 200, 100, 50 and 25). We have included the best results for each trained network in the Table 2. As shown in this table, training a network with three hidden layers improves the results without significantly increasing the training time when compared to training a network with two hidden layers. The results improve slightly for networks with four or five hidden layers, but the computational time increases significantly. We chose a network with three hidden layers with 300, 200, and 100 neurons, respectively, due to the significant increase in computational cost and little benefit in terms of model performance of other structures. We had good results in terms of both Area Under the Receiver-Operating Characteristic (AUROC) and Area Under the Precision–Recall Curve (AUPRC) for the mentioned network, with a computational time of 8 h and 40 min.

Table 2 Results of different neural network architectures

The architecture of neural network

The Neural Network is a feedforward network with fully connected layers consisting of an input layer, three hidden layers, and the output layer (Fig. 1c). The number of input layer neurons is equal to the size of the feature vector with size 525. The output layer has one neuron with probability value. For i-th side effect, we assign a class 0 (absence an interaction) or 1 (represent an interaction) to the output by using a threshold \(\theta _{i}\) in the range of (0, 1). If the probability value is greater than \(\theta _{i}\), the method suggests that the i-th side effect represents in the selected pair of drugs, otherwise, this side effects is not represent in the considered pair of drugs. For initialization weights, the Glorot normal initializer, also called Xavier normal initializer is applied [52]. By learning and investigating the results of the activation function of the neural network, we utilize the ReLU activation function between the layers of the neural model and consider a sigmoid activation function for the output layer (Fig. 1c). The optimization of the model parameters is done by using the binary-cross-entropy loss function and Stochastic Gradient Descent (SGD) [53]. In addition, we trained datasets based on different parameters (see Additional file 1: Table S1). We calculated and averaged loss value (MLoss) of each model over all 964 side effects for each epoch. Figure 2 shows the results of this investigation. In this work, MLoss is obtained by the following formula:

$$\begin{aligned} MLoss_{i} =\frac{\Sigma _{j=1}^{964}Loss_{side~effect_{j}}}{964} ,\quad for\;epoch\; i=1,\ldots ,50 \end{aligned}$$

We depicted the Fig. 3 that considered AUROC against loss value when selecting epoch for the best performing model (NNPS).To do so, we calculated and averaged the AUROC (MAUROC) and MLoss of the best performing model for each epoch over all 964 side effects and plotted them, where MAUROC is obtained by the following formula:

$$\begin{aligned} MAUROC_{i} =\frac{\Sigma _{j=1}^{964}AUROC_{side~effect_{j}}}{964} ,\quad for\;epoch\; i=1,\ldots ,50 \end{aligned}$$

As shown in this figure, the considering structure works well across 964 polypharmacy side effects. As a result, we considered epoch 50 based on Figs. 2 and 3 for the best performance model of our neural network.

Fig. 2

Loss curves of models based on different parameters for 50 epochs over all 964 polypharmacy side effects

Fig. 3

MAUROC and MLoss of NNPS model for 50 epochs over all 964 polypharmacy side effects


Training hyperparameters

According to Fig. 2, the hyperparameters based on 5-fold cross-validation for the best model which we named NNPS are tuned by 50 epochs and batch size 1024 with a dropout rate of 0.1 for preventing over-fitting and learning rate 0.01 and momentum value 0.9 by trial and error are considered. Because the presence or absence of polypharmacy side effects is determined by a threshold, a ROC curve for each side effect is plotted, and the threshold \(\theta _{i}\) with the highest F-score value is chosen. The hyperparameter values, the standard deviation, and the average thresholds for NNPS method are shown in Table 3.

Table 3 The selected hyperparameter values

Assessment and comparison

In this section, the performance of NNPS is benchmarked against 5 well-known methods, Decagon, Concatenated drug features, Deep Walk, DEDICOM, and RESCAL, for 964 polypharmacy side effects. We adopt the 5-fold cross-validation for 50 iterations and use the average of the results to assess the performance of the NNPS method. The average of AUROC and AUPRC values of all methods for 964 polypharmacy side effects are presented in Table 4. Because only the source code and implementation of Decagon are available, we execute 5-fold cross-validation for 50 iterations for the Decagon method and see that the obtained results are very similar to the reported results of the Decagon method in [16]. In Table 4, we mention the average of the obtained results for the Decagon method and the reported performances of other methods that we do not have their source code by using Table 2 in [16]. According to Table 4, NNPS achieves the improvement 9.2% and 12.8% against Decagon which is the best algorithm among other well-known methods in terms of AUROC and AUPRC, respectively. To compare the results of NNPS more precisely, we compare it to the results of the Decagon with more details and by some more criteria. Figure 4 illustrates the boxplots of AUROC and AUPRC criteria for 964 polypharmacy side effects resulted by NNPS and Decagon methods, respectively. As shown in Fig. 4, it can be concluded that the median of the AUROC and AUPRC criteria related to NNPS are much higher than the median of the AUROC and AUPRC criteria related to the Decagon method, and the range of variation of the AUROC and AUPRC criteria for NNPS method are less than the range of variation of the AUROC and AUPRC criteria for the Decagon method which is the evidence of good performance of NNPS.

Fig. 4

Boxplot of area under the receiver-operating characteristic (AUROC) and area under the precision–recall curve (AUPRC) values of all 964 side effects for NNPS and Decagon methods

Table 4 The average of Area under ROC curve (AUROC), area under precision–recall curve (AUPRC) for 964 polypharmacy side effects prediction

For more evaluation, the best thresholds that have produced the best results for each polypharmacy side effects based on F-score values for NNPS and Decagon methods are detected and the results of NNPS and Decagon based on F-score, Accuracy (ACC), and Matthews Correlation Coefficient (MCC) are compared. Table 5 reports True Positive (TP), False Positive (FP), True Negative (TN), False Negative (FN), Precision, Recall, F-score, ACC and MCC of these two methods for all 964 side effects. According to Table 5, NNPS outperforms about 8.6%, 10.3%, and 18.1% against Decagon based on F-score, ACC, and MCC criteria, respectively.

Table 5 The average of the best results for NNPS and Decagon methods for 964 side effects
Fig. 5

Receiver-Operating Characteristic (ROC) curve (part a) and loss curve (part b) of Schizoaffective disorder polypharmacy side effect for 50 epochs

Evaluation of feature selection, aggregation, and train/test set sizes

In this part, to show the significance of the PCA algorithm for dimension reduction, we compare the results of NNPS by using the low variance filter and autoencoder techniques as two another feature selection methods. We use these two techniques to reduce the mono side effects and drug–protein interaction matrices features to 503 and 22 features, respectively. In Table 6, the results of NNPS with both dimension reduction techniques are presented. This table shows that the performance of the NNPS method is higher when PCA technique was used. Also, we adopt two operators (i.e., summation and concatenation) to aggregate the feature vectors of two drugs into one feature vector for representing the drug-drug pairs in neural network architecture. As shown in Table 7, the summation operator achieves better results with respect to the results of NNPS when we concatenate the feature vectors of two drugs as features for feeding the neural network. We train the NNPS method with two different size of train, validation, and test sets, and represent the results in Table 8. This table shows that the performance of the NNPS method has very little reduction by decreasing the size of the train set which is evidence of the advantage of the method. Finally, we compared the performance of our method to four well-known machine learning algorithms using AUROC and AUPRC. The average results of these methods for all 964 polypharmacy side effects are shown in Table 9. According to the values in the Table 9, NNPS has the best performance among all methods.

Table 6 The results for three dimension reduction techniques for 964 side effects
Table 7 Results of two feature aggregation operators for 964 side effects
Table 8 The results of NNPS method with different size of Training set (Tr set), Validation and Test sets (VT sets) of dataset
Table 9 Results of different machine learning methods

Time complexity

Between the previous methods, only the source code and implementation of Decagon are available. So, we can only compare the time complexity of NNPS to Decagon method. The time of NNPS is about 8 h (Linux (Ubuntu 16.04), 15 CPUs, Intel Xeon(R) 2.00 GHz) on DPIs and DDIs datasets and is therefore noticeably faster than Decagon which requires 15 days for 5-fold cross-validation on a single GTX1080Ti graphic card. This decreased training time in NNPS that stems from the simplicity and efficiency of this model, is one of the main advantages of NNPS which can further be generalized to other purposes and datasets as well.

Discussion and conclusion

Due to the enormous number of drug combinations, screening all possible pairs to achieve polypharmacy side effects are unfeasible in terms of cost and time. On the other hand, understanding the side effects of DDIs is an essential step in drug development and drug co-administration. So, some computational methods are developed for predicting polypharmacy side effects. The lately approach in this task (Decagon method) predicts the performance of polypharmacy side effects up to 0.874 and 0.825 in terms of accuracy on AUROC and AUPRC, respectively. In this study, we consider a neural network architecture with novel feature vectors. In NNPS method, each drug represents by a feature vector based on mono side effects and drug–protein interactions, and to decrease the method complexity, the PCA is used for dimension reduction of feature vectors. For a given drug pair, the corresponding drug feature vectors are summed to train the neural network for predicting polypharmacy side effects. The superior performance of NNPS occurs for two reasons. The first main reason is the novel feature vectors that are obtained by the dimension reduction techniques. The second reason is chosen a simple neural network architecture. We can see NNPS achieves excellent accuracy on the polypharmacy side effects prediction task that are shown in Additional file 1 and Table 10. We have provided 10 best and worst performance polypharmacy side effects based on AUROC and AUPRC in both NNPS and Decagon methods. The results can be found in Additional file 1: Tables S2–S7. These tables belong to the results of NNPS and Decagon which show that the performance of the NNPS method is better than the performance of the Decagon method. Figure 5 part (a) shows the ROC curve for Schizoaffective disorder side effect (one of the best performances of NNPS). Part (b) of Fig. 5 illustrates the loss curve of model for different epochs. Similarly, Fig. 6 part (a) and (b) show the ROC and Loss curves for NNPS related to Icterus side effect, one of the worst performances of NNPS, respectively. As shown by these figures, NNPS works well for each side effect alone and is acceptable with respect to the loss values for epoch 50. Among side effects with the best performance in NNPS, five important side effects that can lead to death or serious complications are selected [54,55,56,57,58]. The performance results in NNPS and Decagon methods and the literature evidence for supporting these dangerous side effects are collected in Table 10. According to Table 10, the performances of dangerous polypharmacy side effects in NNPS on AUROC have values of 1.0, but in Decagon are located between 0.791 and 0.936. Also, we can see that on AUPRC the NNPS method have values of 1.0 but the Decagon performances are between 0.789 and 0.911. The finding of this tables show that in dangerous side effects, the performance of NNPS is higher than the performance of Decagon, and the NNPS is an effective approach for predicting polypharmacy side effects especially in order to detect dangerous side effects.

Fig. 6

Receiver-Operating Characteristic (ROC) curve (part a) and loss curve (part b) of Icterus polypharmacy side effect for 50 epochs

Table 10 Results of dangerous side effects in NNPS and Decagon on AUROC and AUPRC

In summary, the evaluation of the NNPS against five well-known methods, in terms of accuracy, complexity, and running time speed shows the performance of the presented method for an essential and challenging problem in pharmacology.

As for future work, we suggest adding the protein–protein interaction information to the model, as it plays a crucial role in many biological functions and may lead to more accurate results. Another avenue for research is to apply the proposed method to other datasets and compare their findings on the association of diseases and polypharmacy side effects with the current work.

Availability of data and materials

Datasets and code for NNPS algorithm are freely accessible at


  1. 1.

    Masnoon N, Shakib S, Kalisch-Ellett L, Caughey GE. What is polypharmacy? A systematic review of definitions. BMC Geriatr. 2017;17(1):1–10.

    Article  Google Scholar 

  2. 2.

    World Health Organization. Medication safety in polypharmacy. Med Without Harm. 2019;1(1):1–63.

    Google Scholar 

  3. 3.

    Wilson M, Mcintosh J, Codina C, Flemming G, Geitona M, Gillespie U, Harrison C, Illario M, Kinnear M, Fernandez-llimos F, Kempen T, Menditto E, Michael N, Scullin C, Wiese B. Alpana Mair Plus the SIMPATHY consortium. Robert Gordon University Aberdeen (2017)

  4. 4.

    Avery T, Barber N, Ghaleb B, Franklin BD, Armstrong S, Crowe S, Dhillon S, Freyer A, Howard R, Pezzolesi C, Serumaga B, Swanwick G, Olanrenwaju T. Investigating the prevalence and causes of prescribing errors in general practice?: The PRACtICe Study (PRevalence And Causes of prescrIbing errors in general practiCe) A report for the GMC. General Med Counc. 2012;1(May):1–187.

    Google Scholar 

  5. 5.

    Rodrigues MCS, De Oliveira C. Interações medicamentosas e reações adversas a medicamentos em polifarmácia em idosos: Uma revisão integrativa. Rev Lat Am Enfermagem. 2016;24:1–17.

    Article  Google Scholar 

  6. 6.

    Shah BM, Hajjar ER. Polypharmacy, adverse drug reactions, and geriatric syndromes. Clin Geriatr Med. 2012;28(2):173–86.

    Article  PubMed  Google Scholar 

  7. 7.

    Oluwaseun E. William\_P. Acad Div Child Health. 2015;101(4):1–13.

    Google Scholar 

  8. 8.

    Verrotti A, Tambucci R, Di Francesco L, Pavone P, Iapadre G, Altobelli E, Matricardi S, Farello G, Belcastro V. The role of polytherapy in the management of epilepsy: suggestions for rational antiepileptic drug selection. Expert Rev Neurother. 2020;20(2):167–73.

    CAS  Article  PubMed  Google Scholar 

  9. 9.

    Hosseini L, Hajibabaee F, Navab E. Reviewing polypharmacy in elderly. Syst Rev Med Sci. 2020;1(1):17–24.

    Google Scholar 

  10. 10.

    Chen C-m, Kuo L-n, Cheng K-j, Shen W-c, Bai K-j, Wang C-c, Chiang Y-c, Chen H-y. The effect of medication therapy management service combined with a national PharmaCloud system for polypharmacy patients. Comput Methods Programs Biomed. 2016;134(1):109–11.

    Article  Google Scholar 

  11. 11.

    Zhang P, Wang F, Hu J, Sorrentino R. Label propagation prediction of drug–drug interactions based on clinical side effects. Sci Rep. 2015;5(1):1–10.

    Article  Google Scholar 

  12. 12.

    Valenza PL, McGinley TC, Feldman J, Patel P, Cornejo K, Liang N, Anmolsingh R, McNaughton N. Dangers of polypharmacy. Vignettes Patient Saf. 2017;1(1):47–69.

    Article  Google Scholar 

  13. 13.

    Stephen LJ, Brodie MJ. Antiepileptic drug monotherapy versus polytherapy: pursuing seizure freedom and tolerability in adults. Curr Opin Neurol. 2012;25(2):164–72.

    CAS  Article  PubMed  Google Scholar 

  14. 14.

    Andrew T, Milinis K, Baker G, Wieshmann U. Self reported adverse effects of mono and polytherapy for epilepsy. Seizure. 2012;21(8):610–3.

    Article  PubMed  Google Scholar 

  15. 15.

    Aggarwal A, Mehta S, Gupta D, Sheikh S, Pallagatti S, Singh R, Singla I. Clinical & immunological erythematosus patients characteristics in systemic lupus Maryam. J Dent Educ. 2012;76(11):1532–9.

    Article  PubMed  Google Scholar 

  16. 16.

    Zitnik M, Agrawal M, Leskovec J. Modeling polypharmacy side effects with graph convolutional networks. Bioinformatics. 2018;34(13):457–66.

    CAS  Article  Google Scholar 

  17. 17.

    St. Louis E. Truly, “rational” polytherapy: maximizing efficacy and minimizing drug interactions, drug load, and adverse effects. Curr Neuropharmacolo. 2009;7(2):96–105.

  18. 18.

    Holmes LB, Mittendorf R, Shen A, Smith CR, Hernandez-Diaz S. Fetal effects of anticonvulsant polytherapies: different risks from different drug combinations. Arch Neurol. 2011;68(10):1273–9.

    Article  Google Scholar 

  19. 19.

    Shtar G, Rokach L, Shapira B. Detecting drug–drug interactions using artificial neural networks and classic graph similarity measures. PLoS ONE 14(8), 1–25 (2019). arXiv:1903.04571

  20. 20.

    Mekonnen AB, Alhawassi TM, McLachlan AJ, Brien JE. Adverse drug events and medication errors in African hospitals: a systematic review. Drugs Real World Outcomes. 2018;5(1):1–24.

    Article  PubMed  Google Scholar 

  21. 21.

    Alsulami Z, Conroy S, Choonara I. Medication errors in the Middle East countries: a systematic review of the literature. Eur J Clin Pharmacol. 2013;69(4):995–1008.

    Article  PubMed  Google Scholar 

  22. 22.

    Sears K, Scobie A, Mackinnon NJ. Patient-related risk factors for self-reported medication errors in hospital and community settings in 8 countries. Can Pharm J. 2012;145(2):88–93.

    Article  Google Scholar 

  23. 23.

    Lin X, Quan Z, Wang Z-J, Ma T, Zeng X. KGNN: knowledge graph neural network for drug-drug interaction prediction. IJCAI. 2020.

    Article  Google Scholar 

  24. 24.

    Davies EA, O’Mahony MS. Adverse drug reactions in special populations—the elderly. Br J Clin Pharmacol. 2015;80(4):796–807.

  25. 25.

    Molokhia M, Majeed A. Current and future perspectives on the management of polypharmacy. BMC Fam Pract. 2017;18(1):1–9.

    Article  Google Scholar 

  26. 26.

    Hubbard RE, O’Mahony MS, Woodhouse KW. Medication prescribing in frail older people. Eur J Clin Pharmacol. 2013;69(3):319–26.

  27. 27.

    Liu R, AbdulHameed MDM, Kumar K, Yu X, Wallqvist A, Reifman J. Data-driven prediction of adverse drug reactions induced by drug–drug interactions. BMC Pharmacol Toxicol. 2017;18(1):1–18.

    CAS  Article  Google Scholar 

  28. 28.

    Zhang W, Chen Y, Liu F, Luo F, Tian G, Li X. Predicting potential drug–drug interactions by integrating chemical, biological, phenotypic and network data. BMC Bioinform. 2017;18(1):1–12.

    CAS  Article  Google Scholar 

  29. 29.

    Lewis R, Guha R, Korcsmaros T, Bender A. Synergy maps: exploring compound combinations using network-based visualization. J Cheminform. 2015;7(1):1–11.

    CAS  Article  Google Scholar 

  30. 30.

    Percha B, Garten Y, Altman RB. Discovery and explanation of drug–drug interactions via text mining. Pac Symp Biocomput. 2012;1:410–21.

    Google Scholar 

  31. 31.

    Vilar S, Friedman C, Hripcsak G. Detection of drug–drug interactions through data mining studies using clinical sources, scientific literature and social media. Brief Bioinform. 2018;19(5):863–77.

    CAS  Article  PubMed  Google Scholar 

  32. 32.

    Chen D, Zhang H, Lu P, Liu X, Cao H. Synergy evaluation by a pathway–pathway interaction network: a new way to predict drug combination. Mol BioSyst. 2016;12(2):614–23.

    CAS  Article  PubMed  Google Scholar 

  33. 33.

    Huang L, Li F, Sheng J, Xia X, Ma J, Zhan M, Wong STC. DrugComboRanker: drug combination discovery based on target network analysis. Bioinformatics. 2014;30(12):228–36.

    CAS  Article  Google Scholar 

  34. 34.

    Sun Y, Sheng Z, Ma C, Tang K, Zhu R, Wu Z, Shen R, Feng J, Wu D, Huang D, Huang D, Fei J, Liu Q, Cao Z. Combining genomic and network characteristics for extended capability in predicting synergistic drugs for cancer. Nat Commun. 2015;6:1–10.

    CAS  Article  Google Scholar 

  35. 35.

    Takeda T, Hao M, Cheng T, Bryant SH, Wang Y. Predicting drug–drug interactions through drug structural similarities and interaction networks incorporating pharmacokinetics and pharmacodynamics knowledge. J Cheminform. 2017;9(1):1–9.

    CAS  Article  Google Scholar 

  36. 36.

    Gottlieb A, Stein GY, Oron Y, Ruppin E, Sharan R. INDI: a computational framework for inferring drug interactions and their associated recommendations. Mol Syst Biol. 2012;8(592):1–12.

    CAS  Article  Google Scholar 

  37. 37.

    Li X, Xu Y, Cui H, Huang T, Wang D, Lian B, Li W, Qin G, Chen L, Xie LCO. Artif Intell Med. 2017;17(83):35–43.

    Article  Google Scholar 

  38. 38.

    Li J, Zheng S, Chen B, Butte AJ, Swamidass SJ, Lu Z. A survey of current trends in computational drug repositioning. Brief Bioinform. 2016;17(1):2–12.

    Article  PubMed  Google Scholar 

  39. 39.

    Zitnik M, Zupan B. Data fusion by matrix factorization. IEEE Trans Pattern Anal Mach Intell. 2015;37(1):41–53. arXiv:1307.0803.

    Article  PubMed  Google Scholar 

  40. 40.

    Ferdousi R, Safdari R, Omidi Y. Computational prediction of drug–drug interactions based on drugs functional similarities. J Biomed Inform. 2017;70:54–64.

    Article  PubMed  Google Scholar 

  41. 41.

    Vilar S, Harpaz R, Uriarte E, Santana L, Rabadan R, Friedman C. Drug–drug interaction through molecular structure similarity analysis. J Am Med Inform Assoc. 2012;19(6):1066–74.

    Article  PubMed  PubMed Central  Google Scholar 

  42. 42.

    Nickel M, Tresp V, Kriegel HP. A three-way model for collective learning on multi-relational data. In: Proceedings of the 28th international conference on machine learning, ICML 2011, vol. 1, p. 809–16 (2011)

  43. 43.

    Papalexakis EE, Faloutsos C, Sidiropoulos ND. Tensors for data mining and data fusion: Models, applications, and scalable algorithms. ACM Trans Intell Syst Technol. 2016;8(2):1–44.

    Article  Google Scholar 

  44. 44.

    Perozzi B, Al-Rfou R, Skiena S. DeepWalk: Online learning of social representations. In: Proceedings of the ACM SIGKDD international conference on knowledge discovery and data mining, vol. 1, no. 1, p. 701–10, 2014. arXiv:1403.6652.

  45. 45.

    Zong N, Kim H, Ngo V, Harismendy O. Deep mining heterogeneous networks of biomedical linked data to predict novel drug-target associations. Bioinformatics. 2017;33(15):2337–44.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  46. 46.

    Martinez CJ, Torrie JH, Allen ON. Correlation analysis of criteria of symbiotic nitrogen. Fixation by soybeans (Glycine max Merr.). Zentralblatt fur Bakteriologie, Parasitenkunde, Infektionskrankheiten und Hygiene. Zweite naturwissenschaftliche Abt.: Allgemeine, landwirtschaftliche und technische Mikrobiologie 124(3), 212–6 (1970).

  47. 47.

    Kuhn M, Letunic I, Jensen LJ, Bork P. The SIDER database of drugs and side effects. Nucleic Acids Res. 2016;44(D1):1075–9.

    CAS  Article  Google Scholar 

  48. 48.

    Menche J, Sharma A, Kitsak M, Ghiassian SD, Vidal M, Loscalzo J, Barabási AL. Uncovering disease–disease relationships through the incomplete interactome. Science. 2015;347(6224):841.

  49. 49.

    Chatr-Aryamontri A, Breitkreutz BJ, Oughtred R, Boucher L, Heinicke S, Chen D, Stark C, Breitkreutz A, Kolas N, O’Donnell L, Reguly T, Nixon J, Ramage L, Winter A, Sellam A, Chang C, Hirschman J, Theesfeld C, Rust J, Livstone MS, Dolinski K, Tyers M. The BioGRID interaction database: 2015 update. Nucleic Acids Res. 2015;43(D1):470–8.

  50. 50.

    Szklarczyk D, Morris JH, Cook H, Kuhn M, Wyder S, Simonovic M, Santos A, Doncheva NT, Roth A, Bork P, Jensen LJ, Von Mering C. The STRING database in 2017: quality-controlled protein–protein association networks, made broadly accessible. Nucleic Acids Res. 2017;45(D1):362–8.

    CAS  Article  Google Scholar 

  51. 51.

    Rolland T, Taşan M, Charloteaux B, Pevzner SJ, Zhong Q, Sahni N, Yi S, Lemmens I, Fontanillo C, Mosca R, Kamburov A, Ghiassian SD, Yang X, Ghamsari L, Balcha D, Begg BE, Braun P, Brehme M, Broly MP, Carvunis AR, Convery-Zupan D, Corominas R, Coulombe-Huntington J, Dann E, Dreze M, Dricot A, Fan C, Franzosa E, Gebreab F, Gutierrez BJ, Hardy MF, Jin M, Kang S, Kiros R, Lin GN, Luck K, Macwilliams A, Menche J, Murray RR, Palagi A, Poulin MM, Rambout X, Rasla J, Reichert P, Romero V, Ruyssinck E, Sahalie JM, Scholz A, Shah AA, Sharma A, Shen Y, Spirohn K, Tam S, Tejeda AO, Trigg SA, Twizere JC, Vega K, Walsh J, Cusick ME, Xia Y, Barabási AL, Iakoucheva LM, Aloy P, De Las Rivas J, Tavernier J, Calderwood MA, Hill DE, Hao T, Roth FP, Vidal M. A proteome-scale map of the human interactome network. Cell. 2014;159(5):1212–26.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  52. 52.

    Glorot X, Bengio Y. Understanding the difficulty of training deep feedforward neural networks. J Mach Learn Res. 2010;9(1):249–56.

    Google Scholar 

  53. 53.

    Bottou L. Stochastic gradient descent tricks. Lecture notes in computer science (including subseries Lecture notes in artificial intelligence and lecture notes in bioinformatics) 7700 LECTURE NO(1), 421–436 (2012).

  54. 54.

    Serban B, Panti Z, Nica M, Pleniceanu M, Popa M, Ene R, Cîrstoiu C. Statistically based survival rate estimation in patients with soft tissue tumors. Rom J Orthop Surg Traumatol. 2019;1(2):84–9.

    Article  Google Scholar 

  55. 55.

    Arbyn M, Weiderpass E, Bruni L, de Sanjosé S, Saraiya M, Ferlay J, Bray F. Estimates of incidence and mortality of cervical cancer in 2018: a worldwide analysis. Lancet Glob Health. 2020;8(2):191–203.

  56. 56.

    Januszewicz A, Guzik T, Prejbisz A, Mikołajczyk T, Osmenda, G, Januszewicz W. 158\_Prejbisz\_ONLINE. PALSKIE 126Janusze(1), 86–93 (2016)

  57. 57.

    Atci IB, Yilmaz H, Yaman M, Baran O, Türk O, Solmaz B, Kocaman Ü, Ozdemir NG, Demirel N, Kocak A. Incidence, hospital costs and in-hospital mortality rates of surgically treated patients with traumatic cranial epidural hematoma. Rom Neurosurg. 2018;32(1):110–5.

    Article  Google Scholar 

  58. 58.

    Evans EC, Matteson KA, Orejuela FJ, Alperin M, Balk EM, El-Nashar S, Gleason JL, Grimes C, Jeppson P, Mathews C, Wheeler TL, Murphy M. Salpingo-oophorectomy at the time of benign hysterectomy: a systematic review. Obstet Gynecol. 2016;128(3):476–85.

    Article  PubMed  Google Scholar 

Download references


Changiz Eslahchi and others would like to thank the School of Biological Sciences, Institute for Research in Fundamental Sciences (IPM) and Computing Center of IPM in performing a parallel computing is gratefully acknowledged.


No funding to declare.

Author information




CHE and RA developed the methods. RM performed the computational and statistical analysis. CHE, RM, and RA design the paper and RM wrote the paper. CHE and RA contributed to writing and editing the manuscript. All authors read and approved the final manuscript.

Corresponding authors

Correspondence to Rosa Aghdam or Changiz Eslahchi.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1.

Different hyperparameters values for 964 side effects of each model, and the results of 10 best and worst performance of polypharmacy side effects in NNPS and Decagon on AUROC and AUPRC. Bold numbers show the best performance for each criteria.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Masumshah, R., Aghdam, R. & Eslahchi, C. A neural network-based method for polypharmacy side effects prediction. BMC Bioinformatics 22, 385 (2021).

Download citation


  • Polypharmacy side effects prediction
  • Neural network
  • Drug–protein interactions
  • Drug–drug interactions