 Research
 Open access
 Published:
GACNNMDA: a computational model for predicting potential human microbedrug associations based on graph attention network and CNNbased classifier
BMC Bioinformatics volume 24, Article number: 35 (2023)
Abstract
As new drug targets, human microbes are proven to be closely related to human health. Effective computational methods for inferring potential microbedrug associations can provide a useful complement to conventional experimental methods and will facilitate drug research and development. However, it is still a challenging work to predict potential interactions for new microbes or new drugs, since the number of known microbedrug associations is very limited at present. In this manuscript, we first constructed two heterogeneous microbedrug networks based on multiple measures of similarity of microbes and drugs, and known microbedrug associations or known microbediseasedrug associations, respectively. And then, we established two feature matrices for microbes and drugs through concatenating various attributes of microbes and drugs. Thereafter, after taking these two feature matrices and two heterogeneous microbedrug networks as inputs of a twolayer graph attention network, we obtained low dimensional feature representations for microbes and drugs separately. Finally, through integrating low dimensional feature representations with two feature matrices to form the inputs of a convolutional neural network respectively, a novel computational model named GACNNMDA was designed to predict possible scores of microbedrug pairs. Experimental results show that the predictive performance of GACNNMDA is superior to existing advanced methods. Furthermore, case studies on wellknown microbes and drugs demonstrate the effectiveness of GACNNMDA as well. Source codes and supplementary materials are available at: https://github.com/tyqGitHub/TYQ/tree/master/GACNNMDA
Background
Researches show that Microorganisms play an integral and often unique role in human beings [1]. The microbiota and its metabolites are essential to the regulation of the host metabolism and immunity [2]. Microbes have a great impact on human health in many ways, including resistance to the invasion of opportunistic pathogens [3], promotion of the synthesis of sugar metabolism and synthesis of the necessary vitamins to boost Tcell responses [4], etc. In recent years, different aspects of the microbiome and its potential role in human health, including the early life and specific diseases, have been widely reported. For instance, Sprockett et al. explored how priority effects might influence microbial communities in the gastrointestinal tract during early childhood [5]. Ximenez et al. discussed the development of microbiota during the early times of life, from pregnancy to delivery to the early years after birth [6]. And in addition, it has been demonstrated that the intestinal microbiota plays a key role in cardiometabolic disorders, inflammatory bowel diseases, neuropsychiatric diseases and cancer separately [7,8,9,10,11,12]. Moreover, bacteria and viruses have been proven to be able to cause infectious diseases such as COVID19 as well [13].
Simultaneously, studies show that when using drugs to treat diseases, not only the administration of drugs can affect the microbiome, but also microbial metabolism can significantly affect the clinical response of drugs [14, 15]. For example, penicillin is an important antibiotic with high efficiency and has treated pneumonia, meningitis, endocarditis, diphtheria, anthrax and so on. However, the widespread use of antibiotics has led to the development of resistance in human microbes such as staphylococcus aureus and Escherichia coli. As a result, there is an urgent need to uncover potential associations between microbes and drugs for drug development. Considering that traditional bioexperiments are quite expensive and timeconsuming, it is meaningful to develop calculation models to infer possible associations between microbes and drugs, because these models can be used to guide the experimental designs of wetlab experiments efficiently.
With the development of bioinformatical technologies, in recent years, several wellknown public microbedrug association databases such as MDAD [16], aBiofilm [17] and Drugvirus [18] have been constructed successively. Based on these databases, researchers around the world have proposed a large number of prediction methods that can be utilized to identify latent associations between microbedrug pairs. For example, though introducing the KATZ metric to detect possible associations between microbedrug pairs, Zhu et al. designed a prediction model named HMDAKATZ [19]. By integrating the metapath2vec scheme with a bipartite network recommendation algorithm, Long et al. proposed a computational approach called HNERMDA to infer microbedrug associations [20]. Additionally, in 2021, Zhu et al. introduced a novel Laplacian Regularized Least Square based prediction method called LRLSMDA, which can discover latent associations between microbedrug pairs effectively [21]. In the literature [22], through combining the graph convolutional network (GCN) with the conditional random field (CRF), Long et al. conceived a calculative model named GCNMDA to predict possible microbedrug associations. In the literature [23], Long et al. constructed a framework of graph attention networks called EGATMDA for latent microbe–drug association prediction. Furthermore, In 2022, Deng et al. designed a multimodal variational graph embedding model named Graph2MDA for prediction of possible microbe–drug associations [24].
Inspired by above methods, through combining the graph attention network (GAT) with a convolutional neural network (CNN)based classifier, we proposed a novel computational model called GACNNMDA to discover potential microbedrug associations in this manuscript. In GACNNMDA, through combining multiple measures of similarity of microbes and drugs, with known microbedrug associations or known microbediseasedrug associations respectively, we constructed two heterogeneous microbedrug networks first. And then, by leveraging multiple types of microbe and drug features, we established two feature matrices for microbes and drugs simultaneously. Thereafter, after inputting these two feature matrices and two heterogeneous microbedrug networks into a twolayer graph attention network (GAT), we obtained low dimensional feature representations for microbes and drugs respectively. Finally, we designed a convolutional neural network (CNN)based classifier to predict possible scores of microbedrug pairs, by integrating low dimensional feature representations and two feature matrices to form the inputs. Moreover, in order to verify the predictive performance of GACNNMDA, we performed intensive comparison experiments and case studies. Experimental results demonstrated that GACNNMDA outperformed existing representative competitive methods, and can achieve satisfactory performances in latent microbedrug association prediction.
Data sources
Firstly, we will download known microbedrug associations from the database MDAD (http://www.chengroup.cumt.edu.cn/MDAD/), which includes 2470 clinically or experimentally verified microbedrug associations between 1373 drugs and 173 microbes.
Secondly, we will download known associations among microbes, drugs and diseases from the dataset collected by Wang et al. [25], which consists of 70,315 known drugdisease associations and 15,633 known microbedisease associations. After removing those associations associated with diseases that have no known association with any drug or microbe included in MDAD, we obtained 1121 different drugdisease associations between 233 drugs and 109 diseases, and 402 different microbedisease associations between 73 microbes and 109 diseases respectively.
Finally, from the dataset constructed by Deng et al. [24], we collected 5586 known drugdrug interactions covering 1228 drugs in MDAD, and 138 microbemicrobe interactions covering 123 microbes in MDAD, separately. Details of these aforementioned data were shown in the following Table 1.
For convenience, all these newly downloaded datasets of diseases, drugs, microbes, drufdisease associations, drugdrug interactions, microbedrug associations, microbedisease associations and microbemicrobe interactions will be kept in Additional files 1–8 separately.
Methods
As shown in Fig. 1, GACNNMDA mainly consists of three parts:
Part 1: In this part, through adopting multiple measures of similarity, two heterogenous networks HN_{1} and HN_{2} will be constructed based on downloaded known microbedrug associations, drugdrug interactions and microbemicrobe interactions.
Part 2: In this part, two feature matrices will be obtained for microbes and drugs by leveraging various attributes of microbes and drugs first, and then, through taking these two feature matrices and two heterogeneous networks as inputs, a twolayer graph attention network will be further designed to learn low dimensional feature representations for microbes and drugs.
Part 3: In this part, a CNNbased classifier will be introduced to calculate possible scores of drugmicrobe associations, in which, those newly learned low dimensional feature representations will be integrated with those two feature matrices to form its inputs.
Construction of two heterogeneous networks
For convenience, let n_{r} and n_{m} represent the numbers of those newly downloaded drugs and microbes separately. Firstly, based on those newly downloaded known microbedrug associations, we can obtain a microbedrug adjacency matrix \(A^{1} \in R^{{n_{r} *n_{m} }}\) as follows: for any given drug r_{i} and microbe m_{j}, if there is a known association between them, then there is \(A^{1} \left( {i,j} \right) = 1\), otherwise there is \(A^{1} \left( {i,j} \right) = 0\).
Secondly, based on those newly downloaded known microbedrug, microbedisease and drugdisease associations, we can obtain another microbedrug adjacency matrix \(A^{2} \in R^{{n_{r} *n_{m} }}\) as follows: for any given drug r_{i}, microbe m_{j} and disease d_{k}, if there is a known association between r_{i} and d_{k}, and a known association between m_{j} and d_{k}, simultaneously, then there is \(A^{2} \left( {i,j} \right) = 1\), otherwise there is \(A^{2} \left( {i,j} \right) = A^{1} \left( {i,j} \right)\).
Finally, based on above matrices \({A}^{1}\) and \({A}^{2}\), we can construct two heterogeneous networks HN_{1} and HN_{2} respectively according to the methods proposed in the following “Calculation of the Gaussian interaction profile (GIP) kernel similarity for microbes and drugs” to “Calculation of the Gaussian interaction profile (GIP) kernel similarity for microbes and drugs” sections.
Let \(A^{v} \left( {r_{i} } \right)\) and \(A^{v} \left( {m_{j} } \right)\) denote the \(i\)th row and the \(j\)th column of \(A^{v}\) (v = 1,2) respectively, and \(\left\ \bullet \right\\) represent the Frobenius norm, then for any two given drugs \(r_{i}\) and \(r_{j}\), we can calculate the GIP kernel similarity between them as follows:
According to above equations, it is easy to see that we can obtain a new GIP kernel similarity matrix \(S_{rg}^{v} \in R^{{n_{r} {*}n_{r} }}\).
Similarly, for any two given microbes \(m_{i}\) and \(m_{j}\), we can calculate the GIP kernel similarity between them as follows:
According to above equations, it is obvious that we can obtain a new GIP kernel similarity matrix \(S_{mg}^{v} \in R^{{n_{m} *n_{m} }}\).
Calculation of the Hamming interaction profile (HIP) similarity for microbes and drugs
Based on the assumption that two nodes will have lower similarity when their interaction profiles are more different. Let • denote the number of elements in the profile, then for any two given drugs r_{i} and r_{j}, we can calculate the HIP similarity between them as follows:
where \(\left {A\left( {r_{i} } \right)! = A\left( {r_{j} } \right)} \right\) denotes the number of different elements between the profiles A \(\left( {r_{i} } \right)\) and \(A\left( {r_{j} } \right)\).
Similarly, for any two given microbes m_{i} and m_{j}, we can calculate the HIP similarity between them as follows:
where \(\left {A\left( {m_{i} } \right)! = A\left( {m_{j} } \right)} \right\) denotes the number of different elements between the profiles A \(\left( {m_{i} } \right)\) and \(A\left( {m_{j} } \right)\).
According to above equations, it is obvious that we can obtain two new HIP similarity matrices \(S_{rh}^{v} \in R^{{n_{r} *n_{r} }}\) and \(S_{mh}^{v} \in R^{{n_{m} *n_{m} }}\) separately.
Integrated similarity
Based on \(S_{rg}^{v}\), \(S_{rh}^{v}\) and newly downloaded known drugdrug interactions, for any two given drugs r_{i} and r_{j}, we can calculate an integrated similarity between them as follows:
In the same way, based on \(S_{mg}^{v}\), \(S_{mh}^{v}\) and newly downloaded known microbemicrobe interactions, for any two given microbes m_{i} and m_{j}, we can calculate an integrated similarity between them as follows:
Hence, based on above newly obtained matrices, we can obtain two new matrices \(H^{1} \in R^{{\left( {n_{r} + n_{m} } \right)*\left( {n_{r} + n_{m} } \right)}}\) and \(H^{2} \in R^{{\left( {n_{r} + n_{m} } \right)*\left( {n_{r} + n_{m} } \right)}}\) as follows:
Obviously, according to above two matrices \(H^{1}\) and \(H^{2}\), we can easily construct two heterogeneous networks HN_{1} and HN_{2} respectively.
Low dimensional feature representations learning for microbes and drugs based on the graph attention network
Construction of two feature matrices
In this section, for any two given drugs r_{i} and r_{j}, we would first adopt SIMCOMP2 [26] to calculate the structural similarity between them, as a result, we can obtain a new drug structural similarity matrix \(S_{rc}\). And at the same time, for any two given microbes m_{i} and m_{j}, we would adopt the method proposed by Kamneva et al. [27] to calculate the functional similarity between them, as a result, we can obtain a new microbe functional similarity matrix \(S_{mf}\) as well.
Moreover, we would further implement a random walk with restart (RWR) on \(S_{r}^{v}\) and \(S_{m}^{v}\) to obtain the topological attributes \(S_{rr}^{v} , S_{mm}^{v}\) of drugs and microbes separately, where the RWR was defined as follows:
Here, \(p_{i}^{l}\) denotes the probabilities that node \(i\) reaches other nodes at the time slot \(l\). M is the transition probability matrix and \(\varepsilon_{i} \varepsilon R^{1*n}\) represents the initial probability vector of node \(i\).
Different from the usual weighted addition of various attribute vectors of nodes to form the feature matrix, we spliced various attributes together to retain more original features. The feature matrices \(X^{v} \in R^{{\left( {n_{r} + n_{m} } \right)*k_{1} }}\) for two heterogeneous networks were defined as follows:
where \(k_{1}\) denotes the dimension of the feature matrices \(X^{v}\).
The structure of the graph attention network
Encoder: Firstly, for any given node \(i\) in \(H^{v} \left( {v = 1,2} \right)\), the coefficient of similarity between it and its neighbors would be calculated as follows:
Here, \(X^{v} \left( i \right)\) denotes the ith row of \(X^{v}\) and \(a\) represents a feature mapping operation. \(W^{v}\) is a trainable weight matrix parameter and \(\Phi_{i}^{v}\) is the set of neighbor nodes of node \(i\) in \(H^{v}\), \(\mu\) is the hypermeter.
Subsequently, the attention score \(\lambda_{ij}\) between node \(i\) and node \(j\) would be calculated based on \(e_{ij}\) according to the following formula:
Finally, the features would be weighted and summed according to the calculated attention score to obtain the new feature representation of node \(i\) as follows:
After obtaining new feature representations of all nodes in \(H^{v}\), it is easy to see that we can construct a feature representation matrix \(Y^{v} = \left[ {\begin{array}{*{20}c} {R_{r}^{v} } \\ {R_{m}^{v} } \\ \end{array} } \right] \in R^{{\left( {n_{r} + n_{m} } \right)*k_{2} }}\).
Where \(k_{2}\) denotes the dimension of the feature representation matrix \(Y^{v}\).
Decoder: The decoder runs an inner product based on newly learned feature representation matrix \({Y}^{v}\) as follows:
Optimization
Considering the reconstructed matrix should be as similar as possible to the original matrix, we adopted the MSE loss function to compute the mean of the sum of squares of the differences between \(Y^{v\prime }\) and \(H^{v}\) as follows:
where \(Y^{v\prime } \left( i \right)\) and \(H^{v} \left( i \right)\) denote the \(i\)th row of \(Y^{v\prime }\) and \(H^{v}\) respectively. During training, we used the Adam optimizer to optimize the loss function.
Construction of the CNNbased classifier
In this section, we treated the microbedrug association prediction as a binary classification problem and designed a classifier based on the convolutional neural network to calculate possible scores of potential drugmicrobe associations. For the input of the classifier, we first constructed two new feature matrices \(N_{r}^{v}\) and \(N_{m}^{v}\) for drugs and microbes separately as follows:
And then, let \(k_{3}\) denote the dimension of the new feature matrix, then for any given drug \(r_{i}\) and microbe \(m_{j}\), the feature matrix \(F^{v} \left( {i,j} \right) = \left[ {\begin{array}{*{20}c} {N_{r}^{v} \left( i \right)} \\ {N_{m}^{v} \left( j \right)} \\ \end{array} } \right] \in R^{{2*k_{3} }}\) would be fed into the classifier to calculate the score between \(i\) and \(j\). Here,\(N_{r}^{v} \left( i \right)\) and \(N_{m}^{v} \left( j \right)\) denote the \(i\)th and the \(j\)th row of \(N_{r}^{v}\) and \(N_{m}^{v}\), respectively.
In the convolutional layer, we adopted zero padding to enlarge the edges and set the size of the convolution kernel to 3 × 3. The convolutional operation in the \(i\)th layer were defined as follows:
where \(\otimes\) represents the operation of convolution, \(G_{i}\) is the weight matrix, and \(b_{i}\) is the offset vector. It is worth mentioning that we added the BatchNorm2d [28] to normalize data to enhance performance stability before \(Relu\).
After inputs having gone through two convolution layers, it would be flattened into a vector. And then, a fullconnected layer and a softmax layer would be used to obtain scores of two associative categories, based on which, we would adopt scores of the second category as predicted scores of potential microbedrug associations in GACNNMDA. Obviously, based on \(H^{1}\) and \(H^{2}\), we can obtain two score matrices \(Score^{1}\) and \(Score^{2}\) respectively. Hence, a final score matrix \(Score \in R^{{n_{r} *n_{m} }}\) can be calculated as follows:
Moreover, in the classifier, we utilized the crossentropy as loss function and Adam optimizer to minimize the loss function. Here, the loss function \(L^{v}\) (v = 1, 2) was defined as follows:
where \(a_{ij}^{v}\) and \(s_{ij}^{v}\) represent the \(ij\)th entry of \(A^{v}\) and \(Score^{v}\) respectively.
Results
Comparison with stateoftheart methods
Considering that there are few computational methods and codes available for microbialdrug association prediction, we compared GACNNMDA with four existing microbedrug association prediction methods such as HMDAKATZ [19], GCNMDA [22], EGATMDA [23] and Graph2MDA [24], and two methods for link prediction problems in the bioinformatics field such as LAGCN [29] and NTSHMDA [30]. Among them, LAGCN [29] is a graph convolutional network with attention mechanism based method designed to infer unknown drugdisease associations. NTSHMDA [30] is a model based on random walk with restart for predicting microbedisease associations.
During experiments, we settled with original parameters for all these competitive methods and ran them on the wellknown public database MDAD for a fair comparison. In addition, we adopted the framework of fivefold cross validation (CV) to evaluate these methods, in which, 20% of known associations and 20% of unknown associations would be randomly selected as the testing set, and the remaining 80% of known associations and unknown associations as the training set [31]. And then, we selected the AUC, AUPR, Accuracy and F1Score as the metrics of performance evaluation. Experimental results were shown in Table 2. Due to the incomplete code proposed by Deng et al. [24], we directly referenced the results in Graph2MDA. As a result, the ROC and PR curves were drawn in Figs. 2 and 3 separately, in which, those evaluation metrics are calculated as follows:
Here, TP and TN represent the numbers of positive and negative samples predicted correctly, respectively. FN and FP denote the numbers of positive and negative samples that are incorrectly identified, separately.
As shown in Table 2, it is obvious that GACNNMDA can achieve the highest AUC value of 0.9777 ± 0.0109, which is 2.57% higher than the second highest AUC value of 0.9585 ± 0.0053 obtained by EGATMDA. For evaluation metrics of accuracy and f1score, GACNNMDA can also achieve the highest values of 0.9945 and 0.7091 respectively. Although in terms of AUPR value, GACNNMDA can only outperform half of all these competitive methods, we can say that GACNNMDA is an effective tool for potential microbedrug association prediction.
Hyperparameter sensitivity analysis
Considering that there are several hyperparameters in GACNNMDA, including the learning rate of GAT, the dropout of GAT and the learning rate of CNN, therefore, in this section, we would perform a fivefold CV on the MDAD dataset for 10 times and observe the average AUC value to tune the values of these parameters.
For convenience, let lr1, dp and lr2 denote the learning rate of GAT, the dropout of GAT and the learning rate of CNN respectively. During the tuning process, we first tested the values of lr1 in the range of {0.0001, 0.001, 0.01, 0.05, 0.1} and illustrated experimental results in Fig. 4a. As shown in Fig. 4a, GACNNMDA achieved the best performance when lr1 was set to 0.001. And then, we limited the values of dp in the range of {0.2, 0.4, 0.5, 0.7} and illustrated experimental results in Fig. 4b. From observing Fig. 4b, it is easy to see that the most suitable value of dp is 0.4. Finally, we restricted the values of lr2 in {0.0001, 0.001, 0.01, 0.05, 0.1} and showed experimental results in Fig. 4c. As illustrated in Fig. 4c, when lr2 was set to 0.001, the performance of GACNNMDA would be the best.
Case studies
In order to further demonstrate the prediction performance of GACNNMDA, case studies on two popular drugs and two microbes will be done in this section. And in experiments of case studies, the top 20 microbes or drugs inferred by GACNNMDA based on the database of MDAD will be selected out for investigation first, and then, we will search published PubMed literatures to verify whether these predicted candidates having been reported by existing references.
The first drug that we chose for case studies is Ciprofloxacin, which is a fluorinated quinolone antibiotic, and a large number of studies have shown that it is associated with a wide range of human microbes [32]. For instance, Paul et al. found that AmphotericinB and 5% ciprofloxacin can effectively hindered the growth of Pseudomonas aeruginosa and Candida albicans [33]. Staphylococcus aureus, Staphylococcus epidermidis, Bacillius subtilius, Escherichia coli and Mycobacterium tuberculosis are susceptible to Ciprofloxacin [34]. The second drug that we chose for case studies is Moxifloxacin, which is a fluoroquinolone antibiotic [35], and has been proven to be associated with antibioticresistant bacteria (ARB) [36] and Listeria monocytogenes [37]. And as a result, we illustrated the top 20 predicted ciprofloxacinassociated and moxifloxacinassociated microbes in Tables 3 and 4 respectively. From observing Tables 3 and 4, it is easy to see that there are 18 and 17 out of top 20 predicted microbes having been validated by existing literatures separately.
Besides, the first microbe that we chose for case studies is HIV1 (Human Immunodeficiency Virus type 1), which is the cause of the acquired immunodeficiency syndrome (AIDS). There are many drugs associated with HIV1. For example, Viani et al. found that longterm zalcitabine for treating HIV1 phenotypes in children is useful [38]. Chong et al. proved that combination of delavirdine, zidovudine and didanosine can inhibit the growth of the HIV1 [39]. The second microbe that we chose for case studies is mycobacterium tuberculosis, which is the cause of the pulmonary tuberculosis [40]. And as a result, we showed the top 20 predicted HIV1associated and mycobacterium tuberculosisassociated drugs in Tables 5 and 6 respectively. From observing Tables 5 and 6, it is obvious that there are 18 and 15 out of top 20 predicted drugs having been verified by existing literatures. Hence, we can draw a conclusion that GACNNMDA can achieve satisfactory prediction performance in both case studies of microbes and drugs.
Conclusion and discussion
In this paper, we presented a novel calculation method named GACNNMDA, an integrated framework of GATbased autoencoder and CNNbased classifier, for prediction of potential microbedrug associations. The main contributions of our model include the following three points.

1.
We introduced known microbediseasedrug associations into the predictive model and made up for the sparsity of known microbedrug associations to some extent.

2.
For the inputs of GAT and CNN, we spliced multiple attributes of microbes and drugs together to form two feature matrices, which can retain more original features of microbes and drugs. Hence, more useful information can be learned by the GAT and the CNN.

3.
Compared with existing stateoftheart methods for predicting potential microbedrug associations, our model can achieve better performance.
However, there is still room to improve our prediction model. In the feature, we can leverage more biological information, such as microbe sequences [24] and sideeffectbased drug similarity [41]. Additionally, for those attributes of microbes and drugs used in GACNNMDA, we can make an assessment of their importance to better use each kind of attribute and further improve the performance of our model. Finally, we can design a new activation to improve the training speed of GAT and CNN such as Li et al. [42].
Data availability
The data and code can be found online at: https://github.com/tyqGitHub/TYQ/tree/master/ GACNNMDA.
References
Huttenhower C, Gevers D, Knight R, et al. Structure, function and diversity of the healthy human microbiome. Nature. 2012;486(7402):207–14.
Ventura M, O’Flaherty S, Claesson MJ, et al. Genomescale analyses of healthpromoting bacteria: probiogenomics. Nat Rev Microbiol. 2009;7(1):61–71.
Sommer F, Backhed F. The gut microbiota masters of host development and physiology. Nat Rev Microbiol. 2013;11(4):227–38.
Kau AL, Ahern PP, Griffin NW, et al. Human nutrition, the gut microbiome and the immune system. Nature. 2011;474(7351):327–36.
Sprockett D, Fukami T, et al. Role of priority effects in the earlylife assembly of the gut microbiota. Nat Rev Gastroenterol Hepatol. 2018;15(4):197–205.
Ximenez C, Torres J. Development of microbiota in infants and its role in maturation of gut mucosa and immune system. Arch Med Res. 2017;48(8):666–80.
Tilg HA, et al. The intestinal microbiota in colorectal cancer. Cancer Cell. 2018;33(6):954–64.
Cani PD, et al. Novel insight into the role of microbiota in colorectal surgery. Gut J Br Soc Gastroenterol. 2017;66(4):738–49.
Routy B, Gopalakrishnan V, et al. The gut microbiota influences anticancer immunosurveillance and general health. Nat Rev Clin Oncol. 2018;15(6):382–96.
Shanahan F, Sinderen DV, O’Toole PW, et al. Feeding the microbiota: transducer of nutrient signals for the host. Gut. 2017;66(9):1709–17.
Cremonesi E, Governa V, Garzon JFG, Mele V, Amicarella F. Gut microbiota modulate T cell trafficking into human colorectal cancer. Gut J Br Soc Gastroenterol. 2018;67(11):1984–94.
Ogino S, Nowak JA, Hamada T, et al. Integrative analysis of exogenous, endogenous, tumour and immune factors for precision medicine. Gut. 2018;67(6):1168–80.
Xiang YT, Li W, Zhang Q, et al. Timely research papers about COVID19 in China. Lancet. 2020;395(10225):684–5.
McCoubrey LE, Gaisford S, Orlu M, et al. Predicting drugmicrobiome interactions with machine learning. Biotechnol Adv. 2022;54: 107797.
Zimmermann M, ZimmermannKogadeeva M, Wegmann R, et al. Mapping human microbiome drug metabolism by gut bacteria and their genes. Nature. 2019;570(7762):462–7.
Sun YZ, Zhang DH, et al. MDAD: a special resource for microbedrug associations. Front Cell Infect Microbiol. 2018.
Rajput A, Thakur A, Sharma S, et al. aBiofilm: a resource of antibiofilm agents and their potential implications in targeting antibiotic drug resistance. Nucleic Acids Res. 2018;46(D1):D894–900.
Pia A, Ai A, Hl A, et al. Discovery and development of safeinman broadspectrum antiviral agents. Int J Infect Dis. 2020;93:268–76.
Zhu L, Duan G, Yan C, et al. Prediction of microbedrug associations based on KATZ measure. In 2019 IEEE international conference on bioinformatics and biomedicine (BIBM) 2019. pp. 183–187.
Long Y, Luo J. Association mining to identify microbe drug interactions based on heterogeneous network embedding representation. IEEE J Biomed Health Inform. 2021;25(1):266–75.
Zhu L, Wang J, Li G, et al. Predicting microbedrug association based on similarity and semisupervised learning. Am J Biochem Biotechnol. 2021;17(1):50–8.
Long Y, Wu M, Keong KC, et al. Predicting human microbe–drug associations via graph convolutional network with conditional random field. Bioinformatics. 2020;36(19):4918–27.
Long Y, Wu M, Liu Y, et al. Ensembling graph attention networks for human microbe–drug association prediction. Bioinformatics. 2020;36(Supplement 2):i779–86.
Deng L, Huang Y, Liu X, et al. Graph2MDA: a multimodal variational graph embedding model for predicting microbe–drug associations. Bioinformatics. 2022;38(4):1118–25.
Wang L, Tan Y, Yang X, et al. Review on predicting pairwise relationships between human microbes, drugs and diseases: from biological data to computational models. Brief Bioinf. 2022;23(3):bbac080.
Hattori M, Tanaka N, Kanehisa M, et al. SIMCOMP/SUBCOMP: chemical structure search servers for network analyses. Nucleic Acids Res. 2010;38(2):W652–6.
Kamneva OK. Genome composition and phylogeny of microbes predict their cooccurrence in the environment. PLoS Comput Biol. 2017;13(2): e1005366.
Ioffe S, Szegedy C. Batch normalization: accelerating deep network training by reducing internal covariate shift. In Proceedings of the 32nd international conference on machine learning, vol 37; 2015. pp. 448–456.
Yu Z, Huang F, Zhao X, et al. Predicting drug–disease associations through layer attention graph convolutional network. Brief Bioinf. 2020;22(4):bbaa243.
Luo J, Long Y. NTSHMDA: prediction of human microbedisease association based on random walk by integrating network topological similarity. IEEE ACM Trans Comput Biol Bioinf. 2020;17(4):1341–51.
Cai L, Lu C, Xu J, et al. Drug repositioning based on the heterogeneous information fusion graph convolutional network. Brief Bioinf. 2021;22(6):bbab319.
CampoliRichards DM, Monk JP, Price A, et al. Ciprofloxacin. Drugs. 1988;35(4):373–447.
Paul D, Saha S, Singh N, et al. Successful control of a coinfection caused by Candida albicans and Pseudomonas aeruginosa in Keratitis. Infect Disord Drug Targets. 2021;21(2):284–8.
Castro W, Navarro M, Biot C. Medicinal potential of ciprofloxacin and its derivatives. Future Med Chem. 2013;5(1):81–96.
Balfour JAB, et al. Moxifloxacin. Drugs. 1999;57(3):363–73.
LoyolaRodriguez JP, PonceDiaz ME, LoyolaLeyva A, et al. Determination and identification of antibioticresistant oral streptococci isolated from active dental infections in adults. Acta Odontol Scand. 2018;76(4):229–35.
Tahoun ABMB, Abou Elez RMM, Abdelfatah EN, et al. Listeria monocytogenes in raw milk, milking equipment and dairy workers: Molecular characterization and antimicrobial resistance patterns. J Glob Antimicrobial Res. 2017;10:264–70.
Viani RM, Smith IL, Spector SA. Human immunodeficiency virus type 1 phenotypes in children with advanced disease treated with longterm zalcitabine. J Infect Dis. 1998;177(3):565–70.
Chong KT, Pagano PJ. Inhibition of human immunodeficiency virus type 1 infection in vitro by combination of delavirdine, zidovudine and didanosine. Antiviral Res. 1997;34(1):51–63.
Koch A, Mizrahi V. Mycobacterium tuberculosis. Trends Microbiol. 2018;26(6):555–6.
Kuhn M, Campillos M, Letunic I, et al. A side effect resource to capture phenotypic effects of drugs. Mol Syst Biol. 2010;6(1):343.
Li H, Wang Y, Zhang Z, et al. Identifying microbedisease association based on a novel backpropagation neural network model. IEEE ACM Trans Comput Biol Bioinf. 2021;18(6):2502–13.
Acknowledgements
Not applicable
Funding
This work was partly sponsored by the National Natural Science Foundation of China (No. 62272064) and the Key project of Changsha Science and technology Plan (No. KQ2203001).
Author information
Authors and Affiliations
Contributions
LW supervised the study. QM and YQT designed the model and conducted the experiments, YQT and QM wrote this paper. LW provide suggestions and revised the manuscript. All authors read and approved the final manuscript.
Corresponding authors
Ethics declarations
Ethical approval and consent to participate
Not applicable.
Consent to publication
Not applicable.
Competing interests
The authors declare that they have no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Additional file 1:
The newly downloaded dataset of diseases.
Additional file 2:
The newly downloaded dataset of drugs.
Additional file 3:
The newly downloaded dataset of microbes.
Additional file 4:
The newly downloaded dataset of drugdisease associations.
Additional file 5:
The newly downloaded dataset of drugdrug interactions.
Additional file 6:
The newly downloaded dataset of microbedrug associations.
Additional file 7:
The newly downloaded dataset of microbedisease associations.
Additional file 8:
The newly downloaded dataset of microbemicrobe interactions.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
About this article
Cite this article
Ma, Q., Tan, Y. & Wang, L. GACNNMDA: a computational model for predicting potential human microbedrug associations based on graph attention network and CNNbased classifier. BMC Bioinformatics 24, 35 (2023). https://doi.org/10.1186/s12859023051587
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s12859023051587