- Research article
- Open Access
- Published:
Antimicrobial peptide identification using multi-scale convolutional network
BMC Bioinformatics volume 20, Article number: 730 (2019)
Abstract
Background
Antibiotic resistance has become an increasingly serious problem in the past decades. As an alternative choice, antimicrobial peptides (AMPs) have attracted lots of attention. To identify new AMPs, machine learning methods have been commonly used. More recently, some deep learning methods have also been applied to this problem.
Results
In this paper, we designed a deep learning model to identify AMP sequences. We employed the embedding layer and the multi-scale convolutional network in our model. The multi-scale convolutional network, which contains multiple convolutional layers of varying filter lengths, could utilize all latent features captured by the multiple convolutional layers. To further improve the performance, we also incorporated additional information into the designed model and proposed a fusion model. Results showed that our model outperforms the state-of-the-art models on two AMP datasets and the Antimicrobial Peptide Database (APD)3 benchmark dataset. The fusion model also outperforms the state-of-the-art model on an anti-inflammatory peptides (AIPs) dataset at the accuracy.
Conclusions
Multi-scale convolutional network is a novel addition to existing deep neural network (DNN) models. The proposed DNN model and the modified fusion model outperform the state-of-the-art models for new AMP discovery. The source code and data are available at https://github.com/zhanglabNKU/APIN.
Introduction
In recent years, antimicrobial peptides (AMPs) have attracted lots of attention due to the well-known antibiotic resistance problem. AMPs are polypeptides shorter than 100 amino acids, which are an important part of host defense systems of animals and plants [1]. AMPs have antimicrobial activity under specific circumstances since the difference between microbial and host cells in biochemical and biophysical provides a basis for selective toxicity of AMPs [2]. AMPs exhibit many advantages including fast killing, low toxicity, and broad range of activity [3]. Besides, AMPs show a lower likelihood for antimicrobial resistance compared to many antibiotics [4]. Due to the advantages of AMPs, they have been a popular research area of bioinformatics.
To identify AMPs, many computational tools are proposed such as CAMP [5], CAMPR3 [6], ADAM [7], AMPer [8], AntiBP [9], AntiBP2 [10], AVPpred [11], iAMP-2 L [12], EFC-FCBF [13], classAMP [14] and web-based antimicrobial peptide prediction tools [15]. Many of these tools applied various machine learning methods. For example, support vector machine (SVM), random forest (RF), and artificial neural network (ANN) were employed in CAMP. To apply machine learning methods, feature engineering is a necessary step. The most popular features for AMPs are amino acid composition. For example, AntiBP employed basic amino acid counts over the full peptide as the features. The pseudo-amino acid composition (PseAAC) method is also applied in some methods [16].
For machine learning methods, feature construction of protein sequences relies heavily on domain knowledges. To avoid the complexity of feature engineering and remove the burden of feature construction, many deep learning models have been applied to various problems in bioinformatics [17] such as protein structure prediction [18, 19], protein classification [20], biomedical imaging recognition [21, 22]. To apply deep learning to the problem of AMP identification, a deep neural network (DNN) model was proposed [23]. This model employed a convolutional layer [24] and a recurrent layer, which can capture latent features of protein sequences, so it was shown to outperform the state-of-the-art models in AMP identification. Although this model is great, there is still room for improvement. For example, a long short-term memory (LSTM) layer [25] was employed due to its ability to recognize and forget gap-separated patterns in this model. However, this architecture of DNN model is usually applied in natural language processing (NLP) [26, 27], and is not appropriate for AMP identification in our experiments which is listed in Table 3 for comparison of modified models.
In this paper, we have designed a multi-scale convolutional network which contains multiple convolutional layers of different filter lengths, and proposed a DNN model based on the multi-scale convolutional network to improve the performance of AMP identification. In the proposed model, we have employed an embedding layer and a multi-scale convolutional network. The embedding layer can capture semantic information of amino acids by converting each of them into a numerical vector. The distance between vectors can represent the relation between the corresponding amino acids. Many word embedding models, such as word2vector [28] and gloves [29], are widely used in text recognition tasks. The choice of a multi-scale convolutional network is due to its ability to capture latent features of motifs. Since a multi-scale convolutional network contains multiple convolutional layers, it can make use of all latent features captured by their convolutional layers. Because of the ability of the multi-scale convolutional network to capture multi-scale motifs, the proposed model outperforms the state-of-the-art DNN model [23] in AMP identification. To further improve the performance, we also incorporated additional information into the proposed model and proposed a fusion model.
Results
Dataset
We adopt four datasets in this paper. The first dataset we used is made by Veltri et al. (2018) [23], containing 1778 AMPs constructed from the APD vr.3 database [30] and 1778 non-AMPs constructed from UniProt [31]. The dataset is split by Veltri et al. (2018) [23] into a training set, a tuning set and a test set and the number of AMP sequences are 712, 354, and 712 respectively. More detailed information of this dataset can be found in Veltri et al. (2018) [23]. In the rest of the paper, this dataset is named DAMP dataset. The second dataset is taken from AntiBP2 [10], which has 1998 peptide sequences. AMPs have ∼75% overlap with DAMP dataset and non-AMPs have no overlap with it. The third dataset is an anti-inflammatory peptide (AIP) dataset, which is from AIPpred [32]. This dataset contains 1258 AIPs and 1887 non-AIPs in training set, 420 AIPs and 629 non-AIPs in test set. The last dataset is from the paper [15], which is composed of 10,278 sequences. Table 1 summarizes the four datasets.
Setup and runtime performance
The proposed DNN model is constructed using Keras [33], a Python neural network library, with a CPU-based TensorFlow back-end [34]. The weights in our model of 11 are initialized with the default value of Keras. The optimizer is RMSProp whose learning rate is set to 0.0002, and the loss function is ‘binary_crossentropy’. Besides, the batch size is set to 32. Experiments are conducted on a computer with Intel Xeon E3-1226v3 CPU and the RAM of this computer is 8GB. The training of each epoch takes about 56 s and the prediction of a peptide sequence takes 6 ms on average.
Model tuning
First, we want to know how the model performs with only one convolutional layer. We replaced the multi-scale convolutional network with the single convolutional layer. The performance of the modified model with different filter size is shown in Fig. 1. As shown in this figure, the accuracy (ACC) [35] of the modified model is under 89% when this model only contains one convolutional layer whose filter length is short. As the filter length increases, the ACC also increases very fast. The performance of the length between 6 and 20 is similar as shown in Fig. 1. The results of this experiment show that any single convolutional layer whose filter length is shorter than 7 could not capture enough information of a peptide sequence in AMP identification, and the convolutional layers with filter lengths longer than 7 have similar performance in this problem.
Then we want to find the best parameter N in our multi-scale model. Figure 2 shows the performance of the proposed model with different parameter N. As shown in Fig. 2, when N is small, the performance of this multi-scale model is similar to the model with one convolutional layer. Conversely, when N gets larger, the multi-scale model performs better. When N = 14, ACC score is the highest with low fluctuation. We finally choose N = 14 in the proposed model.
Comparison with current main methods
To evaluate the proposed multi-scale DNN model, this model is compared with the state-of-the-art models including the traditional machine learning models and the existing DNN model. Table 2 shows comparison results of the state-of-the-art model. The results show that the proposed model outperforms the existing DNN in all evaluation metrics except sensitivity (SENS). To be specific, the accuracy of the proposed model is about 92.4%, which is 1.3% higher than the existing DNN model, and the specificity (SPEC) is about 94%, which is 1.51% higher than the existing DNN model. Although the highest SENS is achieved by the RF model, the performance of the proposed model is better than the performance of the existing DNN model. The fusion model which makes use of amino acid composition (AAC) [32] and dipeptide composition (DPC) [32] further improves the performance. ACC of the fusion model reaches 92.55%.
Modification comparison
We modified the propose model and conducted a modification comparison by replacing or removing some components in the proposed model in order to find out the vital elements of the success of the proposed model and discover the best architecture of DNN model in AMP identification.
To be specific, we have tested the models in which we replaced the embedding layer with one-hot encoding, or replaced multi-scale convolutional network with simple convolutional layer or replaced the pooling1 layers with LSTM layers. Besides, we also have tested models without pooling2 layer or with additional fully connected (FC) layers. The results of modification comparison are shown in Table 3. From the results, we find that the multi-convolutional network is the most important part in our model, and the ACC performance of the model without this component drops to 90.44%. Also, the embedding layer is significant in our model. When we run the model without embedding layer, the ACC performance drops to 91.43%. Additionally, using LSTM to replace pooling1 doesn’t improve the performance of AMP identification and increases runtime. This result implies that LSTM is not a good choice for AMP identification in the proposed model. We also tested a model in which we replaced the pooling1 layers with Gated Recurrent Unit (GRU) layers and its accuracy is 91.43%. Because the structure of GRU is similar to LSTM, the result doesn’t change obviously compared to replacing pooling1 layers with LSTM layers. In addition, the results also show that additional fully connected layer or removing pooling2 would not improve the performance.
We also analyzed the training time of each modified model. The results are shown in Table 4. The results show that replacing the embedding layer or multi-scale convolutional network reduces the training time but the accuracy decreases. Adding LSTM into the proposed model not only increases the training time but also decreases the accuracy. Besides, adding FC layers or removing pooling2 doesn’t apparently affect runtime.
Model performance on other datasets
To find out how the proposed model performs on other datasets, we applied our model to AntiBP2 dataset, AIP dataset and the APD3 benchmark dataset from paper [15].
We used 10-fold cross validation test on AntiBP2 dataset to compare the proposed model with state-of-the-art models. Table 5 shows that the proposed DNN also outperforms other state-of-the-art models on AntiBP2 dataset. The accuracy of this dataset is 93.38%.
We compared the proposed model with the existing DNN [23] and the AIPpred model which is state-of-the-art on AIP dataset. The result is shown in Table 6. From this table, we can see that the accuracy of the proposed model on this dataset is 73.02% (0.38% lower than AIPpred). However, the proposed model performs much better than the existing DNN [23]. When using AAC, DPC and some other features, the proposed fusion model achieves a better performance than AIPpred (ACC is 0.44% higher than AIPpred). This experiment implies that the proposed model has a good applicability and could also be applied to problems of other peptide sequence identification.
We also tested these methods on the APD3 benchmark dataset. The prediction result is shown in Table 7. The performance metrics indicate that our proposed method and proposed fusion method perform better than other methods. Besides, we used DeLong’s test to get differences between our two proposed methods and other methods with the area under receiver-operating curve (auROC) analysis. The result is shown in Table 8. It also shows that our two proposed methods over-perform other methods.
Discussion
We have designed a multi-scale convolutional DNN model to identify AMP sequences. In terms of accuracy, it overperforms other methods on three datasets. Although the proposed model and the proposed fusion model have no obvious advantage over AIPpred, the former models use less information from sequences and they’re easily to use. The propose model takes a little longer time than some modified model but the runtime is acceptable and the prediction accuracy has significant improvements.
Conclusion
To identify AMPs, we have proposed a DNN model based on the multi-scale convolutional layers. The proposed DNN model mainly employs the embedding layer and the multi-scale convolutional network. Through the embedding layer, each amino acid in a peptide sequence is converted into an embedding vector. The multi-scale convolutional network can capture the local features, and its max pooling layers and convolutional layers of different filter lengths can help with the feature selection. This model focusing on the local context could improve the performance of AMP identification. Furthermore, we have incorporated additional information into the proposed model and developed a fusion model. Compared with the state-of-the-art models, our proposed model achieved better performance. Through the model modification comparisons, we found that the model without multi-scale convolutional network achieved the worst results, which means the multi-scale convolutional network is the most important part in our model. We also applied the proposed model and proposed fusion model to other datasets including an AMP dataset and an AIP dataset and the APD3 benchmark dataset. The results show that the fusion model could achieve a better performance and our proposed model is applicable for other peptide identification.
Methods
Structure of our proposed DNN
First, we tested and analyzed the state-of-the-art DNN model which contains a LSTM layer. The LSTM layer applied to AMP identification focuses on the whole sequence without caring about short motifs. However, it is believed that proteins with similar functions may share some short motifs [32]. This means that we can predict AMPs based on these motifs shared with known AMPs.
With this mind, we designed a multi-scale convolutional network, and then proposed a new DNN model based on this network. The proposed DNN model mainly employs a multi-scale convolutional network containing many convolutional layers of different filter lengths. Since each convolutional layer can capture motifs of a fixed length, convolutional layers of different filter lengths can detect motifs of different lengths. The structure of our proposed model is shown in Fig. 3, which shows that the proposed model mainly contains an Embedding module, a Convolutional module, a Pooling module and a Fully Connection module. In the proposed model, we used dropout and set the parameter 0.2 to prevent overfitting.
The structure of the proposed model. The proposed model mainly uses embedding layer and convolutional layers. All sequences are encoded into numerical vectors of length 200 and are fed into the embedding layer. Each embedding vector dimension is 128. Then the outputs of embedding layer are fed into N convolutional layers. Each convolutional layer uses 64 filter kernels. These outputs are connected to feed into a max pooling layer and outputs of the pooling layers are concatenated to fed into another max pooling layer. Finally the output will be fed into a fully connection layer and passed through a sigmoid function. The final output is in range [0,1] as the prediction of the input sequence
As shown in Fig. 3, the sequence data has to be converted to be fed into the model. A peptide sequence is converted into a numerical vector of length 200, which is larger than the length of the longest sequence. We assigned an integer within 20 to each one of the 20 basic amino acids. The sequence shorter than 200 will be padded with the number 0 to obtain a fixed vector length 200. The padded 0 s will be ignored by the model during later data processing. Then the encoded data will be fed into the embedding layer that can convert the data with discrete representation into a word vector of a fixed size. That they have a dense representation and can represent an abstract symbol (e.g. a word or an amino acid) with a fixed vector can help reduce dimension. Besides, the distance between two word vectors can represent the relation between two symbols. Compared to the one-hot encoding, the word vector is more compact. As a result, the embedding layer will output a sequence matrix given an amino acid sequence. The matrix has a fixed-dimension of 128 × 200 in our model. The embedding layer will be trained with the whole model.
In the Convolutional module, we employed a multi-scale convolutional network containing N convolutional layers of different filter lengths. A filter will be activated when a matching motif is detected. An amino acid sequence embedding presentation is given as
where vi(∈R128) is the embedding vector of i-th amino acid. To extract local contexts, the output of each convolutional layer is as
where δ(∗) means a non-linear activation function which is Rectified Linear Unit (ReLU) [36] in our model, w(f) and b(f) are weight and bias of f-th filter, and xi is i-th part which is to be convolved. xi is as [vi, vi + 1, …, vi + l] where l is the filter length of this convolutional layer. The Convolutional module takes the most important part in recognizing the AMPs by the short motifs which the convolutional layers can detect. A difference between convolutional layers in the multi-scale convolutional network is the filter lengths. Due to the filters of different lengths, each of the convolutional layers screen motifs of its length and then the results of all convolutional layers are different. To be specific, the filter lengths of all N convolutional layers are 2, 4, 6, ..., 2 N.
Each convolutional layer’s output is fed into a max pooling layer. The pooling layer helps reduce over-fitting. Besides, the max pooling is similar as feature selection, which selects the feature with max value. Next, to make use of motifs of different size, all pooling layers’ outputs are concatenated. In other words, the results of all different convolutional layers are concatenated. Then the concatenated layer’s output is fed into another max pooling layer. Finally, the output of pooling layer is fed into a fully connected layer to get the final prediction. The final dense layer uses a sigmoid function and its output is in the range [0,1]. The final output greater than 0.5 means the input sequence is an AMP, otherwise, a non-AMP.
As described above, recurrent neural network (RNN) or LSTM were not used in the proposed model. In our experiments, adding LSTM or RNN did not improve the performance of the proposed model significantly. The results of experiments are discussed in Results section. The features of motifs which convolutional layers detect are used for our identification of new AMPs.
Model tuning and metrics
We evaluate our proposed model based on sensitivity (SENS), specificity (SPEC), precision (PREC), balanced accuracy (BalACC), accuracy (ACC) [35] and Matthew’s Correlation Coefficient (MCC) [37]. All of them are based on the number of true positive (TP), true negative (TN), false positive (FP), false negative (FN). They are defined as
Besides, we also make use of auROC [38]. The receiver operating curve (ROC) can represent the performance of a model by showing the TP rate as a function of FP rate. As the discrimination threshold changes, the TP rate and FP rate change. The auROC is the area under the ROC, which is in range [0.5,1]. 0.5 means random guess, while 1 means that the prediction is always correct.
To reflect different filter lengths bring about different prediction results, a 10-fold cross validation based on a single convolutional layer was conducted. Besides, to find out the best parameter N which is the number of convolutional layers in the multiscale convolutional network, we conducted a 10-fold cross validation to evaluate the parameter N. In this procedure, we merged the training set and tuning set and only took ACC into consideration to choose N. After N was chosen, we merged the training set and tuning set as a new training set to train the proposed model and then evaluated the proposed model and compared it with the state-of-the-art models based on the prediction results of the test set.
Fusion model
To further improve the performance of the proposed model, redundant information [39] of a peptide sequence is incorporated into the proposed model via a hybrid approach. We combined the proposed model with a fully connected network into a fusion model to capture multi-type features. Besides peptide sequences, amino acid composition (AAC) [32] and dipeptide composition (DPC) [32] are used in this fusion model. AAC is a vector which represents the fractions of 20 amino acid in its peptide sequence. It is defined as
DPC is a vector which represents the ratio of 400 possible dipeptides in a given sequence. It is calculated as
DPC has a fixed length of 400 which represents the 400 possible dipeptides.
Figure 4 shows the structure of the fusion model. There are two parts in this model. One is the proposed DNN model and another one is an additional fully connected network. The DPC and AAC are concatenated into a vector which has a length of 420. Then this vector is fed into a dense layer with 64 units and each unit use a sigmoid function. The output of this layer with the output of pooling layer in proposed model are concatenated. The concatenated vector is fed into a final dense layer with 1 unit. The final dense layer uses a sigmoid function and its output is in the range [0,1]. We only make use of DPC and AAC in this model, which are easy to obtain, and thus this model also can be applied to any sequence dataset.
The structure of the proposed fusion model. There are two parts in the fusion model. The proposed structure is on the left. An additional fully connected network is on the right and this part make use of the DPC and AAC of peptide sequences. This network incorporates redundant information into the proposed model
Availability of data and materials
The AMP dataset described in Dataset part could be downloaded from http://www.dveltri.com/ascan/v2/ascan.html. The AntiBP2 dataset could be downloaded from http://crdd.osdd.net/raghava/antibp2/. The AIP dataset could be downloaded from http://www.thegleelab.org/AIPpred/. The APD3 dataset could be downloaded from https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5860510/bin/btx081_supp.zip. The source code is available at https://github.com/zhanglabNKU/APIN.
Abbreviations
- AAC:
-
Amino acid composition
- ACC:
-
Accuracy
- AIPs:
-
Anti-inflammatory peptides
- AMPs:
-
Antimicrobial peptides
- ANN:
-
Artificial neural network
- APD:
-
The Antimicrobial Peptide Database
- auROC:
-
The area under the ROC curve
- BalACC:
-
Balanced accuracy
- DNN:
-
Deep neural network
- DPC:
-
Dipeptide composition
- FC:
-
Fully connected
- FN:
-
False negative
- FP:
-
False positive
- GRU:
-
Gated recurrent unit
- LSTM:
-
Long short-term memory
- MCC:
-
Matthew’s correlation coefficient
- NLP:
-
Natural language processing
- PseAAC:
-
Pseudo-amino acid composition
- ReLU:
-
Rectified linear unit
- RF:
-
Random forest
- RNN:
-
Recurrent neural network
- ROC:
-
Receiver-operating curve
- SENS:
-
Sensitivity
- SPEC:
-
Specificity
- SVM:
-
Support vector machine
- TN:
-
True negative
- TP:
-
True positive
References
Gallo RL, Huttner KM. Antimicrobial peptides: an emerging concept in cutaneous biology. J Invest Dermatol. 1998;111(5):739–43. https://doi.org/10.1046/j.1523-1747.1998.00361.x.
Ganz T. Defensins: antimicrobial peptides of innate immunity. Nat Rev Immunol. 2003;3(9):710–20. https://doi.org/10.1038/nri1180.
Fjell CD, Jenssen H, Hilpert K, Cheung WA, Panté N, Hancock REW, Cherkasov A. Identification of novel antibacterial peptides by Chemoinformatics and machine learning. J Med Chem. 2009;52(7):2006–15. https://doi.org/10.1021/jm8015365.
Zelezetsky I, Pontillo A, Puzzi L, Antcheva N, Segat L, Pacor S, Crovella S, Tossi A. Evolution of the primate cathelicidin. Correlation between structural variations and antimicrobial activity. J Biol Chem. 2006;281(29):19861–71. https://doi.org/10.1074/jbc.M511108200.
Thomas S, Karnik S, Barai RS, Jayaraman VK, Idicula-Thomas S. CAMP: a useful resource for research on antimicrobial peptides. Nucleic Acids Res. 2010;38(Database issue):D774–80. https://doi.org/10.1093/nar/gkp1021.
Waghu FH, Barai RS, Gurung P, Idicula-Thomas S. CAMPR3: a database on sequences, structures and signatures of antimicrobial peptides. Nucleic Acids Res. 2016;44(D1):D1094–7. https://doi.org/10.1093/nar/gkv1051.
Lee HT, Lee CC, Yang JR, Lai JZ, Chang KY. A large-scale structural classification of antimicrobial peptides. Biomed Res Int. 2015;2015:475062. https://doi.org/10.1155/2015/475062.
Fjell CD, Hancock REW, Cherkasov A. AMPer: a database and an automated discovery tool for antimicrobial peptides. Bioinformatics. 2007;23(9):1148–55. https://doi.org/10.1093/bioinformatics/btm068.
Lata S, Sharma BK, Raghava GP. Analysis and prediction of antibacterial peptides. BMC Bioinformatics. 2007;8:263. https://doi.org/10.1186/1471-2105-8-263.
Lata S, Mishra NK, Raghava GP. AntiBP2: improved version of antibacterial peptide prediction. BMC Bioinformatics. 2010;11(Suppl 1):S19. https://doi.org/10.1186/1471-2105-11-S1-S19.
Thakur N, Qureshi A, Kumar M. AVPpred: collection and prediction of highly effective antiviral peptides. Nucleic Acids Res. 2012;40(Web Server issue):W199–204. https://doi.org/10.1093/nar/gks450.
Xiao X, Wang P, Lin WZ, Jia JH, Chou KC. iAMP-2L: a two-level multi-label classifier for identifying antimicrobial peptides and their functional types. Anal Biochem. 2013;436(2):168–77. https://doi.org/10.1016/j.ab.2013.01.019.
Veltri D, Kamath U, Shehu A. Improving recognition of antimicrobial peptides and target selectivity through machine learning and genetic programming. IEEE/ACM Trans Comput Biol Bioinform. 2017;14(2):300–13. https://doi.org/10.1109/TCBB.2015.2462364.
Joseph S, Karnik S, Nilawe P, Jayaraman VK, Idicula-Thomas S. ClassAMP: a prediction tool for classification of antimicrobial peptides. IEEE/ACM Trans Comput Biol Bioinformatics. 2012;9(5):1535–8. https://doi.org/10.1109/tcbb.2012.89.
Gabere MN, Noble WS. Empirical comparison of web-based antimicrobial peptide prediction tools. Bioinformatics (Oxford, England). 2017;33(13):1921–9. https://doi.org/10.1093/bioinformatics/btx081.
Meher PK, Sahu TK, Saini V, Rao AR. Predicting antimicrobial peptides with improved accuracy by incorporating the compositional, physico-chemical and structural features into Chou's general PseAAC. Sci Rep. 2017;7:42362. https://doi.org/10.1038/srep42362.
Wang W, Gao X. Deep learning in bioinformatics. Methods. 2019;166:1–3. https://doi.org/10.1016/j.ymeth.2019.06.006.
Heffernan R, Paliwal K, Lyons J, Dehzangi A, Sharma A, Wang J, Sattar A, Yang Y, Zhou Y. Improving prediction of secondary structure, local backbone angles, and solvent accessible surface area of proteins by iterative deep learning. Sci Rep. 2015;5:11476. https://doi.org/10.1038/srep11476.
Lyons J, Dehzangi A, Heffernan R, Sharma A, Paliwal K, Sattar A, Zhou Y, Yang Y. Predicting backbone Cα angles and dihedrals from protein sequences by stacked sparse auto-encoder deep neural network. J Comput Chem. 2014;35(28):2040–6. https://doi.org/10.1002/jcc.23718.
Asgari E, Mofrad MR. Continuous distributed representation of biological sequences for deep proteomics and genomics. PLoS One. 2015;10(11):e0141287. https://doi.org/10.1371/journal.pone.0141287.
Chen CL, Mahjoubfar A, Tai LC, Blaby IK, Huang A, Niazi KR, Jalali B. Deep learning in label-free cell classification. Sci Rep. 2016;6:21471. https://doi.org/10.1038/srep21471.
Xu J, Xiang L, Liu Q, Gilmore H, Wu J, Tang J, Madabhushi A. Stacked sparse autoencoder (SSAE) for nuclei detection on breast Cancer histopathology images. IEEE Trans Med Imaging. 2016;35(1):119–30. https://doi.org/10.1109/TMI.2015.2458702.
Veltri D, Kamath U, Shehu A. Deep learning improves antimicrobial peptide recognition. Bioinformatics. 2018;34(16):2740–7. https://doi.org/10.1093/bioinformatics/bty179.
LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521(7553):436–44. https://doi.org/10.1038/nature14539.
Hochreiter S, Schmidhuber J. Long short-term memory. Neural Comput. 1997;9(8):1735–80. https://doi.org/10.1162/neco.1997.9.8.1735.
Palangi H, Deng L, Shen Y, Gao J, He X, Chen J, Song X, Ward R. Deep sentence embedding using long short-term memory networks: analysis and application to information retrieval. IEEE/ACM Trans Audio Speech Lang Proc. 2016;24(4):694–707. https://doi.org/10.1109/taslp.2016.2520371.
Sundermeyer M, Ney H, Schluter R. From feedforward to recurrent LSTM neural networks for language modeling. Audio Speech Lang Process IEEE/ACM Trans on. 2015;23:517–29. https://doi.org/10.1109/TASLP.2015.2400218.
Mikolov T, Chen K, Corrado G, Dean J. Efficient estimation of word representations in vector space. In: arXiv e-prints; 2013.
Pennington J, Socher R, Manning C. Glove: Global Vectors for Word Representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP); 2014.
Wang G, Li X, Wang Z. APD3: the antimicrobial peptide database as a tool for research and education. Nucleic Acids Res. 2016;44(D1):D1087–93. https://doi.org/10.1093/nar/gkv1278.
Magrane M, UniProt C. UniProt Knowledgebase: a hub of integrated protein data. Database (Oxford). 2011;2011:bar009. https://doi.org/10.1093/database/bar009.
Manavalan B, Shin TH, Kim MO, Lee G. AIPpred: sequence-based prediction of anti-inflammatory peptides using random Forest. Front Pharmacol. 2018;9:276. https://doi.org/10.3389/fphar.2018.00276.
Chollet, F. Keras: The python deep learning library. In: Astrophysics Source Code Library; 2018.
Abadi M, Barham P, Chen J, Chen Z, Davis A, Dean J, Devin M, Ghemawat S, Irving G, Isard M, et al. TensorFlow: a system for large-scale machine learning. In: arXiv e-prints; 2016.
Powers DMW. Evaluation: from precision, recall and f-measure to roc., informedness, markedness & correlation. J Mach Learn Technol. 2011;2(1):37–63.
Nair V, Hinton GE. Rectified linear units improve restricted boltzmann machines. In: Proceedings of the 27th International Conference on International Conference on Machine Learning; Haifa, Israel, vol. 3104425: Omnipress; 2010. p. 807–14.
Boughorbel S, Jarray F, El-Anbari M. Optimal classifier for imbalanced data using Matthews correlation coefficient metric. PLoS One. 2017;12(6):e0177678. https://doi.org/10.1371/journal.pone.0177678.
Brzezinski D, Stefanowski J. Prequential AUC: properties of the area under the ROC curve for data streams with concept drift. Knowl Inf Syst. 2017;52(2):531–62. https://doi.org/10.1007/s10115-017-1022-8.
Liu Q, Xia F, Yin Q, Jiang R. Chromatin accessibility prediction via a hybrid deep convolutional neural network. Bioinformatics. 2018;34(5):732–8. https://doi.org/10.1093/bioinformatics/btx679.
Acknowledgements
The authors would like to thank the editor and the anonymous reviewers for their comments and suggestions, which helped improve the manuscript greatly. This work was supported by computational facilities of College of Artificial Intelligence in Nankai University.
Funding
This research was funded by the National Natural Science Foundation of China grant No. 61973174. The funding body played no role in the design of the study and collection, analysis, and interpretation of data and in writing the manuscript.
Author information
Authors and Affiliations
Contributions
HZ conceived the research. XS, JX, HZ, YY and XQ designed the research. XS and JX implemented the research. XS, HZ, YY wrote the manuscript. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interests
The authors declare that they have no competing interests.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
About this article
Cite this article
Su, X., Xu, J., Yin, Y. et al. Antimicrobial peptide identification using multi-scale convolutional network. BMC Bioinformatics 20, 730 (2019). https://doi.org/10.1186/s12859-019-3327-y
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s12859-019-3327-y
Keywords
- Multi-scale convolutional network
- Antimicrobial peptide
- Deep learning
- Fusion model