Skip to main content
Fig. 2 | BMC Bioinformatics

Fig. 2

From: Combining genetic algorithm with machine learning strategies for designing potent antimicrobial peptides

Fig. 2

Codon-representation method increases scoring variation in beginning generations of algorithm runs. The blue lines represent data for the codon-representation and the orange lines represent data without the codon-representation. The solid lines represent the average of 6 repeated genetic runs over 100 generations. The dotted lines represent the 95% CI using the student t test statistic of the repeated runs. a The number of predicted antibacterial sequences decreases when using the codon representation, b While the beginning mean fitness of the codon-representation is worse, the mean fitness converges for both methods due to the filtering of each pool by the top-scoring sequences, c The beginning standard deviation of fitness scores is higher with codon representation, but it converges with the non-codon representation by the sequence score filtering. d The maximum fitness also shows a similar trend as mean fitness, showing that the best scoring sequences for either method converge to similar scores. The increased scoring variation likely comes from a wider parameter space coverage. An advantage of using codon representation is the increased parameter coverage for screened peptides

Back to article page