Figure 4From: Improving de novo sequence assembly using machine learning and comparative genomics for overlap correctionAssembly of S. aureus strain JH1. Statistics from overlaps derived from the JH9 training reads were used to train a J48 Weka model. Overlaps from the JH1 test data were classified based on this model and any overlaps predicted to be false were removed. The remaining overlaps were used in assembly of JH1. The N50 contig length of the final assembly as well as the percentage of the reference JH1 genome matched by the contigs are plotted. Within parenthesis, the percent cutoff for strains to be analyzed with the comparative score is shown.Back to article page