Fig. 8From: Decontaminating eukaryotic genome assemblies with machine learningGC content and average per-base sequencing coverage for the simulated datasets contaminated with microbial DNA. Training datasets are shown on the left and bagging decision tree predictions are shown on the right for a-b) A. thaliana; c-d) C. elegans; e-f) D. melanogaster; and g-h) T. rubripes. The microbial genomes were GC-rich relative to the target organisms and a simple decision tree based on GC content and sequencing coverage predicted scaffold origin with low error for each datasetBack to article page