Skip to main content

Table 3 Analysis of contigs removed from the scaffolding process

From: SOPRA: Scaffolding algorithm for paired reads via statistical optimization

 

E. coli dataset

P. syringae dataset

 

V-SOPRA

S-SOPRA

V-SOPRA

S-SOPRA

Total number of removed contigs

106

338

61

189

Total genomic length of removed contigs (% of total assembly)

192 kb (4.1%)

313 kb (6.7%)

77 kb (1.3%)

272 kb (4.5%)

number of problematic contigs

58

128

60

164

Total genomic length of problematic contigs (% of total assembly)

130 kb (2.8%)

184 kb (3.9%)

76 kb (1.2%)

233 kb (3.8%)

  1. Problematic contigs refer to contigs which are either chimeric, belong to repeats, or do not match to the reference genome. Genomic length means that for repeats, the length is multiplied by the corresponding copy number.