Skip to main content

Table 1 Real data sets

From: Efficient alignment of pyrosequencing reads for re-sequencing applications

Reference genome Source

Reference genome size

SRA accession and species

Total reads

Average read length

S. pneumoniae

ATCC 700669 [GenBank: FM211187]

≈2.2 Mbp

SRR001327 S. pneumoniae CDC1873-00

SRR001328 S. pneumoniae SP195

SRR001329 S. pneumoniae CDC0288-04

646,724

253

E. coli 0127:H6 E2348/69 [GenBank: FM180568.1]

≈4.96 Mbp

SRR000868 E. coli K-12

SRR000870 E. coli K-12

SRR031369 E. coli ETEC WS3080A

SRR031370 E. coli ETEC TW03576

588,397

263

P. falciparum 3D7 PlasmoDB rel 7.0

≈23.3 Mbp

SRR006911 P. falciparum 3D7

SRR006912 P. falciparum 3D7

SRR006913 P. falciparum 3D7

SRR006914 P. falciparum 3D7

SRR006915 P. falciparum 3D7

203,196

223

C. elegans

WormDB rel.

WS210

≈103 Mbp

SRR022943 C. elegans Lynch MA41 mutation-accumulation line derived from N2.

3,214,353

103

D. pseudoobscura FlyBase rel. 2.14

≈150 Mbp

SRR003807 D. pseudoobscura Flagstaff 1993

SRR014458 D. pseudoobscura bogotana ER (white)

SRR014459 D. pseudoobscura bogotana ER (white)

SRR014460 D. miranda strain Mather 1993

834,659

239

H. sapiens Chr. 15 ENSEMBL ver. GRCh37

≈100 Mbp

SRR014420 Human individual NA15510

SRR014421 Human individual NA15510

SRR014422 Human individual NA15510

SRR014423 Human individual NA15510

SRR014424 Human individual NA15510

SRR014425 Human individual NA15510

3,204

212

  1. Biological data sets used for the evaluation of the algorithms. The read data sets were downloaded from the Sequence Read Archive (SRA) public repository.