Skip to main content

Table 1 Real datasets used for the evaluation of the error correction tools

From: Illumina error correction near highly repetitive DNA regions improves de novo genome assembly

Abbr.

Organism

Reference ID

Reference

Platform

Insert size

Insert size

Cov.

Number of

Read length

Ref.

Dataset ID

   

size

 

mean

STD

 

Reads

mean

  

R1

Homo sapiens chr. 21

HG19

40 Mbp

Illumina

312

14

33 ×

13 486 136

100 bp

[10, 22]

Ill. Data library

R2

Homo sapiens chr. 14

HG14

104 Mbp

Illumina

158

17

35 ×

36 504 800

101 bp

[12]

GAGE

R3

Caenorhabditis elegans

WS222

97 Mbp

Illumina

173

16

58 ×

57 721 732

101 bp

[5, 22, 37]

SRR543736

R4

Drosophila melanogaster

Release 5

116 Mbp

Illumina

281

92

52 ×

63 014 762

100 bp

[5, 22, 37]

SRR823377

R5

Drosophila melanogaster

Release 5

116 Mbp

Illumina

598

39

64 ×

75 938 276

101 bp

[5, 37]

SRR988075

R6

Arabidopsis thaliana

TAIR10

116 Mbp

Illumina

477

18

72 ×

93 429 346

90 bp

[38]

SRR1174256

P1

Drosophila melanogaster

Release 5

116 Mbp

PacBio

n/a

n/a

10 ×

169 923

7374 bp

[39]

SRR1204466

P2

Arabidopsis thaliana

TAIR10

116 Mbp

PacBio

n/a

n/a

13 ×

187 292

8298 bp

[39]

SRR1284707