Skip to main content

Table 1 List of simulated datasets (S1–S4) and real-world datasets (R1–R3)

From: CARE 2.0: reducing false-positive sequencing error corrections using machine learning

Name

Organism

Coverage

Reads

S1

C.elegans

30x

30.1M

S2

D.melanogaster

30x

36.0M

S3

Hum. Chr. 14

30x

26.5M

S4

Human

30x

914.7M

R1

C.elegans

58x

57.7M

R2

D.melanogaster

64x

75.9M

R3

Hum. Chr. 14

35x

36.5M

  1. Simulated reads have length 100. Real reads have length 101. Accession numbers for R1 and R2 are SRR543736, and SRR988075, respectively. R3 is taken from GAGE Genome Assembly Gold-standard Evaluations