Skip to main content

Table 1 Metrics of repeats on five datasets

From: RepAHR: an improved approach for de novo repeat identification by assembly of the high-frequency reads

Species

Method

Num

Size (kbp)

Max/min

N50 (bp)

N90 (bp)

Drosophila melanogaster

RepARK CLC

818

518

6833/200

1040

255

RepARK Velvet

4561

873

7587/57

285

87

REPdenovo

52

61

8339/102

2843

397

RepAHR

2647

1350

12,787/56

1350

98

Saccharomyces cerevisiae

RepARK CLC

545

291

8271/200

626

266

RepARK Velvet

1457

394

9129/57

423

111

REPdenovo

3

0.72

265/213

258

213

RepAHR

392

219

9523/128

2089

183

Acromyrmex echinatior

RepARK CLC

485

249

8272/200

659

240

RepARK Velvet

3931

559

5547/57

160

68

REPdenovo

249

99

2143/100

597

182

RepAHR

2699

514

10,701/88

285

86

\(Homo \ sapiens\) chr14

RepARK CLC

105

29

594/201

273

216

RepARK Velvet

846

106

574/57

140

80

REPdenovo

14

9

5545/101

5545

211

RepAHR

1738

219

2177/45

213

61

\(Mus\ musculus\)

RepARK CLC

3839

1835

17,062/200

565

236

RepARK Velvet

47,232

2302

16,526/57

129

57

REPdenovo

9376

12,652

14,827/100

3129

848

RepAHR

77,891

19,201

28,893/150

503

222

  1. \('Num'\) indicates the number of repeats. \('Size'\) indicates the total length of all repeats. \('Max'\) represents the length of the longest segment in the repeats. \('Min'\) represents the length of the shortest segment in the repeats. \('N50\, or\, N90'\) represents the length of the longest segment such that all the segments longer than this segment cover at least 50% or 90% of the total length of the assemblies