Skip to main content

Table 3 The output statistics and derived statistics from running five pairs of NGS datasets through the SA_Run2Run workflow.

From: SeqAssist: a novel toolkit for preliminary analysis of next-generation sequencing data

NGS datasets

Ecoli_I4M_R1

Ecoli_I4M_R2

Ecoli_454_500K

ECT_R1

ECT_rerun_R1

ECT_R2

ECT_rerun_R2

 

Run1

Run2

Run1

Run2

Run1

Run2

Run1

Run2

Run1

Run2

Output statistics

          

Total number of raw reads in the run

2,000,000

2,000,000

2,000,000

2,000,000

250,000

250,000

7,575,822

7,064,035

7,575,822

7,064,035

Total number of cleaned reads in the run

1,968,732

1,997,550

1,999,692

1,999,283

231,123

231,245

7,538,930

7,046,481

7,542,743

7,046,396

Number of unique reads in the run (after removing identical redundant reads)

1,487,552

1,482,834

1,450,704

1,405,779

224,217

224,537

7,114,791

6,702,601

7,242,351

6,839,442

Number of unique reads in the run (after removing identical & inclusive redundant reads)

1,487,552

1,482,834

1,450,704

1,405,779

221,379

221,622

6,885,175

6,407,251

6,945,743

6,546,440

Total number of overlapping reads in the run

712,022

730,780

810,225

834,999

18,304

18,321

950,696

941,648

807,790

770,963

Number of unique overlapping reads in the run

360,898

360,898

403,927

403,927

15,786

15,786

621,978

617,458

537,199

530,419

Number of unique overlapping reads from both runs

360,898

403,927

15,786

625,044

538,561

Derived satistics

          

File size after preprocessing

266MB

266MB

266MB

266MB

216MB

216MB

2.6GB

2.4GB

2.5GB

2.4GB

Number of redundant cleaned reads in the run

481,180

514,716

548,988

593,504

9,744

9,623

653,755

639,230

597,000

499,956

Redundancy rate within the run

32.3%

34.7%

37.8%

42.2%

4.4%

4.3%

9.5%

10.0%

8.6%

7.6%

Total number of non-overlapping reads in the run

1,256,710

1,266,770

1,189,467

1,164,284

212,819

212,924

6,588,234

6,104,833

6,734,953

6,275,433

Number of unique non-overlapping reads in the run

1,126,654

1,121,936

1,046,777

1,001,852

205,593

205,836

6,263,197

5,789,793

6,408,544

6,016,021

Number of redundant non-overlapping reads in the run

130,056

144,834

142,690

162,432

7,226

7,088

325,037

315,040

326,409

259,412

Redundancy of non-overlapping reads in the run

11.5%

12.9%

13.6%

16.2%

3.5%

3.4%

5.2%

5.4%

5.1%

4.3%

Number of redundant overlapping reads in the run

351,124

369,882

406,298

431,072

2,518

2,535

328,718

324,190

270,591

240,544

Redundancy of overlapping reads in the run

97.3%

102.5%

100.6%

106.7%

16.0%

16.1%

52.9%

52.5%

50.4%

45.3%

Total overlapping reads/total cleaned reads (each run)

36.2%

36.6%

40.5%

41.8%

7.9%

7.9%

12.6%

13.4%

10.7%

10.9%

Total overlapping reads/total cleaned reads (both runs)

36.4%

41.1%

7.9%

13.0%

10.8%

Total runtime (min)

13.3

13.0

8.2

106.3

110.8

  1. Overlapping reads are defined as those found in both runs. Unique overlapping reads from both runs are those left after removing redundant reads (identical or inclusive) from the combined overlapping reads from both runs.