Skip to main content

Table 1 Genome projects with full genomes from Refseq and pyrosequencing reads from Short Read Archive

From: Artificial and natural duplicates in pyrosequencing reads of metagenomic data

IDa

SRA

Studya

SRA

Runa

Platform

Genome

Genome

size (Mbp)

GC

(%)

Number

reads

Read

Densityb

% of total

Duplicates

% of natural

Duplicates (σ d)

20067

SRP000091

SRR000351

GS_20

NC_010741

1.13946

52

529181

0.4644

13.585

5.032

(0.030)

20739

SRP000868

SRR017616

GS_FLX

NC_013170

1.61780

50

513712

0.3175

17.751

4.999

(0.022)

29525

SRP000571

SRR013433

GS_FLX

NC_013124

2.15816

68

570098

0.2641

9.938

4.464

(0.034)

19265

SRP000036

SRR000223

GS_20

NC_010085

1.64526

34

429372

0.2609

12.120

4.418

(0.026)

19981

SRP000204

SRR001584

GS_20

NC_010830

1.88436

35

399515

0.2120

26.027

4.293

(0.030)

20655

SRP000207

SRR001568

GS_20

NC_012803

2.50109

72

528437

0.2112

9.734

4.099

(0.032)

20833

SRP000867

SRR017612

GS_FLXe

NC_013174

2.74965

58

574027

0.2087

16.266

3.745

(0.027)

18819

SRP000035

SRR000219

GS_20

NC_009637

1.77269

33

332809

0.1877

15.145

3.664

(0.024)

29443

SRP000895

SRR017790

GS_FLX

NC_013166

2.85207

43

529344

0.1855

16.017

3.044

(0.025)

29419

SRP000560

SRR013388

GS_FLX

NC_012785

2.30212

41

416146

0.1807

9.537

3.025

(0.031)

19543

SRP000205

SRR001565

GS_20

NC_010483

1.87769

46

321938

0.1714

25.373

2.886

(0.038)

29381

SRP000558

SRR013382

GS_FLX

NC_011832

2.92292

55

461295

0.1578

8.526

2.911

(0.024)

29403

SRP000584

SRR013477

GS_FLX

NC_013162

2.61292

39

400460

0.1532

11.796

2.549

(0.021)

29177

SRP000442

SRR007446

GS_FLX

NC_011901

3.46455

65

438386

0.1265

14.140

2.239

(0.023)

29493

SRP000569

SRR013431

GS_FLX

NC_011883

2.87344

58

362855

0.1262

11.171

2.204

(0.023)

29175

SRP000928

SRR018125

GS_FLX

NC_011661

1.85556

33

225795

0.1216

6.209

2.151

(0.021)

27731

SRP000397

SRR006411

GS_FLX

NC_011769'

4.04030

67

488823

0.1209

18.793

2.073

(0.020)

31289

SRP000919

SRR018042

GS_FLX

NC_012917

4.86291

51

517593

0.1064

3.211

1.948

(0.037)

20635

SRP000049

SRR000266

GS_20

NC_011666

4.30543

63

401125

0.0931

10.364

1.943

(0.022)

31295

SRP000921

SRR018051

GS_FLX

NC_012912

4.81385

54

441287

0.0916

6.938

1.941

(0.019)

29527

SRP000893

SRR017783

GS_FLX

NC_013173

3.94266

58

352814

0.0894

8.356

1.926

(0.022)

20039

SRP000209

SRR001574

GS_FLXf

NC_010524

4.90940

68

422674

0.0860

8.566

1.785

(0.023)

19701

SRP000046

SRR000255

GS_20

NC_010644

1.64356

39

136514

0.0830

5.922

1.744

(0.022)

19743

SRP000045

SRR000254

GS_20

NC_011145

5.06163

74

409136

0.0808

7.464

1.739

(0.019)

20095

SRP000054

SRR000278

GS_20

NC_011891

5.02933

74

404796

0.0804

8.363

1.515

(0.022)

30681

SRP000922

SRR018054

GS_FLX

NC_012947

4.57094

50

367491

0.0803

11.324

1.449

(0.018)

21119

SRP000208

SRR001573

GS_FLXf

NC_012032

5.26895

56

392222

0.0744

10.026

1.321

(0.025)

18637

SRP000034

SRR000215

GS_20

NC_010172

5.47115

68

395973

0.0723

12.998

1.306

(0.016)

20167

SRP000053

SRR000277

GS_20

NC_011004

5.74404

64

413261

0.0719

14.572

1.248

(0.022)

19989

SRP000211

SRR001579

GS_20

NC_010571

5.95761

65

378824

0.0635

5.484

1.145

(0.023)

19449

SRP000043

SRR000248

GS_20

NC_011768

6.51707

54

395672

0.0607

16.631

1.108

(0.018)

33873

SRP000554

SRR013372

GS_FLX

NC_012691

3.47129

49

191873

0.0552

43.680

1.001

(0.025)

27951

SRP000587

SRR013487

GS_FLX

NC_013132

9.12735

45

496792

0.0544

15.017

0.283

(0.033)

20827

SRP000582

SRR013470

GS_FLX

NC_012669

4.66918

73

246279

0.0527

4.140

0.295

(0.023)

33069

SRP000920

SRR018045

GS_FLX

NC_012880

4.67945

55

226208

0.0483

13.585

5.032

(0.030)

19705

SRP000576

SRR013446

GS_FLX

NC_013093

8.24814

73

381851

0.0462

17.751

4.999

(0.022)

29975

SRP000443

SRR013137

GS_FLX

NC_011992

3.79657

66

161655

0.0425

9.938

4.464

(0.034)

17265

SRP000067

SRR000311

GS_20

NC_008369

1.89572

32

28221

0.0148

12.120

4.418

(0.026)

20729

SRP000267

SRR004103

GS_FLX

NC_012918

4.74581

60

22822

0.0048

26.027

4.293

(0.030)

  1. aProject IDs, SRA study accessions, and SRA run accessions are from NCBI Short Read Archive at http://www.ncbi.nlm.nih.gov/Traces/sra/sra.cgi.
  2. bRead Density is the number of reads divided by the genome length.
  3. dσ is the standard deviation, which is based on the results of 100 simulations (see the "Duplicated reads of genomic datasets" section).
  4. eThe platform provided by SRA is GS_FLX, and the read length (~400 bp) suggests GS_FLX Titanium.
  5. fThe platform provided by SRA is GS_20, but the read length (~200 bp) suggests GS_FLX.