Skip to main content

Table 4 Missing data fraction generated by each GBS pipeline

From: A comparison of genotyping-by-sequencing analysis methods on low-coverage crop datasets shows advantages of a new workflow, GB-eaSy

 

TASSEL

IGST

Fast-GBS

Stacks

GB-eaSy

Population 1

Missing data per line

84.5%

85.4%

85.0%

89.7%

83.4%

SNPs in 25% of lines

6812

12,334

18,731

3576

23,633

SNPs in 50% of lines

1237

1714

2984

202

3558

SNPs in 75% of lines

736

112

382

31

407

SNPs in 90% of lines

335

25

75

2

119

Population 2

Missing data per line

59.4%

70.8%

70.0%

66.1%

71.5%

SNPs in 25% of lines

65,119

68,805

122,801

142,154

120,437

SNPs in 50% of lines

35,107

39,055

76,485

52,991

76,717

SNPs in 75% of lines

2185

1548

4418

372

4880

SNPs in 90% of lines

973

26

219

21

187

Population 3

Missing data per line

62.4%

69.3%

68.4%

67.2%

69.6%

SNPs in 25% of lines

54,960

65,695

88,904

69,300

88,025

SNPs in 50% of lines

18,859

22,369

32,077

19,756

32,698

SNPs in 75% of lines

6196

7813

12,204

4539

13,005

SNPs in 90% of lines

775

479

934

98

1352

  1. The average percent of missing data per line is shown, as well as the number of SNPs detected at various proportions within each population