Skip to main content

Table 2 Performance of GBS-SNP-CROP under three different sampling strategies for building the Mock Reference: Using all 48 individuals in the population (MR48), using only the 5 individuals with the highest number of parsed reads (MR05), and using only the single most read-abundant genotype (MR01)

From: GBS-SNP-CROP: a reference-optional pipeline for SNP discovery and plant germplasm characterization using variable length, paired-end genotyping-by-sequencing data

Pipelines Total number of centroids used to build the Mock Referencea Total number of paired-end reads used for SNP callingb Number of SNPs calledc Avg. depthd Hetero (%)e Homo (%)f Missing data (%)g Time (hrs:mins)h
GBS-SNP-CROP-MR48 1,276,734 92,667,123 14,712 70.74 32.47 59.31 8.20 14:30
GBS-SNP-CROP-MR05 500,795 132,920,383 20,226 71.02 34.50 57.18 8.31 12:06
GBS-SNP-CROP-MR01 229,549 154,506,669 21,318 69.34 34.51 56.85 8.29 11:03
  1. a Total number of non-redundant consensus sequences (centroids) identified via clustering to represent the GBS fragment space. This is also the number of FASTA entries in the “MockRef_Clusters.fasta” file
  2. b Number of reads retained by the pipeline after mapping procedures and thus used for SNP calling
  3. c Total number of SNPs called, given all SNP calling filters and genotyping criteria described in the text
  4. d Average read depth for all SNPs across the entire population
  5. e Percentage of heterozygous genotype calls
  6. f Percentage of homozygous genotype calls
  7. g Percentage of missing cells (i.e. no genotype call for a given SNP*accession combination) in the final SNP genotype matrix
  8. h The total computation time required for all pipeline analysis when executed on a Unix workstation with 16 GB RAM and a 2.6 GHz Dual Intel processor