Skip to main content

Table 2 Summary of the number of proteins detected with at least 10 or 6 PSMs in the SIHUMIx proteogenomic dataset, using the 6frame translation of the genomes

From: A workflow to identify novel proteins based on the direct mapping of peptide-spectrum-matches to genomic locations

Species Nov Hyp Known %
At least 10 PSMs per candidate     
B. theta. 37 1975 248 45.9
B. producta 52 1138 132 23.2
E. coli 26 150 988 26.8
E. ramosum 10 355 53 13.7
B. longum 16 128 0 7.4
A. caccae 17 549 100 19.3
L. plantarum 31 83 28 3.7
C. butyricum 14 135 32 4.1
Species Nov Hyp Known %
A least 6 PSMs per candidate     
B. theta. 72 2118 256 49.0
B. producta 103 1289 143 26.1
E. coli 65 182 1127 30.9
E. ramosum 30 431 65 16.7
B. longum 42 170 0 9.8
A. caccae 39 632 119 22.3
L. plantarum 48 116 36 5.1
C. butyricum 26 176 39 5.3
  1. Novel (nov) proteins are not contained within annotation, hypothetical (hyp) proteins are annotated but tagged with low confidence (see “Methods” section for details), known refer to all proteins for which higher levels of confidence are associated with the available annotation. The eight species are ordered by decreasing abundance. The last column gives the fraction of the annotated proteins that were detected