Skip to main content

Table 2 Summary of the number of proteins detected with at least 10 or 6 PSMs in the SIHUMIx proteogenomic dataset, using the 6frame translation of the genomes

From: A workflow to identify novel proteins based on the direct mapping of peptide-spectrum-matches to genomic locations

Species

Nov

Hyp

Known

%

At least 10 PSMs per candidate

    

B. theta.

37

1975

248

45.9

B. producta

52

1138

132

23.2

E. coli

26

150

988

26.8

E. ramosum

10

355

53

13.7

B. longum

16

128

0

7.4

A. caccae

17

549

100

19.3

L. plantarum

31

83

28

3.7

C. butyricum

14

135

32

4.1

Species

Nov

Hyp

Known

%

A least 6 PSMs per candidate

    

B. theta.

72

2118

256

49.0

B. producta

103

1289

143

26.1

E. coli

65

182

1127

30.9

E. ramosum

30

431

65

16.7

B. longum

42

170

0

9.8

A. caccae

39

632

119

22.3

L. plantarum

48

116

36

5.1

C. butyricum

26

176

39

5.3

  1. Novel (nov) proteins are not contained within annotation, hypothetical (hyp) proteins are annotated but tagged with low confidence (see “Methods” section for details), known refer to all proteins for which higher levels of confidence are associated with the available annotation. The eight species are ordered by decreasing abundance. The last column gives the fraction of the annotated proteins that were detected