Computational approaches to protein inference in shotgun proteomics

Li, Yong Fuga; Radivojac, Predrag

doi:10.1186/1471-2105-13-S16-S4

BMC Bioinformatics

Table 1 Summary of notation and abbreviations used throughout this paper.

From: Computational approaches to protein inference in shotgun proteomics

Notation	Description
	Set of all fragmentation spectra outputted by mass spectrometer
	Set of spectra identified for peptide j
s	A single fragmentation spectrum,
$P_{i}$ or i	Protein i
p_j or j	Peptide j
p _ij	Peptide j derived from protein i; used to explicitly indicate the parent protein for peptide j
	Protein database, a set of proteins used for peptide and protein identification
	Peptide database, the set of all (tryptic) peptides derived from
	Set of peptides derived from protein $P_{i}$
$t_{j}$	Indicator variable, set to 1 if peptide is p_j confidently identified
	Set of peptides that are confidently identified
x _j	Indicator variable, set to 1 if is present in the sample
y _i	Indicator variable, set to 1 if is present in the sample
x = (x₁, ... , x_j , ...)	Indicator vector representing all peptides in
y = (y₁, ... , y_i , ...)	Indicator vector representing all proteins in
N(i)	Set of peptides mapped to protein P_i
N(j)	Set of proteins that contain peptide p_j
x _N(i)	Indicator vector representing peptides in
	Peptide identification probability, the probability that peptide j is present in the sample given the spectra identified for peptide j
P (x_j = 1\|s)	The probability of the PSM matching to be correct when peptide j is the top-scoring match of spectrum
	Protein posterior probabilities, the probability that protein i is present in the sample given all spectra
d_ij (q)	Detectability of peptide p_ij at some specified quantity q; effective detectability
	Detectability of peptide p_ij at standard quantity q⁰ ; standard detectability
d _ij	Detectability of peptide p_ij; effective detectability
NSP _ij	The estimated number of (identified) sibling peptides of peptide p_ij, used by ProteinProphet to adjust the peptide identification probability
PSM	Peptide-spectrum match; when it is clear from the context, we use PSM to also refer to the top-scoring PSM per spectrum
FDR	False discovery rate; the fraction of incorrect peptide identifications in or the fraction of incorrect protein identifications in a given list outputted by a protein inference algorithm. FDR should be distinguished from the false positive rate (FPR), the fraction of all peptides (proteins) from the database that are not present in the sample but are predicted to be present (at a particular threshold).

Back to article page

ISSN: 1471-2105

Contact us

General enquiries: journalsubmissions@springernature.com