From: Computational approaches to protein inference in shotgun proteomics
Notation | Description |
---|---|
| Set of all fragmentation spectra outputted by mass spectrometer |
| Set of spectra identified for peptide j |
s | A single fragmentation spectrum, |
or i | Protein i |
p j or j | Peptide j |
p ij | Peptide j derived from protein i; used to explicitly indicate the parent protein for peptide j |
| Protein database, a set of proteins used for peptide and protein identification |
| Peptide database, the set of all (tryptic) peptides derived from |
| Set of peptides derived from protein |
| Indicator variable, set to 1 if peptide is p j confidently identified |
| Set of peptides that are confidently identified |
x j | Indicator variable, set to 1 if is present in the sample |
y i | Indicator variable, set to 1 if is present in the sample |
x = (x1, ... , x j , ...) | Indicator vector representing all peptides in |
y = (y1, ... , y i , ...) | Indicator vector representing all proteins in |
N(i) | Set of peptides mapped to protein P i |
N(j) | Set of proteins that contain peptide p j |
x N(i) | Indicator vector representing peptides in |
| Peptide identification probability, the probability that peptide j is present in the sample given the spectra identified for peptide j |
P (x j = 1|s) | The probability of the PSM matching to be correct when peptide j is the top-scoring match of spectrum |
| Protein posterior probabilities, the probability that protein i is present in the sample given all spectra |
d ij (q) | Detectability of peptide p ij at some specified quantity q; effective detectability |
| Detectability of peptide p ij at standard quantity q0 ; standard detectability |
d ij | Detectability of peptide p ij ; effective detectability |
NSP ij | The estimated number of (identified) sibling peptides of peptide p ij , used by ProteinProphet to adjust the peptide identification probability |
PSM | Peptide-spectrum match; when it is clear from the context, we use PSM to also refer to the top-scoring PSM per spectrum |
FDR | False discovery rate; the fraction of incorrect peptide identifications in or the fraction of incorrect protein identifications in a given list outputted by a protein inference algorithm. FDR should be distinguished from the false positive rate (FPR), the fraction of all peptides (proteins) from the database that are not present in the sample but are predicted to be present (at a particular threshold). |