From: Computational approaches to protein inference in shotgun proteomics
Methods | ProteinProphet | MSBayesPro | Fido | MIPGEM |
---|---|---|---|---|
Underlying graph structure | Bipartite graph with identified peptides and matching proteins1 | Bayesian network with all peptides from proteins with at least one identified peptide | Bayesian network with identified peptides and matching proteins | k-partite graph with identified peptides, matching proteins and (optionally) matching gene models2 |
Inference algorithm | EM (Expectation Maximization) like | 1) Exact3; 2) Memorizing-Gibbs sampling | 1) Exact3 ; 2) Pruning approximation | 1) Exact3; 2) Direct sampling |
Input | Probabilities for peptides with user-defined cutoff for p (often p > 0.05 is used) | Likelihood ratios for peptides with p > 0.05 and peptide detectabilities | Likelihood ratios for peptides with p > 0.05 | Probabilities for peptides with user-defined cutoff for p (often p > 0.05 is used; 0.9 for best performance) |
Output | 1) Protein probabilities; 2) Protein group probabilities; 3) NSP adjusted peptide probabilities | 1) MAP solution, protein abundances and probabilities; 2) Protein group probabilities; 3) Posterior peptide probabilities | 1) Protein probabilities; 2) Protein group probabilities | 1) Protein probabilities; 2) Gene model probabilities |
Protein prior estimation | No protein priors | Direct frequency estimation based on protein posterior probabilities in one run of MSBayesPro | Grid search optimizing cross- validation performance through multi-runs of Fido with different priors | Grid search optimizing model likelihood through multi-runs of the MIPGEM with different priors |
Peptide probability adjustment by | NSP from a parent protein | Protein quantity adjusted peptide detectability | Two detectability-like parameters α, β | Treating peptide identifications as random variables |
Protein grouping | Yes | No (indistinguishable proteins are resolved) | Yes | No (indistinguishable proteins are not resolved) |
Peptide charge | Considered | Ignored | Considered | Considered |
Novel aspects | 1) First probabilistic protein inference algorithm; 2) Efficient EM algorithm | 1) A Bayesian network; 2) Resolves indistinguishable proteins using unidentified peptides and peptide detectability; 3) Modified Gibbs sampling | 1) Using a noise model to remedy inaccurate peptide probabilities; 2) Pruning algorithm, efficient inference | Gene model probabilities4 |
Availability | http://tools.proteomecenter.org | http://darwin.informatics.indiana.edu/yonli/ | http://noble.gs.washington.edu/proj/fido | - |