Skip to main content

Table 1 An overview of the proposed data analysis workflow.

From: Statistical protein quantification and significance analysis in label-free LC-MS experiments with complex designs

Steps Detailed tasks Comments
Statement of the problem • Specify comparisons of interest • Express comparisons as statistical hypotheses
  • Define scope of biological replication • Restricted scope suitable for screening; expanded scope required for validation
Exploratory data analysis • Detect mis-identified features • Remove obvious outliers
  • Detect features with missing values • Choose imputation strategy
Model-based analysis • Fit linear mixed model per protein • Reduced scope of biological replication = fixed subjects; expanded scope = random subjects
  • Check qq-plots plots for Normality • If deviations, conclusions are approximate only
  • Check residual plots for equal variance • If deviations, use iterative least squares
  • Test comparisons of interest • Adjust p-values per comparison to control FDR
  • Quantify protein abundance in conditions or samples of interest • Use as input with downstream clustering or classification
Design follow-up experiments • Evaluate power and sample size • Find minimal sample size for a fold change
   • Find minimal fold change for a sample size
  1. Supplementary Table 2 shows MSstats commands for each step.