Skip to main content

Table 1 An overview of the proposed data analysis workflow.

From: Statistical protein quantification and significance analysis in label-free LC-MS experiments with complex designs

Steps

Detailed tasks

Comments

Statement of the problem

• Specify comparisons of interest

• Express comparisons as statistical hypotheses

 

• Define scope of biological replication

• Restricted scope suitable for screening; expanded scope required for validation

Exploratory data analysis

• Detect mis-identified features

• Remove obvious outliers

 

• Detect features with missing values

• Choose imputation strategy

Model-based analysis

• Fit linear mixed model per protein

• Reduced scope of biological replication = fixed subjects; expanded scope = random subjects

 

• Check qq-plots plots for Normality

• If deviations, conclusions are approximate only

 

• Check residual plots for equal variance

• If deviations, use iterative least squares

 

• Test comparisons of interest

• Adjust p-values per comparison to control FDR

 

• Quantify protein abundance in conditions or samples of interest

• Use as input with downstream clustering or classification

Design follow-up experiments

• Evaluate power and sample size

• Find minimal sample size for a fold change

  

• Find minimal fold change for a sample size

  1. Supplementary Table 2 shows MSstats commands for each step.