Robust multi-group gene set analysis with few replicates

BMC Bioinformatics

Table 2 Overview of the results

Evaluation approaches	Evaluated methods	Principles	Data	Results
Data splitting (cross-validation type)	Permutation methods	Gene expression data was splitted into two parts; reference and test. The methods were evaluated based on their ability to replicate results from reference dataset using test dataset.	1. Human primary cell data. 2. Breast cancer data.	Perm1 showed bad performance as compared to the rest of the permutation methods in both datasets.
Data splitting (cross-validation type)	mGSZm and seven other gene set analysis methods shown in Table 1.	Same as above	1. Human primary cell data. 2. Breast cancer data.	mGSZm ranked the maximum number ofreference gene sets in the list of top 50 gene sets from test data in both datasets.
Detection of tissue specific gene sets in tissue gene expression data	mGSZm and seven other methods shown in Table 1.	Method that ranked maximum number of tissue specific gene sets on the top 50 gene sets list was considered the best	Mouse tissue gene expression data.	mGSZm ranked the maximum number of tissue specific gene sets in the list of top 50 gene sets.
Type 1 error test	mGSZm and seven other methods shown in Table 1.	An ideal method is the one that generates uniform distribution of gene set score p-values obtained from null gene expression data.	Breast cancer data.	mGSZm showed slightly left skewed null p-value distribution. Similar results were obtained with other methods.

ISSN: 1471-2105