- Poster presentation
- Open Access
- Published:
Dunn Index Bootstrap (DIBS): A procedure to empirically select a cluster analysis method that identifies biologically and clinically relevant molecular disease subgroups
BMC Bioinformatics volume 16, Article number: P12 (2015)
Background
Cluster analysis is widely used in cancer research to discover molecular subgroups that inform subsequent laboratory investigations and define risk classification criteria for subsequent clinical trials. However, for any data set, there are a very large number of candidate cluster analysis methods (CCAMs) due to the many choices for feature selection criteria, number of selected features, number of clusters to define, etc. Frequently, a specific CCAM is chosen without quantifying the validity of its results in terms of reproducibility or distinctiveness of the reported subgroups.
Materials and methods
Here, we propose the Dunn Index Bootstrap (DIBS) procedure to quantify the reproducibility and distinctiveness of subgroups defined by many CCAMs. DIBS applies each CCAM to the observed data and many bootstrap data sets obtained by subject resampling. The bootstrap results are used to compute metrics of subgroup reproducibility and distinctiveness of the subgroups defined by each CCAM.
Results
DIBS was used to characterize the performance of each of 4,032 CCAMs in the analysis of one RNA-seq, two microarray gene expression, and one methylation array data set from three different cancers. In each example, DIBS identified specific CCAMs that defined subgroups of well-established biological and clinical relevance.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Open Access  This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.
The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
To view a copy of this licence, visit https://creativecommons.org/licenses/by/4.0/.
The Creative Commons Public Domain Dedication waiver (https://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
About this article
Cite this article
Pawlikowska, I., Liu, Z., Shi, L. et al. Dunn Index Bootstrap (DIBS): A procedure to empirically select a cluster analysis method that identifies biologically and clinically relevant molecular disease subgroups. BMC Bioinformatics 16 (Suppl 15), P12 (2015). https://doi.org/10.1186/1471-2105-16-S15-P12
Published:
DOI: https://doi.org/10.1186/1471-2105-16-S15-P12
Keywords
- Feature Selection
- Risk Classification
- Combinatorial Library
- Microarray Gene Expression
- Microarray Gene