Dunn Index Bootstrap (DIBS): A procedure to empirically select a cluster analysis method that identifies biologically and clinically relevant molecular disease subgroups

Pawlikowska, Iwona; Liu, Zhifa; Shi, Lei; Lin, Tong; Gruber, Tanja; Robinson, Giles; Onar-Thomas, Arzu; Pounds, Stan

doi:10.1186/1471-2105-16-S15-P12

Volume 16 Supplement 15

Proceedings of the 14th Annual UT-KBRIN Bioinformatics Summit 2015

Poster presentation
Open access
Published: 23 October 2015

Dunn Index Bootstrap (DIBS): A procedure to empirically select a cluster analysis method that identifies biologically and clinically relevant molecular disease subgroups

Iwona Pawlikowska^1,3,
Zhifa Liu¹,
Lei Shi¹,
Tong Lin¹,
Tanja Gruber²,
Giles Robinson²,
Arzu Onar-Thomas¹ &
…
Stan Pounds¹

BMC Bioinformatics volume 16, Article number: P12 (2015) Cite this article

1392 Accesses
1 Citations
Metrics details

Background

Cluster analysis is widely used in cancer research to discover molecular subgroups that inform subsequent laboratory investigations and define risk classification criteria for subsequent clinical trials. However, for any data set, there are a very large number of candidate cluster analysis methods (CCAMs) due to the many choices for feature selection criteria, number of selected features, number of clusters to define, etc. Frequently, a specific CCAM is chosen without quantifying the validity of its results in terms of reproducibility or distinctiveness of the reported subgroups.

Materials and methods

Here, we propose the Dunn Index Bootstrap (DIBS) procedure to quantify the reproducibility and distinctiveness of subgroups defined by many CCAMs. DIBS applies each CCAM to the observed data and many bootstrap data sets obtained by subject resampling. The bootstrap results are used to compute metrics of subgroup reproducibility and distinctiveness of the subgroups defined by each CCAM.

Results

DIBS was used to characterize the performance of each of 4,032 CCAMs in the analysis of one RNA-seq, two microarray gene expression, and one methylation array data set from three different cancers. In each example, DIBS identified specific CCAMs that defined subgroups of well-established biological and clinical relevance.

Author information

Authors and Affiliations

Department of Biostatistics, St. Jude Children's Research Hospital, Memphis, TN, 38105, USA
Iwona Pawlikowska, Zhifa Liu, Lei Shi, Tong Lin, Arzu Onar-Thomas & Stan Pounds
Department of Oncology, St. Jude Children's Research Hospital, Memphis, TN, 38105, USA
Tanja Gruber & Giles Robinson
Institue of Mathematics, University of Silesia, Katowice, 2469011, Poland
Iwona Pawlikowska

Authors

Iwona Pawlikowska
View author publications
You can also search for this author in PubMed Google Scholar
Zhifa Liu
View author publications
You can also search for this author in PubMed Google Scholar
Lei Shi
View author publications
You can also search for this author in PubMed Google Scholar
Tong Lin
View author publications
You can also search for this author in PubMed Google Scholar
Tanja Gruber
View author publications
You can also search for this author in PubMed Google Scholar
Giles Robinson
View author publications
You can also search for this author in PubMed Google Scholar
Arzu Onar-Thomas
View author publications
You can also search for this author in PubMed Google Scholar
Stan Pounds
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Stan Pounds.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.

The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

To view a copy of this licence, visit https://creativecommons.org/licenses/by/4.0/.

The Creative Commons Public Domain Dedication waiver (https://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article

Pawlikowska, I., Liu, Z., Shi, L. et al. Dunn Index Bootstrap (DIBS): A procedure to empirically select a cluster analysis method that identifies biologically and clinically relevant molecular disease subgroups. BMC Bioinformatics 16 (Suppl 15), P12 (2015). https://doi.org/10.1186/1471-2105-16-S15-P12

Download citation

Published: 23 October 2015
DOI: https://doi.org/10.1186/1471-2105-16-S15-P12

Proceedings of the 14th Annual UT-KBRIN Bioinformatics Summit 2015

Dunn Index Bootstrap (DIBS): A procedure to empirically select a cluster analysis method that identifies biologically and clinically relevant molecular disease subgroups

Background

Materials and methods

Results

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

BMC Bioinformatics

Contact us

Proceedings of the 14th Annual UT-KBRIN Bioinformatics Summit 2015

Dunn Index Bootstrap (DIBS): A procedure to empirically select a cluster analysis method that identifies biologically and clinically relevant molecular disease subgroups

Background

Materials and methods

Results

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

BMC Bioinformatics

Contact us