BicPAMS: software for biological data analysis with pattern-based biclustering

Henriques, Rui; Ferreira, Francisco L.; Madeira, Sara C.

doi:10.1186/s12859-017-1493-3

BMC Bioinformatics

Table 4 Default and dynamic/data-driven parameterizations of BicPAMS

From: BicPAMS: software for biological data analysis with pattern-based biclustering

	Parameter	Value	Notes
Major parameters	P3 Coherency assumption	Constant assumption	A default assumption considers a (possibly noise-tolerant) constant pattern on a subset of rows/columns/nodes, providing an adequate degree of flexibility (superior to biclusters with differential/dense values or constant values overall) well suited for initial analyzes.
	P4 Coherency strength	\(\|\mathcal {L}\|\)=5 or δ=\(\bar {A}\)/5	Adequate sensitivity to different levels of expression ({-2,-1} {0} and {1,2} sets of symbols correspond to down-regulation, preserved and up-regulation) or association strength. Multiple symbols can be assigned to a single real-valued element to guarantee robustness to noise.
	P5 Quality	80%	Guarantees an adequate tolerance to noise, allowing biclusters to have up to 20% of noisy values.
	P15 Pattern representation	Closed	Closed pattern representations enable the discovery of maximal biclusters (biclusters that cannot be extended without removing rows or columns).
	P16 Orientation	Patterns on rows	In accordance with Def.2. Considering expression data where rows correspond to genes, a bicluster with coherency across rows is defined by a group of genes with the same pattern along a subset of conditions. When rows correspond to conditions, a less-trivial bicluster is given by a group genes with preserved expression spanning a subset of conditions.
Mapping options	P6 Normalization	Row	Normalization of values per biological entity or sample.
	P7 Discretization	Gaussian	Cut-off points of a learned Gaussian curve to minimize imbalanced distributions of items.
	P8 Noise handler	None	By default multi-item assignments are deactivated for an easy interpretation of results. Nevertheless, we suggest the selection of multi-item assignments to guarantee a heightened robustness to discretization drawbacks and noise.
	P9 Symmetries	Dynamic	Symmetries are dynamically selected if the inputted data has negative values. This option can be deactivated to force the biclustering task to not distinguish positive from negative values.
	P10 Missings handler	Remove	Remove is suggested since Quality P5 is already in place to accommodate missing values within biclusters. Nevertheless, Replace option is suggested for data with a considerable amount of missing values.
	P11 Remove uninformative elements	None	By default, no items are removed. Alternative options should be only selected in the presence of knowledge regarding uninformative elements, such as non-differential expression or loose interactions.
Mining options	P12 Stopping criteria	50 biclusters	A minimum number of 50 biclusters (before postpro cessing) is suggested by default since the combination of this option with the quality and dissimilarity criteria leads to a compact set of dissimilar biclusters. This number (as well as the number of iterations) can be increased to guarantee more complete solutions for complex or large datasets.
	P13 Min. ♯columns	4	Although maximal biclusters have at least 4 columns by default, this number should be increased for datasets where biclusters have a significantly higher number of columns.
	P14 ♯Iterations	2	Guarantees the removal of small and highly coherent regions in the dataset (after the 1st iteration) to enable the discovery of less-trivial biclusters. This number can be increased to promote a more even distribution of biclusters across the regions of the inputted data.
	P17 Pattern miner	Dynamic	From empirical evidence, CharmDiff is suggested for closed patterns, CharmMFI for maximal patterns, and F2G for simple patterns. When order-preserving coherency is inputted, IndexSpan is suggested by default.
	P18 Scalability	Dynamic	Option activated in the presence of very large datasets (>20 million elements under a constant assumption and >1 million elements for the remaining coherency assumptions).
Closing	P19 Merging	Heuristic	Guarantees an efficient yet quasi-exact postprocessing.
	P20 Filtering	40% dissimilar elements	Guarantees an adequate level of dissimilarity. Biclusters sharing more than 60% of their elements with a larger bicluster are removed.

Back to article page

ISSN: 1471-2105

Contact us

General enquiries: journalsubmissions@springernature.com