Skip to main content

Table 4 Default and dynamic/data-driven parameterizations of BicPAMS

From: BicPAMS: software for biological data analysis with pattern-based biclustering

  Parameter Value Notes
Major parameters P3 Coherency assumption Constant assumption A default assumption considers a (possibly noise-tolerant) constant pattern on a subset of rows/columns/nodes, providing an adequate degree of flexibility (superior to biclusters with differential/dense values or constant values overall) well suited for initial analyzes.
  P4 Coherency strength \(|\mathcal {L}|\)=5 or δ=\(\bar {A}\)/5 Adequate sensitivity to different levels of expression ({-2,-1} {0} and {1,2} sets of symbols correspond to down-regulation, preserved and up-regulation) or association strength. Multiple symbols can be assigned to a single real-valued element to guarantee robustness to noise.
  P5 Quality 80% Guarantees an adequate tolerance to noise, allowing biclusters to have up to 20% of noisy values.
  P15 Pattern representation Closed Closed pattern representations enable the discovery of maximal biclusters (biclusters that cannot be extended without removing rows or columns).
  P16 Orientation Patterns on rows In accordance with Def.2. Considering expression data where rows correspond to genes, a bicluster with coherency across rows is defined by a group of genes with the same pattern along a subset of conditions. When rows correspond to conditions, a less-trivial bicluster is given by a group genes with preserved expression spanning a subset of conditions.
Mapping options P6 Normalization Row Normalization of values per biological entity or sample.
  P7 Discretization Gaussian Cut-off points of a learned Gaussian curve to minimize imbalanced distributions of items.
  P8 Noise handler None By default multi-item assignments are deactivated for an easy interpretation of results. Nevertheless, we suggest the selection of multi-item assignments to guarantee a heightened robustness to discretization drawbacks and noise.
  P9 Symmetries Dynamic Symmetries are dynamically selected if the inputted data has negative values. This option can be deactivated to force the biclustering task to not distinguish positive from negative values.
  P10 Missings handler Remove Remove is suggested since Quality P5 is already in place to accommodate missing values within biclusters. Nevertheless, Replace option is suggested for data with a considerable amount of missing values.
  P11 Remove uninformative elements None By default, no items are removed. Alternative options should be only selected in the presence of knowledge regarding uninformative elements, such as non-differential expression or loose interactions.
Mining options P12 Stopping criteria 50 biclusters A minimum number of 50 biclusters (before postpro cessing) is suggested by default since the combination of this option with the quality and dissimilarity criteria leads to a compact set of dissimilar biclusters. This number (as well as the number of iterations) can be increased to guarantee more complete solutions for complex or large datasets.
  P13 Min. columns 4 Although maximal biclusters have at least 4 columns by default, this number should be increased for datasets where biclusters have a significantly higher number of columns.
  P14 Iterations 2 Guarantees the removal of small and highly coherent regions in the dataset (after the 1st iteration) to enable the discovery of less-trivial biclusters. This number can be increased to promote a more even distribution of biclusters across the regions of the inputted data.
  P17 Pattern miner Dynamic From empirical evidence, CharmDiff is suggested for closed patterns, CharmMFI for maximal patterns, and F2G for simple patterns. When order-preserving coherency is inputted, IndexSpan is suggested by default.
  P18 Scalability Dynamic Option activated in the presence of very large datasets (>20 million elements under a constant assumption and >1 million elements for the remaining coherency assumptions).
Closing P19 Merging Heuristic Guarantees an efficient yet quasi-exact postprocessing.
  P20 Filtering 40% dissimilar elements Guarantees an adequate level of dissimilarity. Biclusters sharing more than 60% of their elements with a larger bicluster are removed.