BicSPAM: flexible biclustering using sequential patterns
 Rui Henriques^{1}Email author and
 Sara C Madeira^{1}
https://doi.org/10.1186/1471210515130
© Henriques and Madeira; licensee BioMed Central Ltd. 2014
Received: 3 October 2013
Accepted: 7 April 2014
Published: 6 May 2014
Abstract
Background
Biclustering is a critical task for biomedical applications. Orderpreserving biclusters, submatrices where the values of rows induce the same linear ordering across columns, capture local regularities with constant, shifting, scaling and sequential assumptions. Additionally, biclustering approaches relying on pattern mining output deliver exhaustive solutions with an arbitrary number and positioning of biclusters. However, existing orderpreserving approaches suffer from robustness, scalability and/or flexibility issues. Additionally, they are not able to discover biclusters with symmetries and parameterizable levels of noise.
Results
We propose new biclustering algorithms to perform flexible, exhaustive and noisetolerant biclustering based on sequential patterns (BicSPAM). Strategies are proposed to allow for symmetries and to seize efficiency gains from itemindexable properties and/or from partitioning methods with conservative distance guarantees. Results show BicSPAM ability to capture symmetries, handle planted noise, and scale in terms of memory and time. BicSPAM also achieves the best matchscores for the recovery of hidden biclusters in synthetic datasets with varying noise distributions and levels of missing values. Finally, results on gene expression data lead to complete solutions, delivering new biclusters corresponding to putative modules with heightened biological relevance.
Conclusions
BicSPAM provides an exhaustive way to discover flexible structures of orderpreserving biclusters. To the best of our knowledge, BicSPAM is the first attempt to deal with orderpreserving biclusters that allow for symmetries and that are robust to varying levels of noise.
Keywords
Background
Biclustering tasks over realvalue matrices aim to discover submatrices (biclusters) where a subset of rows exhibit a correlated pattern over a subset of columns. However, existing approaches impose the selection of specific patterns of correlation, which often leads to incomplete solutions. A simple yet powerful direction to accommodate more flexible patterns – orderpreserving patterns – was introduced by BenDor et al. [1]. A bicluster is orderpreserving if there is a permutation of its columns under which the sequence of values in every row is strictly increasing. These biclusters capture biclusters with shifting and scaling patterns of gene expression, and are, additionally, critical to detect other meaningful profiles as the progression of a disease or cellular response in distinct stages. Orderpreserving biclustering can be applied to study gene expression (GE) data [2], genomic structural variations [3], biological networks [4], translational data [5, 6], chemical data [7], nutritional data [8], among others [9, 10]. Illustrating, subsets of genes that preserve the variation of expression levels for a subset of the conditions (either timepoints, methods, stimuli, environmental contexts, tissues, organs or individuals) can disclose functional modules of interest.
Despite the relevance of the pioneer approach to find orderpreserving biclusters (OPSM) [1] and of its extensions [11, 12], this first class of greedy approaches suffers from two major drawbacks: 1) delivers approximative solutions without optimality guarantees; and 2) places restrictive constraints on the structure of the biclustering solutions (e.g. nonoverlapping assumption). A second class of exhaustive approaches, uClustering (also known as OPClustering) [7, 13], delivers solutions that overcome the flexibility issues of previous approaches. Still, their adoption presents three challenges: 1) efficiency strongly deteriorates for matrices with more than 50 rows; 2) noisy values lead to the partition of large biclusters in multiple smaller biclusters since they search for perfect orderings; and 3) the use of noncondensed pattern representations leads to large biclustering solutions.
Additionally, the existing orderpreserving approaches impose a monotonic ordering of values that does not allow for symmetries [1, 7]. However, in biological domains, such as transcriptional activity analysis, regulatory and coregulatory mechanisms are strongly correlated and, consequently, an increase in expression for some genes is sometimes accompanied by a decrease in expression for other genes.
This work introduces a new set of orderpreserving biclustering approaches, referred as BicSPAM (Biclustering based on Sequential PAttern Mining), with principles to surpass the limitations of existing alternatives. BicSPAM promotes flexible and noisetolerant searches, yet scalable, based on sequential patterns. BicSPAM contributions are threefold:

[Flexibility ] Discovery of orderpreserving biclusters with multiple levels of expressions and symmetries. Delivery of flexible structures of biclusters that allow for an arbitrary number and positioning of biclusters (to tackle the restrictive assumptions of greedy approaches);

[Robustness ] Strategies for the discovery of biclusters with varying quality. Noise relaxations are made available to guarantee noisetolerant solutions (to avoid the homogeneity restrictions imposed by existing exhaustive approaches), followed by filtering criteria to guarantee statistical significance of the discovered biclusters (to avoid the bias of greedy approaches);

[Efficiency ] Scalable searches (to surpass efficiency limits of existing exhaustive approaches) based on new mining methods that seize efficiency gains from itemindexable properties of the biclustering task and from data partitioning principles.
Two additional contributions are provided: 1) parameterizable selection of the degree of cooccurrences versus precedence relations observed in orderpreserving biclusters; and 2) strategies to handle missing values according to a parameterizable expectation of their appearance in biclustering solutions. Finally, BicSPAM integrates all the introduced principles into a coherent model that provides a consistent basis for the further development and extension of orderpreserving biclustering approaches.
Experimental results on both synthetic and real datasets demonstrate the superior flexibility, robustness and effectiveness of BicSPAM. We also show the biological relevance of discovering orderpreserving biclusters with symmetries.
The paper is organized as follows. The remainder of this section provides background on orderpreserving biclustering and biclustering based on pattern mining. Methods section introduces BicSPAM. Results and discussion section validates the performance of BicSPAM against synthetic and real datasets. Finally, the contributions and implications of this work are synthesized.
Orderpreserving biclustering
Definition 1.
Given a matrix, A = (X,Y), with a set of rows X = {x_{1},..,x_{ n }}, a set of columns Y = {y_{1},..,y_{ m }}, and elements ${a}_{\mathit{\text{ij}}}\in \mathbb{R}$relating row i and column j:

a bicluster B = (I,J) is a r × s submatrix of A, where I = (i_{1},..,i_{ r }) ⊂ X is a subset of rows and J=(j_{1},..,j_{ s }) ⊂ Y is a subset of columns;

the biclustering task is to identify a set of biclusters$\mathcal{\mathcal{B}}=\{{B}_{1},\mathrm{..},{B}_{p}\}$such that each bicluster B_{ k }= (I_{ k },J_{ k }) satisfies specific criteria of homogeneity, where I_{ k } ⊂ X, J_{ k } ⊂ Y and$k\in \mathbb{N}$.
Biclustering approaches are driven by homogeneity criteria through the use of merit functions [2]. Merit functions either guarantee intrabicluster homogeneity, the overall homogeneity of the output set of biclusters (interbicluster homogeneity), or both. Following the taxonomy proposed by Madeira and Oliveira [2], the existing biclustering approaches can be grouped acccording to their search paradigm, which determines how merit functions are applied^{a}. The merit function is thus a simple way to define the type and quality of biclusters and to affect the structure of biclusters. The bicluster type defines the allowed pattern profiles and their orientation, the solution structure constrains the number, size and positioning of biclusters, and, finally, the quality determines the allowed noise within a particular or a set of biclusters. Biclusters can follow constant, additive, multiplicative or plaid pattern assumptions, either across rows or columns [1, 2, 8]. Multiple biclustering structures have been also proposed [2], with some approaches constraining them to exhaustive, exclusive or nonoverlapping structures, and few others allowing a more flexible scheme with arbitrarily positioned overlapping biclusters.
Orderpreserving biclusters were originally proposed for finding genes coexpressed within a temporal progression, such as coexpressions at particular stages of a disease or drug response [1]. However, its range of applications are equally attractive for matrices where time is absent. Illustrating, detecting relative changes in the expression of genes across conditions can be indicative of functional regulatory behavior and, additionally, surpasses the need to rely on the exact expression values that are usually noisesusceptible.
Definition 2.
A bicluster following an orderpreserving model is (I,J) where J is a set of s columns respecting a π linear ordering, and I is the set of supporting rows where the s corresponding values are ordered according to the permutation π.
There are two major types of approaches for orderpreserving biclusters: greedy and exhaustive^{b}. Exhaustive approaches aim to identify the largest submatrices where the set of rows are the maximum sets that support a linear order of values across the set of columns [7]. Contrasting, greedy approaches rely on a merit function to guide the composition of incrementally larger/smaller biclusters. The merit function used by the original greedy orderpreserving approach, OPSM [1], is based on the upperbound probability that a random data matrix contains a bicluster with more rows supporting it. Multiple extensions have been proposed over OPSM, including: the OPSMRM method [11] to discover orderpreserving biclusters from multiple matrices obtained from replicated experiments; the POPSM method [12] to model uncertain data with continuous distributions based on a probabilistic extent to which a row belongs to bicluster; and the MinOPSM method [14] that implements a variant of the orderpreserving task.
The evaluation of orderpreserving solutions does not significantly differ from the evaluation of traditional biclustering solutions. When considering the knowledge of hidden biclusters, relative nonintersecting area (RNIA) [15], match scores [3, 16] and clustering metrics (e.g. entropy, recall and precision) have been adopted. RNIA [15] measures the overlap area between the hidden and found biclusters. Clustering error (CE) [17] extends this score to distinguish if several or exactly one of the found biclusters cover a hidden bicluster. Match scores (MS) [16] assess the similarity of solutions based on the Jaccard index. To turn MS sensitive to the number of biclusters in both sets, a consensus can be introduced by computing similarities between the Munkres pairs of biclusters [3].
In the absence of hidden biclusters, merit functions can be adopted as long as they are not biased towards the merit functions used within the approaches under comparison. Complementary, statistical evaluation has been proposed based on biclusters’ expected probability of occurrence [18, 19] or based on their enrichment pvalues against real datasets [20–22].
Sequential pattern mining
Let an item be an element from an ordered set . An itemset p is a set of nonrepeated items, $p\subseteq \mathcal{\mathcal{L}}$. A sequence s is an ordered set of itemsets. A sequence database is a set of sequences D={s_{1},..,s_{ n }}.
Let a sequence a= < a_{1}…a_{ n } > be a subsequence of b= < b_{1}…b_{ m } > (a⊆b), if ${\exists}_{1\le {i}_{1}<\mathrm{..}<{i}_{n}\le m}:{a}_{1}\subseteq {b}_{{i}_{1}},\mathrm{..},{a}_{n}\subseteq {b}_{{i}_{n}}$. A sequence is maximal with respect to a set of sequences, if it is not contained in any of them. Illustrating, s_{1}= < {a},{b e} > = a (b e) is contained in s_{2} = (a d) c(b c e) and is maximal w.r.t. D = {a e,(a b) e}.
Definition 3.
The coverage Φ_{ s }of a sequence s w.r.t. to a sequence database D is the set of all sequences in D for which s is subsequence: Φ_{ s }={s^{′}∈D∣s⊆s^{′}}. The support of a sequence s in D, denoted s u p_{ s }, can either be absolute, being its coverage size ∣ Φ_{ s } ∣, or a relative threshold given by ∣Φ_{ s }∣/∣D∣.
To illustrate these concepts, consider the following sequence database D={s_{1}=(b c)a(a b c)d,s_{2}=c a d(a c d),s_{3}=a(a c)c}. For this database, we have $\mid \phantom{\rule{0.3em}{0ex}}\mathcal{\mathcal{L}}\phantom{\rule{0.3em}{0ex}}\mid $= ∣{a,b,c,d}∣ = 4,Φ_{{a(a c)}}={s_{1},s_{2}}, and s u p_{{a(a c)}}=2.
Definition 4.
Given a set of sequences D and some userspecified minimum support threshold θ, a sequence s∈D is frequent when contained in at least θ sequences. The sequential pattern mining (SPM) problem consists of computing the set of frequent sequences, {s∣s u p_{ s }≥θ}.
The set of maximal frequent sequences for the illustrative sequence database, D= {(b c)a(a b c)d,cad (a c d),a(a c)c}, under the support threshold θ=3 is {a(a c),c c}. Existing SPM methods rely on (anti) monotonic properties to efficiently find sequential patterns.
Consider two sequences s and s^{′}, where s^{′}⊆s, and a predicate M. M is monotonic when M(s)⇒M(s^{′}) and M is antimonotonic when ¬M(s^{′})⇒¬M(s). SPM approaches usually rely on these principles: the support of s is bounded from above by the support of s^{′} and if s^{′} is not frequent, then s is not frequent.
Definition 5.
Given a sequence database and a minimum support threshold θ:

a frequent sequence s is a sequence with$\mid \phantom{\rule{0.3em}{0ex}}{\Phi}_{s}\phantom{\rule{0.3em}{0ex}}\mid \ge \theta $;

a closed frequent sequence is a frequent sequence that is not a subset of sequences with same support$({\forall}_{{s}^{\prime}\supset s}\mid {s}^{\prime}\mid <\mid s\mid )$;

a maximal frequent sequence is a frequent sequence with all supersets being infrequent,${\forall}_{{s}^{\prime}\supset s}\mid \phantom{\rule{0.3em}{0ex}}{\Phi}_{{s}^{\prime}}\phantom{\rule{0.3em}{0ex}}\mid <\theta $.
A frequent subsequence s is maximal if is frequent and all supersequences s^{′} (s⊆s^{′}) are infrequent, while is closed if it is frequent and there exists no superset with the same support. Given the sequence database D= {(b c)a(a b c)d,(a c),c a d(a c d),a(a c)c}, support θ=3 and constraint ∣ s ∣≥2, there are 2 maximal patterns ({a(a c),c c}), 3 closed patterns ({a(a c),(a c),c c}) and 5 simple patterns ({a(a c),a a,a c,(a c),c c}).
Patternbased biclustering
Patternbased biclustering approaches rely on pattern mining methods and, therefore, use support, potentially combined with confidencecorrelation metrics, as the merit means to produce biclusters. There are two major paradigms for patternbased biclustering.
Another option is to rely on frequent itemset mining [22–26]. Although these approaches only target biclusters with constant patterns, their analysis is critical as they provide key principles for flexible exhaustive searches. BiModule [27] allows for a parameterized multivalue itemization of the input matrix. DeBi [22] and Bellay’s et al. [28] place key postprocessing principles to adjust biclusters in order to guarantee heightened statistical significance. GenMiner [23] includes external knowledge within the input matrix to derive biclusters from association rules.
Methods
BicSPAM behavior section covers the fundamental options and structure of BicSPAM. The core contributions of BicSPAM are, then, conveyed in the following sections. Scalability, Flexibility and Quality sections provide critical principles and extensions to BicSPAM. Finally, Default and dynamic BicSPAM parameterizations section offers an integrated view of BicSPAM options and a method for their initialization based on data properties.
BicSPAM behavior
Understandably, optimal and flexible solutions where the number and positioning of biclusters are not previously fixed require efficient search methods. SPM methods have been tuned during the last two decades according to scalability principles [29]. In this context, the composition of orderpreserving biclusters from sequential patterns are a product of three steps (Figure 2). The columns of an input matrix are reordered according to their values, a SPM method is applied, and the output biclusters are mapped from the found frequent subsequences. Note that when two columns have equal values, they are seen as cooccurrences, while when their values differ they are treated as precedences. Consider the illustrative row x_{2} = {y_{1} = 0,y_{2} = 2,y_{3} = 0} in Figure 2, y_{1} and y_{3} cooccur, while y_{1} precedes y_{3}. In this context, biclusters are derived from sequential patterns as follows:
Definition 6.
Given a matrix A and a minimum support threshold θ, a set of orderpreserving biclusters ∪_{ k }B_{ k } where B_{ k }=(I_{ k },J_{ k }) can be derived from the set of frequent sequences ∪_{ k }s^{ k }by: 1) mapping$({I}_{k},{J}_{k})=({\Phi}_{{s}^{k}},\{{s}_{i}^{k}\mid i=1\mathrm{..}\mid {s}^{k}\mid \left\}\right)$to compose orderpreserving biclusters on rows, or by 2) mapping$({I}_{k},{J}_{k})=\left(\right\{{s}_{i}^{k}\mid i=1\mathrm{..}\mid {s}^{k}\mid \},{\Phi}_{{s}^{k}})$from A^{ T }to compose orderpreserving biclusters on columns.
The support threshold defines the minimum number of rows in the bicluster. In the context of GE analysis, a low support is critical since significant coexpression patterns can occur for small groups of genes and/or conditions. Additionally, biclusters with a number of columns below a parameterizable threshold can be filtered by pruning subsequences with a number of items below that threshold. Finally, biclustering can either rely on the SPM methods asis or target more dedicated searches by adapting the SPM support (merit function) and use it within the Aprioribased SPM framework. Existing support extensions include: Pandey et al. [24], Gowtham et al. [26], Huang et al. [30], and Steinbach et al. [31] measures. However, these metrics do not capture ordering relations and their definition needs to be (anti)monotonic.
When the original numeric values are ordered without any form of discretization, the biclusters delivered by SPMbased methods are perfect biclusters, that is, they do not allow ordering mismatches. If discretization is applied with an ordinal alphabet, the number of cooccurrences per sequence increases. In this case, the output biclusters are not perfect but are naturally more robust to handle noise. The number of items in the considered alphabet can be used to control the level of noisetolerance. However, discretization comes along with the drawback of potentially assigning two elements with similar values to different items. We refer to this drawback as the itemsboundary problem.
In particular, the chosen SPM method and target pattern representations affect the performance and output of the biclustering task. Contrasting with existing approaches, BicSPAM makes available alternatives for both variables aiming at an optimized behavior:

SPM Methods: Current SPM methods can be classified into three main categories: aprioribased, patterngrowth, and earlypruning [32]. Methods based on patterngrowth structures and earlypruning principle offer the best performance for the majority of biological data settings.

Complementary to these search alternatives, both horizontal and vertical projections of the database are possible. Vertical projections for the SPM task are only competitive with the alternatives for very flattened matrices (m≫n). When targeting GE matrices, the methods that rely on vertical data formats should be only considered for the discovery of biclusters with orderpreserving values on the rows (instead of columns). BicSPAM uses SPADE [33] (hybrid method) for vertical data settings (m≫n) and PrefixSpan [34] (patterngrowth method) for the remaining settings.

Pattern Representation: The use of simple, closed or maximal patterns largely impact the properties of the biclustering solution, as illustrated in Figure 4. Efficiency gains can be seized when targeting condensed representations. Maximal sequential patterns lead to biclusters with the columns’ size maximized. However, since both vertical and smaller biclusters are loss, maximalbased biclusters lead to incomplete solutions. The alternative is to use all sequential patterns as in μCluster [7]. This solution leads to a high number of biclusters potentially redundant (if contained by another bicluster), which can degrade the performance of the mining and closing steps. Finally, closed sequential patterns allow for overlapping biclusters only if a reduction on the number of columns from a specific bicluster results in a higher number of rows. They are the target representation to obtain maximal biclusters, biclusters that cannot be extended without the need of either removing rows or columns. BicSPAM makes available CloSpan [35] and BIDEPlus [36] to mine condensed sequential patterns. Contrasting with existing approaches, closed sequential patterns (maximal biclusters) is the default option in BicSPAM.
Scalability
Existing SPM methods are prepared to deal with sequences with an arbitrary repetition of items per sequence. However, orderpreserved biclustering is derived from a more restricted form of sequences, itemindexable sequences, which do not allow item repetitions [13]. Additionally, a common input for the biclustering task is the minimum number of columns per bicluster, that is, the minimum number of items of the output sequential patterns. Although existing SPM methods can be applied in this context, they show inefficiencies to deliver large patterns due to the combinatorial explosion of sequential patterns under low support thresholds [13]. To avoid this, we propose two strategies to improve the scalability of BicSPAM. First, we extend IndexSpan algorithm [37] to discover sequential patterns with heightened efficiency from itemindexable sequences. Second, we propose the selection of specific mapping and closing options that foster the scalability of BicSPAM for large datasets.
Seizing itemindexable properties
IndexSpan [37], an extension on PrefixSpan [34], was previously proposed by the authors to seize efficiency gains from itemindexable databases (sequences without repeated items), while guarantee a narrow search space and efficient support counting. This method contrasts with μClusters method [7, 13], which relies on a breadth search with high memory complexity Θ(n×m^{2}) that does not scale for mediumtolarge datasets (even in the presence of pruning techniques). IndexSpan considers the three following structural adaptations over the PrefixSpan algorithm. First, IndexSpan relies on an indexable compacted version of the original sequence database. Second, it uses faster and memoryefficient database projections, the most expensive step of PrefixSpan. Since the index of the items per sequence are known, IndexSpan projected database only maintains a list with the identifiers of the active sequences and of the prefix. To know if a sequence is still frequent when an item is added to a prefix, there is only the need to compare its index against the index of the previous item as well as their lexical order when the index is the same. Finally, the minimum number of items per sequential pattern, δ, is used to prune the search as early as possible. If the number of items of the current prefix plus the items of a postfix is less than δ, then the sequence identifier related with the postfix can be removed from the projected database since all the resulting patterns will have a number of items below the inputted threshold.
Two critical extensions over IndexSpan are implemented in BicSPAM. First, the discovered closed frequent sequences are represented within a compact tree structure, where the supporting transactions are annotated using principles proposed for fullpattern discovery [38]. Second, parameters from closing options are pushed to mining step. Illustrating, overlapping criteria for merging biclusters can be efficiently checked based on the properties of the tree, which significantly removes the complexity associated with computing similarities between all pairs of biclusters.
BicSPAM uses IndexSpan as the default SPM method due to its superior performance (against μClusters and traditional SPM methods) achieved by efficiency gains from fast database projections, minimalist data structures, and early pruning, merging and filter techniques.
Further efficiency options
The use of realvalues or high number of items to define the orderings is an efficient option to find orderpreserving biclusters as it guarantees a high number of precedences among column indexes (and low number of cooccurrences), leading to smaller sequential patterns. Contrasting, discretization with a low number of items is critical to guarantee more noise tolerant solution, but it degrades efficiency. This is due to the exponential increase of frequent sequential patterns either in number or size. To create a compromise between noise and efficiency, BicSPAM allows an arbitrary number of items and provides mediumtohigh number of items as the default option (∣Σ∣≈m/5).
In this context, extending and merging of biclusters discovered using a high number of items can be applied to guarantee efficiency while preserving the quality of solutions. A second strategy is to increase the minimum support threshold (under a relaxed discretization more robust to noise) to promote an heightened SPM efficiency and the later application of filters to remove biclusters’ rows and columns in order to intensify their homogeneity. BicSPAM makes available extension, merging and filtering methods.
Finally, many of the principles proposed in the last decade to guarantee the scalability of SPM methods can be easily applied with IndexSpan. These principles include: data partitioning principles (inter and intrasequence), principles for the application of SPM methods in distributed settings, and the delivery of approximated sequential patterns (discovered under specific performance guarantees) [29, 32].
Flexibility
BicSPAM relies on flexible searches (no need to fix the number of biclusters apriori), delivers flexible structures of biclusters and allows for a flexible parameterization of its behavior (if a user opts not to use the dynamically learned parameters from data). In order to further guarantee the flexibility of the target BicSPAM approaches, we: 1) extend the default orderpreserving biclusters to allow for symmetric values, and 2) define strategies to compose different structures of biclusters.
Orderpreserving biclusters with symmetries
In GE analysis, allowing symmetries is required to combine regulatory and coregulatory expression levels within a bicluster [24]. Two rows from a bicluster may have similar ordered levels of activity differing in sign. To our knowledge, this is the first attempt to combine symmetries with orderpreserving models.
Definition 7.
A bicluster with symmetries is (I,J) with either symmetries on rows${\xe2}_{\mathit{\text{ij}}}={c}_{i}\times {a}_{\mathit{\text{ij}}}$or on columns${\xe2}_{\mathit{\text{ij}}}={c}_{j}\times {a}_{\mathit{\text{ij}}}$, where c_{ i }∈{1,1} is the symmetry factor for each row of the bicluster and${a}_{\mathit{\text{ij}}}\in \mathbb{R}$.
For the purpose of finding biclusters with symmetries, the normalization should satisfy the zeromean criterion. Additionally, if the number of considered items for discretization is odd, there is one item being its own symmetric, which must be specially handled.
Although the alignment of signs can be applied for every column y_{ j }, additional efficiency can be achieved by stopping the search when all the sign combinations have been achieved. Nevertheless, the worst case requires the application of a pattern miner m times. Note that filtering is a critical postprocessing step to remove potential duplicates resulting from the repetition of coincident alignments.
Flexible biclustering structures
Patternbased biclustering approaches produce highly flexible structures of biclusters. A patternbased structure of biclusters allows overlaps and is nonexhaustive and nonexclusive. Additionally, the application of closing options over these structures allow the composition of structures with different properties, such as structures without overlapping areas. Shaping biclustering structures has been poorly addressed in literature, and rather seen as the byproduct of a target biclustering method [2].
Extension and merging of biclusters can be adopted to produce exhaustive structures (either overall, across rows or across columns). Filtering of exhaustive structures can be used to compose exclusive structures (either overall, across rows or across columns). BicSPAM makes available these closing techniques, that can be used to shape solutions with arbitrarily positioned biclusters. The composition of alternative structures in BicSPAM can be performed with sharp usability since there is no need to change the core mapping and mining steps.
Quality
BicSPAM approaches are extended in this section regarding their robustness. Multiple mapping and closing options are proposed to handle missing values and deal with varying levels of noise.
Handling varying levels of noise
Handling missing values
Robustness recurring to mapping options
BicSPAM allows for the application of normalization and discretization methods on the rows, columns or overall matrix. Each context leads to different biclusters and is, respectively, suited to find patterns on bicluster’s columns, rows or on both dimensions. Normalization options are used to scale and enhance differences on the values, which are critical when mining orderpreserving regularities. Marcilio et al. [42] compare three normalization procedures for GE data: zscore, scaling and rankbased procedures. Additional normalization criteria have been reported [43, 44]. BicSPAM requires zeromean thus allowing for symmetries and providing a simple setting for the application of multiple probabilistic distributions. When assuming the presence of missing and outlier elements, a masking bitmap can be adopted for their exclusion [27].
Robustness recurring to closing options

Merging Options[28, 47]. Merging methods allow for the delivery of noisetolerant biclusters, thus recovering lost rows and columns due to the itemsboundary problem or with missing/noisy values. An effective criterion to guide the merging is the overlapping area (as a percentage of the smaller bicluster), the default option in BicSPAM, or alternatively the resulting homogeneity of the bicluster after the merging.

Filtering Options[22, 27]. BicSPAM allows filtering at two levels: 1) at the bicluster level and 2) at the rowcolumn level. For the first type of filtering, removal of biclusters that are duplicated or contained in larger biclusters, BicSPAM follows BiModule [27] heuristics to efficiently perform this type of filtering. The second type of filtering can be adopted to exclude rows or columns from a particular bicluster in order to intensify its homogeneity. This is usually the case when a low number of items is considered, leading to highly noisetolerant biclusters. For this purpose, BicSPAM offers three strategies: 1) use of statistical tests on each row and column, 2) rely on existing greedyiterative approaches and maximize their merit functions, and 3) discover sequential patterns under more restrictive conditions (as higher support and confidence thresholds).

Extension Options[22, 28]. Similarly to filtering options at the rowcolumn level, BicSPAM imple ments three nonexclusive strategies to extend biclusters in ways that the resulting solution still satisfies some predefined homogeneity. First strategy relies on the use of greedy methods and on their merit functions for further extensions. Second strategy consists on the use of statistical tests to include rows or columns over each bicluster. Finally, BicSPAM provides a third novel strategy based on the merging of sequential patterns discovered under more relaxed support thresholds.
Default and dynamic BicSPAM parameterizations
BicSPAM parameters with impact on the solution quality and efficiency are:

Mapping step parameters, including: the number of items (allowed noise), the normalization and discretization methods, and the (optional) methods to handle missing and noisy values;

Mining step parameters, including: the inputted minimum number of rows and columns; the SPM method and its scalability extensions; and the chosen pattern representations;

Closing step parameters, including the criteria to merge, filter and extend biclusters.
BicSPAM makes available default parameterizations (dataindependent setting) and dynamic parameterizations (datadependent setting). Default parameterizations include: zeromean roworiented normalization, overall Gaussian discretization with $\frac{m}{4}$ items (for an adequate tradeoff of precedences vs. cooccurrences), and the use of rowbased IndexSpan with closed sequential patterns, noise relaxation (allocation of 2 items for values in range c∈a,b with $\frac{\mathit{\text{min}}(bc,ca)}{ba}<10\%$), removal of missing values and merging procedure with 80% overlapping. For the default setting, BicSPAM iteratively decreases the support threshold 10% (starting with θ=50%) until the output solution discovers 50 nonsimilar biclusters or a coverage of 10% of the elements in the input matrix.
The dynamic parameterizations adopt identical mining options but differ in the following aspects. Different distributions underlying the input matrix are tested to select the normalization and discretization procedure. If the range of values per row/column cannot be clustered with low error (withincluster sum of squares), extension and filtering (at the column/row level) options are adopted to foster the robustness of BicSPAM. Moderate and relaxed missing handlers are selected if the input matrix has, respectively, over 2% and 5% of missing elements. Vertical searches using SPADE SPM method [33] are selected when m > 10n. Data partitioning principles to foster scalability are made available when the following condition is not satisfied: (n<20000∧m<100)∨(n<4000∧m<200).
These parameterizations provide a robust and userfriendly environment to use BicSPAM, while expert users can still further explore alternative behavior to obtain exploratory solutions with varying quality.
Results and discussion
This section synthesizes the results from experimentally assessing the performance of BicSPAM. Results show that the proposed approaches are computationally efficient, flexible and robust to varying input settings. The methods were implemented in Java (JVM version 1.6.024). The experiments were performed using an Intel Core i5 2.30 GHz with 6 GB of RAM.
The experimental results are collected and analyzed in three steps. First, the impact of alternative BicSPAM parameterizations is analyzed indepth for synthetic datasets with varying size, noise and sparsity. Second, the performance of BicSPAM is assessed against existing alternatives. Finally, the significance of BicSPAM results in biological contexts is assessed.
Results in synthetic data
Properties of the generated dataset settings
Matrix size (♯ rows×♯cols)  100 × 30  500 × 50  1000 × 75  2000 × 100 
Number of hidden biclusters  2  3  5  8 
Number of rows  [10,20]  [40,70]  [100,150]  [200,300] 
Number of columns  [5,7]  [6,8]  [7,9]  [8,10] 
Relative area of biclusters  6,0%  3,9%  4,8%  4,5% 
A second set of datasets was generated to study the efficiency limits of BicSPAM by fixing the number of rows (∣X∣=20000) and varying the number of columns (50 ≤∣Y∣≤ 200). Background values were generated as the first set of datasets, and 2 biclusters were planted to occupy 5% of the total area.
Comparison of biclustering approaches:
four stateoftheart biclustering approaches were selected: two approaches able to deliver orderpreserving biclusters, OPSM [1] and OPClustering [7], and two approaches able to discover biclusters under constant, additive and multiplicative models, FABIA with sparse prior Equation [3] and ISA [48]. We used the following software: the BicAT software [49] to run OPSM and ISA approaches and the R package fabia[3]. The default number of iterations for the OPSM method was varied from 10 to 200 iterations. BicSPAM was used with the: 1) default parameterization, 2) default parameterization but with sequential patterns gathered from multiple levels of expression (∣Σ∣∈{4,7,10}), and 3) dynamic databased parameterization. The support threshold for both BicSPAM and OPClustering approaches was incrementally decreased 10% and stopped when the output solution had over 50 (maximal) biclusters. We applied FABIA with default parameterizations. The specified number of biclusters for both FABIA and ISA (number of starting points) was the number of hidden biclusters plus 10%: $\mid \mathcal{\mathscr{H}}\mid \times 1.1$.
Efficiency limits:
Degree of cooccurrences:
Mining methods:
Pattern representations:
Missing values:
Closing options:
The impact of merging biclusters assuming a 5% level of planted noise is illustrated in Figure 16 (left). The baseline case is when the required overlapping area for merging equals 100% (no merging effect since we are targeting biclusters derived from closed patterns). When relaxing the overlapping criteria, the $\mathit{\text{MS}}(\mathcal{\mathscr{H}},\mathcal{\mathcal{B}})$ levels (and also $\mathit{\text{MS}}(\mathcal{\mathcal{B}},\mathcal{\mathscr{H}})$ levels) increase, as the merging step allows for the recovery of missing columns and rows belonging to planted biclusters. However, this improvement in behavior is only observable until a certain threshold (near 70% for this setting). A correct identification of the optimum threshold can lead to significant gains (near 20 pp for this experimental setting).
The adoption of filtering at the row/column level also enhances the ability to recover the planted biclusters. The impact of removing potentially rows and columns (not satisfying an inputed homogeneity threshold) is illustrated in Figure 16 (middle). Filtering is relevant to correct errors related with nonplanted cooccurrences when considering restrictive discretizations. Similarly to the merging option, an increase in the matching scores is observed from the baseline case (an homogeneity degree of 0%) up to 75% (given by 1 M S R). From this upper threshold the match scores decrease since the homogeneity criteria becomes too restrictive, which leads to removal of rows and columns from planted biclusters due to a misinterpretation of their natural levels of noise.
Finally, the impact of different extension strategies is illustrated in Figure 16 (right). When increasing the planted noise, the presence of the extension options it is critical to maintain attractive levels of accuracy. Both the inclusion of new rows and columns recurring to statistical analyzes or by lowering the support of SPM methods and merging the resulting biclusters are able to maintain match score levels above 90% (30 pp higher than the baseline case).
Symmetries:
Results in real data
To assess the relevance of BicSPAM results over biomedical contexts, we selected four distinct datasets: dlbcl (180 columns/conditions, 660 rows/genes) [52], yeast (18 columns, 2884 rows) [53], colon cancer (62 columns, 2000 rows) [54] and leukemia (38 columns, 7129 rows) [55]. These datasets have been previously used by biclustering approaches with flexible coherency criteria [1, 3, 13].
Biological relevance:
To assess the biological relevance of the discovered orderpreserving biclusters, the statistical relevance was obtained using Gene Ontology (GO) annotations recurring to the GoToolBox [56]. To perform the analysis for functional enrichment we computed the pvalues using the hypergeometric distribution to access the overrepresentation of a specific GO term. In order to consider a bicluster to be highly significant, we require its genes to show significant enrichment in one or more of the “biological process” ontology terms by having a Bonferroni corrected pvalue below 0.01. A bicluster is considered significant if at least one of the GO terms is significantly enriched by having a pvalue below 0.05.
In particular, the average number of significant biclusters increases to over 80 biclusters with a larger number of elements in average when considering symmetries. This is a critical observation since it means that there are groups of genes with biological relevance that can only be discovered through biclustering under a flexible orderpreserving setting when symmetries are considered.
Illustrative biclusters passing the GO termenrichment test at 1% and 5% significance levels after Bonferroni correction
Dataset  ♯Genes  ♯Conds  ♯Preced.  ♯Items  Notes  ♯pvalues  ♯pvalues  Best 

<0 .01  [0.01,0.05]  pvalue  
Dlbcl  179  6  4  20  No closing options  5  2  3.12E4 
Dlbcl  207  9  5  25  Merging allowed  6  1  2.33E5 
Yeast  167  5  3  10  No closing options  11  3  2.12E4 
Yeast  240  8  4  15  Extensions allowed  10  1  7.13E7 
Colon  769  6  4  25  Merging allowed  12  2  6.08E8 
Leukemia  1645  6  3  20  Extensions allowed  9  2  3.47E9 
Conclusions
Patternbased approaches for orderpreserving biclustering are proposed with the goal of performing efficient exhaustive searches under flexible conditions. Results support their ability to find highly flexible and robust solutions over matrices with sizes up to 20000 rows and 200 columns. Results in both synthetic and real data show that BicSPAM can surpass the drawbacks identified for existing orderpreserving approaches, namely more relaxed scalability boundaries, flexible expression profiles, and superior robustness to noise and missing values.
BicSPAM makes available dynamically parameterizable options dependent on the input data context. BicSPAM allows:

different SPM methods, pattern representations (as simple, condensed and approximate), and dynamic optimizations to seize the specificities of the input datasets;

multiple options to deal with noise and missing values according to different relaxation levels;

arbitrary number of items and different discretization options (including strategies to deal with the itemsboundary problem) with heightened influence on the solution;

multiple ways to deal with the composition of flexible structures and with the numerosity of biclusters through extensionmergingfiltering steps without the need to adapt the core task.
Furthermore, this work introduces the notion of orderpreserving biclusters with symmetries and proposes an efficient method for their effective discovery. Results reveal that allowing symmetries is critical to simultaneously capture activation and regulatory mechanisms within a biological process.
As future work, we expect to adapt the mining step to search for lengthy sequential patterns by merging smaller sequential patterns discovered under greater support thresholds according to colossal pattern mining principles [47]. This direction also promotes the scalability of BicSPAM. Finally, we expect to integrate contributions from constraintbased pattern mining in BicSPAM to support knowledgeguided biclustering in biological contexts.
Software availability
The used datasets and BicSPAM executables are available in http://web.ist.utl.pt/rmch/software/bicspam.
Endnotes
^{a} Greedy iterative searches rely on the selection, addition and removal of rows and columns until the merit function is maximized locally [1, 57, 58]. Exhaustive searches use merit functions to guide the space exploration [18, 59]. Approaches that combine clusters from both dimensions use similarity metrics (the merit functions) for the clustering and joining stages [60, 61]. Divideandconquer searches exploit the matrix recursively using a global merit function [62]. Stochastic approaches assume that biclusters follow multivariate distributions [3, 8, 63] and learn their parameters by maximizing a likelihood (merit) function.
^{b} Existing orderpreserving search paradigms also vary with regards to the number of output biclusters – either parameterized (existing greedy approaches) or undefined (existing exhaustive approaches) – and to the number of search iterations – either one bicluster at a time (existing greedy approaches) or all biclusters at a time (existing exhaustive approaches).
^{c}$\mathit{\text{MS}}(\mathcal{\mathscr{H}},\mathcal{\mathcal{B}})$ reveals how the hidden biclusters were covered by the nearest found biclusters. Since there is at least one found bicluster with a direct correspondence to each hidden bicluster, BicSPAM has $\mathit{\text{MS}}(\mathcal{\mathscr{H}},\mathcal{\mathcal{B}})$ levels generally higher than $\mathit{\text{MS}}(\mathcal{\mathcal{B}},\mathcal{\mathscr{H}})$.
Declarations
Acknowledgements
This work was supported by FCT under the projects PTDC/EIAEIA/111239/2009 (NEUROCLINOMICS) and PEstOE/EEI/LA0021/2013, and the PhD grant SFRH/BD/75924/2011.
Authors’ Affiliations
References
 BenDor A, Chor B, Karp R, Yakhini Z: Discovering local structure in gene expression data: the orderpreserving submatrix problem. RECOMB. 2002, New York: ACM, 4957.Google Scholar
 Madeira SC, Oliveira AL: Biclustering algorithms for biological data analysis: a survey. IEEE/ACM Trans Comput Biol Bioinformatics. 2004, 1: 2445.View ArticleGoogle Scholar
 Hochreiter S, Bodenhofer U, Heusel M, Mayr A, Mitterecker A, Kasim A, Khamiakova T, Van Sanden S, Lin D, Talloen W, Bijnens L, Göhlmann HWH, Shkedy Z, Clevert DA: FABIA: factor analysis for bicluster acquisition. Bioinformatics. 2010, 26 (12): 15201527.View ArticlePubMed CentralPubMedGoogle Scholar
 Bebek G, Yang J: PathFinder: mining signal transduction pathway segments from proteinprotein interaction networks. BMC Bioinformatics. 2007, 8: 335View ArticlePubMed CentralPubMedGoogle Scholar
 Ding C, Zhang Y, Li T, Holbrook SR: Biclustering protein complex interactions with a biclique finding algorithm. ICDM. 2006, Washington, DC: IEEE Computer Society, 178187.Google Scholar
 Choi H, Kim S, Gingras AC, Nesvizhskii AI: Analysis of protein complexes through modelbased biclustering of labelfree quantitative APMS data. Mol Syst Biol. 2010, 6: 385View ArticlePubMed CentralPubMedGoogle Scholar
 Liu J, Wang W: OPCluster: clustering by tendency in high dimensional space. ICDM. 2003, Washington, DC: IEEE Computer Society, 187187.Google Scholar
 Lazzeroni L, Owen A: Plaid models for gene expression data. Statistica Sinica. 2002, 12: 6186.Google Scholar
 Charrad M, Ahmed MB: Simultaneous clustering: a survey. 2011,Google Scholar
 Sim K, Gopalkrishnan V, Zimek A, Cong G: A survey on enhanced subspace clustering. Data Min Knowl Discov. 2013, 26 (2): 332397.View ArticleGoogle Scholar
 Yip K, Kao B, Zhu X, Chui CK, Lee SD, Cheung D: Mining orderpreserving submatrices from data with repeated measurements. IEEE Trans Knowl Data Eng. 2013, 25 (7): 15871600.View ArticleGoogle Scholar
 Fang Q, Ng W, Feng J, Li Y: Mining orderpreserving submatrices from probabilistic matrices. ACM Trans Database Syst. 2014, 39: 6:16:43.View ArticleGoogle Scholar
 Liu J, Yang J, Wang W: Biclustering in gene expression data by tendency. Computational Systems Bioinformatics Conference. 2004, Stanford, CA, USA: IEEE Computer Society, 182193.Google Scholar
 Hochbaum DS, Levin A: Approximation algorithms for a minimization variant of the orderpreserving submatrices and for biclustering problems. ACM Trans Algorithms. 2013, 9 (2): 19:119:12.View ArticleGoogle Scholar
 Bozdağ D, Kumar AS, Catalyurek UV: Comparative analysis of biclustering algorithms. 2010,View ArticleGoogle Scholar
 Prelić A, Bleuler S, Zimmermann P, Wille A, Bühlmann P, Gruissem W, Hennig L, Thiele L, Zitzler E: A systematic comparison and evaluation of biclustering methods for gene expression data. Bioinformatics. 2006, 22 (9): 11221129.View ArticlePubMedGoogle Scholar
 Patrikainen A, Meila M: Comparing subspace clusterings. IEEE TKDE. 2006, 18 (7): 902916.Google Scholar
 Tanay A, Sharan R, Shamir R: Discovering statistically significant biclusters in gene expression data. Bioinformatics. 2002, 18: 136144.View ArticleGoogle Scholar
 Madeira S, Teixeira MNPC, SáCorreia I, Oliveira A: Identification of regulatory modules in time series gene expression data using a linear time biclustering algorithm. IEEE/ACM Trans Comput Biol Bioinform. 2010, 1: 153165.View ArticleGoogle Scholar
 Berriz GF, King OD, Bryant B, Sander C, Roth FP: Characterizing gene sets with FuncAssociate. Bioinformatics. 2003, 19: 25022504.View ArticlePubMedGoogle Scholar
 Young SS: Resamplingbased Multiple Testing: Examples and Methods for pvalue Adjustment. 1993, Hoboken, NJ, USA: John Wiley & SonsGoogle Scholar
 Serin A, Vingron M: DeBi: discovering differentially expressed biclusters using a frequent itemset approach. Algorithm Mol Biol. 2011, 6: 112.View ArticleGoogle Scholar
 Martinez R, Pasquier C, Pasquier N: GenMiner: mining informative association rules from genomic data. BIBM. 2007, Washington, DC: IEEE Computer Society, 1522.Google Scholar
 Pandey G, Atluri G, Steinbach M, Myers CL, Kumar V: An association analysis approach to biclustering. KDD. 2009, New York: ACM, 677686.View ArticleGoogle Scholar
 Okada Y, Okubo K, Horton P, Fujibuchi W: Exhaustive search method of gene expression modules and its application to human tissue data. IAENG IJ Comput Sci. 2007, 34: 119126.Google Scholar
 Atluri G, Bellay J, Pandey G, Myers C, Kumar V: Discovering coherent value bicliques in genetic interaction data. Proc. of 9th IW on Data Mining in Bioinformatics (BIOKDD), KDD. 2000, Washington, DC, USA: ACM digital library,Google Scholar
 Okada Y, Fujibuchi W, Horton P: A biclustering method for gene expression module discovery using closed itemset enumeration algorithm. IPSJ Trans Bioinformatics. 2007, 48 (SIG5): 3948.Google Scholar
 Bellay J, Atluri G, Sing TL, Toufighi K, Costanzo M, Ribeiro PS, Pandey G, Baller J, VanderSluis B, Michaut M, Han S, Kim P, Brown GW, Andrews BJ, Boone C, Kumar V, Myers CL: Putting genetic interactions in context through a global modular decomposition. Genome Res. 2011, 21 (8): 13751387.View ArticlePubMed CentralPubMedGoogle Scholar
 Han J, Cheng H, Xin D, Yan X: Frequent pattern mining: current status and future directions. Data Min Knowl Discov. 2007, 15: 5586.View ArticleGoogle Scholar
 Huang Y, Xiong H, Wu W, Sung SY: Mining quantitative maximal hyperclique patterns: a summary of results. PAKDD. 2006, Berlin, Heidelberg: SpringerVerlag, 552556.Google Scholar
 Steinbach M, Tan PN, Xiong H, Kumar V: Generalizing the notion of support. KDD. 2004, New York: ACM, 689694.Google Scholar
 Mabroukeh NR, Ezeife CI: A taxonomy of sequential pattern mining algorithms. ACM Comput Surv. 2010, 43: 3:13:41.View ArticleGoogle Scholar
 Zaki MJ: SPADE: an efficient algorithm for mining frequent sequences. Mach Learn. 2001, 42 (1–2): 3160.View ArticleGoogle Scholar
 Pei J, Han J, MortazaviAsl B, Wang J, Pinto H, Chen Q, Dayal U, Hsu MC: Mining sequential patterns by patterngrowth: the prefixspan approach. IEEE Trans Knowl Data Eng. 2004, 16 (11): 14241440.View ArticleGoogle Scholar
 Yan X, Han J, Afshar R: CloSpan: mining closed sequential patterns in large datasets. Proc. of SIAM IC on Data Mining (SDM). 2003, San Francisco, CA, USA: SIAM, 166177.Google Scholar
 Wang J, Han J: BIDE: efficient mining of frequent closed sequences. IEEE Computer Society. 2004, Washington, 7979.Google Scholar
 Henriques R, Antunes C, Madeira SC: Methods for the efficient discovery of large itemindexable sequential patterns. Lect Notes Artif Intell. 2014, 8399: 94108.Google Scholar
 Henriques R, Madeira SC, Antunes C: F2g: efficient discovery of fullpatterns. ECML/PKDD IW on New Frontiers in Mining Complex Patterns. 2013, Prague, Czech Republic: SpringerVerlag,Google Scholar
 Troyanskaya O, Cantor M, Sherlock G, Brown P, Hastie T, Tibshirani R, Botstein D: Altman RB: Missing value estimation methods for DNA microarrays. Bioinformatics. 2001, 17 (6): 520525.View ArticlePubMedGoogle Scholar
 Donders AR, van der Heijden GJ, Stijnen T, Moons KG: Review: a gentle introduction to imputation of missing values. J Clin Epidemiol. 2006, 59 (10): 10871091.View ArticlePubMedGoogle Scholar
 Hellem T, Dysvik B, Jonassen I: LSimpute: accurate estimation of missing values in microarray data with least squares methods. Nucleic Acids Res. 2004, 32 (3): 34View ArticleGoogle Scholar
 de Souto M, de Araujo D, Costa I, Soares R, Ludermir T, Schliep A: Comparative study on normalization procedures for cluster analysis of gene expression datasets. IEEE Int. Joint Conf. in Neural Networks. 2008, Hong Kong, China: IEEE, 27922798.Google Scholar
 Mahfouz MA, Ismail MA: BIDENS: iterative density based biclustering algorithm with application to gene expression analysis. World Academy Sci Eng Technol. 2009, 3 (1): 331337.Google Scholar
 Calders T, Goethals B, Jaroszewicz S: Mining rankcorrelated sets of numerical attributes. KDD. 2006, New York: ACM, 96105.Google Scholar
 CarmonaSaez P, Chagoyen M, Rodriguez A, Trelles O, Carazo J, PascualMontano A: Integrated analysis of gene expression by association rules discovery. BMC Bioinformatics. 2006, 7: 116.View ArticleGoogle Scholar
 Creighton C, Hanash S: Mining gene expression databases for association rules. Bioinformatics. 2003, 19: 7986.View ArticlePubMedGoogle Scholar
 Zhu F, Yan X, Han J, Yu P, Cheng H: Mining colossal frequent patterns by core pattern fusion. ICDE. 2007, Istanbul, Turkey: IEEE, 706715.Google Scholar
 Ihmels J, Bergmann S, Barkai N: Defining transcription modules using largescale gene expression data. Bioinformatics. 2004, 20 (13): 19932003.View ArticlePubMedGoogle Scholar
 Barkow S, Bleuler S, Prelić A, Zimmermann P, Zitzler E: BicAT: a biclustering analysis toolbox. Bioinformatics. 2006, 22 (10): 12821283.View ArticlePubMedGoogle Scholar
 Toivonen H: Sampling large databases for association rules. Proceedings of the 22th International Conference on Very Large Data Bases, VLDB '96. 1996, San Francisco: Morgan Kaufmann Publishers Inc., 134145.Google Scholar
 FournierViger P, Gomariz A, Soltani A, Lam H, Gueniche T: SPMF: OpenSource Data Mining Platform. 2014, [http://www.philippefournierviger.com/spmf/],Google Scholar
 Alizadeh AA, Eisen MB, Davis RE, Ma C, Lossos IS, Rosenwald A, Boldrick JC, Sabet H, Tran T, Yu X, Powell JI, Yang L, Marti GE, Moore T, Hudson JJ, Lu L, Lewis DB, Tibshirani R, Sherlock G, Chan WC, Greiner TC, Weisenburger DD, Armitage JO, Warnke R, Levy R, Wilson W, Grever MR, Byrd JC, Botstein D, Brown PO, et al: Distinct types of diffuse large Bcell lymphoma identified by gene expression profiling. Nature. 2000, 403 (6769): 503511.View ArticlePubMedGoogle Scholar
 Tavazoie S, Hughes JD, Campbell MJ, Cho RJ, Church GM: Systematic determination of genetic network architecture. Nat Genet. 1999, 22 (3): 281285.View ArticlePubMedGoogle Scholar
 Alon U, Barkai N, Notterman DA, Gish K, Ybarra S, Mack D, Levine AJ: Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proc Natl Acad Sci. 1999, 96 (12): 67456750.View ArticlePubMed CentralPubMedGoogle Scholar
 Golub TR, Slonim DK, Tamayo P, Huard C, Gaasenbeek M, Mesirov JP, Coller H, Loh ML, Downing JR, Caligiuri MA, Bloomfield CD, Lander ES: Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science. 1999, 286 (5439): 531537.View ArticlePubMedGoogle Scholar
 Martin D, Brun C, Remy E, Mouren P, Thieffry D, Jacq B: GOToolBox: functional analysis of gene datasets based on gene ontology. Genome Biol. 2004, 5 (12): R101View ArticlePubMed CentralPubMedGoogle Scholar
 Yang J, Wang W, Wang H, Yu P: Deltaclusters: capturing subspace correlation in a large data set. ICDE. 2002, San Jose, California: IEEE Computer Science, 517528.Google Scholar
 Califano A, Stolovitzky G, Tu Y: Analysis of gene expression microarrays for phenotype classification. Proc. IC Intelligent Systems for Molecular Biology. 2000, San Diego, CA, USA: AAAI Press, 7585.Google Scholar
 Wang H, Wang W, Yang J, Yu PS: Clustering by pattern similarity in large data sets. SIGMOD. 2002, New York: ACM, 394405.Google Scholar
 Getz G, Levine E, Domany E: Coupled twoway clustering analysis of gene microarray data. Proc Natl Acad Sci. 2000, 97 (22): 1207912084.View ArticlePubMed CentralPubMedGoogle Scholar
 Tang C, Zhang L, Ramanathan M, Zhang A: Interrelated twoway clustering: an unsupervised approach for gene expression data analysis. BIBE. 2001, Washington: IEEE Computer Society, 4141.Google Scholar
 Hartigan JA: Direct clustering of a data matrix. J Am Stat Assoc. 1972, 67 (337): 123129.View ArticleGoogle Scholar
 Sheng Q, Moreau Y, Moor BD: Biclustering microarray data by Gibbs sampling. ECCB. Volume 19. 2003, Paris, France: Citeseer, 196205.Google Scholar
Copyright
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited.