From: G-bic: generating synthetic benchmarks for biclustering
Parameters | Gene expression | Recommendation systems | Text mining | Clinical data | Spatio-temporal data | |
---|---|---|---|---|---|---|
Dataset properties | Number of rows | 10000 | 200000 | 30000 | 50000 | 30 |
Number of columns | 100 | 30000 | 20000 | 8000 | 150 | |
Heterogeneous? | No | No | No | Yes | No | |
Properties | Background with 10% noise | Background with 95,5% Missing values | Background with 99,8% missing values | Background with 99,8% missing values | Background with 50% noise and 20% errors | |
Bicluster properties | Number of biclusters | 500 | 3000 | 70 | 30 | 20 |
Rows structure | U(80,400) | U(30,70) | U(1000,10000) | U(20,100) | U(2,4) | |
Columns structure | U(20,40) | U(3,7) | U(600,6000) | U(5,15) | U(7,10) | |
Contiguity | No | No | No | No | Yes | |
Biclustering patterns | Additive and Order Preserving | Order Preserving | Constant and Order Preserving | Order Preserving | Additive and Multiplicative | |
Overlapping | 10% bics with additive overlap | None | None | None | None |