From: Impact of missing data imputation methods on gene expression clustering and classification
Original data | MV Filtering | |||||||||
---|---|---|---|---|---|---|---|---|---|---|
Dataset | Tissue | No. classes | Size of classes | No. samples | No. genes | % MV | % Genes with MV | No. genes | % MV | % Genes with MV |
alizadeh-2000-v1 | Blood | 2 | 21, 21 | 42 | 4022 | 3.25 | 49.30 | 3678 | 2.15 | 44.56 |
alizadeh-2000-v2 | Blood | 3 | 42, 9, 11 | 62 | 4022 | 4.59 | 66.93 | 3369 | 2.75 | 60.52 |
alizadeh-2000-v3 | Blood | 4 | 21, 21, 9, 11 | 62 | 4022 | 4.59 | 66.93 | 3369 | 2.75 | 60.52 |
bredel-2005 | Brain | 3 | 31, 14, 5 | 179 | 41472 | 7.57 | 43.06 | 19200 | 3.25 | 30.56 |
chen-2002 | Liver | 2 | 104, 75 | 66 | 24192 | 6.04 | 88.46 | 22336 | 2.18 | 85.46 |
garber-2001 | Lung | 4 | 17, 40,4, 5 | 110 | 24192 | 3.87 | 67.81 | 36663 | 2.23 | 65.14 |
lapointe-2004-v1 | Prostate | 3 | 11, 39, 19 | 69 | 42640 | 4.56 | 73.57 | 35265 | 2.10 | 69.26 |
lapointe-2004-v2 | Prostate | 4 | 11, 39, 19, 41 | 110 | 42640 | 4.93 | 67.16 | 36663 | 2.23 | 60.29 |
liang-2005 | Brain | 3 | 28, 6, 3 | 37 | 42640 | 4.56 | 73.57 | 22923 | 0.82 | 23.16 |
risinger-2003 | Endometrium | 4 | 13, 3, 19, 7 | 42 | 24192 | 7.97 | 74.33 | 8366 | 0.76 | 20.76 |
tomlins-2006 | Prostate | 5 | 27, 20, 32, 13, 12 | 104 | 8872 | 4.46 | 89.34 | 9936 | 3.27 | 80.94 |
tomlins-2006-v2 | Prostate | 4 | 27, 20, 32, 13 | 92 | 20001 | 4.04 | 84.23 | 10048 | 3.34 | 79.72 |
Mean | 23575 | 5.04 | 70.39 | 17651 | 2.32 | 56.74 |