Feature selection for high-dimensional temporal data

Tsagris, Michail; Lagani, Vincenzo; Tsamardinos, Ioannis

doi:10.1186/s12859-018-2023-7

BMC Bioinformatics

Table 2 Cross-validated, TT-corrected performances of SES and LASSO-type methods on the four scenarios

From: Feature selection for high-dimensional temporal data

Temporal-longitudinal scenario					Temporal-distinct scenario
	MSPE		Selected vars			MSPE		Selected vars
Dataset	SESglmm	glmmLasso	SESglmm	glmmLasso	Dataset	SES	LASSO	SES	LASSO
GDS5088	0.081 (0.026)	0.160 (0.042)	5.25 (0.85)	5.15 (8.65)	GDS3859	0.068 (0.006)	0.019 (0.002)	3.5 (0.51)	11.81 (4.66)
GDS4395	0.104 (0.041)	0.640 (0.568)	5.37 (0.56)	12.35 (13.61)	GDS972	0.022 (0.000)	0.001 (0.000)	5.83 (0.92)	22.2 (9.85)
GDS4822	0.115 (0.484)	0.765 (0.436)	4.75 (0.85)	3.16 (5.77)	GDS947	0.056 (0.000)	0.054 (0.026)	5.92 (0.65)	12.40 (5.40)
GDS3326	0.135 (0.021)	0.234 (0.139)	5.42 (0.78)	2.42 (7.45)	GDS964	0.033 (0.000)	0.003 (0.000)	5.73 (0.69)	25.69 (11.86)
GDS3181	0.971 (0.484)	0.684 (0.257)	4.17 (0.87)	0.35 (2.15)	GDS2688	0.184 (0.006)	0.005 (0.001)	5.79 (1.06)	20.64 (10.93)
GDS4258	0.234 (0.096)	9.882 (4.518)	3.83 (0.51)	1.48 (4.06)	GDS2135	0.053 (0.002)	0.014 (0.003)	3.80 (0.76)	10 (5.72)
GDS3432	0.357 (0.017)	2.283 (1.572)	1.67 (3.51)	0.08 (0.55)	Av. diff.	0.053 ^a		-12.03 ^a
GDS3915	0.059 (0.002)	0.150 (0.055)	5.12 (0.80)	1.66 (4.62)
Av. diff.	-1.59 ^b		1.12 ^b
Static-distinct scenario					Static-longitudinal scenario
	PCC		Selected vars			PCC		Selected vars
Dataset	SES	LASSO	SES	LASSO	Dataset	SES	GLASSO	SES	GLASSO
GDS4319	0.873 (0.000)	0.995 (0.000)	2.1 (0.31)	8 (0.00)	GDS4146	1.000 (0.000)	0.858 (0.142)	1.00 (0.00)	0.42 (1.38)
GDS3924	0.729 (0.000)	0.528 (0.104)	2.75 (0.44)	53.56 (28.55)	GDS4518	0.750 (0.000)	0.417 (0.333)	1.75 (0.44)	3.04 (2.15)
GDS3184	0.556 (0.067)	0.578 (0.111)	3.00 (0.00)	10.62 (5.16)	GDS4820	0.500 (0.000)	0.667 (0.167)	2.00 (0.00)	5.14 (3.19)
GDS3145	0.953 (0.000)	0.594 (0.125)	1.5 (0.88)	0.6 (0.55)	GDS1840	0.625 (0.000)	0.500 (0.250)	1.5 (0.51)	2.67 (2.03)
GDS2882	0.800 (0.000)	0.750 (0.000)	1.5 (0.88)	0.25 (0.50)	Av. diff.	0.108		-1.23
GDS2851	0.722 (0.000)	0.694 (0.000)	2.25 (0.44)	0.75 (0.50)
GDS1784	0.861 (0.000)	0.694 (0.000)	1.75 (0.85)	0.5 (0.58)
GDS2456	1.000 (0.000)	0.739 (0.000)	1.2 (0.41)	0.44 (0.53)
Av. diff.	0.115 ^b		-6.52 ^b

For each dataset, performances are reported as average (st.d.). Zero standard deviations are caused by numerical rounding. For Temporal-longitudinal and Temporal-distinct scenario’s performance are computed as Mean Squared Prediction Error (MSPE, lower values indicate better performances) and number of selected variables, while for the other scenarios the Percentage of Corrected Classification (PCC, the higher the better) is used instead of MSPE. The bold numbers indicate better performance; average differences over all datasets are reported for each scenario. Symbols ^a and ^b denote average differences that are statistically significant at 0.01 and 0.05, respectively. In terms of predictive performances, SES is always on par or better than LASSO type algorithms in all scenarios except for the Temporal-distinct

Back to article page

ISSN: 1471-2105

Contact us

General enquiries: journalsubmissions@springernature.com