Skip to main content

Table 1 Description of the datasets. Size of the dataset (n), number of variables (p), number of minority class samples (n min ) and number of majority class samples (n maj )

From: Joint use of over- and under-sampling techniques and cross-validation for the development and assessment of prediction models

Name

n

p

n min

n maj

n min (%)

Name minority

Indian

768

8

268

500

34.9

Positive

Parkinson

195

22

48

147

24.6

Healthy

Hepatitis

155

19

32

123

20.6

Dead

Abalone

4,177

8

1,307

2,870

31.3

Female

Letter

17,307

16

689

16,618

3.4

A

Lung

32

56

9

23

28.1

1

Tae

151

5

49

102

32.4

Low

Breast

106

9

22

84

20.8

Adi

Sonar

208

60

97

111

46.6

Rock

Ozone

2,536

72

73

2,463

2.9

Ozone day

Sotiriou:er

99

7,650

34

65

34.3

ER-

Sotiriou:grade

99

7,650

45

54

45.5

Grade 3

Ivshina:er

245

22,283

34

211

13.9

ER-

Ivshina:grade

245

22,283

55

234

22.4

Grade 3

Wang:er

286

22,283

77

209

26.9

ER-

Wang:relapse

286

22,283

107

179

37.4

Relapse