|
GDFS
|
CFS
|
rpart
|
Predominant glycans (GDFS method)
|
---|
Peak 1
|
✗
|
✗
|
✗
| |
Peak 2
|
✗
|
✗
|
✗
| |
Peak 3
|
✗
|
✗
|
✗
| |
Peak 4
|
✗
|
✗
|
✗
| |
Peak 5
|
✗
|
✗
|
✗
| |
Peak 6*
|
✗
|
✗
|
✗
|
FA2[3]G1, FA2[6]BG1
|
Peak 7
|
✗
|
✗
|
✗
| |
Peak 8
|
✗
|
✗
|
✗
| |
Peak 9
|
✗
|
✗
|
✗
| |
Peak 10
| ✓ |
✗
|
✗
|
FA2G2, FA2[6]G1S1, FA2[6]BG1S1
|
Peak 11
|
✗
|
✗
|
✗
| |
Peak 12
|
✗
|
✗
|
✗
| |
Peak 13**
|
✗
|
✗
|
✗
|
A2BG2S1
|
Peaks 14 - 24
|
✗
|
✗
|
✗
| |
- Features selected from the prostate cancer dataset (prostate cancer vs. BPH cases) by the proposed GDFS method (GDFS), correlation-based feature selection (CFS), and recursive partitioning (rpart). Features that were selected in 90% more of the cross-validation models are marked with . Also listed are the predominant glycan structures corresponding to each selected peak. Detailed N-glycan composition of human serum was described in Royle et al. [9], and peak 10 was also assigned in Saldova et al. [24]. *Peak 6 was the most commonly identified feature by the rpart method, although it was selected less than 90% of the time. **Peak 13 was selected more than 60% of the time by the GDFS method.