Skip to main content

Table 1 A mixture model with a reference-based algorithm for feature extraction/component selection

From: A mixture model with a reference-based automatic selection of components for disease classification from protein and/or gene expression levels

Inputs. x n k , y n { 1 , - 1 } n = 1 N samples and sample labels, where K represents number of feature points (m/z ratios or genes).

   xcontrol K and xdisease K representing control and disease (case) groups of samples.

Nested two-fold cross-validation. Parameters: single component points (SCPs) selection threshold in radian equivalents of Δ θ {10, 30, 50}; regularization constant λ {10-2λmax, 10-4λmax, 10-6λmax}; number of components M {2, 3, 4, 5}; parameters of selected classifier.

   Components selection from mixture samples.

1. x x n k n = 1 N form a linear mixture models (LMMs) (2a) and (2b).

2. For LMMs (2a)/(2b) select a set of single component points for a givenΔθ.

3. On sets of SCPs use hierarchical clustering (other clustering methods can be used also) to estimate mixing matrices Acontrol and Adisease for a given M.

4. Estimate source matrices Scontrol and Sdisease by solving (3a) and (3b) respectively for a given regularization parameter λ.

5. Use minimal and maximal mixing angles estimated from mixing matrices A control and A disease to select, following the logic illustrated in Fig. 2a and Fig. 2b, disease and control specific components: s control ref .; n disease , s control ref .; n control , s disease ref .; n control and s disease ref .; n disease .

   End of component selection.

End of nested two-fold cross-validation.