Skip to main content
Fig. 1 | BMC Bioinformatics

Fig. 1

From: LANDMark: an ensemble approach to the supervised selection of biomarkers in high-throughput sequencing data

Fig. 1

A geometric overview of how LANDMark trees can partition samples. Oblique (straight line) and non-linear (curved line) splits are created linear and neural network models, respectively. Unlike the axis aligned splits used in Random Forests, LANDMark nodes consider multiple features. This can allow each model to take advantage of the additional information and use it to learn more appropriate decision rules. Since multiple models are considered at each node, only those which partition samples into smaller purer regions are selected. Random linear oracles, left, can be used to add additional randomness to LANDMark. This approach selects two points at random without replacement and calculates the midpoint between these samples. This midpoint is then used to find the hypersurface orthogonal to the initial two points. Samples are then partitioned according to side of the hypersurface in which they are found. Following this, different randomized subsets of features and/or a bootstrapped samples of the data are then used to train different supervised models in each node (middle). This process is repeated until a stopping criterion is met (right). Many trees are constructed in this way and their decisions combined to produce the final prediction

Back to article page