Fig. 2From: Taxonomy-aware feature engineering for microbiome classificationThe HFE algorithm. Note that OTUs are possibly associated to higher taxonomic ranks (e.g. OTU 2) due to incomplete taxonomic classification. We refer to them as leaves in incomplete paths. The feature space first grows from Rm to Rm + m', where m' is the number of internal nodes in T (phase 1). Subsequently the feature space is reduced by the number of sufficiently correlated child nodes (s1, phase 2) and relatively uninformative features (s2 and s3, phase 3 and 4, resp.), yielding the final feature space Rm + m − s1 − s2 − s3. The n samples represent the training datasetBack to article page