Hierarchical cluster analysis (HCA) and principal component analysis (PCA). (A) Dendrogram from HCA (euclidean distance; average linkage) and (B) scatterplot of first two principal components from PCA on data resulting from the application of AMDIS and MetaBox to the raw data from 10 GC-MS-analysed standard mixture samples (5 × 50 μL+50 μL water and 5 × 100 μL aliquots). Reference datasets (Control) were obtained using the R package XCMS. Samples are labeled using a combination of sample number (e.g. S1 = sample 1) and the algorithm applied (MB = MetaBox, Ref = reference, f# = AMDIS using match factor #=70, 80 or 90).