Comparison of ML performance for data set S1. The y-axis shows the Q-Score and the x-axis is divided into four blocks signifying two different conditions: feature selection (none or filter/t-test) and the measurement noise level (0 or 2). In each block the number of input features increases from 8 (all causal for the endpoint) to 308 (300 irrelevant features added). The lines show the performance of different ML methods. Error bars (± 1 SD) are provided for the method LDA to indicate the typical size of variation.