Skip to main content
Figure 4 | BMC Bioinformatics

Figure 4

From: A comparison of machine learning algorithms for chemical toxicity classification using a simulated multi-scale data model

Figure 4

Schematic view of the learning method employed. A large simulated data set is created from the model. From this large data pool, multiple independent samples are drawn and either used for cross validation training and validation (X-val) (left hand branch) or independent model validation (I-val) (right hand branch). For cross validation training, we use standard K-fold cross validation with K = 10. The cross validation performance is the average of the 10 partitions. The classification model ("fit") used in the right hand, independent validation branch is constructed using the entire data set for the left hand branch. For each classifier and each set of conditions, a total of 10 samples are drawn for the cross validation and 10 for the independent validation processes. From this collection of results, we derive means and standard deviations for the balanced accuracy or Q-score.

Back to article page