Skip to main content
Fig. 4 | BMC Bioinformatics

Fig. 4

From: AIKYATAN: mapping distal regulatory elements using convolutional learning on GPU

Fig. 4

Average VR performance are shown for Aikyatan. To obtain a larger data set size, unlike in RFECS, where the training set only contains peaks, we include gray area into our training set. By varying the threshold that is used to turn the raw real-valued prediction into one of the two classes, we can generate a VR curve where X-axis is the number of samples predicted as positive and Y-axis is the portion of these predicted positive samples that are validated, i.e., the validation rate (VR). In order to compare the prediction performance across the ML models, we control for the same number of predictions across these models. In order to find the specific number of predictions, we obtained the target number of predictions from RFECS where the best validation in its original data set is for around 100K predictions. Since we took 70% of the original data set as the training set and 30% as the test set and further divided test sets into 5 non-overlapping test sets, our target number of predictions becomes 6000 in each sub-sampled test set

Back to article page