Skip to main content
Fig. 8 | BMC Bioinformatics

Fig. 8

From: Spliceator: multi-species splice site prediction using convolutional neural networks

Fig. 8

Overview of the construction of the training and test sets. A DNA sequences and exon maps are recovered for each G3PO+ gene. B The AS (All Sequences) positive subset includes the SS of all G3PO+ ‘Confirmed’ and ‘Unconfirmed’ sequences. The GS (Gold Standard) positive subset includes only the SS of the ‘Confirmed’ sequences. Ten negative AS subsets and ten negative GS subsets are then constructed by random sampling of the exon, intron and FP regions of the corresponding genomic sequences. C Four AS and four GS datasets are then constructed with different ratios of positive and negative SS (described in Table 4). D Finally, the training and test sets are formed by shuffling the positive and negative sequences (10 times for each AS and GS dataset)

Back to article page