Skip to main content
Fig. 3 | BMC Bioinformatics

Fig. 3

From: Fangorn Forest (F2): a machine learning approach to classify genes and genera in the family Geminiviridae

Fig. 3

Schematic representation of the VM Algorithm. Initially, the user submits a putative genomic sequence (a). Then, the algorithm scans the full-length sequence identifying all initiation codons [ATG (5′ → 3′) and CAT (3′ → 5′)], which are highlighted in blue boxes and odd numbers, and stop codons [TAA, TAG, TGA (5′ → 3′) and TTA, CTA, TCA (3′ → 5′)], denoted in red and identified by even numbers. The initiation and stop codons are clustered separately and organized according to their numbering scheme (b, e, c). Each initiation codon is tested with all stop codons to verify whether each pair can form a full-length ORF (d). All possible splicing sites GT and AG are located in the ORF (highlighted in green). Filters are applied to evaluate the consistency of candidate ORFs and to certify that they are not truncated (e)

Back to article page