Figure 2

Flowchart of the steps in defining criteria to find novel uORFs that share characteristics with known functional uORFs. Solid arrows denote partition of a gene set into subsets; dotted arrows denote that a gene set or an algorithm is influenced by or operates on something. Letters within brackets identify the different subsets referred to in the text. Set A was the initial training set; set A + B was the training set for the refined rule set.