Structure of the Synteny-based Function Prediction System (SynFPS). The dotted line represents the system boundary, outside of which lies the system inputs and outputs. A set of gene functions (A) specified in the form of regular expressions are matched against the genome database (B) via the text processing unit (D), which result may then be refined (C). A clustering system (E) based on the synteny scores of the matching genes brings together genomes that show conservation of gene order and position (G). Such information is used to generate a set of positive and negative data (genes) to train the classification system (F) that produces function prediction results (H).