Sample Abstract and "Noisy" Gene List. Underlined and bold words in the PubMed Abstract are the genes were found in the text. The answer key consists of four columns. Column 1 is the file name; column 2 has the model organism unique database identifiers. Column 3 shows whether the gene was found automatically in the abstract (Y), not found and pruned (N), or added by hand (X). Column 4 shows the final set of genes in the answer key. This answer key shows that two genes were given by the database curators (FBgn0000592, and FBgn0002722); the first one was found in the abstract, the second one was not. The third gene (FBgn0026412) was found by our annotators based on the guidelines.