Skip to main content
Figure 1 | BMC Bioinformatics

Figure 1

From: CATMA, a comprehensive genome-scale resource for silencing and transcript profiling of Arabidopsis genes

Figure 1

Overview of the GST classification and design process yielding the CATMAv3 repertoire. The design and classification process was started with the creation of a MySQL database containing three types of information: the exon coordinates of the TIGR5 annotated protein-coding nuclear genes, the exon coordinates of Eugène 040917, an in-house generated and curated annotation, and the BLAST hit coordinates of the CATMAv2 GSTs, blasted against the Arabidopsis genome. For each annotation source, regions of overlapping genes were marked and gene models that ended with the ORF stop codon were extended with an 'artificial 3' UTR of 150 bp. Information on the prior CATMAv2 GST amplification success or failure was also added to the database. In a second step, both GSTs and genes were classified into five different categories. The classification routine is depicted in Additional File 2 and the categories themselves are described in detail in Table 2. Only successfully amplified GSTs were taken into consideration for the gene classification. When a gene was classified as GE5, it was considered as having a 'unique' tag. When a GST was classified as GST5, it was considered as 'tagging uniquely'. The GST classification was added to the CATMA database, flagging the non-tagging GSTs without actually removing them from the repository. The gene classification was used as a basis for the third and final step, the design of new GSTs for all genes not classified as GE5. To this end, we used the SPADS 1.1.5 software on virtual gene models from which all overlapping exon regions and all exon regions not common to all of the gene's alternative splice forms were removed When no GST can be designed in the most divergent exon regions, SPADS increasingly incorporates less divergent exon regions in its search space (producing GSTs with progressively lower specificity (high, medium or low) and at one point also allows the design of intron-spanning GSTs. At each design level, SPADS scans the gene model from the 3' end to the 5' end. Newly designed GSTs were added to the CATMA database.

Back to article page