Skip to main content
Figure 2 | BMC Bioinformatics

Figure 2

From: Non-coding sequence retrieval system for comparative genomic analysis of gene regulatory elements

Figure 2

Work flow diagram of the NCSRS. The Refseq annotation uses Entrez gene IDs as the database key while Ensembl uses gene stable IDs. The input ID is converted into the appropriate database key if necessary. Entrez gene IDs are used directly for the Refseq annotation but are converted to gene stable IDs for the Ensembl annotation. Gene symbols are translated into Entrez gene IDs and gene stable IDs. Once the database keys are acquired, the homologous genes can be identified using the available homology databases if the "pull ortholog" option is activated. The database key is then used to access the mapping information that has been compiled from the annotation data. The mapping information is then used to locate the relevant sequences. These sequences are extracted then copied to a new ".fa" file with FASTA sequence format; and the annotation information about the exons is written to the ".exon" file. Thus, for each requested gene, there are one pair of files for each genome.

Back to article page