Skip to main content

Advertisement

Figure 2 | BMC Bioinformatics

Figure 2

From: SQUAT: A web tool to mine human, murine and avian SAGE data

Figure 2

A schematic view of the pipeline that establishes a link between RefSeq transcripts and their promoter sequence for the three species. For the human and the mouse, data is available through DBTSS (DataDase of Transcriptional Start Sites; [39]) which provides on one hand the exact RefSeq transcript TSS (Transcriptional Start Sites) position on a genome assembly and on the other hand, when it exists, alternative TSS position for this transcript. DBTSS enables to provide at least one TSS position for 53% of the human transcripts and for 46% of the mouse transcripts. In order to provide TSS positions for the rest of the transcripts, we used BLAT [40]. 83% of the human transcripts and 75% of the mouse transcripts were thereby endowed with a TSS position. Since there is no data available in DBTSS for the chicken, we first used data coming from Ensembl [41] to establish, when possible, the link between the RefSeq transcripts and the Ensembl transcripts. Some rare RefSeq transcripts correspond to several Ensembl transcripts, which confer to our database alternative TSS positions for the chicken as well. Transcripts which could not be linked to Ensembl were also aligned with BLAT on the same version of genome assembly used by Ensembl release. Finally, 85% of chicken RefSeq transcripts have found a TSS position with this pipeline which is close to the value obtained for the two other species.

Back to article page