Skip to main content

Sequence Analysis In Promoter Regions

A gene's promoter region is a prominent factor in determining that gene's expression networks and cycle. The significant elements which determine the promoter's effect are generally discovered through laboratory methods. This project's aim is to develop high throughput sequence analysis methods for identifying the important promoter features and classifying promoters according to the occurrence of these features.

Saccharomyces Cervisiae, as one of the first fully sequenced organisms which also has publicly available expression data, was chosen as a case study for this work. Motifs of over-represented sequence segments were sought using a bespoke implementation of the Smith-Waterman algorithm running on the Cambridge-Cranfield high performance computer facility. The algorithm identifies all pairs of sequence segments whose similarity surpasses a given threshold. The found segments are then grouped by similarity, and those groups with a many members are the motifs which are over-represented.

The motifs identified using this approach are compared to known transcription factor binding sites from the S. Cervisiae Promoter Database. Most of the motifs produced are expected to correspond with binding sites; others could be new discoveries. Having validated the method on S. Cervisiae, it can then be used with some confidence to analyse promoters from other species, whose characteristics are less well understood.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Jeremy Austin.

Rights and permissions

Open Access This article is published under license to BioMed Central Ltd. This is an Open Access article is distributed under the terms of the Creative Commons Attribution License ( ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and Permissions

About this article

Cite this article

Austin, J. Sequence Analysis In Promoter Regions. BMC Bioinformatics 6 (Suppl 3), P2 (2005).

Download citation

  • Published:

  • DOI: