From: A specialized learner for inferring structured cis-regulatory modules
Data Set | Organism | Description |
---|---|---|
Lee et al. | S. cerevisiae | 25 sets of genes with strong evidence (p-value ≤ 0.01) from the genome-wide location analysis of Lee et al. [15] that a specific pair of regulators bind to their upstream regions. This is a recreation of the data sets used by Segal et al. [2]. For each data set, we use 100 yeast promoters chosen at random as negative examples. |
Gasch et al. | S. cerevisiae | Three sets of genes associated with environmental stress response (ESR) in Yeast, described in [16]. We use promoter sequences from non-ESR yeast genes as negative examples. |
Sinha et al.-Yeast | S. cerevisiae | A set of six yeast sequences where MCM1 and MATα2 are known to bind, described in Sinha et al. [3]. For negative examples, we used nine promoter sequences which contain binding sites for either MCM1 or MATα2, but not both. |
Sinha et al.-Fly | D. melanogaster | A set of eight fly genes associated with the gap gene system, described in Sinha et al. [3]. We use 10 kb promoter sequences, and 100 promoter sequences selected randomly from the fly genome to use as negative examples. |