Skip to main content

Table 1 Comparison of motifs identified by different programs for the muscle genes 1,2,3,4,5.

From: Combining comparative genomics with de novo motif discovery to identify human transcription factor DNA-binding motifs

  Myf SRF Mef2 Tef NVL Combined
masking repeats 6 GGGACATG 14/2/68 TCAGCCCT 4/1/63 N N ATCAGCCC 4/2/60 34/5/191
1/7 AGGGGGCATG 14/1/19 N N N N 34/1/19
2/7 GACAGCTG 14/9/41 ACAAGG 4/1/5 AAATAGCCCC 7/1/4 GACATCTGGC 4/1/14 N 34/12/64
3/7 CAGCTGTT 14/10/19 CCTTATTTGG 4/2/12 GCTAAAAATAGC 7/6/12 N CATACAAGGC 4/1/2 34/19/45
4/7 GACAGCTG 14/9/19 CCCAAAATAGCC 4/1/5 CTATAAATAC 7/6/13 N CCATACAAGGCC 4/1/3 34/17/40
2/7 W GACAGCTG 14/6/43 TGCCCT 4/1/15 N GACAGCTGAG 4/1/15 ACAAGGCC 4/1/31 34/9/104
3/7 W ACAGCTGC 14/8/21 AGGGCA 4/1/12 GGGCTATAAA 7/2/9 AGGGCAGC 4/1/37 N 34/12/79
4/7 W CAGCTGTT 14/9/15 CCAAATATGG 4/2/3 CCTAAGAATAGC 7/2/5 N CATACAAGGC 4/1/2 34/14/25
Compare-Prospector CTGTSA 14/1/4 KAGCYATA 4/1/1 GYTATW 7/5/7 CAGCTGTS 4/1/4 N 34/8/16
Toucan 7 GGGrmAGG 14/1/5 N N N CCTGCT 4/2/12 34/3/17
  1. 1 x/y/z in the table denotes: experimentally determined binding sites/overlap between experimental sites and predicted sites/predicted sites by a discovered motif.
  2. 2 Refer to Figure 1 for the description of 't/7' as well as 't/7 W'.
  3. 3 'N' in a table cell indicates that the corresponding motif was not detected.
  4. 4 Representation of degenerated nucleotides: M = (AC), S = (GC), V = (AGC), R = (AG), Y = (CT), H = (ACT), W = (AT), K = (GT), D = (AGT), B = (GCT), N = (AGCT)
  5. 5 None of the motifs reported by our approach, CompareProspector or Toucan predicted the experimentally determined Sp1 binding site.
  6. 6 Masking repeats represents the 5000-bp upstream sequences (no gap), for which only the repeat regions were masked, were used.
  7. 7 Motifs identified by Toucan were taken from their report [7].