Skip to main content

Table 1 Comparison of motifs identified by different programs for the muscle genes 1,2,3,4,5.

From: Combining comparative genomics with de novo motif discovery to identify human transcription factor DNA-binding motifs

 

Myf

SRF

Mef2

Tef

NVL

Combined

masking repeats 6

GGGACATG 14/2/68

TCAGCCCT 4/1/63

N

N

ATCAGCCC 4/2/60

34/5/191

1/7

AGGGGGCATG 14/1/19

N

N

N

N

34/1/19

2/7

GACAGCTG 14/9/41

ACAAGG 4/1/5

AAATAGCCCC 7/1/4

GACATCTGGC 4/1/14

N

34/12/64

3/7

CAGCTGTT 14/10/19

CCTTATTTGG 4/2/12

GCTAAAAATAGC 7/6/12

N

CATACAAGGC 4/1/2

34/19/45

4/7

GACAGCTG 14/9/19

CCCAAAATAGCC 4/1/5

CTATAAATAC 7/6/13

N

CCATACAAGGCC 4/1/3

34/17/40

2/7 W

GACAGCTG 14/6/43

TGCCCT 4/1/15

N

GACAGCTGAG 4/1/15

ACAAGGCC 4/1/31

34/9/104

3/7 W

ACAGCTGC 14/8/21

AGGGCA 4/1/12

GGGCTATAAA 7/2/9

AGGGCAGC 4/1/37

N

34/12/79

4/7 W

CAGCTGTT 14/9/15

CCAAATATGG 4/2/3

CCTAAGAATAGC 7/2/5

N

CATACAAGGC 4/1/2

34/14/25

Compare-Prospector

CTGTSA 14/1/4

KAGCYATA 4/1/1

GYTATW 7/5/7

CAGCTGTS 4/1/4

N

34/8/16

Toucan 7

GGGrmAGG 14/1/5

N

N

N

CCTGCT 4/2/12

34/3/17

  1. 1 x/y/z in the table denotes: experimentally determined binding sites/overlap between experimental sites and predicted sites/predicted sites by a discovered motif.
  2. 2 Refer to Figure 1 for the description of 't/7' as well as 't/7 W'.
  3. 3 'N' in a table cell indicates that the corresponding motif was not detected.
  4. 4 Representation of degenerated nucleotides: M = (AC), S = (GC), V = (AGC), R = (AG), Y = (CT), H = (ACT), W = (AT), K = (GT), D = (AGT), B = (GCT), N = (AGCT)
  5. 5 None of the motifs reported by our approach, CompareProspector or Toucan predicted the experimentally determined Sp1 binding site.
  6. 6 Masking repeats represents the 5000-bp upstream sequences (no gap), for which only the repeat regions were masked, were used.
  7. 7 Motifs identified by Toucan were taken from their report [7].