Skip to main content

Table 1 Results of the small-scale evaluation.

From: MAPPER: a search engine for the computational identification of putative transcription factor binding sites in multiple genomes

Factor

Sites

Method

Model

Distinct sequences identified

Target

True positives

False positives

False positive ratio

E2F

27

HMMER

M00050

27

27

36

8

18.20%

   

T05206

27

27

27

4

12.90%

  

Match

V$E2F_02

27

27

36

7

16.30%

  

Patser

V$E2F_02

27

27

36

8

18.20%

  

LMM

V$E2F_02

18

27

18

3

n/a

  

ScanACE

M00050

12

27

12

3

n/a

ER

17

HMMER

M00959

16

16

22

7

24.10%

   

T00258

17

16

16

2

11.10%

  

Match

V$ER_Q6_02

17

16

24

7

22.60%

  

Patser

V$ER_Q6_02

17

16

24

8

25.00%

  

LMM

V$ER_Q6

15

16

15

0

0.00%

  

ScanACE

M00959

8

16

11

1

n/a

GR

7

HMMER

M00921

7

7

10

2

16.70%

   

T05076

7

7

7

1

12.50%

  

Match

V$GR_Q6_01

7

7

9

3

25.00%

  

Patser

V$GR_Q6_01

7

7

9

7

43.70%

  

LMM

V$GR_Q6

6

7

6

1

14.30%

  

ScanACE

M00921

4

7

4

1

n/a

HNF-1

18

HMMER

M00790

18

18

19

0

0.00%

   

T01211

18

18

18

0

0.00%

  

Match

V$HNF1_Q6

18

18

22

3

12.00%

  

Patser

V$HNF1_Q6

18

18

29

1

3.30%

  

LMM

V$HNF1_01

16

18

16

0

0.00%

  

ScanACE

M00790

11

18

11

0

n/a

HNF-3

10

HMMER

M00724

10

10

10

1

9.10%

   

T01049

10

10

10

2

16.70%

  

Match

V$HNF3ALPHA_Q6

10

10

12

4

25.00%

  

Patser

V$HNF3ALPHA_Q6

10

10

10

1

9.10%

  

LMM

V$HNF3ALPHA_Q6

9

10

9

4

30.80%

  

ScanACE

M00724

8

10

8

0

0.00%

HNF-4

10

HMMER

M00638

9

9

9

2

18.20%

   

T00372

10

9

9

0

0.00%

  

Match

V$HNF4ALPHA_Q6

9

9

9

2

18.20%

  

Patser

V$HNF4ALPHA_Q6

10

9

9

2

18.20%

  

LMM

V$HNF4ALPHA_Q6

7

9

7

0

0.00%

  

ScanACE

M00638

3

9

3

0

n/a

  1. The "Sites" column contains the number of sequences containing experimentally validated binding sites provided as input. "Target" represents the number of binding sites to be retrieved by a method to be considered successful, and "Distinct sequences identified" is the number of distinct sequences in which at least one true positive was detected. Because of partially overlapping hits, the actual number of true positives reported may be higher than the true number of sites. Not all matrices tested were available in the LMM matrix library; for those cases, the results obtained using the closest available LMM matrix are displayed in italics.