Skip to main content

Table 6 Application of the PHEP_PMSprune(ons) on the Blanchette real dataset

From: A hybrid method for the exact planted (l, d) motif finding problem and its parallelization

DNA region

Seq.

no.

Detected motif

Published Motif

Time

Insulin family

5' promoter

(500 bp)

8

CCTCAGCCCC (10, 1)

CCTCAGCCCC [10, 40]

87(10%)

  

AAGACTCTAA (10,2)

AAGACTCTAA [36, 40]

 
  

GCCATCTGCC (10,1)

GCCATCTGCC [36, 40]

 
  

CTATAAAG (8,0)

CTATAAAG [36, GB]

 
  

GGGAAATG (8,1)

GGGAAATG [36, 40]

 

Metallothionein

5'UTR+Promoter

(590 bp)

26

TTTGCACACGC (11,3)

TTTGCACACG [36, 40]

7.87(1%)

  

TGCACAC (7,1)

TGCACACGG [36, 40]

 

Interleukin-3 5'UTR+Promoter

490 bp

6

TTGAGTACT (9,2)

TTGAGTACT [36, 40]

 
  

GATGAATAAT (10,1)

GATGAATAAT [36, 40]

 
  

TCTTCAGAG, (9,2)

TCTTCAGAG [36, 40]

 
  

AGGACCAG, (8,1)

AGGACCAG [36, 40]

466(10%)

  

AGGTTCCATGTCAGATAAAG,

ATGGAGGTTCCATGTCAGAT,

CTATGGAGGTTCCATGTCAG,

GAGGTTCCATGTCAGATAAA,

GGAGGTTCCATGTCAGATAA,

TATGGAGGTTCCATGTCAGA,

TGGAGGTTCCATGTCAGATA,

all these motifs found with (20,0)

Novel

 

Growth-hormone

5'UTR+promoter

(380 bp)

16

AACTTATCCAT (11,3)

ATTATCCAT [36, 40]

3.43(0%)

  

ATAAATGTAAA (11,3)

ATAAATGTA [36, 40]

 
  

TATAAAAAG (9,2)

TATAAAAAG [36, 40]

 

c-fos

5' UTR+promoter

(800bp)

6

CCATATTAGGAC (12,3)

CCATATTAGGACATCT [10, 41]

350(15%)

  

GAGTTGGCTGC (11,3)

GAGTTGGCTG [36]

 
  

CACAGGATGT (10,2)

CACAGGATGT [36, 40]

 
  

AGGACATCTGCT (12,3)

AGGACATCTG [36, 40]

 

c-myc

5'+promoter

(100bp)

7

GTTTATTC (8,1)

GTTTATTC [36]

83.5(42%)

  

CTTGCTGGG (9,2)

TTGCTGGG [36]

 
  

TGTTTACATC (10,2)

TGTTTACATC [36, 40]

 
  

CCCTCCCC (8,1)

CCCTCCCC [36, 40]

 

Histone H1

5'UTR+Promoter

650 bp

4

CAATCACCAC, (10,2)

CAATCACCAC, [36, GB]

47.6(9%)

  

AAACAAAAGT (10,1)

AAACAAAAGT, [36, GB]

 
  1. The first column includes the gene family and the length of upstream sequences. The second column includes the number of sequences. The third column includes the motif detected by our tool and the respective parameters (l, d). The fourth column includes the published motifs and their references; "GB" stands for Genebank annotation. The final column includes the running time in seconds needed to run our program in the parameter range from (6,0) until (21,3), i.e., there are 64 invocation of our program. The percentages in brackets refer percentage improvements in rum time compared to PMSprune method.