Skip to main content

Table 5 Effect of parameter set indexing strategy on PFCWLLKR performance using non-TIS-containing data

From: MetWAMer: eukaryotic translation initiation site prediction

k

Indexing strategy

TN

FP

Sn

3

modulating

edit

5,074

11,047

0.3147

  

PWM

5,134

10,987

0.3185

  

WAM

5,069

11,052

0.3144

 

static

edit

6,170

9,951

0.3827

  

PWM

6,279

9,842

0.3895

  

WAM

6,119

10,002

0.3796

5

modulating

edit

4,537

11,584

0.2814

  

PWM

4,484

11,637

0.2781

  

WAM

4,262

11,859

0.2644

 

static

edit

5,993

10,128

0.3718

  

PWM

6,065

10,056

0.3762

  

WAM

5,679

10,442

0.3523

10

modulating

edit

4,190

11,931

0.2599

  

PWM

3,708

12,413

0.2300

  

WAM

3,533

12,588

0.2192

 

static

edit

6,345

9,776

0.3936

  

PWM

5,537

10,584

0.3435

  

WAM

5,199

10,922

0.3225

  1. 16,121 non-TIS-containing instances were used in five-fold cross-validation experiments, in which parameter sets were selected for putative TIS evaluation according to best cluster fit established by either the Hamming distance relative to cached medoids (edit), position weight matrix scores (PWM), or weight array matrix scores (WAM). Parameter indexing was tested under both modulating (cluster assignment for each site separately) and static (cluster assignment based on the leftmost ATG) approaches. k denotes the number of clusters considered. TN represents the number of instances for which the method (correctly) refused to predict a TIS, and FP the number for which some prediction was made, though always incorrect (see Figure 2). S n = T N T N + F P MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaem4uamLaemOBa4Maeyypa0tcfa4aaSaaaeaacqWGubavcqWGobGtaeaacqWGubavcqWGobGtcqGHRaWkcqWGgbGrcqWGqbauaaaaaa@37D9@ .