Skip to main content

Table 7 List of siRNA with bad predictions.

From: An accurate and interpretable model for siRNA efficacy prediction

Id

Target

Target site

M.A.

P.A.

status

ΔG loc

s.d

begin

end

1

NM_012864

CAGGCGCAGAAUUAUCUUAGG

1.341

0.765

under

-17

2.86

63

83

9

 

GUUUGCCGGAGACUGGAAAGC

0.019

0.373

over

-18.2

5.75

163

183

2

NM_017346

AUCGAGCGCUCCAACACUCGC

0.152

0.684

over

-19.9

0.00

831

851

3

 

UGACGCCACCUCAGGGCACCU

0.086

0.563

over

-17.7

0.00

209

229

11

NM_002559

UGCGUGAACUACAGCUCUGUG

0.420

0.743

over

-4.8

0.00

376

396

15

 

AGCUCUGUGCUCCGGACCUGU

0.419

0.713

over

-29.8

0.00

388

408

4

NM_003342

GGGAAGUCCUUAUUAUUGGCC

0.876

0.435

under

-17.9

0.06

69

89

5

 

UUCCUGAGCUGGAUGGAAAGA

1.201

0.799

under

-2.4

0.98

246

266

7

NM_016406

GUGGCAAAAUAUGCCUGACGG

0.999

0.616

under

-10.4

0.00

285

305

22

 

AUGCCUGACGGAUCAUUUCAA

1.173

0.907

under

-8.8

2.12

295

315

6

NM_003340

UUCUUUUAUCCAUUUGUUCAC

0.270

0.672

over

-7.2

1.41

255

275

8

NM_003347

UAUGAUAAGGGAGCCUUCAGA

1.099

0.728

under

-8.3

0.06

86

106

10

NM_018426 (XM_371822)

GAUGCCACCCGACGCCCUCAC

0.127

0.454

over

-9.5

1.18

2148

2168

12

NM_001009264 (XM_214061)

CCAGGGCGGAGAAGGCCGACG

0.239

0.548

over

-25.1

0.00

371

391

14

 

UGAACUUUGGGUCCCUGUGAC

0.268

0.568

over

-11.1

0.00

865

885

16

NM_001001481

UGUAACAAGAAUCCAAAGAAA

1.146

0.853

under

-10.9

0.19

353

373

17

NM_016021

CAACAAAAGGAGAGGGAGCCA

0.417

0.709

over

-19.7

0.00

309

329

18

NM_022005

CCUGUGACCUCCAUCUACUCU

0.968

0.682

under

-16.5

0.00

79

99

19

NM_007019

UGUAUGAUGUCAGGACCAUUC

0.321

0.601

over

-10.1

1.02

185

205

20

NM_006357

UAAAGGAGAUAACAUUUAUGA

0.520

0.795

over

-9.6

0.00

211

231

  1. This table gives the list of siRNAs for which the discrepancy between the prediction and the actual potency were particularly large. Each siRNA sequence has been arbitrarily named from 1 to 22. Two sequences have been discarded because their target sequences have been modified in Genbank since Huesken's publication date. M.A. is the measured activity as given in Huesken et al. [33]. P.A. is the predicted activity according to our LASSO model. The column status refers to the activity of the siRNA that has been over- or under-predicted compared to the activity measured experimentally. The begin and end columns indicate the position of the siRNA guide strand within the target sequence. The ΔG loc corresponds to the mean local free energies of motifs in which nucleotides of the target sequence are involved for the 10 lowest energy structures and is given in kcal/mol. s.d. represents the standard deviation of each mean ΔG loc computed.