Skip to main content

Table 1 Comparison of eight-letter vs. four-letter predictions of acceptor and donor sites for different datasets. Where application of eight-letter alphabet enhances the prediction compared to the conventional four-letter alphabet, the data pair is shown in bold.

From: Impact of RNA structure on the prediction of donor and acceptor splice sites

    Markov Models:
    Zero-order First-order Second-order
Dataset Alphabet Site CC Sp Sn CC Sp Sn CC Sp Sn
AraClean: 4- 3' 0.8540* 0.8923 0.9039 0.8821 0.8906 0.9504 0.9011 0.9138 0.9511
  8- 3' 0.8536* 0.8911 0.9048 0.9002 0.9043 0.9611 0.9414 0.9456 0.9744
  4- 5' 0.8832 0.8871 0.9565* 0.8917 0.8989 0.9550 0.9066 0.9019 0.9737
  8- 5' 0.8817 0.8853 0.9565* 0.9088 0.9010 0.9781 0.9422 0.9351 0.9874
AtGS: 4- 3' 0.7025 0.8338 0.8763 0.7450 0.8539 0.8976 0.7787 0.8769 0.9053
  8- 3' 0.7002 0.8359 0.8705 0.7455 0.8495 0.9041 0.7943 0.8776 0.9219
  4- 5' 0.8029 0.8846 0.9224 0.8224 0.8948 0.9311 0.8455 0.9048 0.9440
  8- 5' 0.7986 0.8787 0.9252 0.8222 0.8997 0.9250 0.8576 0.9138 0.9464
HsGS: 4- 3' 0.7848 0.8649 0.9287 0.8167 0.8831 0.9403 0.8450 0.8979 0.9524
  8- 3' 0.7843 0.8635 0.9299 0.8207 0.8888 0.9375 0.8574 0.9085 0.9530
  4- 5' 0.8117 0.8779 0.9409 0.8449 0.8988 0.9510 0.8740 0.9139 0.9639
  8- 5' 0.8070 0.8788 0.9348 0.8446 0.8935 0.9570 0.8842 0.9249 0.9619
BG570: 4- 3' 0.9002 0.9046 0.9597 0.9163 0.9276 0.9570 0.9346 0.9347 0.9761
  8- 3' 0.8981 0.8996 0.9626 0.9174 0.9137 0.9748 0.9427 0.9512 0.9696
  4- 5' 0.8845 0.9056 0.9369 0.9105 0.9153 0.9644 0.9353 0.9351 0.9780
  8- 5' 0.8841 0.9070 0.9346 0.9136 0.9218 0.9615 0.9437 0.9406 0.9838
HMR195: 4- 3' 0.8812 0.8850 0.9596 0.9054 0.9169 0.9567 0.9406 0.9432 0.9774
  8- 3' 0.8832 0.8868 0.9603 0.9120 0.9172 0.9661 0.9556 0.9525 0.9885
  4- 5' 0.8775 0.8980 0.9378 0.9204 0.9290 0.9645 0.9532 0.9526 0.9850
  8- 5' 0.8783 0.9224 0.9095 0.9275 0.9458 0.9558 0.9633 0.9618 0.9892
  1. * Data pairs with insignificant differences (Mann-Whitney test).