Skip to main content

Table 1 Distribution of sequences according to the mean percent identity

From: Spliceator: multi-species splice site prediction using convolutional neural networks

Pairwise sequence identity Sequence length: 600 nt Sequence length: 20 nt
Donor Acceptor Donor Acceptor
AS GS AS GS AS GS AS GS
0–10% 0.0% 0.0% 0.0% 0.0% 0.04% 0.03% 0.04% 0.03%
10–20% 0.35% 0.27% 0.33% 0.25% 1.0% 0.71% 1.65% 1.32%
20–30% 92.77% 94.88% 92.45% 94.7% 10.27% 8.93% 13.92% 12.44%
30–40% 6.83% 4.78% 7.18% 4.99% 32.62% 31.57% 34.05% 33.20%
40–50% 0.03% 0.04% 0.03% 0.04% 36.24% 37.17% 32.89% 34.09%
50–60% 0.01% 0.01% 0.01% 0.01% 16.28% 17.47% 14.22% 15.34%
60–70% 0.0% 0.0% 0.0% 0.0% 3.2% 3.65% 2.9% 3.2%
70–80% 0.0% 0.0% 0.0% 0.0% 0.3% 0.39% 0.29% 0.33%
80–90% 0.0% 0.0% 0.0% 0.0% 0.03% 0.04% 0.03% 0.03%
90–100% 0.0% 0.0% 0.0% 0.0% 0.02% 0.03% 0.02% 0.02%
  1. Pairwise sequence percent identity of positive subsets (AS: All Sequences and GS: Gold Standard) for sequences with a length of 600 nt and 20 nt for donor and acceptor SS (values in bold correspond to the highest percentage of identity)