Skip to main content

Table 1 Distribution of sequences according to the mean percent identity

From: Spliceator: multi-species splice site prediction using convolutional neural networks

Pairwise sequence identity

Sequence length: 600 nt

Sequence length: 20 nt

Donor

Acceptor

Donor

Acceptor

AS

GS

AS

GS

AS

GS

AS

GS

0–10%

0.0%

0.0%

0.0%

0.0%

0.04%

0.03%

0.04%

0.03%

10–20%

0.35%

0.27%

0.33%

0.25%

1.0%

0.71%

1.65%

1.32%

20–30%

92.77%

94.88%

92.45%

94.7%

10.27%

8.93%

13.92%

12.44%

30–40%

6.83%

4.78%

7.18%

4.99%

32.62%

31.57%

34.05%

33.20%

40–50%

0.03%

0.04%

0.03%

0.04%

36.24%

37.17%

32.89%

34.09%

50–60%

0.01%

0.01%

0.01%

0.01%

16.28%

17.47%

14.22%

15.34%

60–70%

0.0%

0.0%

0.0%

0.0%

3.2%

3.65%

2.9%

3.2%

70–80%

0.0%

0.0%

0.0%

0.0%

0.3%

0.39%

0.29%

0.33%

80–90%

0.0%

0.0%

0.0%

0.0%

0.03%

0.04%

0.03%

0.03%

90–100%

0.0%

0.0%

0.0%

0.0%

0.02%

0.03%

0.02%

0.02%

  1. Pairwise sequence percent identity of positive subsets (AS: All Sequences and GS: Gold Standard) for sequences with a length of 600 nt and 20 nt for donor and acceptor SS (values in bold correspond to the highest percentage of identity)