Skip to main content

Table 3 MisPred analysis of EnsEMBL entries

From: Identification and correction of abnormal, incomplete and mispredicted proteins in public databases

EnsEMBL
Conflict 1 Number of proteins Identified as containing an extracellular domain Percentage Identified as suspicious by MisPred Percentage* False Positives Percentage* True errors Percentage*
Homo sapiens 48403 3449 7.13% 277 8.03% ND ND ND ND
Mus musculus 31302 2038 6.51% 151 7.41% ND ND ND ND
Rattus norvegicus 33745 2390 7.08% 325 13.6% ND ND ND ND
Monodelphis domestica 32690 2369 7.25% 661 27.9% ND ND ND ND
Gallus gallus 24168 1519 6.29% 413 27.19% ND ND ND ND
Xenopus tropicalis 28324 2383 8.41% 931 39.07% ND ND ND ND
Fugu rubripes 22102 1612 7.29% 627 38.9% ND ND ND ND
Danio rerio 36065 3312 9.18% 1224 36.96% ND ND ND ND
Ciona intestinalis 20000 1452 7.26% 670 46.14% ND ND ND ND
Caenorhabditis elegans 26439 918 3.47% 117 12.75% ND ND ND ND
Drosophila melanogaster 19789 1071 5.41% 120 11.2% ND ND ND ND
Conflict 2 Number of proteins Identified as containing an extra- and an intracellular domain Percentage Identified as suspicious by MisPred Percentage* False Positives Percentage* True errors Percentage*
Homo sapiens 48403 101 0.21% 18 17.82% 18 17.82% 0 0.00%
Mus musculus 31302 50 0.16% 4 8.00% 4 8.00% 0 0.00%
Rattus norvegicus 33745 67 0.2% 12 17.91% 10 14.93% 2 2.99%
Monodelphis domestica 32690 101 0.31% 25 24.75% 9 8.91% 16 15.84%
Gallus gallus 24168 45 0.19% 5 11.11% 4 8.89% 1 2.22%
Xenopus tropicalis 28324 57 0.2% 11 19.3% 5 8.77% 6 10.53%
Fugu rubripes 22102 58 0.26% 19 32.76% 12 20.69% 7 12.07%
Danio rerio 36065 75 0.21% 8 10.67% 7 9.33% 1 1.33%
Ciona intestinalis 20000 29 0.15% 2 6.90% 2 6.90% 0 0.00%
Caenorhabditis elegans 26439 12 0.05% 1 8.33% 1 8.33% 0 0.00%
Drosophila melanogaster 19789 16 0.08% 1 6.25% 1 6.25% 0 0.00%
Conflict 3 Number of proteins    Identified as suspicious by MisPred Percentage* False Positives Percentage* True errors Percentage*
Homo sapiens 48403    1 0.002% 0 0.00% 1 0.002%
Mus musculus 31302    3 0.01% 0 0.00% 3 0.01%
Rattus norvegicus 33745    3 0.01% 0 0.00% 3 0.01%
Monodelphis domestica 32690    0 0.00% 0 0.00% 0 0.00%
Gallus gallus 24168    1 0.004% 0 0.00% 1 0.004%
Xenopus tropicalis 28324    0 0.00% 0 0.00% 0 0.00%
Fugu rubripes 22102    2 0.01% 0 0.00% 2 0.01%
Danio rerio 36065    0 0.00% 0 0.00% 0 0.00%
Ciona intestinalis 20000    0 0.00% 0 0.00% 0 0.00%
Caenorhabditis elegans 26439    0 0.00% 0 0.00% 0 0.00%
Drosophila melanogaster 19789    0 0.00% 0 0.00% 0 0.00%
Conflict 4 Number of proteins Proteins containing domains suitable for the study of domain integrity Percentage Identified as suspicious by MisPred Percentage* False Positives Percentage* True errors Percentage*
Homo sapiens 48403 16681 34.46% 850 5.1% ND ND ND ND
Mus musculus 31302 9955 31.80% 306 3.07% ND ND ND ND
Rattus norvegicus 33745 11826 35.05% 474 4.01% ND ND ND ND
Monodelphis domestica 32690 11847 36.24% 381 3.22% ND ND ND ND
Gallus gallus 24168 6261 25.91% 383 6.12% ND ND ND ND
Xenopus tropicalis 28324 6733 23.78% 318 4.72% ND ND ND ND
Fugu rubripes 22102 5464 24.72% 278 5.09% ND ND ND ND
Danio rerio 36065 9402 26.07% 591 6.29% ND ND ND ND
Ciona intestinalis 20000 2114 10.57% 147 6.95% ND ND ND ND
Caenorhabditis elegans 26439 3039 11.49% 86 2.83% ND ND ND ND
Drosophila melanogaster 19789 3341 16.88% 58 1.74% ND ND ND ND
Conflict 5 Number of proteins    Identified as suspicious by MisPred Percentage* False Positives Percentage* True errors Percentage*
Homo sapiens 48403    0 0.00% 0 0.00% 0 0.00%
Danio rerio 36065    9 0.02% 7 0.02% 2 0.01%
  1. *Values for suspicious, false positive and true positive sequences are expressed as percentage of the proteins relevant for the given conflict.
  2. ND – not determined