Skip to main content

Table 4 MisPred analysis of NCBI's GNOMON-predicted proteins

From: Identification and correction of abnormal, incomplete and mispredicted proteins in public databases

NCBI/GNOMON

Conflict 1

Number of proteins

Identified as containing an extracellular domain

Percentage

Identified as suspicious by MisPred

Percentage*

False Positives

Percentage*

True errors

Percentage*

Homo sapiens

10125

287

2.83%

93

32.4%

ND

ND

ND

ND

Monodelphis domestica

20110

1293

6.43%

253

19.57%

ND

ND

ND

ND

Gallus gallus

14816

909

6.14%

246

27.06%

ND

ND

ND

ND

Danio rerio

25356

2108

8.31%

562

26.66%

ND

ND

ND

ND

Conflict 2

Number of proteins

Identified as containing an extra- and an intracellular domain

Percentage

Identified as suspicious by MisPred

Percentage*

False Positives

Percentage*

True errors

Percentage*

Homo sapiens

10125

4

0.04%

0

0%

0

0.00%

0

0.00%

Monodelphis domestica

20110

32

0.16%

6

18.75%

3

9.38%

3

9.38%

Gallus gallus

14816

22

0.15%

5

22.73%

3

13.64%

2

9.09%

Danio rerio

25356

31

0.12%

11

35.48%

5

16.13%

6

19.35%

Conflict 3

Number of proteins

  

Identified as suspicious by MisPred

Percentage*

False Positives

Percentage*

True errors

Percentage*

Homo sapiens

10125

  

0

0.00%

0

0.00%

0

0.00%

Monodelphis domestica

20110

  

2

0.01%

0

0.00%

2

0.01%

Gallus gallus

14816

  

2

0.01%

1

0.01%

1

0.01%

Danio rerio

25356

  

7

0.03%

3

0.01%

4

0.02%

Conflict 4

Number of proteins

Proteins containing domains suitable for the study of domain integrity

Percentage

Identified as suspicious by MisPred

Percentage*

False Positives

Percentage*

True errors

Percentage*

Homo sapiens

10125

1632

16.12%

255

15.63%

ND

ND

ND

ND

Monodelphis domestica

20110

6224

30.95%

111

1.78%

ND

ND

ND

ND

Gallus gallus

14816

3564

24.06%

370

10.38%

ND

ND

ND

ND

Danio rerio

25356

4387

17.31%

385

8.78%

ND

ND

ND

ND

Conflict 5

Number of proteins

  

Identified as suspicious by MisPred

Percentage*

False Positives

Percentage*

True errors

Percentage*

Homo sapiens

10125

  

1

0.01%

0

0.00%

1

0.01%

Danio rerio

25356

  

25

0.10%

24

0.09%

1

0.004%

  1. *Values for suspicious, false positive and true positive sequences are expressed as percentage of the proteins relevant for the given conflict.
  2. ND – not determined