Skip to main content

Table 1 Per-residue predictions: secondary structure and disorder

From: Modeling aspects of the language of life through transfer-learning protein sequences

Data

Prediction task

Secondary structure

Disorder

Method

Q3 (%)

Q8 (%)

MCC

FPR

CASP12

NetSurfP-2.0 (hhblits)a,b

82.4

71.1

0.604

0.011

NetSurfP-1.0a,b

70.9

–

–

–

Spider3a,b

79.1

–

0.582

0.026

RaptorXa,b

78.6

66.1

0.621

0.045

Jpred4a,b

76.0

–

–

–

DeepSeqVec

73.1 ± 1.3

61.2 ± 1.6

0.575 ± 0.075

0.026 ± 0.008

DeepProfb

76.4 ± 2.0

62.7 ± 2.2

0.506 ± 0.057

0.022 ± 0.009

DeepProf + SeqVecb

76.5 ± 1.5

64.1 ± 1.5

0.556 ± 0.080

0.022 ± 0.008

DeepProtVec

62.8 ± 1.7

50.5 ± 2.4

0.505 ± 0.064

0.016 ± 0.006

DeepOneHot

67.1 ± 1.6

54.2 ± 2.1

0.461 ± 0.064

0.012 ± 0.005

DeepBLOSUM65

67.0 ± 1.6

54.5 ± 2.0

0.465 ± 0.065

0.012 ± 0.005

TS115

NetSurfP-2.0 (hhblits)a,b

85.3

74.4

0.663

0.006

NetSurfP-1.0a,b

77.9

–

–

–

Spider3a,b

83.9

–

0.575

0.008

RaptorXa,b

82.2

71.6

0.567

0.027

Jpred4a,b

76.7

–

–

–

DeepSeqVec

79.1 ± 0.8

67.6 ± 1.0

0.591 ± 0.028

0.012 ± 0.001

DeepProfb

81.1 ± 0.6

68.3 ± 0.9

0.516 ± 0.028

0.012 ± 0.002

DeepProf + SeqVecb

82.4 ± 0.7

70.3 ± 1.0

0.585 ± 0.029

0.013 ± 0.003

DeepProtVec

66.0 ± 1.0

54.4 ± 1.3

0.470 ± 0.028

0.011 ± 0.002

DeepOneHot

70.1 ± 0.8

58.5 ± 1.1

0.476 ± 0.028

0.008 ± 0.001

Deep BLOSUM65

70.3 ± 0.8

58.1 ± 1.1

0.488 ± 0.029

0.007 ± 0.001

CB513

NetSurfP-2.0 (hhblits)a,b

85.3

72.0

–

–

NetSurfP-1.0a,b

78.8

–

–

–

Spider3a,b

84.5

–

–

–

RaptorXa,b

82.7

70.6

–

–

Jpred4a,b

77.9

–

–

–

DeepSeqVec

76.9 ± 0.5

62.5 ± 0.6

–

–

DeepProfb

80.2 ± 0.4

64.9 ± 0.5

–

–

DeepProf + SeqVecb

80.7 ± 0.5

66.0 ± 0.5

–

–

DeepProtVec

63.5 ± 0.4

48.9 ± 0.5

–

–

DeepOneHot

67.5 ± 0.4

52.9 ± 0.5

–

–

DeepBLOSUM65

67.4 ± 0.4

53.0 ± 0.5

–

–

  1. Performance comparison for secondary structure (3- vs. 8-classes) and disorder prediction (binary) for the CASP12, TS115 and CB513 data sets. Accuracy (Q3, Q10) is given in percentage. Results marked by a are taken from NetSurfP-2.0 [46]; the authors did not provide standard errors. Highest numerical values in each column in bold letters. Methods DeepSeqVec, DeepProtVec, DeepOneHot and DeepBLOSUM65 use only information from single protein sequences. Methods using evolutionary information (MSA profiles) are marked by b; these performed best throughout