Skip to main content

Table 2 Performance of the compared feature sets on datasets with different positive-to-negative ratios

From: Predicting protein-protein interactions in unbalanced data using the primary structure of proteins

Feature

Acc. (%)

Fm.1 (%)

Prec. (%)

Sens. (%)

Spec. (%)

Datasets with 1:1 positive-to-negative ratio

   Shen et al.2

77.1 ± 0.8

77.9 ± 0.8

75.2 ± 0.9

80.9 ± 1.4

73.3 ± 1.4

   Guo et al.3

77.2 ± 0.9

77.6 ± 0.9

76.2 ± 1.0

79.1 ± 1.3

75.4 ± 1.4

   This work4

80.1 ± 0.8

80.4 ± 0.8

79.4 ± 1.0

81.4 ± 1.4

78.8 ± 1.4

Datasets with 1:3 positive-to-negative ratio

   Shen et al.

82.2 ± 0.3

58.6 ± 1.1

69.9 ± 0.8

50.4 ± 1.6

92.7 ± 0.3

   Guo et al.

82.1 ± 0.6

58.3 ± 1.7

69.8 ± 1.6

50.1 ± 1.8

92.8 ± 0.4

   This work

83.6 ± 0.5

66.7 ± 1.2

67.9 ± 0.9

65.5 ± 1.7

89.7 ± 0.4

Datasets with 1:7 positive-to-negative ratio

   Shen et al.

88.0 ± 0.3

45.4 ± 1.7

52.8 ± 1.8

39.9 ± 1.9

94.9 ± 0.3

   Guo et al.

87.2 ± 0.3

45.5 ± 1.3

48.8 ± 1.5

42.6 ± 1.3

93.6 ± 0.3

   This work

90.6 ± 0.2

52.8 ± 1.7

71.5 ± 1.5

41.8 ± 1.8

97.6 ± 0.2

Datasets with 1:15 positive-to-negative ratio

   Shen et al.

92.5 ± 0.1

33.1 ± 1.4

37.5 ± 1.3

29.7 ± 1.5

96.7 ± 0.1

   Guo et al.

91.7 ± 0.2

36.6 ± 1.5

35.1 ± 1.5

38.3 ± 1.9

95.3 ± 0.2

   This work

93.7 ± 0.2

43.6 ± 1.3

49.5 ± 1.7

39.0 ± 1.3

97.3 ± 0.1

  1. The best performance among each positive-to-negative ratio is highlighted with bold font. 1The parameter selection is based on a five-fold cross validation of the training dataset to maximize the F-measure. 2Using triad frequency as the feature set. 3Using auto cross covariance as the feature set. 4Using triad significance as the feature set.