Skip to main content

Table 5 Percentage of true positives (%TP) identified by both outstanding supervised and unsupervised classifiers when detecting ortholog pairs in the twilight zone (< 30% of identity)

From: Surveying alignment-free features for Ortholog detection in related yeast proteomes by using supervised big data classifiers

Algorithm/Dataset

Alignment-based Features

Alignment-free Features

Alignment-based + Alignment-free Features

%TP

%TP

%TP

Supervised Algorithms

Scer

Cgla

Cgla

Klac

Klac

Kwal

Scer

Cgla

Cgla

Klac

Klac

Kwal

Scer

Cgla

Cgla

Klac

Klac

Kwal

Spark Random Forest MLlib 1.6

 Normal

0.00

0.00

0.00

0.00

0.00

0.00

3.54

0.00

2.25

 ROS-100

97.43

96.26

98.03

71.06

64.29

57.87

96.14

91.84

93.54

 ROS-130

98.71

96.94

98.31

76.21

64.97

65.45

95.18

93.88

93.54

 RUS

99.04

96.26

98.60

74.60

64.29

61.24

96.78

93.88

95.51

Spark Decision Trees MLlib 1.6

 Normal

0.32

0.68

0.28

0.00

0.00

0.56

12.54

7.82

9.55

 ROS-100

95.18

94.56

97.19

72.67

62.93

55.62

97.75

84.69

96.07

 ROS-130

95.82

91.50

97.47

79.74

61.56

63.48

98.71

87.41

96.35

 RUS

98.07

95.24

99.16

76.53

67.01

65.45

98.07

90.82

97.47

Unsupervised Algorithms

 RBH

57.56

58.84

73.31

      

 RSD 0.2 1e-20

46.95

45.92

62.36

      

 RSD 0.5 1e-10

61.41

61.90

80.34

      

 RSD 0.8 1e-05

68.17

70.41

85.96

      

 OMA

42.77

45.24

46.91

      
  1. The best results in each dataset are in bold face and the general best results are underlined. Supervised algorithm performance is presented for the alignment-based, alignment-free and alignment-based + alignment-free feature combinations