Skip to main content

Table 5 Percentage of true positives (%TP) identified by both outstanding supervised and unsupervised classifiers when detecting ortholog pairs in the twilight zone (< 30% of identity)

From: Surveying alignment-free features for Ortholog detection in related yeast proteomes by using supervised big data classifiers

Algorithm/Dataset Alignment-based Features Alignment-free Features Alignment-based + Alignment-free Features
%TP %TP %TP
Supervised Algorithms Scer
Cgla
Cgla
Klac
Klac
Kwal
Scer
Cgla
Cgla
Klac
Klac
Kwal
Scer
Cgla
Cgla
Klac
Klac
Kwal
Spark Random Forest MLlib 1.6
 Normal 0.00 0.00 0.00 0.00 0.00 0.00 3.54 0.00 2.25
 ROS-100 97.43 96.26 98.03 71.06 64.29 57.87 96.14 91.84 93.54
 ROS-130 98.71 96.94 98.31 76.21 64.97 65.45 95.18 93.88 93.54
 RUS 99.04 96.26 98.60 74.60 64.29 61.24 96.78 93.88 95.51
Spark Decision Trees MLlib 1.6
 Normal 0.32 0.68 0.28 0.00 0.00 0.56 12.54 7.82 9.55
 ROS-100 95.18 94.56 97.19 72.67 62.93 55.62 97.75 84.69 96.07
 ROS-130 95.82 91.50 97.47 79.74 61.56 63.48 98.71 87.41 96.35
 RUS 98.07 95.24 99.16 76.53 67.01 65.45 98.07 90.82 97.47
Unsupervised Algorithms
 RBH 57.56 58.84 73.31       
 RSD 0.2 1e-20 46.95 45.92 62.36       
 RSD 0.5 1e-10 61.41 61.90 80.34       
 RSD 0.8 1e-05 68.17 70.41 85.96       
 OMA 42.77 45.24 46.91       
  1. The best results in each dataset are in bold face and the general best results are underlined. Supervised algorithm performance is presented for the alignment-based, alignment-free and alignment-based + alignment-free feature combinations