Skip to main content

Table 6 Run time values (hh:mm:ss) comprising learning and classifying steps obtained by the highest quality Spark supervised algorithms (Decision Trees and Random Forest) together with the corresponding values of the Hadoop MapReduce Random Forest implementation. Supervised algorithm run time values are presented for the alignment-based, alignment-free and alignment-based + alignment-free feature combinations. The Random Oversampling pre-processing (ROS) is accompanied by the corresponding resampling size value

From: Surveying alignment-free features for Ortholog detection in related yeast proteomes by using supervised big data classifiers

Algorithm/Dataset Alignment-based Features Alignment-free Features Alignment-based + Alignment-free Features
ScerCgla CglaKlac KlacKwal ScerCgla CglaKlac KlacKwal ScerCgla CglaKlac KlacKwal
Spark Random Forest MLlib 1.6
 NORMAL Learn 00:00:49 00:00:57 00:00:57 00:01:03 00:01:05 00:01:07 00:00:57 00:00:57 00:01:00
 NORMAL Classify 00:00:19 00:00:38 00:00:24 00:00:31 00:00:26 00:00:25 00:00:34 00:00:30 00:00:32
 ROS-100 Learn 00:01:43 00:02:34 00:02:29 00:01:48 00:01:43 00:01:47 00:01:48 00:01:50 00:01:48
 ROS-100 Classify 00:00:20 00:00:19 00:00:19 00:00:33 00:00:28 00:00:29 00:00:33 00:00:31 00:00:31
 ROS-130 Learn 00:02:09 00:02:15 00:02:43 00:02:03 00:01:57 00:02:00 00:02:06 00:02:03 00:01:57
 ROS-130 Classify 00:00:19 00:00:18 00:00:18 00:00:39 00:00:30 00:00:34 00:00:41 00:00:31 00:00:34
 RUS Learn 00:00:09 00:00:09 00:00:09 00:00:11 00:00:11 00:00:11 00:00:10 00:00:10 00:00:10
 RUS Classify 00:00:14 00:00:14 00:00:13 00:00:39 00:00:31 00:00:42 00:00:41 00:00:39 00:00:39
Spark Decision Trees MLlib 1.6
 NORMAL Learn 00:00:31 00:00:31 00:00:35 00:00:35 00:00:33 00:00:35 00:00:49 00:00:38 00:00:40
 NORMAL Classify 00:00:13 00:00:12 00:00:15 00:00:23 00:00:20 00:00:20 00:00:25 00:00:25 00:00:24
 ROS-100 Learn 00:00:57 00:00:58 00:00:56 00:00:59 00:01:03 00:01:01 00:01:11 00:01:07 00:01:07
 ROS-100 Classify 00:00:11 00:00:15 00:00:13 00:00:22 00:00:20 00:00:24 00:00:24 00:00:23 00:00:24
 ROS-130 Learn 00:00:57 00:00:58 00:00:57 00:01:14 00:01:06 00:01:05 00:01:15 00:01:13 00:01:16
 ROS-130 Classify 00:00:12 00:00:19 00:00:11 00:00:23 00:00:22 00:00:22 00:00:25 00:00:24 00:00:23
 RUS Learn 00:00:08 00:00:08 00:00:08 00:00:09 00:00:09 00:00:09 00:00:09 00:00:09 00:00:09
 RUS Classify 00:00:12 00:00:11 00:00:11 00:00:33 00:00:26 00:00:34 00:00:36 00:00:34 00:00:35
MapReduce Random Forest Mahout 0.9
 NORMAL Learn 23:25:10 23:25:10 23:25:10       
 NORMAL Classify 00:14:25 00:13:07 00:13:04