Skip to main content

Table 6 Run time values (hh:mm:ss) comprising learning and classifying steps obtained by the highest quality Spark supervised algorithms (Decision Trees and Random Forest) together with the corresponding values of the Hadoop MapReduce Random Forest implementation. Supervised algorithm run time values are presented for the alignment-based, alignment-free and alignment-based + alignment-free feature combinations. The Random Oversampling pre-processing (ROS) is accompanied by the corresponding resampling size value

From: Surveying alignment-free features for Ortholog detection in related yeast proteomes by using supervised big data classifiers

Algorithm/Dataset

Alignment-based Features

Alignment-free Features

Alignment-based + Alignment-free Features

ScerCgla

CglaKlac

KlacKwal

ScerCgla

CglaKlac

KlacKwal

ScerCgla

CglaKlac

KlacKwal

Spark Random Forest MLlib 1.6

 NORMAL Learn

00:00:49

00:00:57

00:00:57

00:01:03

00:01:05

00:01:07

00:00:57

00:00:57

00:01:00

 NORMAL Classify

00:00:19

00:00:38

00:00:24

00:00:31

00:00:26

00:00:25

00:00:34

00:00:30

00:00:32

 ROS-100 Learn

00:01:43

00:02:34

00:02:29

00:01:48

00:01:43

00:01:47

00:01:48

00:01:50

00:01:48

 ROS-100 Classify

00:00:20

00:00:19

00:00:19

00:00:33

00:00:28

00:00:29

00:00:33

00:00:31

00:00:31

 ROS-130 Learn

00:02:09

00:02:15

00:02:43

00:02:03

00:01:57

00:02:00

00:02:06

00:02:03

00:01:57

 ROS-130 Classify

00:00:19

00:00:18

00:00:18

00:00:39

00:00:30

00:00:34

00:00:41

00:00:31

00:00:34

 RUS Learn

00:00:09

00:00:09

00:00:09

00:00:11

00:00:11

00:00:11

00:00:10

00:00:10

00:00:10

 RUS Classify

00:00:14

00:00:14

00:00:13

00:00:39

00:00:31

00:00:42

00:00:41

00:00:39

00:00:39

Spark Decision Trees MLlib 1.6

 NORMAL Learn

00:00:31

00:00:31

00:00:35

00:00:35

00:00:33

00:00:35

00:00:49

00:00:38

00:00:40

 NORMAL Classify

00:00:13

00:00:12

00:00:15

00:00:23

00:00:20

00:00:20

00:00:25

00:00:25

00:00:24

 ROS-100 Learn

00:00:57

00:00:58

00:00:56

00:00:59

00:01:03

00:01:01

00:01:11

00:01:07

00:01:07

 ROS-100 Classify

00:00:11

00:00:15

00:00:13

00:00:22

00:00:20

00:00:24

00:00:24

00:00:23

00:00:24

 ROS-130 Learn

00:00:57

00:00:58

00:00:57

00:01:14

00:01:06

00:01:05

00:01:15

00:01:13

00:01:16

 ROS-130 Classify

00:00:12

00:00:19

00:00:11

00:00:23

00:00:22

00:00:22

00:00:25

00:00:24

00:00:23

 RUS Learn

00:00:08

00:00:08

00:00:08

00:00:09

00:00:09

00:00:09

00:00:09

00:00:09

00:00:09

 RUS Classify

00:00:12

00:00:11

00:00:11

00:00:33

00:00:26

00:00:34

00:00:36

00:00:34

00:00:35

MapReduce Random Forest Mahout 0.9

 NORMAL Learn

23:25:10

23:25:10

23:25:10

      

 NORMAL Classify

00:14:25

00:13:07

00:13:04