Skip to main content

Table 4 Confusion matrix of RF model on validation dataset for prediction of a) clades and b) continents

From: Utilizing genomic signatures to gain insights into the dynamics of SARS-CoV-2 through Machine and Deep Learning techniques

Prediction/Ref

Clade_G (%)

Clade_GH (%)

Clade_GK (%)

Clade_GR (%)

Clade_GRA (%)

Clade_GRY (%)

Clade_GV (%)

(a)

Clade_G

61.47

8.11

0.29

5.79

0.21

0.46

4.53

Clade_GH

10.47

80.27

1.40

7.91

3.00

0.14

11.33

Clade_GK

2.97

2.33

96.60

0.85

5.42

0.03

0.91

Clade_GR

7.31

3.39

0.79

71.63

1.15

1.67

2.42

Clade_GRA

10.72

1.82

0.76

1.03

90.13

0.36

0.00

Clade_GRY

1.04

0.44

0.02

11.23

0.06

97.34

0.45

Clade_GV

6.01

3.65

0.15

1.56

0.02

0.00

80.36

Prediction/Ref

Africa (%)

Asia (%)

Europe (%)

North_America (%)

Oceania (%)

South_America (%)

Unknown (%)

(b)

Africa

0.67

0.01

0.01

0.00

0.00

0.00

0.00

Asia

0.45

5.29

0.52

0.46

1.52

0.25

1.19

Europe

61.80

51.12

77.48

31.33

65.24

33.90

41.52

North America

36.94

43.42

21.65

67.25

28.69

34.00

56.55

Oceania

0.00

0.01

0.00

0.56

4.21

0.21

0.00

South America

0.13

0.16

0.34

0.39

0.00

31.64

0.34

Unknown

0.00

0.00

0.00

0.00

0.34

0.00

0.40