Skip to main content

Table 4 Codons that played an important role in distinguishing clusters from other clusters from cluster 9, a reference sequence that first occurred in Wuhan, China

From: NGS data vectorization, clustering, and finding key codons in SARS-CoV-2 variations

Cluster (group)

Codon (amino acid)

Cluster 1 (B, Ref)

Cluster 125 (A, Alpha)

Cluster 140 (C, Delta)

Cluster 536 (H, 490R-GH)

Cluster 650 (I, Omicron)

Random forest

SHAP

Random forest

SHAP

Random forest

SHAP

Random forest

SHAP

Random forest

SHAP

Cluster 9 (A, Ref)

AGC (SER)

0.26

0

0

0

0

0

0

0

0.03

0

CAG (GLN)

0.21

1.915

0

0

0

0

0

0

0

0

CCU (PRO)

0

0

0.088

15.69

0.13

0

0.032

0

0

0

CUG (LEU)

0

0

0.012

0.018

0.12

38.2

0

0

0

0

GGU (GLY)

0

0

0.01

0

0.08

5.627

0

0

0.05

0

ACU (THR)

0

0

0

0

0.06

11.35

0.061

0

0.03

0

GAU (ASP)

0

0

0.024

0.160

0.04

0

0.041

0

0

0

ACA (THR)

0

0

0.137

22.160

0.02

8.931

0

0

0.05

0

GAC (ASP)

0

0

0.074

0.297

0

0

0.068

0

0.03

0

AAU (ASN)

0

0

0.08

0.181

0

0

0.058

0

0.05

0

AGA (ARG)

0

0

0

0

0

0

0.05

0

0

0

CCA (PRO)

0

0

0

0

0

0

0.04

0

0

0

UUU (PHE)

0

0

0

0

0

0

0.04

7.758

0

0

AUU (ILE)

0

0

0

0

0

0

0.038

0

0

0

UAU (TYR)

0

0

0

0

0

0

0.032

0

0

0

GAG (GLU)

0

0

0

0

0

0

0.03

0

0

58.34

AAG (LYS)

0

0

0

0

0

0

0

0

0.05

0

UCA (SER)

0

0

0.033

0.043

0

0

0

0

0.04

0

CAA (GLN)

0

0

0

0

0

0

0

0

0.03

0

GCU (ALA)

0

0

0.097

0.144

0

0

0

0

0.03

0