Skip to main content

Table 4 Codons that played an important role in distinguishing clusters from other clusters from cluster 9, a reference sequence that first occurred in Wuhan, China

From: NGS data vectorization, clustering, and finding key codons in SARS-CoV-2 variations

Cluster (group) Codon (amino acid) Cluster 1 (B, Ref) Cluster 125 (A, Alpha) Cluster 140 (C, Delta) Cluster 536 (H, 490R-GH) Cluster 650 (I, Omicron)
Random forest SHAP Random forest SHAP Random forest SHAP Random forest SHAP Random forest SHAP
Cluster 9 (A, Ref) AGC (SER) 0.26 0 0 0 0 0 0 0 0.03 0
CAG (GLN) 0.21 1.915 0 0 0 0 0 0 0 0
CCU (PRO) 0 0 0.088 15.69 0.13 0 0.032 0 0 0
CUG (LEU) 0 0 0.012 0.018 0.12 38.2 0 0 0 0
GGU (GLY) 0 0 0.01 0 0.08 5.627 0 0 0.05 0
ACU (THR) 0 0 0 0 0.06 11.35 0.061 0 0.03 0
GAU (ASP) 0 0 0.024 0.160 0.04 0 0.041 0 0 0
ACA (THR) 0 0 0.137 22.160 0.02 8.931 0 0 0.05 0
GAC (ASP) 0 0 0.074 0.297 0 0 0.068 0 0.03 0
AAU (ASN) 0 0 0.08 0.181 0 0 0.058 0 0.05 0
AGA (ARG) 0 0 0 0 0 0 0.05 0 0 0
CCA (PRO) 0 0 0 0 0 0 0.04 0 0 0
UUU (PHE) 0 0 0 0 0 0 0.04 7.758 0 0
AUU (ILE) 0 0 0 0 0 0 0.038 0 0 0
UAU (TYR) 0 0 0 0 0 0 0.032 0 0 0
GAG (GLU) 0 0 0 0 0 0 0.03 0 0 58.34
AAG (LYS) 0 0 0 0 0 0 0 0 0.05 0
UCA (SER) 0 0 0.033 0.043 0 0 0 0 0.04 0
CAA (GLN) 0 0 0 0 0 0 0 0 0.03 0
GCU (ALA) 0 0 0.097 0.144 0 0 0 0 0.03 0