Skip to main content

Table 1 Classification performance and identified motifs for common lectins

From: Identifying glycan motifs using a novel subtree mining approach

Lectin

Conc. (μg/ml)

AUC (Validation)

AUC (Train)

Top Motif*

 

Agaricus bisporus agglutinin (ABA)

100

0.934 (0.034)

0.947 (0.006)

(*3,4,6)GlcNAc α

 

Concanavalin A (Con A)

10

0.971 (0.031)

0.982 (0.015)

Man α1-3(*2,4)Man

 

Dolichos biflorus agglutinin (DBA)

100

0.839 (0.069)

0.897 (0.042)

(*3,4,6)GalNAc

 

Human DC-SIGN tetramer

200

0.841 (0.062)

0.955 (0.026)

Man α1-3(Man α1-6)(*2,4)Man α

 

Griffonia simplicifolia Lectin I isolectin B4 (GSL I-B4)

10

0.867 (0.061)

0.953 (0.014)

(*2,3,4,6)Gal α1-3Gal β

 

Influenza hemagglutinin (HA) (A/Puerto Rico/8/34) (H1N1)

200

0.913 (0.105)

0.973 (0.023)

(*8,9)Neu5Ac α

 

Influenza HA (A/harbor seal/Massachusetts/1/2011) (H3N8)

200

0.959 (0.028)

0.958 (0.007)

(*8,9)Neu5Ac α2-3(*2,4,6)Gal

 

Jacalin

1

0.882 (0.055)

0.896 (0.009)

(*4,6)GalNAc α/ β

 

Lens culinaris agglutinin (LCA)

10

0.964 (0.032)

0.976 (0.008)

Man α1-3Man α

 

Maackia amurensis lectin I (MAL-I)

10

0.833 (0.035)

0.848 (0.053)

(*2,4,6)Gal β1-4(*3,6)GlcNAc α/ β

 

Maackia amurensis lectin II (MAL-II)

10

0.718 (0.078)

0.814 (0.074)

Gal β1-3GalNAc α

 

Phaseolus vulgaris erythroagglutinin (PHA-E)

10

0.959 (0.018)

0.975 (0.009)

(*2,4,6)Gal β1-4(*3,6)GlcNAc β1-2Man α1-3(Man α1-6)Man

 

Phaseolus vulgaris leucoagglutinin (PHA-L)

10

0.914 (0.126)

0.967 (0.030)

GlcNAc β1-6(*3,4)Man

 

Peanut agglutinin (PNA)

10

0.914 (0.048)

0.943 (0.021)

(*2,3,4,6)Gal β1-3GalNAc

 

Pisum sativum agglutinin (PSA)

10

0.890 (0.053)

0.929 (0.028)

Man α1-3(*2,4)Man

 

Ricinus communis agglutinin I (RCA I/RCA120)

10

0.953 (0.026)

0.958 (0.008)

(*2,3,4,6)Gal β1-4(*3,6)GlcNAc

 

Soybean agglutinin (SBA)

10

0.875 (0.061)

0.938 (0.026)

(*3,4,6)GalNAc

 

Sambucus nigra agglutinin (SNA)

10

0.950 (0.060)

0.979 (0.010)

Neu5Ac α2-6Gal β1-4GlcNAc

 

Ulex europaeus agglutinin I (UEA I)

100

0.861 (0.049)

0.895 (0.042)

(*3)Fuc

 

Wheat germ agglutinin (WGA)

1

0.882 (0.021)

0.901 (0.004)

GlcNAc β1-3Gal β1-4(*3,6)GlcNAc β1-3(*2,4,6)Gal β1-4(*3,6)GlcNAc

 
  1. Model performance was assessed using stratified 5-fold cross-validation, with Area Under the Curve (AUC) values calculated for both validation and training folds (shown as mean (s.d.)). The top motif is defined as the feature with the highest coefficient in the logistic regression classification model, and is shown for a single test/training split. Experimentally determined lectin specificities and associated citations are provided in Additional file 7
  2. *Note: Motifs are written in a modified CFG linear text nomenclature. A set of parentheses with connection types preceded by an asterisk indicates restricted connection types for the following residue. For example, a GlcNAc motif with restricted connections on C3 and C4 is indicated by (*3,4)GlcNAc