Identifying glycan motifs using a novel subtree mining approach

Coff, Lachlan; Chan, Jeffrey; Ramsland, Paul A.; Guy, Andrew J.

doi:10.1186/s12859-020-3374-4

BMC Bioinformatics

Table 2 Comparison of classifier performance across different motif generation tools

From: Identifying glycan motifs using a novel subtree mining approach

Lectin	GLYMMR(mean)	GLYMMR(best)	Glycan Miner Tool	MotifFinder	CCARL
Agaricus bisporus agglutinin (ABA)	0.607 (0.151)	0.776 (0.088)	0.888 (0.067)	0.905	0.934 (0.034)
Concanavalin A (Con A)	0.760 (0.083)	0.875 (0.048)	0.951 (0.042)	0.937	0.971 (0.031)
Dolichos biflorus agglutinin (DBA)	0.630 (0.098)	0.674 (0.126)	0.722 (0.083)	0.936	0.839 (0.069)
Human DC-SIGN tetramer	0.634 (0.132)	0.727 (0.125)	0.823 (0.130)	0.538	0.841 (0.062)
Griffonia simplicifolia Lectin I isolectin B₄ (GSL I-B₄)	0.773 (0.103)	0.847 (0.086)	0.875 (0.066)	0.875	0.867 (0.061)
Influenza hemagglutinin (HA) (A/Puerto Rico/8/34) (H1N1)	0.851 (0.140)	0.889 (0.103)	0.838 (0.144)	0.643	0.917 (0.104)
Influenza HA (A/harbor seal/Massachusetts/1/2011) (H3N8)	0.925 (0.059)	0.935 (0.034)	0.947 (0.021)	0.717	0.958 (0.028)
Jacalin	0.782 (0.061)	0.804 (0.050)	0.848 (0.026)	0.726	0.882 (0.055)
Lens culinaris agglutinin (LCA)	0.772 (0.092)	0.811 (0.083)	0.908 (0.083)	0.832	0.956 (0.037)
Maackia amurensis lectin I (MAL-I)	0.700 (0.054)	0.758 (0.057)	0.868 (0.050)	0.873	0.833 (0.035)
Maackia amurensis lectin II (MAL-II)	0.600 (0.162)	0.827 (0.056)	0.850 (0.091)	0.830	0.721 (0.073)
Phaseolus vulgaris erythroagglutinin (PHA-E)	0.817 (0.061)	0.875 (0.044)	0.910 (0.016)	0.496	0.965 (0.021)
Phaseolus vulgaris leucoagglutinin (PHA-L)	0.805 (0.095)	0.829 (0.089)	0.858 (0.110)	0.636	0.875 (0.132)
Peanut agglutinin (PNA)	0.668 (0.116)	0.751 (0.133)	0.894 (0.041)	0.617	0.914 (0.048)
Pisum sativum agglutinin (PSA)	0.796 (0.070)	0.830 (0.050)	0.858 (0.064)	0.694	0.891 (0.053)
Ricinus communis agglutinin I (RCA I/RCA₁₂₀)	0.696 (0.053)	0.751 (0.032)	0.848 (0.034)	0.909	0.953 (0.026)
Soybean agglutinin (SBA)	0.542 (0.061)	0.582 (0.049)	0.781 (0.046)	0.775	0.875 (0.061)
Sambucus nigra agglutinin (SNA)	0.962 (0.051)	0.963 (0.057)	0.962 (0.050)	0.820	0.961 (0.059)
Ulex europaeus agglutinin I (UEA I)	0.703 (0.099)	0.734 (0.057)	0.866 (0.023)	0.951	0.859 (0.047)
Wheat germ agglutinin (WGA)	0.663 (0.048)	0.697 (0.055)	0.831 (0.034)	0.817	0.883 (0.021)

Model performance was assessed using stratified 5-fold cross-validation, with mean Area Under the Curve (AUC) values calculated across all validation folds (shown as mean (s.d.)). The best performing tool for each sample is highlighted in bold. Note the MotifFinder tool was evaluated with a single test-train split due to difficulty automating this tool. GLYMMR was evaluated across a range of minimum support thresholds, with AUC values reported for the best threshold as well as mean AUC values across all thresholds

Back to article page

ISSN: 1471-2105

Contact us

General enquiries: journalsubmissions@springernature.com