Fig. 2From: PhyloSophos: a high-throughput scientific name mapping algorithm augmented with explicit consideration of taxonomic science, and its application on natural product (NP) occurrence database processingAssessment of the effects of PhyloSophos' core concepts on scientific name mapping. A. Partial mapping coverage could be improved with multiple database usage. (Whole): all canonical scientific names uniquely appear in either CoL, EoL or GBIF. CoL: canonical scientific names uniquely appear in CoL. EoL: canonical scientific names uniquely appear in EoL. GBIF: canonical scientific names uniquely appear in GBIF. Dark blue: Theoretical maximum mapping coverage achievable with single taxonomic database usage. Gold: Mapping coverage achieved with multiple database usage. B. Phylogenetic domain matching accuracy of scientific names with homonymic generic epithets. Dark blue: random choice (null hypothesis). Gold: PhyloSophos mapping result. C. Identification of name inputs with strain-like elements (n = 2,988). Fractions of name inputs which assigned mapping status codes of either 0–5 (exact match code) or 40 (strain name code) were calculated per each taxonomic reference. Dark blue: exact match. Gold: nearest match. Grey: strain-like element identified. D. Reconstruction accuracy of name inputs with Latin declension (n = 353). Fractions of name inputs which assigned mapping status codes 30/31 were calculated per each taxonomic reference. Dark blue: Fraction of name inputs mapped with edit distance (Damerau-Levenshtein) based correction. Gold: Fraction of name inputs mapped Latin declension correctionBack to article page