Skip to main content

Table 5 Validation of the seven rules using random sub-sampled test sets from specialized databases. Performance is given assuming mass spectrometry errors of ± 5% isotope abundance error and ± 3 ppm mass accuracy and calculating element combinations of C, H, N, S, O, P, F, Cl and Br

From: Seven Golden Rules for heuristic filtering of molecular formulas obtained by accurate mass spectrometry

Test set and Source Number of random formulas Mass range [Da] target DB top hit [%] PubChem top hit [%] PubChem false top hit [%] no DB query top 3 hits [%]
Pharmaceuticals (DrugBank) 2400 30–1093 99 90 8 78
Natural Products (DNP) 1200 92–2020 99 84 10 81
Toxic Chemicals (TSCA) 1200 56–2170 98 87 8 78
Unknowns taken from Wiley+NIST 1200 150–1536 - - 78 65