Skip to main content

Table 5 Validation of the seven rules using random sub-sampled test sets from specialized databases. Performance is given assuming mass spectrometry errors of ± 5% isotope abundance error and ± 3 ppm mass accuracy and calculating element combinations of C, H, N, S, O, P, F, Cl and Br

From: Seven Golden Rules for heuristic filtering of molecular formulas obtained by accurate mass spectrometry

Test set and Source

Number of random formulas

Mass range [Da]

target DB top hit [%]

PubChem top hit [%]

PubChem false top hit [%]

no DB query top 3 hits [%]

Pharmaceuticals (DrugBank)

2400

30–1093

99

90

8

78

Natural Products (DNP)

1200

92–2020

99

84

10

81

Toxic Chemicals (TSCA)

1200

56–2170

98

87

8

78

Unknowns taken from Wiley+NIST

1200

150–1536

-

-

78

65