Skip to main content

Table 1 Summary statistics for the results of automated classification of chemical entities in the test set

From: Self-organizing ontology of biochemically relevant small molecules

Tested Category Total Inferences Direct Inferences Correct Inferences
Acetamides 13 4 (30.8) 13 (100.0)
Acetic Anhydrides 11 5 (45.5) 9 (81.8)
Acrylamides 20 4 (20.0) 19 (95.0)
Alcohols 9 4 (44.4) 9 (100.0)
Aldehydes 6 2 (33.3) 6 (100.0)
Alkadienes 8 5 (62.5) 8 (100.0)
Amides 2 1 (50.0) 2 (100.0)
Amines 9 0 (0.0) 9 (100.0)
Amino Alcohols 7 3 (42.9) 7 (100.0)
Aminopyridines 12 3 (25.0) 10 (83.3)
Anhydrides 5 3 (60.0) 5 (100.0)
Aza Compounds 3 1 (33.3) 3 (100.0)
Benzaldehydes 11 4 (36.4) 11 (100.0)
Benzyl Alcohols 10 4 (40.0) 10 (100.0)
Benzylamines 10 4 (40.0) 10 (100.0)
Boron Compounds 8 5 (62.5) 7 (87.5)
Butylamines 11 5 (45.5) 11 (100.0)
Carbodiimides 2 1 (50.0) 2 (100.0)
Carboxylic Acids 9 0 (0.0) 4 (44.4)
Chlorohydrins 11 5 (45.5) 10 (90.9)
Cyanates 4 3 (75.0) 4 (100.0)
Cyclohexylamines 6 3 (50.0) 6 (100.0)
Diazonium Compounds 15 5 (33.3) 10 (66.7)
Ethers 6 5 (83.3) 6 (100.0)
Ethylamines 4 2 (50.0) 4 (100.0)
Fatty Alcohols 2 2 (100.0) 2 (100.0)
Formamides 7 5 (71.4) 7 (100.0)
Glycols 6 2 (33.3) 3 (50.0)
Guanidines 15 5 (33.3) 15 (100.0)
Hydrazines 11 3 (27.3) 11 (100.0)
Hydroxylamines 14 5 (35.7) 13 (92.9)
Imides 21 5 (23.8) 21 (100.0)
Imines 7 2 (28.6) 7 (100.0)
Isocyanates 10 5 (50.0) 7 (70.0)
Ketones 6 5 (83.3) 6 (100.0)
Lactams 17 4 (23.5) 17 (100.0)
Lactones 10 5 (50.0) 10 (100.0)
Methylamines 7 3 (42.9) 7 (100.0)
Nitrates 6 2 (33.3) 6 (100.0)
Nitriles 8 5 (62.5) 8 (100.0)
Nitrites 5 5 (100.0) 5 (100.0)
Nitro Compounds 22 5 (22.7) 17 (77.3)
Nitroso Compounds 13 4 (30.8) 11 (84.6)
Organic Compounds 2 1 (50.0) 2 (100.0)
Organophosphorus Compounds 18 5 (27.8) 16 (88.9)
Organoselenium Compounds 6 3 (50.0) 6 (100.0)
Organosilicon Compounds 6 5 (83.3) 6 (100.0)
Organothiophosphorus Compounds 8 5 (62.5) 8 (100.0)
Peroxides 5 5 (100.0) 5 (100.0)
Phenols 8 5 (62.5) 8 (100.0)
Total 452 182 (40.3) 419 (92.7)
  1. The direct inferences are class inferences that are identical to the annotations in the test set. Correct inference counts include the direct inferences and inferences that were deemed correct by a curator. Please note that a lack of direct inferences does not reflect an error - merely the presence of another class whose definition was a closer match for a given molecule than its original class. More than one inference was possible for a given molecule. Percentages of total inferences for each class are given in brackets for each category.