Skip to main content


Figure 1 | BMC Bioinformatics

Figure 1

From: Natural product-likeness score revisited: an open-source, open-data implementation

Figure 1

Molecule curation and atom signature generation workflow. This workflow takes input of compounds and performs curation and atom signature generation for every compound structure. Iterative SDfile Reader takes input of compounds (Query file) in Structure Data Format (SDF) file. The input can be a single SDF file or list of files. The number of compounds to be read and passed down the workflow for each iteration is specified using the port Iterations. As soon as the compounds are read, the Tag Molecules With UUID worker tags every compound with a UUID. This step helps in keeping track of compounds until the end of the scoring process. As a first step in the curation process, the Molecule Connectivity Checker checks for the connectedness of the atoms in the compound structure. This step removes counter ions and other small disconnected fragments. Remove Sugar groups worker removes linear and ring sugars from the compound structures. Finally, the compound structures are checked for the presence of elements other than non-metals, and if present the structures are discarded by the Curate Strange Elements worker. The curated molecules are consumed by the Generate Atom Signatures worker to generate atom signatures for every atom in the compound structure. The generated atom signatures are written out to a text file (Signature) for re-use. At any step of the process, the curated and discarded structures can be written out to an SDF file. In this workflow, initially tagged compounds (tagged structures) and fully curated compounds (Curated Structures) are written out to SDF files. This workflow is available for free download at

Back to article page