Skip to main content
Fig. 1 | BMC Bioinformatics

Fig. 1

From: CASTELO: clustered atom subtypes aided lead optimization—a combined machine learning and molecular modeling method

Fig. 1

The general pipeline for CASTELO. The starting point is the generation of MD trajectories, with tools such as GROMACS. RMSD clustering can be done with VMD software. In another route, we process MD trajectories with python scripts to obtain contact matrices. Atom subtype information is used to aggregate the calculated contact matrices. Following that, dynamism tensors with temporal information is generated on top of the contact matrices using python scripts. CVAE model is used to encode the dynamism data, before clusters are calculated with tools such as HDBSCAN. Finally, we converge the two routes by comparing clusters from conventional RMSD clustering and CVAE clustering with proposed comparison metrics. The atom subtypes are ranked, as the final output of CASTELO. With domain knowledge, we suggest modifications for the lowest ranked atoms. Methods such as free energy perturbation calculations can be used to verify CASTELO’s suggestions

Back to article page