Cross-validation performance of ML model as a function of interaction features. (Left) %Top10 — percentage of targets with at least one positive molecule in Top10 ranked. Description of various groups of features (SM-5, SM-11, SM-5_11, SM-M2M, SM-M2T, SM-Cross, SM-Intra and SeqMact) are given in the text. Error bars are estimated from 30 independent cross-validation experiments. (Right) Prediction of active mutants at least as specific as the wild type I-CreI. Top10 — avg. number of active proteins at least as specific as I-CreI in top10 ranked molecules, α — trade-off parameter between predicted specificity and activity of candidate proteins. Seq – machine learning model trained on protein/target sequences, Fx – FoldX score.