C, sequence conservation; D, distance to the GCM; A, residue type; K, closeness. A check symbol on a column indicates that the feature was included in training. The Matthew’s correlation coefficient, expressed as a percentage, was calculated on the validation set after convergence of the training procedure. Networks were trained either with “All residues” or only with residues that showed a conservation value of 3.5 or more. The calculation of the MCC was performed over the entire original set of residues, regardless of the effect of preselection on the overall counts.