From: Automated Alphabet Reduction for Protein Datasets
Orig. Alphabet
ACDEFGHIKLMNPQRSTVWXY
Encoding
001100001001111110010
Meaning of the encoding
Group 1: ACFGHILMV WY
Group 2: DEKNPQRSTX