Training dataset | Sources | Initial number sequences | Sequences >120 AA | Size after redundancy removal |
---|---|---|---|---|
ntm | PDB-REPRDB [32] | 3159 | 2290 | 1763 |
ahtm | Sanger all-alpha membrane datasets A, B and C [33] | 189 | 166 | 132 |
bbtm | TC-DB [35], Uniprot [34] and PDB [5] | 1126 | 1107 | 196 |