From: Efficient use of unlabeled data for protein sequence classification: a comparative study
Method
mismatch(5,1)
mismatch(5,2)
triple(1,3)
full seq.
12,084
13,593
153
region
2,624
3,195
73
region+clustering
2,412
2,998
64