Skip to main content

Table 8 The number of neighbors (mean/median/maximum) and the number of observed features with and without clustering for the remote fold recognition task

From: Efficient use of unlabeled data for protein sequence classification: a comparative study

Method Without Clustering With Clustering
  # neighbors # features # neighbors # features
full seq. 135/99/490 192,378,952 64/41/356 120,990,413
region 64/41/356 34,807,209 50/26/352 28,738,521
no tails (full seq.) 75/17/402 57,575,176 23/11/325 29,649,870
max. length (full seq.) 70/16/431 39,915,003 22/12/279 14,634,511