Skip to main content
Figure 1 | BMC Bioinformatics

Figure 1

From: SEQOPTICS: a protein sequence clustering system

Figure 1

SEQOPTICS Overview. This figure depicts four steps in the system: First, data sets are extracted from data sources (mostly protein databases), then mixed and randomized. Three data sources are Pfam, Swiss-Prot and NCBI. Secondly, the pairwise distances between any two proteins are computed. Here a normalized Smith-Waterman score is used. Several other options may be chosen, such as BLAST or FASTA, for distance measure. Thirdly, the OPTICS algorithm is adopted to execute the clustering and the clustering structure is graphically presented. Finally clustering results are analyzed and compared to some other methods based on Jaccard Coefficient, Precision, and Recall.

Back to article page