Skip to main content
Fig. 1 | BMC Bioinformatics

Fig. 1

From: Density Peak clustering of protein sequences associated to a Pfam clan reveals clear similarities and interesting differences with respect to manual family annotation

Fig. 1

Schematic representation of Pfam Ground Truth Architecture (GTA) assignment to a generic protein region \({{\mathcal {S}}}_i\). In this example, the full-length protein \(s_i\) has the following three-family architecture: PFAAAAA + PFBBBBB + PFCCCCC; the aligned region of the search sequence, \({{\mathcal {S}}}_i\), covers (partially) only PFAAAAA and PFBBBBB; thus, the Pfam ground truth of \(S_i\) is \(p_i\) = PFAAAAA_PFBBBBB (note that a 1-residue overlap of \({\mathcal {S}}_i\) with a family is enough for the latter to be included into the GTA); in orange we show \(\mathcal {P}_i\), namely the full region covered by the GTA families on sequence \(s_i\), including residues between them

Back to article page