Skip to main content
Fig. 1 | BMC Bioinformatics

Fig. 1

From: Protein secondary structure prediction using a small training set (compact model) combined with a Complex-valued neural network approach

Fig. 1

Representation of features. A target residue, t in the input sequence is represented as a 27-dimensional feature vector. The input sequence is read in a sliding window (w) of 17 residues (grey). The central residue (t) and several of its neighbours to the left and right are shown. CATH templates were previously assigned SS using DSSP. Target to template threading was done using w = 17 and the reference energy computed with the CABS-algorithm. The SS are read in from best fit template sequences that have the lowest energy for the central 9 residues within w. Since multiple SS assignments will be available for a residue, t and its neighbours from from templates, the probability of each SS state is computed using a hydrophobic cluster similarity score. P(H), P(E) and P(C) denote probabilities of t and its four neighbours to the left and right, adopting Helix, Sheet and Coil structures respectively. CATH templates are homology removed and independent with respect to the CB513 dataset

Back to article page