Skip to main content

Table 1 16 features for benchmarking DeepQA

From: DeepQA: improving the estimation of single protein model quality with deep belief networks

Feature Name Feature descriptions
(1). Surface score (SU) The total area of exposed nonpolar residues divided byc the total area of all residues
(2). Exposed mass score (EM) The percentage of mass for exposed area, equal to the total mass of exposed area divided by the total mass of all area
(3). Exposed surface score (ES) The total exposed area divided by the total area
(4). Solvent accessibility score (SA) The difference of solvent accessibility predicted by SSpro4 [1] from the protein sequence and those of a model parsed by DSSP [2]
(5). RF_CB_SRS_OD score [3] A novel distance dependent residue-level potential energy score.
(6). DFIRE2 score [4] A distance-scaled all atom energy score.
(7). Dope score [5] A new statistical potential discrete optimized protein energy score.
(8). GOAP score [6] A generalized orientation-dependent, all-atom statistical potential score.
(9). OPUS score [7] A knowledge-based potential score.
(10). ProQ2 score [8] A single-model quality assessment method by machine learning techniques.
(11). RWplus score [9] A new energy score using pairwise distance-dependent atomic statistical potential function and side-chain orientation-dependent energy term
(12). ModelEvaluator score [10] A single-model quality assessment score based on structural features using support vector machine.
(13). Secondary structure similarity score (SS) The difference of secondary structure information predicted by Spine X [11] from a protein sequence and those of a model parsed by DSSP [2]
(14). Secondary structure penalty score (SP) Calculated from the predicted secondary structure alpha-helix and beta-sheet matching with the one parsed by DSSP.
(15). Euclidean compact score (EC) The pairwise Euclidean distance of all residues divided by the maximum Euclidean distance (3.8) of all residues.
(16). Qprob [12] A single-model quality assessment score that utilizes 11 structural and physicochemical features by feature-based probability density functions.