Table 14 Summary of base features

From: A meta-learning approach for B-cell conformational epitope prediction

Base feature Description Reference
Propensity score The propensity score is derived from a scoring function that sums the log-odd ratios of the amino acids in the spatial neighborhood (defined in [9]) around each residue in a given protein. [9]
Residue accessibility Using NACCESS to calculate the accessibilities of the whole molecule submitted in a pdb file. NACCESS calculates the atomic accessible surface defined by rolling a probe around a van der Waals surface. The residue accessibilities are categorized into 4 classes: all-polar, nonpolar, total-side, and main-chain. [31]
Secondary structure Secondary structure refers to highly regular local sub-structures defined by patterns of hydrogen bonds between the main-chain peptide groups. [26]
In such cases, the chain of amino acids folds into regular repeating structures, such as α helix, β structure, and coil.
Accessible surface area Calculated using Gerstein et al.’s calc-surface program to measure the accessible surface area of a sphere, on each point of which the center of a solvent molecule can be placed in contact with this atom without penetrating any other atoms of the molecule. [38],[39]
Atom volume Calculated using Gerstein et al.’s calc-volume program. It calculates volumes by applying a geometric construction called Voronoi polyhedra to divide the total volume among the atoms in a protein model. [37]
B factor The B factor is also known as the Debye-Waller factor or the temperature factor. It is used to describe the attenuation of x-ray scattering or coherent neutron scattering caused by thermal motion. Two B factors of a protein were considered in this study: the B factor of side chain and the B factor of main chain. [32],[33]
Solvent excluded surface Calculated using Sanner et al.’s MSMS program, which builds the solvent excluded surface based on the reduced surface. [34]
Solvent accessible surface Calculated using Sanner et al.’s MSMS program, which builds the solvent accessible surface based on the reduced surface. [34]
PSSM Using PSI-BLAST to search the non-redundant protein database, and derive the information content from a position specific scoring matrix as the base feature. [36]
Side chain polarity The 20 amino acids were divided into four categories: polar, nonpolar, acidic polar, and basic polar. [40]
Hydropathy index Kyte and Doolittle devised the hydopathy index by applying a sliding-window strategy that continuously determined the average hydopathy in a window as it advanced through the sequence. [41]
Antigenic propensity Kolaskar and Tongaonkar analyzed 156 antigenic determinants (<20 residues per determinant) in 34 different proteins to obtain the antigenic propensities of amino acid residues. [42],[43]
Flexibility Karplus and Schulz developed the flexibility scale based on the mobility of the protein segments on 31 proteins with known structures. [35]
Hydrophilic scale Parker et al. developed the hydrophilic scale based on the high-performance liquid chromatography (HPLC) peptide retention data. [26]