Skip to main content

Table 2 Initial set of features extracted from amino acid sequences

From: Predictability of gene ontology slim-terms from primary structure information in Embryophyta plant proteins

Nature

Description

Number

Physical-chemical

Sequence length

1

 

Molecular weight

1

 

Positively charged residues (%)

1

 

Negatively charged residues (%)

1

 

Isoelectric point

1

 

GRAVY

1

Primary structurestatistics

Amino acid frequencies

20

 

Amino acid dimer frequencies

400

Secondary structurestatistics

Structure frequencies

3

 

Structural dimer frequencies

9

 

Total

438

  1. Features are divided into three broad categories: physical-chemical features, primary structure composition statistics and secondary structure composition statistics.