Skip to main content

Table 1 Parameters used for Indel annotation with their defined values and the source of the data

From: A comprehensive study of small non-frameshift insertions/deletions in proteins and prediction of their phenotypic effects by a machine learning method (KD4i)

Class

Parameters

In final method?

Values

Source

Conservation

Conserved residue

Yes

Yes/No

MACSIMS via SM2PH

Block

Yes

Functional

Pfam domain

Yes

Yes/No

MACSIMS via SM2PH

Prosite motif

No

Uniprot domain

Yes

Physico-chemical properties (average)

Volume

No

See table 2

In-house

Hydrophobicity

No

Polarity

Yes

Charge

No

Physico-chemical properties (total)

Volume

Yes

See table 2

In-house

Hydrophobicity

Yes

Polarity

No

Charge

No

Local perturbation in site (average)

Volume

Yes

−2 to +2*

In-house

Hydrophobicity

Yes

Polarity

No

Charge

No

Local perturbation in environment (average)

Volume

No

−2 to +2*

In-house

Hydrophobicity

No

Polarity

No

Charge

No

Local perturbation in region (average)

Volume

No

−2 to +2*

In-house

Hydrophobicity

No

Polarity

No

Charge

No

Local perturbation in site (total)

Volume

Yes

−2 to +2*

In-house

Hydrophobicity

Yes

Polarity

No

Charge

No

Local perturbation in environment (total)

Volume

No

−2 to +2*

In-house

Hydrophobicity

No

Polarity

No

Charge

No

Local perturbation in region (total)

Volume

No

−2 to +2*

In-house

Hydrophobicity

No

Polarity

Yes

Charge

No

Structural

Disorder

Yes

Structured (probability of disorder P < 0.4)

Spine-D

Semi-disorder (0.4 < P < 0.7)

Disorder (P > 0.7)

RSA Secondary structure

Yes

Fully buried (RSA value (Rv < 30)

Spine-D

Buried (30 < Rv < 60)

Intermediate (60 < Rv < 90)

Exposed (90 < Rv < 120)

Fully exposed (Rv > 120)

Secondary structure Relative Indel Position

Yes

Coil

Spine-D

Helix

Strand

Two (if NFS-Indel is in the transition zone between a strand/helix and coil)

Others

Relative Indel Position

Yes

N-terminal

In-house

Middle

C-terminal

Indel length

Yes

One

In-house

More than one

Presence of Proline

No

Yes/No

In-house

 

Presence of Glycine

No

Yes/No

In-house

  1. The column ‘In final method?’ indicates whether the parameter is used in the final ILP rule set for prediction of deleterious NFS-Indels. *The numerical values range from −5 to +5 but, in order to reduce computational cost, we have regrouped values higher than ±2 into the semantic category two or more/two or less.