Skip to main content

Table 7 Rule-sets obtained by BioHEL for CN and RSA predictions using the full AA alphabet.

From: Automated Alphabet Reduction for Protein Datasets

Rules for CN prediction

Rules for RSA prediction

1:If AA-4 ∉ {E,L,M,N,R,X}, AA-3 ∉

1:If AA-4 ∉ {G,I,L,V,X,F,Y}, AA-3 ∉

{D,E,N,H,R,F,W,Y,X}, AA-2 ∉ {E,F,W,N,S,P},

{G,Q,F,W}, AA-2 ∉ {C,N,P}, AA-1 ∉ {A,I,V,Q,Y}, AA ∈ {K},

AA-1 ∉ {D,E,F,G,H,K,N,Q}, AA ∉ {C,I,L,M,V}, AA1 ∉

AA1 ∉ {F,I,L,M,V,N,T,P}, AA2 ∉ {N,Q,S,P}, AA3 ∉

{D,E,K,R,N,Q,S,P}, AA2 ∉ {H,R,M,P,T,N,W,X}, AA3 ∉

{C,I,L,R,W}, AA4 ∉ {A,C,I,L,R,S} then RSA is high

{A,C,I,L,M,V,F,G,H,X}, AA4 ∉ {A,C,L,M,G,H,F,W} then

2:If AA-4 ∉ {A,I,L,V,G,W,F}, AA-3 ∉

CN is High

{C,I,M,V,G,P,S,T,Y,F}, AA-2 ∉ {C,H,R,F,W}, AA-1 ∉

2:If AA-4 ∉ {E,H,K,R,N,Q,P,W,X}, AA-3 ∉

{F,H,I}, AA ∈ {E,K}, AA1 ∉ {I,M,V,N,S}, AA2 ∉

{D,E,K,R,M,N,T,P,Y}, AA-2 ∉ {D,N,S}, AA-1 ∉

{C,D,H,N,S}, AA3 ∉ {A,C,I,L,V,H,N,W,Y,F}, AA4 ∉

{D,E,G,K,N,P}, AA ∈ {A,C,I,L,M,W}, AA1 ∉

{G,H,I,L,M,P,F,W,Y} then RSA is high

{D,E,G,K,P,N,Q,S,T}, AA2 ∉ {C,I,D,G,P,S,X,Y}, AA3 ∉

.

{D,E,G,K,R,N,Q,S,P,X}, AA4 ∈ {A,C,I,L,M,V,F,G,T} then

.

CN is high

.

.

.

.

.

.

.

32:If AA-4 ∉ {E,F,P,K,R,S,X}, AA-3 ∈

.

{A,C,I,L,V,G,F,W,X,Y}, AA-2 ∉ {F,H,I,M,P,N,Q,X}

57:If AA-4 ∉ {Q}, AA-3 ∉ {G,H,I,V,P,Y}, AA-2 ∉

AA-1 ∉ {C,I,D,E,G,P,K,R,N,S}, AA ∈

{G,H,T,M,V,W,Y,F}, AA-1 ∉ {G,I,M,V,X,Y},

{A,C,I,L,M,V,F,W,Y}, AA1 ∉ {G,P,N,Q,T,X}, AA2 ∉

AA ∈ {D,E,G,H,K,P,Q,S}, AA1 ∉ {E,F,W}, AA2 ∈

{D,G,N,Q,S}, AA3 ∉ {D,K,P,Q,W}, AA4 ∉ {A,I,M,R,X}

{D,E,G,H,K,N,S,T,P,X}, AA3 ∉ {G,K,F,W}, AA4 ∉

then CN is high

{L,M,R,W} then RSA is high

33:Default class: CN is low

58:Default class: RSA is low

  1. Rule set at the left is for CN prediction. Rule set at the right is for RSA prediction. AA± nmeans AA type for residue in position ± n in respect to the target residue. X means end of chain, in case one of the residues of the window overlaps with either one of the two ends of a chain.