Skip to main content

Table 2 With values 1 and 25 for the aa-limit and check-limit parameters, respectively, our heuristic guarantees a minimum identity percentage equal to 92.55% for pairs of similar classified proteins (Table 3)

From: GENPPI: standalone software for creating protein interaction networks from genomes

Amino acids

A

R

N

D

C

E

Q

G

H

I

L

K

M

F

P

S

T

W

Y

V

A histogram

12

2

8

6

4

10

1

9

2

11

6

13

3

2

4

14

5

1

5

10

B histogram

11

3

8

6

4

11

1

9

2

11

8

12

3

2

4

13

4

1

4

11

abs(A-B):

1

1

0

0

0

1

0

0

0

0

2

1

0

0

0

1

1

0

1

1

  1. According to the heuristics of GENPPI, proteins A and B are similar because, in the difference of their amino acid histograms, at least 25 of the 26 possible types presented frequency differences less than or equal to 1. In this table, we present only the 20 principal amino acids for the sake of exemplification. For the proteins A and B, in fasta format below, we have 94.5% identity (96.9% similar) according to the Needleman–Wunsch Algorithm. Amino acids in bold format are the different ones between A and B sequences
  2. >A Protein
  3. MAYSKKVMDHYENPRNVGSFSNSDNNVGSGLVGAPACGDVMKLQIKVNEKGIIEDACFKTYGCGS
  4. AIASSSLVTEWVKGKSITEAESIRNTTIVEELELPPVKIHCSILAEDAIKAAIADYKSKKYSN
  5. >B Protein
  6. MAYSKKVMDHYENPRNVGSFSNSDLNVGSGLVGAPACGDVMKLQIKVNEEGIIEDACFKTYGCGS
  7. AIASSSLVTEWVKGKSIVEAESIRNTTIVEELELPPVKIHCSILAEDAIKAAISDYKRKKNLN