Skip to main content

Table 2 Contact prediction performance on PFAM datasets

From: Correlated mutations via regularized multinomial regression

Nprota

Accuracyb

Xdc

 

L/5 d

L/10 d

L/5 d

L/10 d

200-500

0.10 (0.09)

0.11 (0.12)

4.1 (4.2)

4.6 (5.5)

500-1000

0.16 (0.14)

0.21 (0.18)

6.3 (5.2)

8.0 (6.9)

1000-2000

0.24 (0.18)

0.32 (0.24)

9.3 (7.5)

12.1 (9.3)

2000-4000

0.25 (0.19)

0.33 (0.26)

8.8 (7.9)

11.8 (10.1)

  1. a Nprot, number of protein sequences in the alignment.
  2. bAccuracy, fraction of predicted contacts that is correct according to the crystal structure. Contacts are defined according to the CASP criteria (Cβ atoms (Cα for glycines) within a distance of 8 Å; only contacts for residues separated at least 24 residues along the sequence were taken into account).
  3. c Xd, measures how the distribution of distances for predicted contact pairs differs from the distribution of all pairs of residues in the target domain structure.
  4. d Highest ranked predicted contacts were assessed, using either L/5 or L/10 contacts. Here L refers to the length of the target sequence. Values for accuracy and Xd are averages (standard deviations).