Skip to main content

Table 8 Mapping of the datasets to Pfam domains

From: Representativeness of variation benchmark datasets

dataset

number of unique Pfam domains

number of variants with a Pfam domain

% variants with a Pfam domain of total number of variants in dataset

no. of variants mapped to a UniProt sequence

% variants with a Pfam domain of number of variants mapped to UniProt

KS statistica

DS1

5213

148,681

33.34

378,706

39.26

0.25 (< 10− 4)

DS2

2065

7307

30.87

18,660

39.16

0.64 (< 10− 4)

DS3

794

14,228

73.59

19,318

73.65

0.86 (< 10− 4)

DS4

1954

6589

33.86

15,880

41.49

0.66 (< 10− 4)

DS5

742

10,997

75.27

14,597

75.34

0.87 (< 10− 4)

DS6

1898

5293

30.03

13,811

38.32

0.67 (< 10− 4)

DS7

775

12,842

73.28

17,514

73.32

0.86 (< 10− 4)

DS8

1810

4833

33.00

11,847

40.80

0.68 (< 10−4)

DS9

727

9796

74.80

13,096

74.80

0.87 (< 10− 4)

DS10

1632

4396

33.65

10,882

40.40

0.72 (< 10−4)

DS11

668

9641

76.61

12,584

76.61

0.88 (< 10− 4)

DS12

147

579

36.07

1288

44.95

0.97 (< 10−4)

DS13

80

897

68.95

1301

68.95

0.99 (< 10− 4)

DS14

1197

2656

30.66

7185

36.97

0.79 (< 10− 4)

DS15

551

5619

78.85

7151

80.31

0.90 (< 10− 4)

DS16

116

354

33.62

848

42.22

0.98 (< 10−4)

DS17

64

526

70.04

751

70.04

0.99 (< 10−4)

DS18

1265

7190

44.66

12,056

59.64

0.78 (< 10−4)

DS19

1172

4859

47.33

10,154

47.85

0.80 (< 10−4)

DS20

1046

4818

54.44

8662

55.62

0.82 (< 10− 4)

DS21

2301

20,415

50.55

39,735

51.38

0.60 (< 10− 4)

DS22

2090

7727

36.53

21,151

36.53

0.64 (< 10−4)

DS23

1073

16,309

73.48

22,196

73.48

0.81 (< 10− 4)

DS24

3325

41,997

55.94

75,042

55.96

0.61 (< 10−4)

  1. ap-value between brackets