Skip to main content

Table 5 Mapping of the datasets to PDB structures and CATH domains

From: Representativeness of variation benchmark datasets

dataset

no. of variants mapped to PDB

no. of variants mapped to CATH domain

% mapped to CATH domain (of mapped to PDB)

no. of variants with a CATH classification

% with a CATH classification (of mapped to PDB)

no. of unique CATH superfamilies

DS1

39,081

23,303

59.63

21,853

55.92

700

DS2

2358

1387

58.82

1319

55.94

319

DS3

10,242

6580

64.25

6396

62.45

239

DS4

2245

1325

59.02

1262

56.21

306

DS5

7261

4687

64.55

4556

62.75

227

DS6

1743

991

56.86

941

53.99

277

DS7

9519

6100

64.08

5920

62.19

234

DS8

1706

973

57.03

928

54.40

269

DS9

6652

4301

64.66

4170

62.69

223

DS10

1731

865

49.97

826

47.72

253

DS11

6420

4350

67.76

4212

65.61

220

DS12

150

66

44.00

62

41.33

32

DS13

481

142

29.52

135

28.07

18

DS14

953

478

50.16

454

47.64

186

DS15

3728

2557

68.59

2486

66.68

188

DS16

82

38

46.34

36

43.90

21

DS17

272

78

28.68

73

26.84

12

DS18

4494

2980

66.31

2862

63.68

274

DS19

3418

2081

60.88

2035

59.54

210

DS20

2985

2086

69.88

2031

68.04

235

DS21

10,990

7051

64.16

6786

61.75

402

DS22

2169

1301

59.98

1217

56.11

291

DS23

10,290

6566

63.81

6353

61.74

307

DS24

12,749

7828

61.40

7499

58.82

347