Skip to main content

Table 4 Number of shifts and features of the data set per atom super class, evaluated on the ‘raw’ BMRB data set

From: NightShift: NMR shift inference by general hybrid model training - a framework for NMR chemical shift prediction

 

N

CA

CB

C

H

HA

HB

HD

HEHZ

HG

Orig num shifts

65,440

66,870

60,761

48,686

68,496

71,243

62,116

37,523

21,548

43,227

Orig num features

111

111

111

111

111

111

111

111

111

111

Final num shifts train

39,147

39,947

36,211

29,065

41,076

42,639

37,263

22,508

12,919

25,932

Final num shifts test

26,099

26,632

24,142

19,377

27,385

28,427

24,843

15,006

8,613

17,289

Final num features

44

40

39

44

44

44

38

43

38

38

  1. Maximum identity between proteins in test- and training data set was below 10%. The first two lines show the numbers for the raw data set before applying the training procedure, lines three and four show the number of shifts, and the last line shows the number of features used by the models (numbers for the RefDB are similar).