Skip to main content

Table 4 Number of shifts and features of the data set per atom super class, evaluated on the ‘raw’ BMRB data set

From: NightShift: NMR shift inference by general hybrid model training - a framework for NMR chemical shift prediction

  N CA CB C H HA HB HD HEHZ HG
Orig num shifts 65,440 66,870 60,761 48,686 68,496 71,243 62,116 37,523 21,548 43,227
Orig num features 111 111 111 111 111 111 111 111 111 111
Final num shifts train 39,147 39,947 36,211 29,065 41,076 42,639 37,263 22,508 12,919 25,932
Final num shifts test 26,099 26,632 24,142 19,377 27,385 28,427 24,843 15,006 8,613 17,289
Final num features 44 40 39 44 44 44 38 43 38 38
  1. Maximum identity between proteins in test- and training data set was below 10%. The first two lines show the numbers for the raw data set before applying the training procedure, lines three and four show the number of shifts, and the last line shows the number of features used by the models (numbers for the RefDB are similar).