Skip to main content

Table 1 Data statistics of positive and negative sequences (with window size 21) in training and testing datasets

From: Investigation and identification of protein carbonylation sites based on position-specific amino acid composition and physicochemical features

Dataset

Residues

Number of proteins

Number of positive sequences

Number of negative sequences

Training dataset

K

155

206

1166

R

90

101

504

T

81

96

488

P

77

94

412

Independent testing dataset

K

67

78

301

R

65

67

276

T

50

53

124

P

71

82

304