Skip to main content

Table 1 Datasets used to train and test a MoRF predictor

From: Discovering MoRFs by trisecting intrinsically disordered protein sequence into terminals and middle regions

Data sets

No. of Sequences

Total residues

No. of MoRF residues

No. of non-MoRF residues

training set

TRAIN

421

245,984

5396

240,588

test sets

TEST

419

258,829

5153

253,676

NEW

45

37,533

626

36,907

TEST464

464

296,362

5779

290, 583

TEST266

266

154,399

3305

151,094

validation set

EXP53

53

25,186

2432

22,754