Skip to main content

Table 3 Feature statistics for different datasets (GM = generic model; BM = biological model). Note that the feature list used in the BM model is longer than that of the GM model due to the additional binary biological features (has-protein, has-two-proteins, etc.).

From: Identification of transcription factor contexts in literature using machine learning approaches

  

TF data

PPI Data

NonPF Data

total # features

GM

1327

1188

1780

 

BM

803

760

1306

# features per sentence

GM

9.70

14.44

11.43

 

BM

12.87

17.73

9.78