Skip to main content

Table 2 Recognition rate of Decision Tree models

From: Developing and validating predictive decision tree models from mining chemical structural fingerprints and high–throughput screening data in PubChem

  

Active compounds

Inactive compounds

   

Bioassay

PubChem Assay ID

TP

FN

Sensitivity

TN

FP

Specificity

Overall accuracy

MCC

Model Complexity (Number of Nodes/Number of Leaves/Number of Features)

5HT1a agonist

567

362

4

98.9%

64,394

146

99.8%

99.8%

0.84

(321/161/149)

5HT1a antagonist

612

360

56

86.5%

60,909

281

99.5%

99.5%

0.70

(1135/568/261)

HIV-1 RT RNase H inhibitor

565

1,128

122

90.2%

63,070

896

98.6%

98.4%

0.70

(3003/1502/412)

HIV-1 RT RNase H inhibitor

372

640

130

83.1%

98,463

535

99.5%

99.3%

0.67

(2511/1256/370)

  1. TP = true positives, the number of correctly recognized active compounds;
  2. FN = false negative, the number of active compounds that the model is unable to recognize;
  3. TN = true negative, the number of inactive compounds that successfully recognized by the model;
  4. FP = false positive, the number of inactive compounds that the model is unable to recognize.