Skip to main content

Table 1 Data split and number of classes for the two tasks analyzed

From: Deep active learning for classifying cancer pathology reports

Dataset

Initial training

Validation

Testing

Holdout

Classes

Large

15,000

18,032

20,036

147,284

525 (Histology)

317 (Subsite)

Small

1000

18,032

20,036

161,284

525 (Histology)

317 (Subsite)