VarSight: prioritizing clinically reported variants with binary classification algorithms

BMC Bioinformatics

Table 2 Classifier performance statistics

Classifier	CV10 Acc.	AUROC	AUPRC
RandomForest(sklearn)	0.84+-0.13	0.9282	0.1961
LogisticRegression(sklearn)	0.84+-0.13	0.9300	0.2458
BalancedRandomForest(imblearn)	0.86+-0.11	0.9313	0.2015
EasyEnsembleClassifier(imblearn)	0.85+-0.08	0.9303	0.1918

For each tuned classifier, we show performance measures commonly used for classifiers (from left to right): 10-fold cross validation balanced accuracy (CV10 Acc.), area under the receiver operator curve (AUROC), and area under the precision-recall curve (AUPRC). The CV10 Acc. was gathered during hyperparameter tuning by calculating the average and standard deviation of the 10-fold cross validation. AUROC and AUPRC was evaluated on the testing set after hyperparameter tuning and fitting to the full training set

ISSN: 1471-2105