Skip to main content

Table 5 The performance of all feature descriptors with various machine learning algorithms based on independent dataset

From: Analysis and prediction of single-stranded and double-stranded DNA binding proteins based on protein sequences

Method

Features

ACC

SN

SP

AUC

MCC

F1

Random forest

OAAC

0.697

0.734

0.585

0.660

0.290

0.785

Dipeptide = 0

0.570

0.557

0.610

0.583

0.144

0.660

Dipeptide = 1

0.696

0.731

0.590

0.687

0.292

0.784

Dipeptide = 2

0.546

0.516

0.634

0.575

0.130

0.631

AAindex

0.703

0.798

0.415

0.607

0.211

0.802

PSSM

0.703

0.774

0.488

0.631

0.249

0.797

All features

0.727

0.807

0.488

0.647

0.288

0.816

SVM

All features

0.642

0.613

0.732

0.672

0.298

0.720

Baseline

All features

0.509

0.492

0.558

0.526

0.044

0.591