Skip to main content

Table 2 The prediction results of integrative feature representation for H. sapiens via 10-fold cross-validation by SVM, ELM and XGBoost

From: Comparative analysis and prediction of nucleosome positioning using integrative feature representation and machine learning algorithms

Method

Feature

parameter

ACC

Sn

Sp

MCC

AUC

SVM

FCGR + DAC

K = 4, lag = 2

0.8708

0.8896

0.8522

0.7425

0.9315

 

FCGR + TAC

K = 4, lag = 2

0.8679

0.8878

0.8483

0.7369

0.9288

 

FCGR + DACC

K = 4, lag = 2

0.8537

0.8531

0.8544

0.7079

0.9208

 

FCGR + TACC

K = 4, lag = 2

0.8415

0.8319

0.8509

0.6837

0.9113

 

FCGR + PCPseDNC

K = 4, λ = 8, w = 0.5

0.8673

0.8936

0.8413

0.7359

0.9286

 

FCGR + PCPseTNC

K = 4, λ = 8, w = 0.5

0.8708

0.8966

0.8452

0.7429

0.9273

 

All features

 

0.8137

0.7518

0.8748

0.6322

0.8996

ELM

FCGR + DAC

K = 4, lag = 2

0.8292

0.8539

0.8048

0.6598

0.9007

 

FCGR + TAC

K = 4, lag = 2

0.8297

0.8531

0.8065

0.6604

0.8977

 

FCGR + DACC

K = 4, lag = 2

0.8336

0.8627

0.8048

0.6689

0.9009

 

FCGR + TACC

K = 4, lag = 2

0.8325

0.8632

0.8022

0.6668

0.8983

 

FCGR + PCPseDNC

K = 4, λ = 8, w = 0.5

0.8314

0.8658

0.7974

0.6648

0.8985

 

FCGR + PCPseTNC

K = 4, λ = 8, w = 0.5

0.8248

0.8544

0.7957

0.6516

0.8947

 

All features

 

0.8356

0.8632

0.8083

0.6735

0.9013

XGBoost

FCGR

K = 1 + 2 + 4

0.8585

0.89309

0.8244

0.71934

0.9197

 

FCGR + DAC

K = 4, lag = 2

0.8450

0.87503

0.8152

0.69182

0.9160

 

FCGR + TAC

K = 4, lag = 2

0.8402

0.8733

0.8074

0.68221

0.9136

 

FCGR + DACC

K = 4, lag = 2

0.8423

0.86583

0.8191

0.68588

0.9127

 

FCGR + TACC

K = 4, lag = 2

0.8391

0.87287

0.8057

0.68059

0.9115

 

FCGR + PCPseDNC

K = 4, λ = 8, w = 0.5

0.8559

0.88913

0.8230

0.71396

0.9207

 

FCGR + PCPseTNC

K = 4, λ = 8, w = 0.5

0.8498

0.88254

0.8174

0.70168

0.9183

 

All features

 

0.8472

0.87374

0.8209

0.69581

0.9170

  1. All features means the feature vector = FCGR + DACC + TACC + PC-PseDNC + PC-pseTAC, and the parameters are consistent with the parameters of the corresponding feature. Parameter K indicates the values of K nucleotide in FCGR; lag indicates the distance of lag along the sequence; λ represents the highest counted rank (or tier) of the correlation along a DNA sequence; w is the weight factor ranged from 0 to 1
  2. Best values are in bold