Skip to main content

Table 4 The prediction results of integrative feature representation for D. melanogaster via 10-fold cross-validation by SVM, ELM and XGBoost

From: Comparative analysis and prediction of nucleosome positioning using integrative feature representation and machine learning algorithms

Method

Feature

parameter

ACC

Sn

Sp

MCC

AUC

SVM

FCGR + DAC

K = 4, lag = 2

0.8047

0.7862

0.8235

0.6103

0.8762

 

FCGR + TAC

K = 4, lag = 2

0.8089

0.7835

0.8347

0.6190

0.8747

 

FCGR + DACC

K = 4, lag = 2

0.7753

0.7772

0.7733

0.5509

0.8295

 

FCGR + TACC

K = 4, lag = 2

0.7560

0.6772

0.8361

0.5199

0.8247

 

FCGR + PCPseDNC

K = 4, λ = 8, w = 0.5

0.8073

0.7797

0.8354

0.6162

0.8803

 

FCGR + PCPseTNC

K = 4, λ = 8, w = 0.5

0.8057

0.7835

0.8284

0.6129

0.8769

 

All features

 

0.7510

0.6828

0.8204

0.5078

0.7987

ELM

FCGR + DAC

K = 2, lag = 2

0.7920

0.7779

0.8063

0.5847

0.8644

 

FCGR + TAC

K = 2, lag = 2

0.7917

0.7807

0.8028

0.5839

0.8651

 

FCGR + DACC

K = 2, lag = 2

0.7769

0.7617

0.7923

0.5544

0.8503

 

FCGR + TACC

K = 2, lag = 2

0.7694

0.7735

0.7653

0.5391

0.8460

 

FCGR + PCPseDNC

K = 2, λ = 8, w = 0.5

0.7896

0.7631

0.8165

0.5806

0.8651

 

FCGR + PCPseTNC

K = 2, λ = 8, w = 0.5

0.7595

0.7341

0.7853

0.5206

0.8400

 

All features

 

0.7847

0.7810

0.7884

0.5700

0.8576

XGBoost

FCGR

K = 1 + 2 + 4

0.7976

0.7797

0.8158

0.5959

0.8725

 

FCGR + DAC

K = 4, lag = 2

0.7873

0.7717

0.8032

0.5751

0.8613

 

FCGR + TAC

K = 4, lag = 2

0.7877

0.7624

0.8133

0.5768

0.8647

 

FCGR + DACC

K = 4, lag = 2

0.7724

0.7814

0.7632

0.5450

0.8532

 

FCGR + TACC

K = 4, lag = 2

0.7824

0.7693

0.7958

0.5658

0.8542

 

FCGR + PCPseDNC

K = 4, λ = 8, w = 0.5

0.7997

0.7824

0.8172

0.6001

0.8725

 

FCGR + PCPseTNC

K = 4, λ = 8, w = 0.5

0.7988

0.7790

0.8190

0.5989

0.8775

 

All features

 

0.7951

0.7793

0.8112

0.5909

0.8718

  1. All features means the feature vector = FCGR + DACC + TACC + PC-PseDNC + PC-pseTAC, and the parameters are consistent with the parameters of the corresponding feature. Parameter K indicates the values of K nucleotide in FCGR; lag indicates the distance of lag along the sequence; λ represents the highest counted rank (or tier) of the correlation along a DNA sequence; w is the weight factor ranged from 0 to 1
  2. Best values are in bold