Skip to main content

Table 3 The prediction results of integrative feature representation for C. elegans via 10-fold cross-validation by SVM, ELM and XGBoost

From: Comparative analysis and prediction of nucleosome positioning using integrative feature representation and machine learning algorithms

Method

Feature

parameter

ACC

Sn

Sp

MCC

AUC

SVM

FCGR + DAC

K = 4, lag = 2

0.8574

0.8863

0.8290

0.7164

0.9283

 

FCGR + TAC

K = 4, lag = 2

0.8561

0.8824

0.8302

0.7137

0.9272

 

FCGR + DACC

K = 4, lag = 2

0.8471

0.8777

0.8171

0.6961

0.9122

 

FCGR + TACC

K = 4, lag = 2

0.8470

0.8641

0.8301

0.6949

0.9179

 

FCGR + PCPseDNC

K = 4, λ = 8, w = 0.5

0.8576

0.8921

0.8236

0.7176

0.9275

 

FCGR + PCPseTNC

K = 4, λ = 8, w = 0.5

0.8539

0.8839

0.8244

0.7096

0.9275

 

All features

 

0.8431

0.8461

0.8401

0.6867

0.9139

ELM

FCGR + DAC

K = 4, lag = 2

0.8707

0.8863

0.8555

0.7421

0.9355

 

FCGR + TAC

K = 4, lag = 2

0.8696

0.8890

0.8505

0.7400

0.9359

 

FCGR + DACC

K = 4, lag = 2

0.8684

0.8831

0.8539

0.7376

0.9358

 

FCGR + TACC

K = 4, lag = 2

0.8680

0.8917

0.8447

0.7371

0.9329

 

FCGR + PCPseDNC

K = 4, λ = 8, w = 0.5

0.8624

0.8847

0.8405

0.7258

0.9318

 

FCGR + PCPseTNC

K = 4, λ = 8, w = 0.5

0.8557

0.8847

0.8271

0.7132

0.9262

 

All features

 

0.8597

0.8863

0.8336

0.7210

0.9271

XGBoost

FCGR

K = 1 + 2 + 4

0.8487

0.8797

0.8182

0.6995

0.9202

FCGR + DAC

K = 4, lag = 2

0.8416

0.8652

0.8182

0.6842

0.9165

FCGR + TAC

K = 4, lag = 2

0.8433

0.8707

0.8163

0.6882

0.9169

FCGR + DACC

K = 4, lag = 2

0.8462

0.8703

0.8225

0.6938

0.9170

FCGR + TACC

K = 4, lag = 2

0.8417

0.8676

0.8163

0.6848

0.9162

FCGR + PCPseDNC

K = 4, λ = 8, w = 0.5

0.8450

0.8749

0.8156

0.6917

0.9199

FCGR + PCPseTNC

K = 4, λ = 8, w = 0.5

0.8493

0.8789

0.8202

0.7004

0.9178

All features

 

0.8481

0.8695

0.8271

0.6973

0.9195

  1. All features means the feature vector = FCGR + DACC + TACC + PC-PseDNC + PC-pseTAC, and the parameters are consistent with the parameters of the corresponding feature. Parameter K indicates the values of K nucleotide in FCGR; lag indicates the distance of lag along the sequence; λ represents the highest counted rank (or tier) of the correlation along a DNA sequence; w is the weight factor ranged from 0 to 1
  2. Best values are in bold