Skip to main content

Table 4 Classification using topo-relevant cohort

From: Gene expression data classification using topology and machine learning models

EXPR

Decision tree

Naive Bayes classifier

  

FULL

H1+H2

H2

FULL

H1+H2

H2

Droso breeding

#

131

116

101

131

116

101

Accuracy

0.714125

0.751768

0.793434

0.398146

0.412121

0.422444

Precision

0.745417

0.815000

0.835000

0.389111

0.431756

0.451673

Recall

0.712500

0.754167

0.795833

0.400000

0.416667

0.434478

Droso parasitod

#

89

85

51

   

Accuracy

0.792778

0.796667

0.811667

–

–

–

Precision

0.817381

0.823571

0.859167

–

–

–

Recall

0.792500

0.797500

0.825000

–

–

–

Mouse prion

#

321

292

168

321

292

146

Accuracy

0.562310

0.616240

0.586843

0.555112

0.578489

0.576131

Precision

0.562716

0.591471

0.543743

0.383462

0.378556

0.384572

Recall

0.539712

0.564394

0.558267

0.415855

0.422354

0.423651

Mouse liver cancer

#

242

229

190

242

229

190

Accuracy

0.682761

0.698934

0.729545

0.723232

0.723232

0.721404

Precision

0.590716

0.579833

0.656051

0.444761

0.444761

0.412018

Recall

0.573319

0.602582

0.641168

0.499837

0.499837

.506429

Mouse E.Coli

#

226

206

166

226

206

166

Accuracy

0.880731

0.851794

0.892900

0.592770

0.592105

0.592105

Precision

0.880541

0.853406

0.901481

0.604010

0.651101

0.652203

Recall

0.868052

0.842963

0.891786

0.509841

0.511111

0.511111

Human bowel disease

#

1745

101

101

   

Accuracy

0.499698

0.510987

0.510987

–

–

–

Precision

0.493808

0.509147

0.509147

–

–

–

Recall

0.491258

0.501173

0.501173

–

–

–

  1. Each of the data are explained in Dataset section. The # symbol indicates the size of each dataset. ‘–’ in the table means the stats were too low: the relevant classifier was unable to classify the given data. The column ‘FULL’ represents training on the full dataset while \({H}_1+{H}_2\) represent the union of \(n'\) topo-relevant cohorts obtained from the dominant cycles in either \({H}_1\) or \({H}_2\) whereas \(H_2\) represents cohorts obtained from the dominant cycles in \(H_2\)