Skip to main content

Table 7 Performance validation of MP4 on pathogenic protein dataset

From: MP4: a machine learning based classification tool for prediction and functional annotation of pathogenic proteins from metagenomic and genomic datasets

Strains

Class 1

Class 2

Class 3

Total sequences

Pathogenicity index

References

Bacillus anthracis A2012 uid54101

42

96

159

297

0.859

https://www.ncbi.nlm.nih.gov/bioproject?cmd=Retrieve&dopt=Overview&list_uids=299

Prevotella melaninogenica ATCC 25,845 uid51377

344

288

1661

2293

0.85

http://hmp.jcvi.org/jumpstart/hmp013/index.shtml

Chlamydophila psittaci 6BC uid63621

154

139

682

975

0.842

https://doi.org/10.1128/mBio.00604-12

Chlamydophila pneumoniae TW 183 uid57997

178

160

775

1113

0.84

PMID: 26420648

Helicobacter pylori B8 uid49873

314

197

1196

1707

0.816

PMID:21896079

Helicobacter pylori SouthAfrica20 uid216150

320

218

1164

1702

0.812

PMID: 21081026

Shigella dysenteriae 1617 uid229875

1224

2520

2665

6409

0.809

 

Providencia stuartii MRSN 2154 uid162193

900

920

2279

4099

0.78

 

Francisella tularensis holarctica F92 uid181998

407

600

835

1842

0.779

PMC3569339

Escherichia coli CFT073 uid57915

1196

1519

2649

5364

0.777

PMID: 12471157

Proteus mirabilis HI4320 uid61599

817

786

2059

3662

0.777

PMID: 18375554

Klebsiella pneumoniae 342 uid59145

1302

1649

2815

5766

0.774

https://doi.org/10.1371/journal.pgen.1000141

Capnocytophaga ochracea DSM 7271 uid59197

493

616

1062

2171

0.773

PMID: 21304645

Citrobacter koseri ATCC BAA 895 uid58143

1153

1397

2456

5006

0.77

PMID:12751719

Escherichia coli clone D i14 uid162049

1138

1342

2438

4918

0.769

https://doi.org/10.1371/journal.ppat.1006525

Mycoplasma pneumoniae 309 uid85495

164

107

436

707

0.768

PMID:18754792

Enterobacter aerogenes KCTC 2190 uid68103

1171

1330

2411

4912

0.762

PMID: 22493190

Shigella sonnei 53G uid84383

1303

1586

2521

5410

0.759

https://www.ncbi.nlm.nih.gov/genome/?term=Shigella+sonnei+53G+uid84383

Treponema pallidum DAL 1 uid87065

256

404

396

1056

0.758

PMID: 23449808

Enterobacter cloacae SCF1 uid59969

1067

1249

2083

4399

0.757

PMC3236048

Moraxella catarrhalis BBH18 uid48809

460

452

974

1886

0.756

PMID: 20453089

Capnocytophaga canimorsus Cc5 uid70727

590

607

1207

2404

0.755

https://doi.org/10.1371/journal.ppat.1000164

Shigella flexneri 2,002,017 uid159233

1160

1239

2304

4703

0.753

PMID: 19955273

Nocardia brasiliensis ATCC 700,358 uid86913

2081

3731

2602

8414

0.753

PMC3347167

Salmonella typhimurium DT104 uid223287

1159

1280

2153

4592

0.748

PMID: 9752592

  1. Where, Class 1: Non-pathogenic proteins; Class 2: Antibiotic resistance and toxic proteins and Class 3: Secretory and capsular proteins