Skip to main content

Table 1 Total number of pathogen proteins taken from Uniprot and number of proteins remaining after redundancy filtering at 3 different percentages.

From: A Support Vector Machine based method to distinguish proteobacterial proteins from eukaryotic plant proteins

Positive dataset (Pathogen)

Total number of proteins (reviewed)

90% redundancy

50% redundancy

30% redundancy

Agrobacterium tumefaciens (Rhizobium radiobacter)

104

103

103

103

Burkholderia phymatum

333

333

333

333

Pseudomonas aeruginosa (ATCC)

1217

1216

1211

1211

Xanthomonas oryzae pv. Oryzae

411

410

410

410

Ralstonia solanacearum

601

601

599

599

Rhizobium etli (ATCC)

424

421

421

421

Rhizobium meliloti

48

47

47

47

Methylobacterium nodulans

213

213

213

213

Desulfobacterales autotrophicum (ATCC)

157

157

157

157

Total

3508

3501

3494

3494

Total after blastclust on cumulative data

-

3408

3230

3203