Skip to main content

Table 7 Performance of mGOASVM with different inputs and different numbers of homologous proteins for (a) the virus dataset and (b) the plant dataset

From: mGOASVM: Multi-label protein subcellular localization based on gene ontology and support vector machines

(a) Performance on the viral protein dataset

Input data type

#homo

N d (GO)

Locative accuracy

Actual accuracy

AC

0

331

244/252 = 96.8%

191/207 = 92.3%

S

1

310

244/252 = 96.8%

184/207 = 88.9%

S

2

455

235/252 = 93.3%

178/207 = 86.0%

S

4

664

221/252 = 87.7%

160/207 = 77.3%

S

8

1134

202/252 = 80.2%

130/207 = 62.8%

S + AC

1

334

242/252 = 96.0%

188/207 = 90.8%

S + AC

2

460

238/252 = 94.4%

179/207 = 86.5%

S + AC

4

664

230/252 = 91.3%

169/207 = 81.6%

S + AC

8

1134

216/252 = 85.7%

145/207 = 70.1%

(b) Performance on the plant protein dataset

Input data

#homo

N d (GO)

Locative accuracy

Actual accuracy

AC

0

1532

1023/1055 = 97.0%

863/978 = 88.2%

S

1

1541

1015/1055 = 96.2%

855/978 = 87.4%

S

2

1906

907/1055 = 85.8%

617/978 = 63.1%

S + AC

1

1541

1010/1055 = 95.7%

859/978 = 87.8%

S + AC

2

1906

949/1055 = 90.0%

684/978 = 70.0%

  1. S: Sequence; AC: Accession Number; # homo: Number of homologs used in the experiments; N d (GO): Number of Distinct GO Terms. # homo=0 means only the true accession number is used.