Skip to main content

Advertisement

Table 7 Performance of mGOASVM with different inputs and different numbers of homologous proteins for (a) the virus dataset and (b) the plant dataset

From: mGOASVM: Multi-label protein subcellular localization based on gene ontology and support vector machines

(a) Performance on the viral protein dataset
Input data type #homo N d (GO) Locative accuracy Actual accuracy
AC 0 331 244/252 = 96.8% 191/207 = 92.3%
S 1 310 244/252 = 96.8% 184/207 = 88.9%
S 2 455 235/252 = 93.3% 178/207 = 86.0%
S 4 664 221/252 = 87.7% 160/207 = 77.3%
S 8 1134 202/252 = 80.2% 130/207 = 62.8%
S + AC 1 334 242/252 = 96.0% 188/207 = 90.8%
S + AC 2 460 238/252 = 94.4% 179/207 = 86.5%
S + AC 4 664 230/252 = 91.3% 169/207 = 81.6%
S + AC 8 1134 216/252 = 85.7% 145/207 = 70.1%
(b) Performance on the plant protein dataset
Input data #homo N d (GO) Locative accuracy Actual accuracy
AC 0 1532 1023/1055 = 97.0% 863/978 = 88.2%
S 1 1541 1015/1055 = 96.2% 855/978 = 87.4%
S 2 1906 907/1055 = 85.8% 617/978 = 63.1%
S + AC 1 1541 1010/1055 = 95.7% 859/978 = 87.8%
S + AC 2 1906 949/1055 = 90.0% 684/978 = 70.0%
  1. S: Sequence; AC: Accession Number; # homo: Number of homologs used in the experiments; N d (GO): Number of Distinct GO Terms. # homo=0 means only the true accession number is used.