Skip to main content

Table 1 Summary results of different methods on the four simulated datasets

From: A new method to measure the semantic similarity from query phenotypic abnormalities to diseases based on the human phenotype ontology

Dataset 1(Noise:-, Imprecision:-)

 

Resnik

Lin

JC

Rel

IC

GraphIC

Wang

RBP

Top 1

1027

1016

1029

1018

1021

1029

1023

1031

Top 5

1087

1071

1082

1071

1075

1079

1078

1091

Top 10

1089

1077

1088

1077

1079

1081

1081

1095

Top 20

1092

1078

1092

1078

1080

1083

1081

1096

Dataset 2(Noise:+, Imprecision:-)

 

Resnik

Lin

JC

Rel

IC

GraphIC

Wang

RBP

Top 1

992

997

1036

996

1006

1031

1001

1030

Top 5

1074

1059

1081

1063

1070

1077

1071

1089

Top 10

1081

1069

1086

1071

1077

1080

1078

1094

Top 20

1087

1074

1089

1076

1078

1083

1079

1095

Dataset 3(Noise:-, Imprecision:+)

 

Resnik

Lin

JC

Rel

IC

GraphIC

Wang

RBP

Top 1

434

243

104

302

336

120

172

438

Top 5

767

502

261

583

603

341

446

765

Top 10

866

613

342

685

707

482

604

863

Top 20

926

714

440

785

797

620

725

926

Dataset 4(Noise:+, Imprecision:+)

 

Resnik

Lin

JC

Rel

IC

GraphIC

Wang

RBP

Top 1

183

130

97

143

162

73

77

370

Top 5

453

327

239

383

406

252

263

694

Top 10

579

452

319

509

533

393

384

786

Top 20

703

570

420

640

657

540

535

860

  1. Resnik the Resnik measure, Lin the Lin measure, JC the Jiang-Conrath measure, Rel the Relevance measure, IC the information coefficient measure, GraphIC the graph IC measure, Wang the Wang measure, RBP RelativeBestPair method
  2. The seven existing measures are all implemented with one-sided search algorithm. The numbers represent the number of patients in 1100 cases that the true diseases are ranked within top 1, top 5, top 10 or top 20