Skip to main content

Table 4 Comparison of the factions of reads with homologs at different evolutionary distances that are detected by different similarity search tools.

From: RAPSearch: a fast protein similarity search tool for short reads

Evolutionary distance

Apply E-value cutoff?c

BLAST

RAPSearch

BLAT

familya

No

0.95

0.88

0.77

 

E-value = 0.1

0.91

0.87

0.76

 

E-value = 0.001

0.86

0.84

0.71

phylumb

No

0.79

0.51

0.21

 

E-value = 0.1

0.68

0.48

0.16

 

E-value = 0.001

0.48

0.39

0.09

  1. a: short reads of ~100 bps simulated from the gene sequences of Salmonella typhi were searched against the proteins of Escherichia coli K12 (Salmonella typhi and Escherichia coli belong to the same family, but different genera). b: short reads of ~100 bps simulated from the gene sequences of Desulfococcus oleovorans Hxd3 were searched against the proteins of Escherichia coli K12 (Desulfococcus oleovorans and Escherichia coli belong to the same phylum, but different subphylums). Only the genes of at least 90 bp (encoding 30aa) were included in the statistics, and two genes with 40% or higher amino acid identity spanning at least 50% of the length of one gene were considered as homologs. c: "no" indicates that no E-value cutoff was applied to filter out the similarity hits for the short reads, and E-value = 0.1 indicates that only similarity hits with E-value < = 0.1 were included for the statistics.