Skip to main content

Table 1 Analysis of random and low-repeat -mers

From: Meta-aligner: long-read alignment based on genome statistics

  (a)  
  Number of disjoint random -mers within a read Number of disjoint random -mers within a read
  for L=400 for L=1000
  0 1 between 1 and ≈L/ L/ 0 1 between 1 and ≈L/ L/
(d,)=(0,20) 0.9 % 0.7 % 69.1 % 29 % 0.6 % 0.19 % 85.6 % 13.5 %
(d,)=(0,40) 0.7 % 0.4 % 39.4 % 59.1 % 0.54 % 0.17 % 62.6 % 36.68 %
(d,)=(3,40) 6.3 % 8 % 84.8 % 0.8 % 2.26 % 1.49 % 96.2 % 0.04 %
(d,)=(0,80) 0.7 % 0.3 % 4.2 % 94.8 % 0.49 % 0.17 % 8.2 % 91.1 %
  (b)  
  Number of low-repeat -mer within a read   Number of low-repeat -mer within a read  
  for d=0   for d=3  
\(\mathcal {L}_{s,1}\) 0 between 1 and 80 between 81 and ≈L/   0 between 1 and 80 between 81 and ≈L/  
5 56.73% 11.94% 30.98%   32.01% 30.97% 36.88%  
10 54.46% 4.9% 4.9%   26.21% 19.81% 53.83%  
20 52.75% 0.08% 46.81%   21.90% 4.06% 73.89%  
40 52.75% 0.08% 46.81%   21.69% 0.07% 78.09%  
  (c)  
  Number of low-repeat -mer within a read   Number of low-repeat -mer within a read  
  for d=0   for d=3  
\(\mathcal {L}_{s,1}\) 0 between 1 and 80 between 81 and ≈L/   0 between 1 and 80 between 81 and ≈L/  
5 52.31% 12.28% 35.08%   17.18% 20.15% 62.59%  
10 50.22% 4.91% 44.54%   14.07% 11.67% 74.18%  
20 48.64% 0.1% 50.93%   11.75% 2.24% 85.93%  
40 48.64% 0.1% 50.93%   11.64% 0.04% 88.23%  
  1. (a) Percentage of disjoint random -mers within reads of lengths L=400 and L=1000 of ch19 of hg19. (b) and (c) Fraction of the remaining reads after the first step and their number of low-repeat -mers with different list sizes \(\mathcal {L}_{s,1}=\{5,10,20,40\}\) for =40. In (b) and (c), we assume that, all -mers and only non-overlapping -mers, are respectively used at the first step