Skip to main content

Table 1 Analysis of random and low-repeat â„“-mers

From: Meta-aligner: long-read alignment based on genome statistics

 

(a)

 
 

Number of disjoint random â„“-mers within a read

Number of disjoint random â„“-mers within a read

 

for L=400

for L=1000

 

0

1

between 1 and ≈L/ℓ

≈L/ℓ

0

1

between 1 and ≈L/ℓ

≈L/ℓ

(d,â„“)=(0,20)

0.9 %

0.7 %

69.1 %

29 %

0.6 %

0.19 %

85.6 %

13.5 %

(d,â„“)=(0,40)

0.7 %

0.4 %

39.4 %

59.1 %

0.54 %

0.17 %

62.6 %

36.68 %

(d,â„“)=(3,40)

6.3 %

8 %

84.8 %

0.8 %

2.26 %

1.49 %

96.2 %

0.04 %

(d,â„“)=(0,80)

0.7 %

0.3 %

4.2 %

94.8 %

0.49 %

0.17 %

8.2 %

91.1 %

 

(b)

 
 

Number of low-repeat â„“-mer within a read

 

Number of low-repeat â„“-mer within a read

 
 

for d=0

 

for d=3

 

\(\mathcal {L}_{s,1}\)

0

between 1 and 80

between 81 and ≈L/ℓ

 

0

between 1 and 80

between 81 and ≈L/ℓ

 

5

56.73%

11.94%

30.98%

 

32.01%

30.97%

36.88%

 

10

54.46%

4.9%

4.9%

 

26.21%

19.81%

53.83%

 

20

52.75%

0.08%

46.81%

 

21.90%

4.06%

73.89%

 

40

52.75%

0.08%

46.81%

 

21.69%

0.07%

78.09%

 
 

(c)

 
 

Number of low-repeat â„“-mer within a read

 

Number of low-repeat â„“-mer within a read

 
 

for d=0

 

for d=3

 

\(\mathcal {L}_{s,1}\)

0

between 1 and 80

between 81 and ≈L/ℓ

 

0

between 1 and 80

between 81 and ≈L/ℓ

 

5

52.31%

12.28%

35.08%

 

17.18%

20.15%

62.59%

 

10

50.22%

4.91%

44.54%

 

14.07%

11.67%

74.18%

 

20

48.64%

0.1%

50.93%

 

11.75%

2.24%

85.93%

 

40

48.64%

0.1%

50.93%

 

11.64%

0.04%

88.23%

 
  1. (a) Percentage of disjoint random â„“-mers within reads of lengths L=400 and L=1000 of ch19 of hg19. (b) and (c) Fraction of the remaining reads after the first step and their number of low-repeat â„“-mers with different list sizes \(\mathcal {L}_{s,1}=\{5,10,20,40\}\) for â„“=40. In (b) and (c), we assume that, all â„“-mers and only non-overlapping â„“-mers, are respectively used at the first step