Meta-aligner: long-read alignment based on genome statistics

Background Current development of sequencing technologies is towards generating longer and noisier reads. Evidently, accurate alignment of these reads play an important role in any downstream analysis. Similarly, reducing the overall cost of sequencing is related to the time consumption of the aligner. The tradeoff between accuracy and speed is the main challenge in designing long read aligners. Results We propose Meta-aligner which aligns long and very long reads to the reference genome very efficiently and accurately. Meta-aligner incorporates available short/long aligners as subcomponents and uses statistics from the reference genome to increase the performance. Meta-aligner estimates statistics from reads and the reference genome automatically. Meta-aligner is implemented in C++ and runs in popular POSIX-like operating systems such as Linux. Conclusions Meta-aligner achieves high recall rates and precisions especially for long reads and high error rates. Also, it improves performance of alignment in the case of PacBio long-reads in comparison with traditional schemes. Electronic supplementary material The online version of this article (doi:10.1186/s12859-017-1518-y) contains supplementary material, which is available to authorized users.


Analysis: Meta-aligner Alignment Stage
In this analysis, variations between target and reference genomes and also sequencing errors at individual base locations are assumed to follow i.i.d. models with rates ν and , respectively. Here, we only consider mismatch errors as the indel rate is assumed to be low enough such that the chance of observing an indel within a fragment of length 1 is sufficiently low. The probability of anchoring a given read r with i disjoint random 1 -mers at the first stage of Meta-aligner algorithm is determined as follows, where P t ( 1 ; b j ) represents the anchoring probability of an 1 -mer to its true position given that the nearest repeat position is b j bases away from the starting position of r, see Supplementary Figure  1. Note that, in Meta-aligner algorithm, since we need at least two disjoint random 1 -mers for anchoring a given read, P a (r, 0) = P a (r, 1) = 0.
In order to determine P t ( 1 ; b j ), we let A t (r) and F(r) denote the cases of correct and incorrect alignments, respectively. The probability of correct alignment of a fragment of length 1 with a maximum edit distance of d max to its true position is, Given the i.i.d. model for random intervals, such that each base is generated uniformly and independently, a fragment r of length 1 within the random interval is aligned to an incorrect position on the genome with maximum edit distance of d max with the following probability (using the union bound ), where P w ( 1 ) represents the probability of incorrect alignment of a fragment of length 1 . Therefore, if P w ( 1 ) scales as O 1 G , P {F(r)} tends to zero and the fragment r is aligned uniquely to its true position with probability P {A t (r)}. It can be easily verified that if 1 = 1.5 log G and d max ≤ 3, P w ( 1 ) scales as O 1 G , and therefore, all anchored reads are aligned correctly. The next key question is what is a proper value of for human genome given its high repetitive structure. In order to answer this question, in Supplementary Figure 2, we evaluate the fraction of random -mers for chr19 of hg19 in the case of d = 0, as changes from 20 to 200. The key result of this figure, is that a fragment length of = 80 is almost sufficient to ensure that structures which appeared as repetitive for small fragment lengths, start to fully appear as random intervals. Also, as shown in Supplementary Figure 3, even for short reads of length L = 100, close to 98% of reads will contain random -mers if is chosen equal to 40. Since, Meta-aligner uses two -mers for anchoring, these simulations confirm that = 40 (≈ 1.5 log G with G = 3 × 10 9 ) is a proper choice in the case of human genome.  In order to take into account the effect of repeat intervals, fragments close to these intervals (with a distance of at most 1 bases from nearest repeat interval) are multiplied by the correction term 1 − P w (b j ). Thus, It is worth mentioning that the correction term becomes almost one for values of b j > 10.
We generated synthetic reads from chr19 of hg19 with mismatch sequencing error rate of 10%. Then, using the first stage of Meta-aligner, we align each read to chr19 with 1 = 40 and d max = 3. Based on the proposed approach, we also determine the random and repeat intervals of chr19 in hg19 with ( 1 , d) = (40, 0). Then, we calculate (1) for this simulation. Supplementary Figure 4 shows the comparison of our analysis in (1) with simulation results. Also, in this figure results of our analysis and simulation when using sliding window with length S 1 = 1 /2 are presented. The proposed mosaic model of the genome, for given ( , d) is shown in Supplementary Figure 5. Based on this model, reads can be classified into three classes, denoted by C 1 , C 2 , and C 3 . The classification is based on two factors: the fraction of random and repeat -mers in a read and the number of copies of an -mer. If most of -mers within a read belong to random intervals, it is classified as C 1 . If most of the -mers of a read belong to repeat interval with low (resp. high) copy number then it is classified as C 2 (resp. C 3 ). In Supplementary Figure 6, the type of reads belonging to these classes is shown. Most of reads taken from human genome belong to C 1 and a very small fraction of reads belong to C 3 . Consequently, significant improvement in speed and accuracy is achieved if an alignment algorithm aligns reads according to their classes. In fact, Meta-aligner is designed based on such concept where in its alignment stage it handles all the reads in C 1 very rapidly and accurately. Also, in the assignment stage of Meta-aligner, most of reads in C 2 are handled. For the third class, Meta-aligner applies a second round of assignment stage with Bowtie2 to handle both cases of high copy number and bad quality reads through a completely controlled level of additional complexity. In this way, Meta-aligner efficiently adapts itself to the reference genome structure and quality of reads. r1 r2 r3

Length of reads (L)
Random regions

Repeat regions
Supplementary Figure 5: The mosaic model. The proposed mosaic model for the genome structure, consists of two intervals, random and repeat intervals. Based on the definition of random and repeat intervals, strings that start from inside random intervals (reads r1 and r2) are uniquely mapped to the genome.

Random regions
Repeat regions with low copy number C1 C2 C3 r r r Repeat regions with high copy number Supplementary Figure 6: Classification of reads. Reads are classified into three classes. Reads whose majority of -mers reside in random intervals and also the ones with a majority of -mers in low copy repeat intervals, belong to C 1 and C 2 , respectively. Reads whose majority of bases are located within high copy number repeat intervals belong to C 3 .

Details: Meta-aligner Overall Diagram and Algorithm
The details of Meta-aligner is presented through the following algorithms. In Supplementary Algorithm 1 and 2, the description of alignment and assignment procedures are presented. The path finder subroutine is presented in Supplementary Algorithm 3. The overall block diagrams of Meta-aligner and the alignment and assignment stages are shown in Supplementary Figures 7-9, respectively. The Parameter Estimation (PE) algorithm of Meta-aligner which is used for estimating , d and the slide number, is presented in the subsequent section.
Output: The set of aligned DNA reads denoted by A 1 .

Initiate:
1: for i = 1 to N do 2: for k = 1 to K max do 3: Align 1 bases that started from {(k − 1) 1 + hS 1 + 1}-th base of all reads in R with maximum distance of d max using SRA. Set P ri,k = P i , anchored position on the genome.

7:
if r i is aligned to forward strand then 8: Set F ri,k = 1.
if F ri,k = 1 then

15:
Add r i with anchored position of P ri,k − (k − 1) 1 − hS 1 to A 1 . Add r i with anchored position of P ri,k + k 1 + hS 1 − L i to A 1 .  Locally align all reads in A 1 to their anchored positions.

27:
Update K max = max r i ∈R K r i . 28: end for Supplementary Algorithm 2 Meta-aligner: Assignment stage Input: The set of N DNA reads denoted by R. A reference genome of length G bases. Input Parameters: 1 , 2 (with 2 > 1 ), d max , G 1 , G 2 , L s,1 , L s,2 , S th , S 1 , S 2 and #step ∈ {1, 2, 3}. Output: The set of aligned DNA reads denoted by A 2 .

3:
for i = 1 to |R| do 4: Add all K i fragments of length c for i th read in R, to A . Align all fragments in A with distance d max and maximum list size of L s,1 with Bowtie.  for i = 1 to |R| do 12: for j = 1 to K i do 13: Put list of alignment positions of the j th fragment in T i,j (., p).

14:
Put list of alignment scores of the j th fragment in T i,j (., s)

15:
Put list of alignment flags of the j th fragment in T i,j (., f ). Run Path Finder (r i , {T i,1 , · · · , T i,K i }; L i ).

18:
Find the mean score (S m ) and maximum score (S max ) of local alignment score. Filter all paths in L i whose local alignment score is lower than 2S m − S max .

22:
if |L i | = 0 then 23: Add all paths in L i to A 2 .

24:
Remove r i from R.

28:
Update K max = max i∈{1,··· ,|R|} K i . 29: end for end for 12: end for 13: Set S max = max n T i,Ki (n, s). 14: Sort T i,Ki decreasingly based on scores. 15: Filter all T i,Ki 's with score lower than S max /2. 16: Filter all paths in the T i,Ki 's with at most one index. 17: for n = 1 to |T i,Ki | do     All reads (r i 's for i ∈ {1, · · · , N }) are divided into small fragments of length 1 (top block) and position of uniquely mapped fragments are stored in an array for each read (bottom block). At each stage, one set of fragments is loaded into the aligner and the position of uniquely mapped reads are stored in the array. The decision is made based on two confirming locations in the array. The read, whose location is confirmed this way, is then anchored and removed from the set of reads.

Path selection and report
Supplementary Figure 9: Assignment procedure. Reads remained unaligned after the alignment procedure are divided into fragments of length ( is 1 and 2 at the first and second steps of the assignment, respectively). All fragments are sent to the aligner (Bowtie or Bowtie2 depending on the step of assignment) which reports a table of positions for each fragment. All positions which confirm each other are connected and the corresponding paths are extracted. Paths may have different lengths and scores. After filtering paths, the output of the assignment stage is the list of best selected paths for a given read.

Description: Parameter Estimation Algorithm of Meta-aligner
Adjusting parameters of Meta-aligner, similar to what is done for other aligners, affects efficiency and quality of its performance. In order to automatically select the parameters, we propose a Parameter Estimation (PE) algorithm. This feature is of significance importance in practice, especially when no information about the input read set is available. The parameters estimated by PE algorithm are , d, slide value, and mismatch/indel sequencing error rates for local alignment adjustment. PE algorithm is presented in Supplementary Algorithm 4.

Supplementary Algorithm 4 Parameter Estimation Algorithm
Input Data: The set of N DNA reads of length {L 1 , · · · , L N } denoted by R. A reference genome of length G bases. Output: Estimated parameters:ˆ M ,ˆ G , 1 , d. Initiate: Add the i th read (r i ) from R to A.
1: Run the first stage of Meta-aligner for the read set A with Bowtie and = 25, d = 2, and without any slide. 2: Estimate mismatch and indel rates within reads,ˆ M andˆ G , from local alignment reports. Set d = (ˆ M + 0.5) .

6:
Run the first stage of Meta-aligner (without local alignment) with and d.

7:
Save recall rate and anchoring time as R ( ) and T ( ).
Run the first stage of Meta-aligner with op , d op , and slide value op / (i + 1) .

37:
Save the recall and time as R i and T i .

Details of PE algorithm
We use the output of alignment stage of Meta-aligner to estimate , d, and slide value (if necessary). At the first step, we obtain an estimate of mismatch and indel rates through use of initial values of = 25 and d = 2. We also select the first N t reads of the set of input reads such that the aggregate of their lengths is greater than 1,000,000 bases. We run the first stage of the Matealigner for the selected reads with Bowtie. For the anchored reads, the local alignment reports number of mismatches and indels, denoted by n M and n G , respectively. Consequently, we compute, In our implementation, is selected from the set {15, 20, 25, 30, 35, 40, 45, 50}. Givenˆ M , for any given value of , we choose d = (ˆ M + 0.5) for the alignment stage. For high indel scenarios, when recall rate is significant, ifˆ G ≥ 2, we change d to d + 1 (by considering that d ≤ 3). The best pair of ( , d) is selected based on its recall rate and alignment speed. Heuristically, amongst all pairs which have maximum distance of 0.02 with the best recall rate, the one whose anchoring execution time is the least is selected.
In order to estimate the slide value, we first compute the average length of the selected reads. If this average is greater than 2, 000 bps, no slide will be used. Otherwise, we run Meta-aligner with different slide numbers starting from 1 (if slide number is i, the length of slide window will be / (i + 1) ). At the first instant that the improvement of recall rate divided by running time is less than a constant value set at α = 0.3, we stop and report the slide value.

Results: Meta-aligner Incorporation of Short-Read Aligners
In the current implementation of Meta-aligner, three short-read aligners, namely Bowtie, SOAP2 and mrsFast, are used in the first stage of the algorithm. In order to compare these aligners, we generated N = 1, 000, 000 simulated reads with = {2; 5; 10; 15} % and different lengths L = {300; 1, 000} bps, from chr19 of hg19. Results are presented in Supplementary Table 2. For all simulations, the following parameters are reported: execution time T 1 (sec), recall rate R 1 , and number of reads which are mapped incorrectly. Since the number of incorrect mappings in most cases is too low, we report the number of reads mapped incorrectly instead of precision. Once embedded in Meta-aligner, mrsFast yields the lowest number of unaligned reads except in the case of high error rate = 10%. This is due to the fact that mrsFast is initially designed to align as many as possible reads. However, mrsFast's performance is deteriorated at higher sequencing error rate scenarios, for instance, with L = 1, 000 bps and = 10%, by using mrsFast ≈ 10, 000 errors occur, while Bowtie and Soap2 have < 100 errors. Hence, we suggest that mrsFast be used in cases that the the goal is achieving a high recall rate and the sequencing error is low.
SOAP2 performs almost as good as Bowtie in low sequencing error rates, with a slightly lower precision. But as the sequencing error increases, Bowtie is preferable in our algorithm's first stage and we manage to maintain a reasonable performance in terms of high recall rate, acceptable running time, and a very high precision. In addition, since Bowtie is also used in the second stage of the Meta-aligner, choosing it in the alignment stage would also be preferred if possible. Taking into account these observations, we have chosen Bowtie as the default short-read aligner for the first stage of our algorithm.
In Supplementary Table 2, the result of high indel rates is also presented. In this scenario, L = 1, 000 bps and = {10; 15}%, with 10% indel rate and {0 or 5}% mismatch rate. The results show very high recall and very low error in these scenarios, and both Bowtie and SOAP2 may be adopted in such setups at the first stage of Meta-aligner. As the simulation results clearly demonstrate, most of input reads can be handled at the first stage of Meta-aligner. Therefore, applications that need only robust and unique alignment and a high recall rate can utilize only the first stage of Meta-aligner. For high error rate scenarios, we specify the exact error rates and we use normalized cutting distance (defined as number of columns and rows that is used in the local alignment table relative to each read length) of 0.8 in the local alignment process. In this part, we present Meta-aligner simulation results for the human genome (hg19) using Bowtie at the first stage of the algorithm. Similar to the previous section, N = 1, 000, 000 simulated reads are generated with = {2; 5; 10; 15; 20} % and different lengths L = {300; 500; 1, 000} bps, from the reference genome. If not specified explicitly, the sequencing errors are 90% mismatches and 10% indels. In Supplementary Tables 4-5, the following parameters are reported for only the first stage of Meta-aligner: execution time T 1 (sec), recall rate R 1 , and number of reads which are mapped incorrectly. For high error rate scenarios, we specify the exact error rates and we use normalized cutting distance of 0.8 in the local alignment. Results of the overall algorithm of the Meta-aligner (results of the first and the second steps) is presented in Supplementary Tables 3 and 6, and the following parameters are reported: time of the second step (T 2 ) and overall time (T tot ), recall rate of the second step (R 2 ) and overall recall rate (R tot ), and precision of the second step (P 2 ) and overall precision (P tot ). The simulation results reveal that the first stage of the algorithm can very rapidly and with very high precision handle most of the reads, especially for longer reads (e.g. L ≥ 1, 000 bps) even at high error rates. In applications where repeat intervals are not the main concern, by running only the first stage of the algorithm an acceptable recall rate with very high precision is achievable.

Supplementary
In the assignment stage when Bowtie with list size of 10 is used, many reads within the repeat intervals with low copy number are handled very fast. For this step, we consider the same parameters with the alignment stage (except for slide number whose default value is 1, i.e. length of sliding window is 1 /2 ). Results show that these two steps complement each other. However, if one needs more recall rate (and execution time is not the highest priority) Meta-aligner has the option for running the third step, i.e. the second step of the assignment stage, with Bowtie2 using default sub-fragment and list size of 150 and 40, respectively. As mentioned earlier, this step is specifically designed for handling long repeat intervals with high copy number and very bad quality reads. Note that for high indel rates, we set normalized cutting distance of the local alignment table at 5× indel rate.

Selecting Bowtie2 score parameter
In order to achieve the best recall performance, one needs to set the parameters of Bowtie2 in an optimal way. We first find the appropriate value for score. Let p M and p G be the error rates for mismatch and indel rates, respectively. A fragment of length has p M × and p G × mismatch and indel bases on the average, respectively. We set the score in such a way that most of the fragments can be handled by Bowtie2, probabilistically. Since the variance of the number of mismatches and indels within a fragment of length are (1 − p M )p M and (1 − p G )p G , respectively, we set the score parameter as, where S M is match score in the used aligner, and S M M and S G denote mismatch and gap penalties, respectively. Also, n M and n G denoting the average and variance of the number of mismatches and indels computed, respectively, by The parameter t is a constant that is set to be 3 at low error rates and 1 or 2 at high error rates.
Supplementary    In this section, we compare the number of reads reported as unique reads by Meta-aligner and other aligners in different scenarios. We consider L = {300; 500; 1, 000} bps with = {2; 5; 10}% sequencing error rates. In addition, we consider the = 10% indel-only case for L = 1, 000 bps. Results are shown in Supplementary Figures 10-11. Simulation results for these cases are presented in these figures as follows: (1) Bowtie2 in different scenarios (in the default mode, fixed score option and default list size, fast mode with list size of 10, sensitive mode with list size of 10, fixed score option and list sizes of 2, 10 and 40), (2) Seqalto in default mode, (3) BWA-SW with z = 2, 10, and 40, and (4) Meta-aligner. . Also, Bowtie2 in the default mode, fixed score option and default list size, fast mode with list size of 10, sensitive mode with list size of 10, fixed score option and list sizes of 2, 10 and 40, Seqalto in default mode, and BWA-SW with z = 2, z = 10 and z = 40 results are presented.
Results show that Meta-aligner reports very large subset of mapped reads uniquely to the reference genome. Note that since Bowtie2 in default mode and BWA-SW with z = 10 reported only the best locations for all reads, they report a high percentage of uniquely mapped reads (more than Meta-aligner). However, Meta-aligner demonstrates better performance in terms of both execution time and precision compared to these two aligners at the aforementioned settings (Supplementary Sections 5-6). Also, in most cases, its recall rate is quite close to Bowtie2. . Also, Bowtie2 in the default mode, fixed score option and default list size, fast mode with list size of 10, sensitive mode with list size of 10, fixed score option and list sizes of 2, 10 and 40, Seqalto in default mode, and BWA-SW with z = 2, z = 10 and z = 40 results are presented.
Simulation results show that only first 6, 000 bases of all reads are enough for the anchoring step. This result is shown in Supplementary Figure 12. Also, in the following, we explain why such value of read length is theoretically enough for anchoring. This figure shows that if we use all bases of all reads, mapping percentage increases only by 2.54%, while anchoring time increases by 1440 sec and overall time of first stage increases by 2528 sec. Using only first i bases of all reads for anchoring in Meta-aligner can be controlled by -tr i command.
We now describe why 6 Kbps of reads is enough for anchoring. First, suppose noise is added to PacBio reads through an i.i.d. model. Note that Bowtie can not handle indels within a read, and only a single indel at the first or last bases of a read can be handled by Bowtie. Assume that number of mismatches and indels within a sub-fragment of length are denoted by N M and N G , respectively. Thus, a sub-fragment of length with d = 1 is aligned to its true position with probability of P a = P {N M = 0, N G ≤ 1 and two sides bases} + P {N M = 1, N G = 0} (5) where M and G are mismatch and indel error rates, respectively. If we take = 25, M =ˆ M (withˆ M = 0.028) and G =ˆ G (withˆ G = 0.138), then P a = 0.0232. If a read has N disjoint random -mers for ( , d) = (25, 1), then the probability of alignment to true location of this read using the first stage of Meta-aligner is For N ≥ 240 (or L ≥ 6 Kbps), P t ≈ 0.976. Thus, if a read has at least 240 disjoint random 25-mers, it is aligned to its true location with high probability. Also, note that for a length of 6 Kbps, only a very small fraction of repeats in the genome remain unbridged. Now, we compare mapping rates for two different values of d = 1 and d = 2 with = 25. Using d = 2 and only first 6, 000 bases of reads, mapping percentage increases by 3.23% while time of anchoring increases by 280 sec and overall time of first stage increases by 3, 092 sec. When d = 2 is used, extracted fragments of length from reads that contain indels within their first or last bases can be aligned by Bowtie. This is one reason that the mapping percentage increases in this case.
We also compare results when using slide window for = 25 and d = 1 or 2. Comparison of the mapping percentage and time of anchoring is presented in Supplementary Table 12. For maximum mapping percentage, we propose that user uses = 25, d = 2 and slide value of 5 (using whole length or L max = 6 Kbps) with mapping percentage of ≈ 85 − 87%. The overall Meta-aligner results for the PacBio input reads are shown in the Supplementary Table 13.
Step number

Main arguments
Meta-aligner main arguments and their descriptions are as follows, -x <name>: The base name of the indexes for the reference genome used at each step of the Meta-aligner. Note that if any aligner is used at the alignment or assignment stages, their indexes must exist with this name. For Soap2 and mrsFast aligners which are used at the first stage of Meta-aligner, this name must be used without any suffix.
-fa <name>: The reference genome which is used for the local alignment (assumed to be in Fasta format).
-r <name>: The base name of the input read set (assumed to be in FastQ format).

-o <name>:
File to write SAM alignments to. By default, alignments are written to "output.sam".

Options
Meta-aligner options are as follows,

Estimation option
User can choose PE algorithm for estimating required parameters as follows, -est <1 or 2>: By this command estimation is used and 1 (fragment length), d (maximum Hamming distance), mismatch and indel error rates and normalized cutting distance (-ed command) are estimated by Meta-aligner. Meta-aligner propose parameters for two scenarios, 1) recall rate or 2) time are significant. By setting -sig command, each of these scenarios are selected. Default mode of estimation is off.

Input option
User can choose input format as follows, -FA: Reads are Fasta files. Fasta files usually have extension .fa, .fasta, .mfa or .fnaor. -pg: Percent of gap within the input read set. This value is used in -ed options. The default value is 0.01. This parameter can be estimated when user uses the PE algorithm.

Alignment options
User can change alignment parameters for achieving better performance at the first stage (and/or the first step of the second stage) of the Meta-aligner.
-l1 <int>: The fragment size ( 1 ) which is used at the alignment stage, and the first step of the assignment step. The default value is 40. This parameter can be learned when user uses the PE algorithm.
-sl1 <int>: The length of sliding window. This parameter is used at the alignment stage and the first step of the assignment stage of Meta-aligner. The default value is 0, i.e., no sliding is used. This parameter can be learned when user uses the PE algorithm.
-cfd1 <int>: The consecutive distance between two anchored fragment which is used for confirming two fragments of a read and anchor read (G 1 ). This parameter is used at the alignment stage and the first step of the assignment stage of Meta-aligner. The default value is G 1 = 0.1 1 .

-d <int>:
Edit distance between fragments and the reference genome using for alignment. This parameter is used at the alignment stage and the first step of the assignment stage of Meta-aligner. Setting this value to zero means that only exact matches are desired. The default value is 2. This parameter can be learned when user uses the PE algorithm.
• For Bowtie: this command works as -v (may be an integer from 0 through 3) and determines only number of mismatches.
• For mrsFast: this command works as -e.
• For Soap2: this command works as -v.
-tr <int>: Length of reads that are trimmed and only <int>bases of each read is used for anchoring at the alignment stage and the first step of the assignment stage of Meta-aligner. The remaining bases of each reads are used in the local alignment. In the default mode, this value is not used.

Assignment options
User can change these parameters for achieving better performance at the second stage of Metaaligner.
-l2 <int>: The fragment size ( 2 ) which is used at the second step of the assignment stage. The default value is 150.
-sl2 <int>: The length of sliding window for the second step of the assignment stage. The default value is 50.
-cfd2 <int>: The consecutive fragments distance which is used for confirming two fragments of a read and anchor it (G 2 ) which is used at the second step of the assignment stage. The default value is G 2 = 0.1 2 .
-seedmm2 <int>: Number of mismatches which is allowed in a seed alignment at the second step of the assignment stage. The default value is 1.
-seedlen2 <int>: Length of the seed substrings to align at the second step of the assignment stage. The default value is 20.
-ls1 <int>: List size of the assignment stage when Bowtie is used (at the second step of the assignment stage of Meta-aligner). The default value is 10.
-ls2 <int>: List size of the assignment stage when Bowtie2 is used (at the third step of the assignment stage of Meta-aligner). The default value is 40.
-thrsc <double>: Threshold of path selection step at the assignment stage of Meta-aligner (for both Bowtie and Bowtie2). Paths are filtered by their scores. The default value is 0.3.

Scoring options
User can change the desired score for local alignment step of Meta-aligner.

Reporting options
User can choose the desired report of the Meta-aligner. -dis: This option discards local alignment of the anchored reads. By using this option, only reads, their flags and positions on the reference genome are reported. This parameter is only used at the first stage of Meta-aligner. -disHeader: This option suppresses the header of the output SAM file.

Other options
Meta-aligner provides some options such that users can match Meta-aligner with their platform.
-step <{1 or 2 or 3}>: This parameter specifies that Meta-aligner is run up to the selected step, in case of selecting "1": run only the alignment stage; "2": run the alignment stage and the first step of the assignment stage; "3": run all steps of Meta-aligner. The default value is "2".
-dir <address>: If this parameter is used, Meta-aligner creates a new directory at the input address, and all steps are executed at that address. The default address is "./results".
-p <int>: Number of threads which is used for running Meta-aligner (both stages). The default value is 1.

-ed <double>:
This parameter controls the normalized cutting length of the local alignment table in the Smith-Waterman algorithm (relative to each read length). With this parameter, only {ed/2× read length} cells adjacent to the original diagonal of the local alignment table are used for local alignment procedure. This parameter must be between 0 (consider only original diagonal cells of the dynamic table) and 2 (consider all cells of the dynamic table). The default value is 5 × pg. This parameter can be estimated from indel rate when user uses the PE algorithm.
-ram <double>: User can set the available RAM when running Meta-aligner. By this parameter, user can run Meta-aligner in all platforms without any restriction of RAM. By using this command, Meta-aligner handles number of threads (-p) and length of reads at the local alignment step. If some reads cannot be processed by this value of RAM (even with one thread), Meta-aligner reports these reads in a file (named "NotEnoughRAM.txt") which consists of reads in Fastq format, with their flags and the anchored positions written in their header section by underline.