Next generation sequencing technology has led to the frequent application of deep sequencing projects, and the use of systems that require a large number of oligonuceotide primers for PCR. The design of a high number of primers is a challenge logistically both in terms of achieving good coverage of target regions and in terms of cost. Although there are a number of primer design programs available, utilizing them for high throughput design can be difficult and financially costly. We aimed to produce a system in which a large number of primers could be designed cost effectively by using the fewest necessary primers, hence the lowest cost, at multiple priming sites where possible whilst maintaining an acceptable level of coverage, and avoiding degeneracy in amplicon targets which overlap in the same regions.

In designing our program we compared our approaches and performance with several other available programs; Primer3
[1], Batchprimer3
[2], Primer-BLAST
[3] and PAMPS
[4]. The core algorithm for the first three of these programs is that of Primer3. The Primer3 algorithm takes into account the primer size, melting temperature (T_{m}), GC content, and concentration of monovalent and divalent cations within the PCR reaction mixture, a selection of salt correction formulae and different parameters for simulating the thermodynamics of primer hybridization. Potential primers are then checked by using a mispriming repeat library from the human, rodent or *Drosophila* genomes, allowing interspersed repeats or other sequence regions to be avoided as primer annealing locations. Primer-BLAST utilizes the Primer3 algorithm and the BLAST local alignment search tool
[5] to ensure only unique primer pairs are selected, thus preventing primers becoming designed around undesired targets such as introns. These two programs output a range of primer pair possibilities for single DNA sequences, but they are not designed for high throughput primer design. The need for high throughput primer design was recognized by You et al.
[2], who produced BatchPrimer3 in which multiple sequences can be input for primer selection, but only one primer pair per input sequence is produced.

Minimizing the cost of a primer design can be achieved by (i) designing degenerate primers able to anneal to a number of related target sequences and (ii) implementing primer reuse utilizing primers that bind to conserved loci that are repeated. Although degeneracy allows for amplification of greater numbers of related sequences, the more degenerate primers are the less specific amplicons will be. Therefore achieving an optimal degree of degeneracy is important to obtaining a suitable trade-off between the number of related sequences amplified and the specificity of these amplicons. A number of variants exist for tackling this problem and achieving a good trade-off for the specificity and sensitivity. The Maximum Coverage Degenerate Primer Design (MCDPD) approach as used in HYDEN
[6] tries to identify a primer of length *l* and a maximum degeneracy *d*
_{
max
} that covers a maximum number of sequences, each of length *l*. The Minimum Degeneracy Degenerate Primer Design (MDDPD) attempts to find a primer of length *l* and a minimum degeneracy *d*
_{
min
} that can cover all input sequences with a length equal to or greater than *l*. The Minimum Primers Degenerate Primer Design (MPDPD) attempts to find the fewest number of primers of length *l* and a maximum degeneracy of *d*
_{
max
} for a set of sequences, so that each sequence is covered by at least one primer. Whereas this approach has the constraint that all input sequences have the same length as the primers and may be inadequate in practice, the Multiple Degenerate Primer Design (MDPD) allows the input sequences to have different lengths of greater than *l*
_{
min
} and attempts to identify primers of length at least *l*
_{
min
} and degeneracy *d*
_{
max
}, allowing each sequence to be covered by at least one primer. This was the approach taken by MIPS
[7] and PAMPS
[4]. PAMPS is a heuristic, high throughput algorithm which designs degenerate primers through a process of consecutive *ad hoc* pairwise alignments
[4]. This program has been shown to outperform other degenerate primer design systems such as HYDEN and MIPS in terms of computational time.

Algorithms for implementing primer reuse have also been developed: Doi and Imai have described a heuristic algorithm for greedy primer design within multiplex PCR, which attempts to minimize the cost of primers required for multiplex PCR and SNP genotyping
[8]. MuPlex is another heuristic algorithm designed for multiplex PCR, which uses a graph based approach to assign the largest number of non-conflicting primers into the fewest ‘cliques’ that can be assigned to multiplex PCR tubes
[9]. Lui and Carson have utilized a simulated annealing optimization to maximize primer reuse, which exhaustively searches primer space and aims to converge upon the optimal cost solution
[10]. Despite individual cost benefits from either optimization of primer reuse, or automated design of degenerate primers, combining the two techniques is likely to offer additional cost advantages.

Optimizing primer design to make use of degeneracy and multiplexing has been referred to as the Multiple Degenerate Primer Selection Problem (MDPSP), and variants have been shown to be NP-complete. Previous approaches to MDPSP, such as those undertaken by Balla et al
[11] have shown that primer coverage and cost can be improved through approximate (heuristic) greedy algorithms. Jabado *et al*. provide a heuristic algorithm for degeneracy, Greene SCPrimer
[12]. In this method phylogenetic trees are constructed from multiple sequence alignments to identify candidate primers, which are used by a greedy set covering problem (SCP) solving algorithm to determine the minimum set of degenerate primers that may amplify all members of the alignment, so combining degeneracy with primer reuse. Although heuristic approaches generally outperform global optimizations in computation time, the reverse can potentially be true in quality of output. Given that optimization of large multiplexed primer design is not generally time-critical, a global optimization approach seems appropriate.

In order to improve on greedy approaches to MDPSP we present here an algorithm that takes a Markov Chain Monte Carlo (MCMC) approach, which allows sampling through primer parameter space using a probability distribution of acceptance of iterative primer designs. Primers are weighted according to their degree of reuse provided their degeneracy is kept below a user-defined threshold. We have implemented a Metropolis-Hastings algorithm, in which new proposals (e.g. the cost of a primer design) are accepted if they provide a more optimal solution to the current proposal, with the system tending to revert probabilistically, to the current state if the new proposal is more costly. We call the algorithm the Markov Chain Monte Carlo Optimized Degenerate Primer Reuse (MCMC-ODPR) algorithm. We show that the MCMC-ODPR program outperforms Primer3, Primer-BLAST, BatchPrimer3, and PAMPS in terms of cost and in terms of sequence coverage.