Computing the PLF and tree likelihood for candidate trees comprises the computational kernel of these tools. Since multiple tools use a common likelihood computation, Ayres et al., implemented a finely tuned implementation of the likelihood computation as a general-purpose library called BEAGLE [2]. BEAGLE supports CPU and GPU-based architectures, but does not yet support Field Programmable Gate Array (FPGA)-based architectures such as the Convey HC-1 [3]. In this paper, we describe our effort to add FPGA support to BEAGLE as well as the resultant performance.

### Pseudocode 1: BEAGLE Kernel

*// for each nucleotide in the sequence perform the PLF to complete the four-column*

*// conditional likelihood table*

*h = 0;*

*for (k = 0; k < nsites; k++) {*

*State is {AA, AC, AG, AT, CA, CC, CG, CT, GA, GC, GG, GT, TA, TC, TG, TT};*

*Base is {A, C, G, T};*

*State = State.first;*

*for (i = 0; i < 4; i++) {*

*Base = Base.first;*

*sopL = sopR = 0;*

*for (j = 0; j < 4; j++) {*

*sopL = sopL + tipL[State] * clL[Base];*

*sopR = sopR + tipR[State] * clR[Base];*

*State = State.next;*

*Base = Base.next;*

*}*

*clP[h + i] = sopL * sopR;*

*}*

*// find the maximum of the previously computed values (scaler value)*

*scaler = 0;*

*for (i = 0; i < 4; i++)*

*if (clP[h + i] > scaler) scaler = clP[h + i];*

*// normalize the previously computed values*

*for (i = 0; i < 4; i++)*

*clP[h + i] = clP[h + i] / scaler;*

*// store the log of the scaler value and store in scP array*

*scP[k] = log(scaler);*

*// update the lnScaler array*

*lnScaler[k] = scP[k] * lnScaler[k];*

*// accumulate the log-likelihood value*

*condLike = 0;*

*for (i = 0; i < 4; i++)*

*condLike = condLike + bs[i] * clP[h + i];*

*lnL = lnL + numSites[k] * (lnScale[k] + log(condLike));*

*// increment counters*

*h = h + 4; clL = clL + 4; clR = clR + 4;*

*}*

*double log (double x) {*

*// initialize binary search*

*log_comp = −16;*

*coeff_set = 0;*

*coeff_incr = 8;*

*log_comp_incr = 8;*

*// perform a logarithmic binary search*

*for (i = 0;i < 4;i++) {*

*if (x < 10^log_comp) {*

*log_comp = log_comp - log_comp_incr;*

*} else {*

*log_comp = log_comp + log_comp_incr;*

*coeff_set = coeff_set + coeff_incr;*

*}*

*coeff_incr = coeff_incr / 2;*

*log_comp_incr = log_comp_incr / 2;*

*}*

*// compute the polynomial approximation*

*return_val = 0;*

*pow_x = 1.0;*

*for (i = 0; i < 5; i++) {*

*return_val + = return_val + coeff[coeff_set][i] * pow_x;*

*pow_x = pow_x * x;*

*}*

*return return_val;*

*}*

The natural log approximation is implemented as an order-5 polynomial where the coefficients are divided into 16 segments whose range scales logarithmically. The coefficients are computed using a Chebyshev approximation [10]. In all of our experiments, we verify that the results computed from our design, including the log approximation, are accurate to within 1% of the results delivered by BEAGLE.