Skip to main content

Table 1 Overview of the new features incorporated into MicroMerge v2 and guidelines for when to use them

From: Merging microsatellite data: enhanced methodology and software to combine genotype data for linkage and association analysis

New Feature

Purpose

Guidelines and Usage

Feature 1: One-to-one alignment format

Creates flexible output files and provides a more suitable alignment format for most data sets. Both one-to-one and lumped alignment formats are available in v2.

The lumped format may be preferred for data sets that will be analyzed with Mendel and have 1) few rare bins (<5–10% of bins with <6 instances), 2) discrepant numbers of unique bins between data sets (most markers differ by 2–3 bins), 3) genotyping that was done on platforms with different resolving power, or 4) other situations where bin frequencies don't match well. Otherwise, use the default one-to-one format.

Feature 2: Controlling one-to-one alignment translation

Allows > 1 one-to-one translation from each lumped alignment.

This feature was not useful for our test data sets but can potentially increase alignment posterior probability for markers that have many one-to-one translations with competitive likelihoods. The default value is 1 one-to-one alignment translation per lumped alignment.

Feature 3: Re-merging markers with low posterior probabilities

Improves alignment of markers that have low posterior probabilities and rare bins by zeroing these bins and re-merging the data. Results in a second set of merged data files.

There are three parameters controlling marker selection for re-merging: 1) alignment posterior probability (< 0.425) and bin pair(s) that 2) have low overlap (< 0.85) and 3) low theoretical allele frequencies (< 0.015). The user can control the frequency of re-merging by adjusting these three parameters (from the above default values) in the control file.

Feature 4: Adjusting the prior on the theoretical allele number

Controls emphasis on alignments that have fewer theoretical alleles.

Allows τ to range from 0.05–0.3 (where 0.2 is the default), corresponding to decreasing emphasis on alignments with fewer alleles. Smaller τ values are useful when one or more data set(s) are several times larger than the other(s).

Feature 5: Using population allele frequencies to align data

Improves alignment of data sets with unreliable allele frequencies.

Enables alignment of small data sets and data sets from different ethnic groups. If reliable population allele frequencies are available then this feature should be used to improve alignments.

Feature 6: Aligning multiple data sets

Allows simultaneous alignment of >2 data sets.

All data sets should be merged simultaneously.

Feature 7: Likelihood ratio test (LRT) for assessing alignment quality

Provides another measure of alignment quality that is more general than the posterior probability.

Applicable to lumped alignments only, > 90% of lumped alignments should reject the LRT null hypothesis (LRT = 1).