- Open Access
SNPEVG: a graphical tool for GWAS graphing with mouse clicks
BMC Bioinformatics volume 13, Article number: 319 (2012)
Genome-wide association studies (GWAS) using single nucleotide polymorphism (SNP) markers generate large quantities of tests results. Global and local graphical viewing of the test results is an effective approach to digest and interpret GWAS results.
SNPEVG is a set of graphical tools for instant global and local viewing and graphing of GWAS results for all chromosomes and for each trait. The current version includes three programs, SNPEVG1, SNPEVG2 and SNPEVG3. SNPEVG1 is a graphical tool for SNP effect viewing of P-values allowing multiple traits. The total number of graphs that can be generated by one ‘Run’ is n(c + 2), where n is number of ‘traits’ with 0 < n ≤ 100, and c is the number of chromosomes. SNP effect viewing and graphing is accomplished through a user friendly graphical user interface (GUI) that provides a wide-range of options for the user to choose. The GUI can produce the Manhattan plot, the Q-Q plot of all SNP effects, and graphs for SNP effects by chromosome by clicking one command. Any or all the graphs can be saved with publication quality by clicking one command. SNPEVG2 is for the viewing and graphing of multiple traits on the same graph with options to graph any or all of the traits, customizable colors and user specified Y1 or Y2 axis for each traits. The SNPEVG3 program uses the output file of single-locus test results from the epiSNP computer package as the input file. Each chromosome figure can display three genetic effects (genotypic, additive and dominance effects), and the number of observations.
The SNPEVG package is a versatile, flexible and efficient graphical tool for rapid digestion of large quantities of GWAS results with mouse clicks.
GWAS analysis generally yields large quantities of test results. Global and local graphical viewing of the test results is an effective approach and often is a necessary step for interpreting GWAS results. A widely used graphical viewing of GWAS results is the Manhattan plot, which provides a global graphic view of GWAS results of all chromosomes for a trait on one graph to quickly identify genome locations with the most significant SNP effects [1, 2]. Following this global view, detailed graphical examination of each chromosome is helpful for further understanding the GWAS results, and more graphical work often is needed for effective presentation of the GWAS results. The purpose of the SNPEVG package is to provide a graphical tool for rapid digestion of GWAS results and to accomplish large quantities of graphical tasks of GWAS analysis in a seamless fashion.
The SNPEVG computer package is implemented in the C++ programming language. The object orientation feature of the C++ language enables the efficient software development cycle by easy reuse of modules for different applications with similar features. The SNPEVG computer package used the Qt library under the terms of the GNU Lesser General Public License (LGPL) version 2.1 as shown .
Results and discussion
SNPEVG Version 3.2 includes three graphical programs: SNPEVG1, SNPEVG2 and SNPEVG3. SNPEVG1 is for graphing effects of one trait per graph for up to 100 traits, SNPEVG2 is for graphing multiple traits on the same graph, and SNPEVG3 processes directly uses an output file of EPISNP or EPISNPmpi  as the input file. Both SNPEVG1 and SNPEVG2 using the same format of the input file, which contain name, chromosome number and chromosome position of each SNP marker, and P-values of statistical tests from any method. Each program has a scalable GUI allowing efficient and flexible use of computer screen and allows the production of graphical images with user defined vertical/horizontal ratios. Each program can be launched multiple times by mouse click of the executable program so that the user can compare graphical effects of different graph options simultaneously. SNPEVG 3.2 is available from Additional files 1 and 2 or from the website at http://animalgene.umn.edu. Full features of the SNPEVG package are described in the SNPEVG user manual  Additional file 3.
The SNPEVG1 program
SNPEVG1 supports a maximum 100 traits. The GUI (Figure 1A) has numerous graphical options for Manhattan plots, including user-customized colors (Figure 1B), shading P-values below the threshold P-value line (Figure 1C), and scalable pixel size proportional to P-values  (Figure 1A1C), and displaying P-values above the specified cut-off P-value (Figure 1D). Each Manhattan plot uses true chromosome size defined by the starting and ending SNP marker positions of the chromosome. P-values for the unknown chromosome are displayed in sequential order of SNP markers rather than chromosome positions. Manhattan plots and Q-Q plots (Figure 1E) provide global view of test results for each trait. In addition to global viewing, the GUI produces graphs for each chromosome and each trait. For each chromosome, P-values can be presented as connected lines (Figure 2A) or separate symbols (Figure 2B). The total number of graphs that can be generated is n(c + 2), where n is the number of ‘traits’ with 0 < n ≤ 100, and c is the number of chromosomes. Assuming 30 traits and 30 chromosomes per trait, the program produces 960 graphs for interactive viewing by one click of ‘run’, including 30 Manhattan plots, 30 Q-Q plots and 900 chromosome graphs. The upper-right window of the GUI (Figure 1A) is the ‘Graph list’ by trait, showing a list of graphs produced by the ‘Run’ button. The user can turn off Manhattan and Q-Q plots, scroll the chromosome graphs of each trait using the up or down arrow key, and switch between traits using the left or right key. Any selected graph, or graphs for selected traits, or all graphs can be saved as graphical images with publication quality by clicking a button on the GUI. SNPEVG1 requires a simple text input file with the following columns: CHR, POSITION, SNP, and P-VALUE columns, where CHR = chromosome number, POSITION = chromosomal position of the SNP marker, SNP = name of the SNP marker, and P-VALUE is the P-value for a trait.
The SNPEVG2 program
SNPEVG2 is designed to display P-values of multiple traits on the same graph. Each chromosome figure can display P-values in log scale or the original values of a variable on either Y1 or Y2 axis (up to 100 traits) (Figure 3A). The Y2 axis can be used to display a variable unrelated to P-values such as minor allele frequency or allele frequency difference between the best and worst individuals, allowing the production of more flexible and informative graphs than using Y1 axis presenting P-values only. The chromosome graphs can be crowded and difficult to view if the number of traits is large. This problem can be solved by the option to select traits to display, to customize the color of each trait or switch Y1 and Y2 axes using the ‘Setting’ button on the GUI (Figure 3B). Each Y axis, Y1 or Y2, can have its own threshold P-value or cut-off P-value (Figures 3A and C). SNPEVG2 requires a simple text input file with the same format as for SNPEVG1, i.e., CHR, POSITION, SNP, and P-VALUE columns, where CHR = chromosome number, POSITION = chromosomal position of the SNP marker, SNP = name of the SNP marker, and P-VALUE is the P-value for a trait.
The SNPEVGconvert program
The SNPEVGconvert program is designed to convert an output file from any GWAS analysis software to the format of SNPEVG1 and SNPEVG2. With this format conversion program, virtually any GWAS software could SNPEVG1 and SNPEVG2. To use this program, the user only needs to specify the number of columns in the original files and identify the column numbers to be printed in the input file for SNPEVG1 and SNPEVG2.
The SNPEVG3 program
SNPEVG3 is developed for graphical analysis of GWAS using the output file of single-locus test results of EPISNP or EPISNPmpi  as the input file for drawing figures. SNPEVG3 has similar GUI features as SNPEVG1, but it does not have the limit of 100 traits. This program draws graphs for P-values of additive, dominance and genotypic effects on the Y1 axis and draws sample size on the Y2 axis. The P-values can be displayed with lines connecting adjacent data points (Figure 3D) or use symbols without connecting lines (Figure 3E). The user has an option to draw a figure by a sorted effect such as additive or dominance effect.
Evaluation of sample size limitations
Currently, SNPEVG1, SNPEVG2 and SNPEVG3 have a Microsoft Windows 32-bit version and a 64-bit version for Mac OS X 10.6 or newer. A Windows 64-bit version is expected to become available at a later time. For practical purposes, either the 32-bit or the 64-bit version would be powerful enough for real GWAS data sets. For a single trait, the 32-bit version could process 10 million markers per trait in about 30 seconds but failed for 12 million markers, and the 64-bit version could process 30 million markers in 80.62 seconds (Table 1). For multiple traits, the number of markers that can be processed per trait is approximately the numbers in Table 1 divided by the number of traits (Table 2).
The SNPEVG package is a versatile and efficient graphical tool for rapid digestion of large quantities of test results from GWAS and can be customized for graphical viewing and drawing of non-GWAS information such as allele frequency differences.
Availability and requirements
Project name: SNPEVG
Project homepage: http://animalgene.umn.edu/
Operating system(s): Microsoft Windows 7, Mac OS X 10.6 or newer
Other requirements: none.
Any restrictions to use by non-academics: none.
Genome-wide association study
Single nucleotide polymorphism.
Zhao JH: Gap: Genetic analysis package. J Stat Softw 2007., 23(i08): http://www.jstatsoft.org/v23/i08/paper
Ma L, Runesha HB, Dvorkin D, Garbe JR, Da Y: Parallel and serial computing tools for testing single-locus and epistatic SNP effects of quantitative traits in genome-wide association studies. BMC Bioinformatics 2008, 9(1):315. 10.1186/1471-2105-9-315
GNU Lesser General Public License, version 2.1. 1999. http://www.gnu.org/licenses/old-licenses/lgpl-2.1.html
Wang S, Dvorkin D, Da Y: A graphical tool for SNP effect viewing and graphing, version 3.2. 2012. http://animalgene.umn.edu
Cole JB: Data Structures and Visualization. ADSA/ASAS Joint Annual Meeting. 2011. http://aipl.arsusda.gov/publish/presentations/ADSA11/ADSA11_jbc_files/frame.htm
This research is supported by USDA National Institute of Food and Agriculture Grant no. 2011-67015-30333 and by project MN-16-043 of the Agricultural Experiment Station at the University of Minnesota.
The authors declare that they have no competing interests.
SW is the author of SNPEVG1, SNPEVG2, and SNPEVG3. DD is the author of the EPISNPPLOT program that is partially used in SNPEVG3. YD designed most functions of the computing tools, and is the lead writer of the manuscript. All authors read and approved this manuscript.