VennDiagram: a package for the generation of highlycustomizable Venn and Euler diagrams in R
 Hanbo Chen^{1} and
 Paul C Boutros^{1}Email author
https://doi.org/10.1186/147121051235
© Chen and Boutros; licensee BioMed Central Ltd. 2011
Received: 22 July 2010
Accepted: 26 January 2011
Published: 26 January 2011
Abstract
Background
Visualization of orthogonal (disjoint) or overlapping datasets is a common task in bioinformatics. Few tools exist to automate the generation of extensivelycustomizable, highresolution Venn and Euler diagrams in the R statistical environment. To fill this gap we introduce VennDiagram, an R package that enables the automated generation of highlycustomizable, highresolution Venn diagrams with up to four sets and Euler diagrams with up to three sets.
Results
The VennDiagram package offers the user the ability to customize essentially all aspects of the generated diagrams, including font sizes, label styles and locations, and the overall rotation of the diagram. We have implemented scaled Venn and Euler diagrams, which increase graphical accuracy and visual appeal. Diagrams are generated as highdefinition TIFF files, simplifying the process of creating publicationquality figures and easing integration with established analysis pipelines.
Conclusions
The VennDiagram package allows the creation of high quality Venn and Euler diagrams in the R statistical environment.
Keywords
Background
The visualization of complex datasets is an increasingly important part of biology. Many experiments involve the integration of multiple datasets to understand complementary aspects of biology. These overlapping results can be visualized in a number of ways, including textual tables (e.g. twoway tables), network diagrams [1, 2] and in some cases heatmaps [3, 4]. Venn diagrams have seen increasing use due to their familiarity, easeofinterpretation, and graphical simplicity. For the purpose of this publication, Venn diagrams can be defined as diagrams that use simple geometrical shapes such as circles and ellipses to display all 2^{n}1 possible areas created by the interaction of n sets. The use of simple geometrical shapes reduces figure complexity and size relative to spaceconsuming tables or network layouts.
However, despite this popularity, there are currently few packages for generating Venn diagrams in the widelyused R statistical environment. These packages are limited in their ability to generate highresolution, publicationquality Venn diagrams in that they allow little customization of colours, linetypes, labelplacement, and label font. Numerous specialcases are handled inappropriately, and the output is not usually in the format of highresolution, publicationquality TIFF files. Other, nonRbased local or webbased software capable of generating Venn diagrams exist, such as Venny [5], BioVenn [6], ConSet [7], and VennMaster [8]. All of these suffer from some of the weaknesses listed above. Further, integration into standard Rbased statistical/computational pipelines such as the widely used BioConductor libraries of the R statistical environment [9] is viable, but not technically trivial.
Additionally, if some intersecting or nonintersecting areas in a Venn diagram do not exist, another class of diagrams called Euler diagrams may be more desirable. Euler diagrams are equivalent to Venn diagrams when all intersecting and nonintersecting areas exist. However, areas containing zero elements are shown on Venn diagrams (by definition), whereas Euler diagrams show only nonzero areas. In many cases, Euler diagrams further reduce figure complexity, increase graphical accuracy and improve overall readability relative to Venn diagrams. Unfortunately, almost all existing packages cannot generate publicationquality Euler diagrams in R, although VennEuler does generate Euler diagrams.
To address these issues we introduce VennDiagram, an R package for generating highly customizable, highresolution Venn diagrams with up to four sets and Euler diagrams of two or three sets in the R statistical environment.
Implementation
The VennDiagram package has been developed in and designed for the R statistical environment. The R environment is opensource and available online under the GNU General Public License (GPLv2). R was chosen because of its opensource nature, versatile functions, and general preference within the bioinformatics community. The use of R should facilitate integration with existing dataanalysis pipelines. All code was designed and tested using version 2.12.1 (32bit and 64bit versions) of R. The VennDiagram package is available as Additional Files 1 (linux .tar.gz file) and 2 (windows .zip file).
VennDiagram uses the grid package for graphics. The grid package is a base (standard) package available in all installations, and offers more manoeuvrability than default R graphics in terms of graphical options and the existence of modifiable grid objects. VennDiagram uses these features to dynamically stretch/compress diagrams to fit the dimensions of the output file and to offer a vast number of graphical options.
Results
Discussion
During development of the VennDiagram package, it was discovered that it was impossible to draw accurate, scaled Venn diagrams with three sets using circles. This conundrum is illustrated in the following scenario. In a system of two circles A and B, the distances between the centres of the circles, d_{AB}, could be determined as long as the areas (A_{A} and A_{B} respectively) and the intersection area (A_{A} ∩ A_{B}) are both known. This is possible because in a twocircle system a single A_{A} ∩ A_{B} corresponds to a unique value for d_{AB}. Therefore, a system of three circles A, B, and C, d_{AB}, d_{BC}, d_{AC} could be calculated as long as A_{A}, A_{B}, A_{C}, A_{A} ∩ A_{B}, A_{A} ∩ A_{C}, A_{B} ∩ A_{C} are all known. However, d_{AB}, d_{BC}, d_{AC} make a unique triangle, implying that a Venn diagram can be drawn without ever knowing the overall intersection A_{A} ∩ A_{B} ∩ A_{C}. In other words, the size of the overlap between all three circles does not alter the presentation of scaled Venn diagrams  the area is unchanged even if one system has zero overall intersection (i.e. A_{A} ∩ A_{B} ∩ A_{C} = 0)! This conundrum results from the (arbitrary) choice of circles to represent set size, which reduces the degrees of freedom by one. Unique solutions can be identified by using ellipses or polygons to draw Venn diagrams but the resulting diagrams would lose the instant recognisability and familiarity associated with circular Venn diagrams, defeating the point of a convenient display of information. Noncircular diagrams would also require iterative algorithms to compute the positions and sizes of the shapes, greatly increasing computational burdens, as has been discussed by others [10]. Consequently, scaling of threeset Venn diagrams is disabled in the VennDiagram package. Similarly, Venn diagrams containing more than four sets [11, 12] were not implemented in the VennDiagram package because they become too complex for intuitive visualization.
The VennDiagram package handles all twoset Euler diagrams and the majority of all conceivable threeset Euler diagrams. Threeset Euler diagrams that could not be drawn using circles or ellipses are not supported. For example, an Euler diagram for the case where two nonintersecting sets comprise the third set cannot be drawn using circles and ellipses, though it may be drawn using polygons. This type of figure lacks a ready analytical layout and would require iterative fitting; no polygonrequiring Euler diagrams are available, but standard Venn diagrams are available for these few unsupported cases.
After comparing with other programs capable of generating Venn diagrams (Table 1), advantages of the VennDiagram package include:

Drawing Euler diagrams using circles and/or ellipses with two or three sets

Offering greater customizability to generate more elegant diagrams

Availability in the widelyused R statistical environment

Generating high resolution TIFF files that are standard in publications
A comparison of the features of various programs capable of generating Venn diagrams.
DrawVenn  Venny  gplots::venn  venneuler  limma::vennDiagram  Google Chart  GeneVenn  VennMaster  BioVenn  VennDiagram  

Shapefill  
Colour  X  X  X  X  X  X  
Shapeline  
Style  X  
Width  X  X  
Colour  X  
Caption labels  
Content  X  X  X  X  
Colour  X  X  
Font  X  X  X  
Size  X  X  X  
Style  X  
Location  X  X (SVG only)  X  
Position  X  X (SVG only)  X  
Distance  X  X (SVG only)  X  
Justification  X  
Area labels  
Colour  X  X  X  
Font  X  X  X  X  
Size  X  X  X  X  X  
Style  X  
Titles  
Main title  X  X  X  X  
Subtitle  X  X  
Position  X (SVG only)  X  
Colour  X  X  X  
Font  X  X  
Size  X  X  X  
Style  X  
Justification  X  
Backgroundfill  
Colour  X  X  
Style  X  
File options  
Output type  None  PNG  R graphics  R graphics  R graphics  PNG/GIF  PNG  SVG/JPEG  SVG/PNG  TIFF/PNG/JPEG/BMP/others 
Figure resolution  X  X  X  
Data processing  
Builtin gene ID recognition  X  X  
Figure from file(s)  X  X  X  
Specific optimizations  Gene Ontology  
General  
Environment  Java  Web  R  R  R  Web  Web  Java  Web  R 
Input format  Direct (slider)  Lists  Lists  Partial areas  R object  Partial areas  Lists  Lists/GoMiner output  Lists  Lists 
Maximum sets  3  4  5  3  3  3  3  >5  3  4 
Shapes used  Circles/Rectangles  Circles/Ellipses  Circles/Ellipses  Circles  Circles  Circles  Circles  Polygons  Circles  Circles/Ellipses 
Scaling  X  X*  X*  X (iterative)  X*  X (2set only)  
Euler diagrams  X  X  X  X  
Margin size  X  X  X  
Rotation  X  
Twoset external lines  X  
Other setspecific parameters  X  X 
Conclusions
The VennDiagram package advances both the easeofuse and the degree of customizability in the generation of Venn diagrams in a bioinformatics context. While other tools offer much of the functionality presented here, the implementation of all features together in the widelyused R statistical environment will promote the usage of automatically generated Venn diagrams within computational pipelines.
Availability and Requirements
Declarations
Acknowledgements
The authors thank all members of the Boutros lab for support, and especially Dr. Kenneth Chu and Daryl Waggott for help in generating the windowscompatible version of this package.. This study was conducted with the support of the Ontario Institute for Cancer Research to PCB through funding provided by the Government of Ontario. This work was financially supported by grant number MOP57903 from the Canadian Institutes of Health Research (to PCB and Dr. Allan B. Okey).
Authors’ Affiliations
References
 Brown KR, Otasek D, Ali M, McGuffin MJ, Xie W, Devani B, Toch IL, Jurisica I: NAViGaTOR: Network Analysis, Visualization and Graphing Toronto. Bioinformatics 2009, 25(24):3327–3329. 10.1093/bioinformatics/btp595PubMed CentralView ArticlePubMedGoogle Scholar
 Merico D, Gfeller D, Bader GD: How to visually interpret biological data using networks. Nat Biotechnol 2009, 27(10):921–924. 10.1038/nbt.1567PubMed CentralView ArticlePubMedGoogle Scholar
 Boutros PC, Okey AB: Unsupervised pattern recognition: an introduction to the whys and wherefores of clustering microarray data. Brief Bioinform 2005, 6(4):331–343. 10.1093/bib/6.4.331View ArticlePubMedGoogle Scholar
 Verhaak RG, Sanders MA, Bijl MA, Delwel R, Horsman S, Moorhouse MJ, van der Spek PJ, Lowenberg B, Valk PJ: HeatMapper: powerful combined visualization of gene expression profile correlations, genotypes, phenotypes and sample characteristics. BMC Bioinfo 2006, 7: 337. 10.1186/147121057337View ArticleGoogle Scholar
 Oliveros J: Venny. An interactive tool for comparing lists with Venn Diagrams.[http://bioinfogp.cnb.csic.es/tools/venny/index.html]
 Hulsen T, de Vlieg J, Alkema W: BioVenn  a web application for the comparison and visualization of biological lists using areaproportional Venn diagrams. BMC Genomics 2008, 9: 488. 10.1186/147121649488PubMed CentralView ArticlePubMedGoogle Scholar
 Kim B, Lee B, Seo J: Visualizing Set Concordance with Permutation Matrices and Fan Diagrams. Interact Comput 2007, 19(5):630–643. 10.1016/j.intcom.2007.05.004PubMed CentralView ArticlePubMedGoogle Scholar
 Kestler HA, Muller A, Kraus JM, Buchholz M, Gress TM, Liu H, Kane DW, Zeeberg BR, Weinstein JN: VennMaster: areaproportional Euler diagrams for functional GO analysis of microarrays. BMC Bioinfo 2008, 9: 67. 10.1186/14712105967View ArticleGoogle Scholar
 Gentleman RC, Carey VJ, Bates DM, Bolstad B, Dettling M, Dudoit S, Ellis B, Gautier L, Ge Y, Gentry J, et al.: Bioconductor: open software development for computational biology and bioinformatics. Genome Biol 2004, 5(10):R80. 10.1186/gb2004510r80PubMed CentralView ArticlePubMedGoogle Scholar
 Chow S, Rodgers P: Constructing AreaProportional Venn and Euler Diagrams with Three Circles. In Euler Diagrams 2005: 2005. Paris, France; 2005.Google Scholar
 Edwards A: Sevenset Venn diagrams with rotational and polar symmetry. Combinatorics, Probability and Computing 1998, 7(2):149–152. 10.1017/S0963548397003143View ArticleGoogle Scholar
 Schwenk A: Venn diagram for five sets. Mathematics Magazine 1984, 57(5):297–298. 10.2307/2689606View ArticleGoogle Scholar
Copyright
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.