Combing the hairball with BioFabric: a new approach for visualization of large networks
© Longabaugh; licensee BioMed Central Ltd. 2012
Received: 28 July 2012
Accepted: 17 October 2012
Published: 27 October 2012
The analysis of large, complex networks is an important aspect of ongoing biological research. Yet there is a need for entirely new, scalable approaches for network visualization that can provide more insight into the structure and function of these complex networks.
To address this need, we have developed a software tool named BioFabric, which uses a novel network visualization technique that depicts nodes as one-dimensional horizontal lines arranged in unique rows. This is in distinct contrast to the traditional approach that represents nodes as discrete symbols that behave essentially as zero-dimensional points. BioFabric then depicts each edge in the network using a vertical line assigned to its own unique column, which spans between the source and target rows, i.e. nodes. This method of displaying the network allows a full-scale view to be organized in a rational fashion; interesting network structures, such as sets of nodes with similar connectivity, can be quickly scanned and visually identified in the full network view, even in networks with well over 100,000 edges. This approach means that the network is being represented as a fundamentally linear, sequential entity, where the horizontal scroll bar provides the basic navigation tool for browsing the entire network.
BioFabric provides a novel and powerful way of looking at any size of network, including very large networks, using horizontal lines to represent nodes and vertical lines to represent edges. It is freely available as an open-source Java application.
KeywordsVisualization Networks Open-source Graph layout
Traditional network visualization
Despite the increasing importance of analyzing and understanding very large networks of data, the traditional way of visualizing networks has difficulties scaling up, and typically ends up depicting these large networks as “hairballs”. This traditional approach does indeed have a deeply intuitive foundation: nodes are depicted with a shape such as a circle or square, which are then connected by lines or curves that represent the edges. However, although there are many different ways to apply this basic underlying idea , it needs to be revisited in light of current and emerging needs for understanding increasingly complex networks.
The traditional way depicting networks has the following characteristics:
Though nodes are typically depicted as small two-dimensional glyphs, they are, in essence, zero-dimensional points positioned in two-dimensional space.
Edges are shown as lines or curves, i.e. essentially one-dimensional objects, positioned in the same shared two-dimensional space.
When there are many edges to or from a node, they are all converging on a single zero-dimensional point. Furthermore, since node locations are not constrained, overlapping zones of edge convergence result in unavoidable ambiguity, as do edges that may intersect intervening nodes between the two true endpoints.
Since edges are arbitrarily positioned, they can easily overlap each other, and invariably create a huge number of arbitrary, meaningless intersections that can completely obscure the paths of individual links.
The addition of each new edge to the network degrades the existing presentation, as the edge will typically overlap existing network features. This property means the traditional approach is inherently unscalable.
BioFabric visualization technique
The BioFabric approach has the following characteristics:
The key feature is that nodes are represented as one-dimensional horizontal line segments, one per row.
Edges are represented as one-dimensional vertical line segments, one per column, terminating at the two rows associated with the endpoint nodes.
Both ends of a link are represented as a tiny square. This provides sufficient contrast to make the ends of the link stand out even at large scales. In the case of directed edges, the appropriate end is tagged with an arrowhead.
Edges are unambiguously represented and never overlap. In networks that have multiple edges between the same nodes, i.e. representing different types of relationships, all edges show up clearly.
As nodes are represented as horizontal lines, there is no requirement that all edges converge upon a single point, allowing for complete flexibility in where a link is drawn. Links can originate, and terminate, anywhere along the length of the node segment. This flexibility introduces the powerful ability to create sets of links that share some semantic property and depict them as discrete groups arranged horizontally in the visualization.
The addition of a new edge just increases the width of the visualization, and does not degrade the existing presentation in any fashion. And increased width can be thought of as simply adding pages to a book; the network is being represented as a fundamentally linear, sequential entity, where the horizontal scroll bar provides the basic navigation tool for addressing the entire network.
Edges are drawn darker than nodes; this has the effect of emphasizing the links and making them appear to float in front of the nodes. So despite the existence of a vast number of orthogonal intersections, links and nodes are unambiguous.
The visualization technique produces a distinct edge wedge for each node, created by the close-set juxtaposition of the parallel links. The wedge provides clear visual cues about how the node is connected, and how it compares to other similar nodes.
A set of 32 colors is used, not randomly, but in a repeating cycle to render node and edge segments. Colors are not used to apply semantic meaning to network elements, but are crucial for providing a framework that allows the user to visually trace features over long distances. Also, the use of cycling insures that antialiased rendering will produce larger-scale color patterns that provide useful visual cues even when individual links cannot be discerned.
Note that the traditional technique overloads the two-dimensional plane, using the same space to represent both nodes and edges. BioFabric effectively segregates the plane into two separate one-dimensional spaces, and assigns each space to either nodes or edges; the imposition of orthogonality and the use of judicious rendering allow the user to visually distinguish the two. Thus, BioFabric can provide additional clarity of the network structure while using the same underlying two-dimensional resource.
Using lines to depict nodes has appeared previously in the literature. McAllister  used the technique to illustrate an algorithm for the linear arrangement problem (LAP), which finds an ordering of nodes arranged along a line that minimizes the sum of the edge lengths in the graph. In this instance, it is a natural representation that allows the edges to be clearly shown despite the one-dimensional nature of the problem. Another common use where nodes have a linear representation is in Unified Modeling Language (UML) sequence diagrams, where objects have an associated vertical lifeline. However, in that context, the lines are specifically being used to represent the objects over the passage of time.
Contrast to adjacency matrix
It is also useful to contrast BioFabric with another common method of visually representing a network: an adjacency matrix. For a network of n nodes, the matrix is laid out as an n x n grid of points, symbols, or cells. In general, each node m is assigned to both row m and column m. Each edge in the network between node r and node c is then depicted with a symbol in row r and column c. Though this approach has the powerful advantage of being unambiguous, it still suffers from some critical shortcomings:
The area of the representation increases as n2.
Many large networks are sparse; a network with 104 nodes has over 108 possible edges, and thus 105 edges would only have one edge cell filled for every thousand available spots. The depiction of the network is mostly empty space.
The representation of edges as essentially zero-dimensional points gives them much less visual impact than one-dimensional lines, yet the edges in a network are arguably the essential aspect that needs to be conveyed to the viewer.
Contrast to power graph analysis
Various other techniques have been employed to try and handle the scalability problem; one such technique is Power Graph Analysis . The method explicitly identifies recurring network motifs (e.g. cliques) and uses simplified graphical representations for these structures that implicitly represent a large number of edges without needing to render them. This clever method can result in a significant edge reduction, but still has the same limitations as the traditional method for the remaining edges that still need to be drawn. Note that BioFabric can use some of these same simplifications, such as symbolic representations of cliques, as well. One planned future enhancement for the software will allow cliques to be represented compactly as multiple endpoints glyphs on a single vertical segment. Variations on this technique could also be used to depict hyperedges.
Platform and libraries
BioFabric was quickly built using the pre-existing Java code base that has been developed for BioTapestry [9, 10], a Java application for modelling and visualizing genetic regulatory networks. Thus, it uses many of the same core Java libraries that BioTapestry is built upon: Java Swing, Java2D, and Java ImageIO.
The Java2D library proved to be an excellent platform for BioFabric development, particularly due to its antialiasing support. This is important because the BioFabric approach is prone to aliasing artifacts: it involves rendering many very tightly spaced parallel lines, which are being drawn with a repeated cycle of colors. In fact, with large networks and full-network zoom levels, there are multiple lines (e.g. tens, hundreds, or more) being rendered through each pixel. Yet it was not necessary to spend any development time working on specialized low-level, resolution-dependent pixel coloring code to handle this; the standard Java2D draw() method was sufficient, in combination with setting the corresponding Java2D RenderingHint to VALUE_ANTIALIAS_ON. The only caveat that has cropped up so far is a requirement to use Java 1.6 on Apple Macs to get the desired network display. With Java 1.5 on the Mac, the BioFabric networks appear too light compared to all other platforms (e.g. Windows and Linux), yet this problem disappears using Java 1.6.
BioFabric is intended to provide useful visualization of a network with 105 or even 106 edges. In order to keep rendering times down for the large-scale zoom levels, BioFabric starts rendering the network to image buffers in memory as soon as the network is loaded from a file. With the exception of the single top-level zoom image, a grid of image tiles is used to render each zoom value above the level where the program can get adequate performance using immediate mode rendering. After the first two zoom levels are cached, the file load is completed and control passes to the user. From then on, subsequent user pans and zooms are handled using tiles from the image cache. If a needed tile has not yet been generated, a low-resolution tile is created immediately from an available large-scale existing image tile, while the needed final high-resolution tile is queued up for creation on a background thread. Those results are then swapped in as they become available. This approach allows the program to remains responsive even when dealing with large numbers of links and edges, yet the user experience is familiar to users of online resources such as Google Maps .
“Shadow Links” can improve the user’s understanding of the network
BioFabric has two different modes for rendering network edges. In the standard mode, each edge appears only once in the network. This has the advantage of being clean, compact, as well as being consistent with the traditional way that networks are depicted: one line is drawn per edge. However, the addition of a shadow link mode provides a powerful alternative visualization technique.
Link grouping is a BioFabric feature that leverages both the wide flexibility for assigning columns to network edges, as well as the advantage of edge wedges for highlighting differences in node connectivity. If the user has assigning unique suffix tags to the link relation descriptors that partition the edges into distinct sets, BioFabric can use these tags to order and layout the edges incident on each node according to this scheme. As Case Study III will illustrate below, this allows the user to unambiguously and directly compare how the connectivity of a node, or a set of nodes, varies across multiple networks.
A network layout for BioFabric is very simple, and just consists of: 1) the linear ordering of the n nodes, assigned to rows 1 to n, and 2) the linear ordering of the e edges, assigned to columns 1 to e. But this simple framework still provides a variety of different, powerful ways to organize a complex network.
The default layout was designed to provide a fast technique for organizing the network in an understandable and useful fashion. It is simply a breadth-first traversal of the network from most connected component, where the neighboring nodes are visited in the order determined by their degree. The network shown previously in Figure 1 has been laid out using this technique. Some general principles are:
The algorithm works in two passes, where the node rows are assigned first, followed by the edge columns.
All edges are treated as undirected, even with directed networks.
Duplicate edges (i.e. with identical endpoints but different relation labels) are ignored when calculating node degree.
Ties are broken using lexicographic ordering of node names.
For the base case (no shadow links or link groups) the algorithm proceeds as follows:
Set row 1 as the next available row.
Find the highest degree node not yet processed, and assign it to the next available row. Make that row the current row; increment the next available row.
Take the node assigned to the current row and order its neighbors based upon their degree, highest degree first.
Traversing the neighbor nodes using that order, if the node has not yet been assigned, assign it to the next available row and increment the next available row.
Increment the current row. If a node has been assigned to that row, go to step 3. If not, go to step 2.
Set column 1 as the next available column. Make row 1 the current row c.
For current row c, get all the unassigned edges for the node in that row. Note that since we are not dealing with shadow links, all unassigned edges must connect to rows ≥ c.
For each row r ≥ c, create a set S of edges incident on c and r. Order these sets by increasing row number r, so that edges will be assigned in order of increasing length.
Iterating through the ordered list of sets, for each set S, order those edges in S based on lexicographic ordering of the link relation description, and assign them to the next available columns in this order; increment next available column appropriately. If there is a pair of directed edges with the same link relation description, downward links are assigned before upward links.
Increment the current row, and go to step 2.
For a network of n nodes and e edges, the algorithm first tags each edge with a coefficient that represents the similarity between the connectivities of the two endpoints nodes. Two methods are available: cosine similarity  or Jaccard similarity . Note that in both cases, directed edges are treated as undirected, so the similarity coefficients are symmetric.
Nodes are brought into the set of placed nodes P one at a time, only considering nodes from the front F, which is the subset of nodes in the set of unplaced nodes U that have at least one edge to a node in P. A simple approach would be to select a node from F with the highest similarity coefficient of all the edges from P to F. But if the algorithm is in the process of “mining” a region of the network that is richly interconnected, the simple approach would tend to easily abandon this growing chain of similarly connected nodes if a slightly higher similarity coefficient appears anywhere else along the front. To create longer runs of similar nodes, it is preferable to make the algorithm “sticky”.
To achieve this, the algorithm maintains an ordered chain of the r most recently used nodes, as well as a threshold fraction 0.0 ≤ t ≤ 1.0; both these values r and t are user-configurable. If the highest coefficient S b to the front is assigned to a link from node A, but there is a coefficient S d assigned to an edge from node C in the chain to the front, such that S d > S b * t, the node connected to C would be added to the placed set P. Otherwise, if the node in the front connected to A wins and is placed in P, the algorithm empties the current chain. Regardless, the connected node (A or C) in P is either added in the first slot, or (if C) moved up to the first slot of the chain, and the newly added node is inserted into the second slot in the chain, pushing all other elements back. If the new addition causes the chain to exceed the maximum size, the least recently accessed node is removed from end of the chain.
Interoperation with other software tools
Cytoscape  is a powerful and popular platform for analyzing networks, and the platform supports an extensive ecosystem of users and plug-in developers, so it is highly desirable to be able to leverage this platform. The Gaggle  is a software system that allows users to exchange data between heterogeneous, independent software tools, and the CyGoose plug-in allows Cytoscape to work with Gaggle. Since BioFabric is a tool that supports a unique way of visualizing, navigating, and exploring networks, but is not a tool for supporting computational analysis, it has been Gaggle-enabled to allow it to work with, and leverage the strengths of, these other analysis tools. Using Gaggle, networks and selections can be exchanged between BioFabric and other Gaggle-aware tools running on the user’s desktop. To support this, a Gaggle-aware version of BioFabric can be launched from the BioFabric web site using Java Web Start.
Results and discussion
The following four case studies highlight the advantages of using BioFabric to explore large networks. Some of these advantages are:
The ability to use a single, coherent, rational, unambiguous layout of an entire large network as a basis for navigating and exploring that network.
A means of quickly assessing the connectivity of nodes through the depicted edge wedges.
A superior way of unambiguously depicting the edge relationships in clustered networks.
A way of visually identifying differences in network connectivity between multiple conditions through the use of link grouping and the connectivity layout.
The ability to identify interesting network structures and properties at large scales through simple inspection.
Networks need to first be imported into BioFabric
The current incarnation of BioFabric is designed to be a network viewer, not an editor, and thus networks need to be first imported either as a Cytoscape tab-delimited. sif file, or using the Gaggle network import method described above. In order to retain the final chosen layout and display options, the network can then be saved and reloaded as a BioFabric .bif file, which is an XML-based format.
Case study I: Introduction to the BioFabric interface using a large network
Mouse Location: This thin bar is located immediately under the main network view, and reports the node row, link column, and node link zone currently under the mouse. In most cases, the node link zone can be thought of as the node associated with the edge wedge currently under the mouse.
Network Magnifier: This gives a magnified view of the network under the mouse, along with a listing of all the links that display an endpoint glyph in that magnified view. The magnification can be easily varied; at maximum magnification, detailed information about the visible link ends and nodes are shown on the view boundary. The magnifier is manipulated using the displayed key shortcuts, so it can be operated simultaneously alongside the mouse. When desired, the magnifier can be locked, thereby disconnecting it from the mouse, and panned and zoomed independently.
Network Overview: This panel always shows a fixed full-network view, while the current viewport, mouse location, and (possibly locked) magnifier location are shown in context.
Network Tour: This panel drives the network tour feature. The user can select a link endpoint, and then navigate orthogonally through the network. For example, buttons allow the user to jump along the current node row between adjacent link endpoints, or from one end of a link to the other. This tool allows the network features to be explored in a systematic, organized fashion.
Note that Figure 4 demonstrates that even zoomed out to the full network level, some features of the network stand out. For example, there are long, clearly visible stretches of similarly interacting proteins that turn out to be, for example, ribosomal proteins or RNA polymerase proteins.
Find interesting nodes, either by browsing or using the search tool. Select each node either by clicking on the node row, or the node name. If using search, the results are selected already.
Click on the Add First Neighbors to Selection button on the toolbar, which adds the neighboring nodes, as well as the connecting edges, to the current selection.
Click on the Send Selections to Subset View button on the toolbar.
The subset view appears in a separate window, which behaves just like the main window, except that only one level of subset view creation is currently supported.
Case study II: Understanding clustered networks
The traditional network layout method is frequently used to depict the results of applying network clustering algorithms. While the proximity of clustered nodes provides a useful visualization, the edges are typically cluttered, so that the user cannot see the internal edge structure of the cluster, nor can she see where inter-cluster edges terminate. Furthermore, edges just passing through a cluster can be mistaken as representing a non-existent relationship between clusters.
BioFabric’s ability to segregate links into bundles of distinct functionality along the horizontal axis can instead create a clear and unambiguous representation of a clustered network. To illustrate this, we will use a network depicted in Figure 4 of , which presents clustering results for altered genes from The Cancer Genome Atlas (TCGA) data set applied to their underlying functional protein interaction network. A BioFabric version of this network is shown in Figure 8. To create this presentation, the required node and link orderings were generated and then specified in two files, which were imported using the Layout → Layout Using Node Attributes command followed by the Layout → Layout Using Link Attributes command. This is necessary because BioFabric does not yet have a built-in cluster layout algorithm. However, this layout was prepared externally by applying the default layout to each cluster separately, ordering the clusters by the cluster number used in the original analysis , and assigning the remaining inter-cluster edges to the appropriate interstices between each cluster. Two crucial aspects of using BioFabric for presenting clustered networks stand out:
Nodes and internal edges in a cluster can be assigned to contiguous sets of rows and columns, creating clear and concise depictions of each cluster as it stands as an independent sub-network.
The edges connecting clusters are shown as discrete bundles, completely separated from intra-cluster edges, and are assigned to target clusters in a logical, ordered fashion. Edge endpoints are not obscured, allowing any primary inter-cluster hubs in each cluster to appear clearly in the depiction. Additionally, there are no ambiguous inter-cluster edges that can create the false impression that two clusters are directly linked.
Case study III: Visualizing the differences between cancer subtypes
The Center for Systems Analysis of the Cancer Regulome (CSACR) website  provides a wealth of TCGA cancer data, such as analyses of significant pairwise feature associations iidentified via standard statistical tests. These features are heterogeneous, and can include quantities such as gene expression, mutations, copy number variations, and clinical outcomes. By constructing networks of these associations, researchers can study how these heterogeneous features interact in the various cancer types.
One type of cancer studied is glioblastoma multiforme (GBM) , of which there are four different subtypes: classical, mesenchymal, neural, and proneural . Separate CSACR pairwise feature association studies have been carried out for these four types, as well as a unified study that combines all four [28–32]. This case study will use these data to demonstrate how BioFabric can graphically compare the differences between a set of networks; i.e. the differences in associations between these GBM subtypes. This example also illustrates how the researcher can visualize and linearly browse a very large network. Of course, the best way to actually find a comprehensive list of these differences at this scale is not to browse this network, but to use computational tools that calculate and compare node degree across the subtypes.
Exploring this CD44 subset model, the edge wedge shapes help to spot differences between the subtypes, and the presence or absence of an association for each of the various subtypes can be quickly scanned left-to-right along any node line. For any association, right-clicking on a link endpoint allows the user to launch a web browser for a user-defined hyperlink that has been previously specified in the Edit Display Options dialog. (Note that this is in contrast to right-clicking on a node line, which launches a web browser for the associated node.)
In this particular example, a right-click launches a web application built on top of the CSACR Regulome Explorer data portal  that queries the TCGA database and displays scatterplots of the underlying data for the five different analyses. This particular association shown in the figure, between the gene expression levels of CD44 and MSN, actually only appears in the network for the classical and unified analyses; inspecting and comparing the different scatterplots provides insights into why this is the case.
Case study IV: Full-network shapes with the default layout
BioFabric lays out node rows and edge columns using a fixed, square grid. This feature means that the slopes of the upper and lower boundaries also provide visual clues about network structure. In particular, when the lower boundary is at a 45-degree angle, each newly added edge is adding one new node. But where the slope is zero degrees, each newly added edge is incident on a previous visited node. Thus, network B, which has the same number of nodes as edges, has a lower boundary slope that is unsurprisingly approaching the 45-degree limit.
Current limitations of BioFabric
BioFabric’s pervasive use of its fundamental underlying abstraction of nodes and edges as simple orthogonal lines has a significant advantage in being able to consistently represent a network at all scales. However, this approach does result in a very simple, abstract representation of the network, and so it currently lacks the expressive power that is available through the traditional method of representing networks when used on networks of medium size or smaller. For example, one area where these limitations are apparent is the representation of signalling and metabolic pathways, where rich symbol libraries for nodes and edges can succinctly convey significant amounts of information. The flexibility afforded by the traditional technique also means that important features such as information flow and paths (including parallel paths and cycles) can be given particular emphasis for clarity, so such features can be more difficult to identify in a BioFabric presentation.
Perhaps some or all of these limitations can be addressed through further extensions to BioFabric, including the additional development of new layouts techniques and tools for interactively investigating and illustrating network structures such as paths. These limitations can also be sidestepped if BioFabric’s presentation technique were more tightly integrated as a complement to traditional techniques. Allowing the researcher to toggle between traditional and BioFabric visualizations inside a single tool such as Cytoscape could do this, for example.
Much work remains to be done to leverage the new visualization technique introduced by BioFabric, including improvements to the usability, scalability, and feature set of the first-generation implementation. Some particular directions to pursue include:
Introducing compact representations of network motifs such as cliques.
Investigating new layout algorithms, perhaps applying existing heuristic algorithms for the linear arrangement problem, bandwidth reduction, and profile reduction .
Extending the representation of nodes as lines in two dimensions into representing them as planes in three dimensions.
Implementing navigational features, such as bookmarks, that leverage BioFabric’s presentation of a network as an extended sequential representation.
Implementing metanodes to allow BioFabric to support more complex network models.
Providing additional layouts methods and interactive tools to help the researcher better visualize network features such as paths (including parallel paths and cycles). Improving the network magnifier to give a more visual (as opposed to textual) sense of first neighbors will also help to provide a more intuitive sense of connectivity.
Porting the technique into browser-based technologies such as HTML5 Canvas may prove challenging given the demanding graphics requirements, but will allow the method to be used by the emerging class of purely browser-based web applications.
Finally, since the advantages of BioFabric can be complementary to the advantages provided by traditional network presentation techniques, a combination of the two should provide the most expressive power. The new Cytoscape version 3.0 is designed to support alternate renderers (e.g. ), and this provides an avenue for creating such a combined tool. It would also be fruitful to investigate how one could seamlessly move back and forth between the two types of representations.
BioFabric is a new network visualization software application that challenges the traditional underlying concept of how network nodes and edges are represented in two-dimensional space. In doing so, it gives researchers a powerful tool that provides an organized, comprehensible, scalable way of visualizing large and complex networks.
Availability and requirements
Project Name: BioFabric
Project Home Page: http://www.BioFabric.org/index.html
Programming Language: Java
Other Requirements: Minimum requirement is Java 5, although code outside of the Gaggle subsystem can be compiled using Java 1.4 if desired. The large network presented in Case Study III required the Java heap allocation to be set to 4 gigabytes to import and layout, with the corresponding appropriate hardware. On Mac OS X, Java 6 is required to render the networks with the desired brightness.
License: LGPL V 2.1. Some of the toolbar image files are freely distributed under a separate license from Sun Microsystems, now Oracle. The launch4j wrapper  used to create the Windows executable is licensed under the BSD and MIT licenses. The author of the code forming the basis for browser launching  has declared it to be public domain. Per the LGPL license, the source code for Version 1.0.0 is provided in Additional file 6.
Any restrictions to use by non-academics: None
The author was supported by National Institute of General Medical Sciences grant GM061005, and award number U24CA143835 from the National Cancer Institute. This content is solely the responsibility of the author and does not necessarily represent the official views of the National Institute of General Medical Sciences, National Cancer Institute, or the National Institutes of Health.
Thanks to Guanming Wu for providing the network analysis results used for Case Study II, and to Hamid Bolouri for the apt characterization of BioFabric used in the title of this article. Thanks also to Ilya Shmulevich, Hamid Bolouri, Hector Rovira, and Brady Bernard for reviewing and commenting on the manuscript.
- Lima M: Visual Complexity Mapping Patterns of Information. New York: Princeton Architectural Press; 2011.Google Scholar
- Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, Amin N, Schwikowski B, Ideker T: Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res 2003, 13(11):2498–2504. 10.1101/gr.1239303PubMed CentralView ArticlePubMedGoogle Scholar
- Download Cytoscape. http://www.cytoscape.org/download.html
- von Mering C, Krause R, Snel B, Cornell M, Oliver SG, Fields S, Bork P: Comparative assessment of large-scale data sets of protein-protein interactions. Nature 2002, 417: 399–403.View ArticlePubMedGoogle Scholar
- Lee TI, Rinaldi NJ, Robert F, Odom DT, Bar-Joseph Z, Gerber GK, Hannett NM, Harbison CR, Thompson CM, Simon I, Zeitlinger J, Jennings EG, Murray HL, Gordon DB, Ren B, Wyrick JJ, Tagne J, Volkert TL, Fraenkel E, Gifford DK, Young RA: Transcriptional regulatory networks in Saccharomyces cerevisiae. Science 2002, 298: 799–804. 10.1126/science.1075090View ArticlePubMedGoogle Scholar
- McAllister AJ: A new heuristic algorithm for the linear arrangement problem, Technical Report 99_126a. University of New Brunswick: Faculty of Computer Science; 1999.Google Scholar
- Rumbaugh J, Jacobson I, Booch G: The unified modeling language reference manual. Reading, MA: Addison-Wesley; 1999.Google Scholar
- Royer L, Reimann M, Andreopoulos B, Schroeder M: Unraveling protein networks with power graph analysis. PLoS Comput Biol 2008, 4(7):e1000108. 10.1371/journal.pcbi.1000108PubMed CentralView ArticlePubMedGoogle Scholar
- Longabaugh WJR, Davidson EH, Bolouri H: Computational representation of developmental genetic regulatory networks. Dev Biol 2005, 283: 1–16. 10.1016/j.ydbio.2005.04.023View ArticlePubMedGoogle Scholar
- Longabaugh WJR, Davidson EH, Bolouri H: Visualization, documentation, analysis, and communication of large-scale gene regulatory networks. Biochim Biophys Acta 2009, 1789(4):363–374. 10.1016/j.bbagrm.2008.07.014PubMed CentralView ArticlePubMedGoogle Scholar
- Google Maps. http://maps.google.com/
- Cosine similarity - Wikipedia, the free encyclopedia. http://en.wikipedia.org/wiki/Cosine_similarity
- Jaccard index - Wikipedia, the free encyclopedia. http://en.wikipedia.org/wiki/Jaccard_index
- Shannon PT, Reiss DJ, Bonneau R, Baliga NS: The gaggle: an open-source software system for integrating bioinformatics software and data sources. BMC Bioinforma 2006, 7: 176. 10.1186/1471-2105-7-176View ArticleGoogle Scholar
- Garrow A, Adeleye Y, Warner G: Data_Sets – Cytoscape Wiki. 2007. http://wiki.cytoscape.org/Data_Sets/Google Scholar
- Kerrien S, Aranda B, Breuza L, Bridge A, Broackes-Carter F, Chen C, Duesbury M, Dumousseau M, Feuermann M, Hinz U, Jandrasits C, Jimenez RC, Khadake J, Mahadevan U, Masson P, Pedruzzi I, Pfeiffenberger E, Porras P, Raghunath A, Roechert B, Orchard S, Hermjakob H: The IntAct molecular interaction database in 2012. Nucleic Acids Res 2011, 40(D1):D841-D846.PubMed CentralView ArticlePubMedGoogle Scholar
- Salwinski L, Miller CS, Smith AJ, Pettit FK, Bowie JU, Eisenberg D: The database of interacting proteins: 2004 update. Nucleic Acids Res 2004, 32(Database issue):D449-D451.PubMed CentralView ArticlePubMedGoogle Scholar
- Bader GD, Betel D, Hogue CW: BIND: the biomolecular interaction network database. Nucleic Acids Res 2003, 31(1):248–250. 10.1093/nar/gkg056PubMed CentralView ArticlePubMedGoogle Scholar
- Peri S, Navarro JD, Amanchy R, Kristiansen TZ, Jonnalagadda CK, Surendranath V, Niranjan V, Muthusamy B, Gandhi TK, Gronborg M, Ibarrola N, Deshpande N, Shanker K, Shivashankar HN, Rashmi BP, Ramya MA, Zhao Z, Chandrika KN, Padma N, Harsha HC, Yatish AJ, Kavitha MP, Menezes M, Choudhury DR, Suresh S, Ghosh N, Saravana R, Chandran S, Krishna S, Joy M, et al.: Development of human protein reference database as an initial platform for approaching systems biology in humans. Genome Res 2003, 13: 2363–2371. 10.1101/gr.1680803PubMed CentralView ArticlePubMedGoogle Scholar
- Mishra G, Suresh M, Kumaran K, Kannabiran N, Suresh S, Bala P, Shivkumar K, Anuradha N, Reddy R, Raghavan TM, Menon S, Hanumanthu G, Gupta M, Upendran S, Gupta S, Mahesh M, Jacob B, Matthew P, Chatterjee P, Arun S, Sharma S, Chandrika KN, Deshpande N, Palvankar K, Raghavnath R, Krishnakanth K, Karathia H, Rekha B, Rashmi NS, Vishnupriya G, et al.: Human protein reference database - 2006 update. Nucleic Acids Res 2006, 34: D411-D414. 10.1093/nar/gkj141PubMed CentralView ArticlePubMedGoogle Scholar
- Rual JF, Venkatesan K, Hao T, Hirozane-Kishikawa T, Dricot A, Li N, Berriz GF, Gibbons FD, Dreze M, Ayivi-Guedehoussou N, Klitgord N, Simon C, Boxem M, Milstein S, Rosenberg J, Goldberg DS, Zhang LV, Wong SL, Franklin G, Li S, Albala JS, Lim J, Fraughton C, Llamosas E, Cevik S, Bex C, Lamesch P, Sikorski RS, Vandenhaute J, Zoghbi HY, et al.: Towards a proteome-scale map of the human protein-protein interaction network. Nature 2005, 437(7062):1173–1178. 10.1038/nature04209View ArticlePubMedGoogle Scholar
- Stelzl U, Worm U, Lalowski M, Haenig C, Brembeck FH, Goehler H, Stroedicke M, Zenkner M, Schoenherr A, Koeppen S, Timm J, Mintzlaff S, Abraham C, Bock N, Kietzmann S, Goedde A, Toksöz E, Droege A, Krobitsch S, Korn B, Birchmeier W, Lehrach H, Wanker EE: A human protein-protein interaction network: a resource for annotating the proteome. Cell 2005, 122(6):957–968. 10.1016/j.cell.2005.08.029View ArticlePubMedGoogle Scholar
- Ramani AK, Bunescu RC, Mooney RJ, Marcotte EM: Consolidating the set of known human protein-protein interactions in preparation for large-scale mapping of the human interactome. Genome Biol 2005, 6(5):R40. 10.1186/gb-2005-6-5-r40PubMed CentralView ArticlePubMedGoogle Scholar
- Wu G, Feng X, Stein L: A human functional protein interaction network and its application to cancer data analysis. Genome Biol 2010, 11: R53. 10.1186/gb-2010-11-5-r53PubMed CentralView ArticlePubMedGoogle Scholar
- Cancer Regulome. http://www.cancerregulome.org/
- McLendon R, Friedman A, Bigner D, Van Meir EG, Brat DJ, Mastrogianakis GM, Olson JJ, Mikkelsen T, Lehman N, Aldape K, Yung WK, Bogler O, Weinstein JN, VandenBerg S, Berger M, Prados M, Muzny D, Morgan M, Scherer S, Sabo A, Nazareth L, Lewis L, Hall O, Zhu Y, Ren Y, Alvi O, Yao J, Hawes A, Jhangiani S, Fowler G, et al.: Comprehensive genomic characterization defines human glioblastoma genes and core pathways. Nature 2008, 455(7216):1061–1068. 10.1038/nature07385View ArticleGoogle Scholar
- Verhaak RG, Hoadley KA, Purdom E, Wang V, Qi Y, Wilkerson MD, Miller CR, Ding L, Golub T, Mesirov JP, Alexe G, Lawrence M, O'Kelly M, Tamayo P, Weir BA, Gabriel S, Winckler W, Gupta S, Jakkula L, Feiler HS, Hodgson JG, James CD, Sarkaria JN, Brennan C, Kahn A, Spellman PT, Wilson RK, Speed TP, Gray JW, Meyerson M, et al.: Integrated genomic analysis identifies clinically relevant subtypes of glioblastoma characterized by abnormalities in PDGFRA, IDH1, EGFR, and NF1. Cancer Cell 2010, 17(1):98–110. 10.1016/j.ccr.2009.12.020PubMed CentralView ArticlePubMedGoogle Scholar
- All Pairs Significance Explorer [GBM 06Feb classical]. http://explorer.cancerregulome.org/all_pairs/?dataset=gbm_06feb_class_pw
- All Pairs Significance Explorer [GBM 06Feb mesenchymal]. http://explorer.cancerregulome.org/all_pairs/?dataset=gbm_06feb_mesen_pw
- All Pairs Significance Explorer [GBM 06Feb neural]. http://explorer.cancerregulome.org/all_pairs/?dataset=gbm_06feb_neura_pw
- All Pairs Significance Explorer [GBM 06Feb proneural]. http://explorer.cancerregulome.org/all_pairs/?dataset=gbm_06feb_prone_pw
- All Pairs Significance Explorer [GBM 06Feb all]. http://explorer.cancerregulome.org/all_pairs/?dataset=gbm_06feb_pw
- Van Meir EG, Hadjipanayis CG, Norden AD, Shu HK, Wen PY, Olson JJ: Exciting new advances in neuro-oncology: the avenue to a cure for malignant glioma. CA Cancer J Clin 2010, 60(3):166–193. 10.3322/caac.20069PubMed CentralView ArticlePubMedGoogle Scholar
- Cancer Regulome Software. http://www.cancerregulome.org/software.html
- Csardi G, Nepusz T: The igraph software package for complex network research. Complex Systems: InterJournal; 2006:1695.Google Scholar
- Erdos P, Renyi A: On random graphs. Publicationes Mathematicae 1959, 6: 290–297.Google Scholar
- Barabasi A-L, Albert R: Emergence of scaling in random networks. Science 1999, 286: 509–512. 10.1126/science.286.5439.509View ArticlePubMedGoogle Scholar
- Dong Y: Cytoscape_3/3D_Renderer. http://wiki.cytoscape.org/Cytoscape_3/3D_Renderer
- Kowal G: Launch4j – Cross-platform Java Executable Wrapper. http://launch4j.sourceforge.net/index.html
- Pilafian D: Bare Bones Browser Launch for Java · Use Default Browser to Open a Web Page from a Swing Application. http://www.centerkey.com/java/browser/
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.