High-throughput technologies have led to an accumulation of large amounts of data that can be used to advance scientific inquiry given the appropriate tools. However, our inability to effectively visualize or conceptualize these data, particularly multidimensional data, is one of the factors preventing its integration into the scientific process. One of the promising means of using these data is to develop, train, and validate computational models, preferably those with interactive visual interfaces. Advances in computational modeling platforms are beginning to allow simulation of biological systems from the single cell biochemical level to more abstract multicellular environments, such as representative tissues, organs, or even organisms. These emerging computational tools are poised to put the power of bioinformatics and data interpretation back into the wet-bench biologists hands by automatically incorporating data from the aforementioned datasets with tools for visualization, experimentation, and data analysis.
Many high-throughput technologies collect large amounts of measurement data that are conducive to being stored in databases. For example, a database can easily house multi-scale gene expression data obtained from a single cell to a whole organism while also documenting the source and experimental methods associated with the data. Such repositories are well suited for data consisting of lists of gene and protein abundance, for example. However, new ontologies and formalisms are required for collecting and describing certain kinds of higher-order data. For instance, the outcome of experiments involving shape or morphology can be challenging to describe accurately, particularly in a way that others can search for or interpret computationally. This problem has been particularly challenging in areas of development and regeneration where a description of the organ, appendage, or organism is one of the key reported observations.
The planarian worm is a model organism in regenerative biology that perfectly illustrates the problem of storing shape-based experimental results in a formal database. These free-living flatworms have exceptional regenerative properties that have fascinated biologists for centuries . They are able to regenerate aged, damaged, or lost tissues with the help of a large adult stem cell population . Despite being complex organisms possessing bilateral symmetry, musculature, intestine, and a central nervous system including a true brain [3, 4], fragments smaller than 1/200th of the adult size can remodel and regenerate an intact worm . This astonishing regenerative ability has stimulated an effort to understand its underlying mechanisms , producing an extensive number of experiments based on amputations , drug-induced phenotypes [7, 8], and RNAi gene-knockdowns [9–13]. However, despite these important efforts, we still lack a comprehensive model that can explain more than one or two aspects of planarian regeneration .
Recently, the Levin lab has developed a new tool (Planform) to aid in the assimilation of these data using a graph-based formalism to describe anatomy and morphology along with a new ontology for describing experimental manipulations and observations [15, 16]. The flexible and extensible graph notation allows worm regions and organs to be described as nodes connected by linkages with associated angles and length parameters. Based on this approach, the Planform Database (PlanformDB) was designed and curated to include a complete description of the many planarian experiments and outcomes that exist throughout the literature. Such a resource does not only make it possible for scientists to search and compare worm morphologies, but it also provides an extractable resource for bioinformatics applications.
We are currently combining Planform, agent-based modeling, and an evolutionary search engine to develop an automated system for searching and validating computational models of regeneration. Agent-based modeling holds promise for studying the emergent behavior and complex interactions between signaling networks involved in directing regeneration, when multi-scale or multi-cellular systems are supported. To this end, we are using a modeling platform (CellSim) where the central agents are autonomous cells containing many of the biological primitives necessary for simulating living systems . The current version of this software contains a number of useful features to support this endeavor, including a 3-D interface for visualization and tools for performing experimental manipulations within the client-server architecture. The process of developing, testing, and validating a complex model by hand can be a daunting task, particularly when many individual experimental outcomes are combined. To simplify this process, we have incorporated an evolutionary search engine that can automate this process using a genetic algorithm driven by appropriate fitness metrics that are informed by the Planform Database (PlanformDB). Our ultimate goal is for this integrated system to identify computational models that can account for many, if not all, of the available experimental outcomes related to planarian regeneration. We believe that this general approach holds the promise to spur biological discovery, develop novel insights into long-standing problems and biases, and elucidate previously unobserved biological behaviors.
This paper presents a novel agent-based planarian model capable of simulating basic biological behavior. The model is suitable for automated and varied experimental manipulations akin to those traditionally performed by wet-bench biologists and represented in the PlanformDB. This model includes a reaction network that responds to manipulations by initiating appropriate head and tail regeneration. Importantly, we describe an algorithm that allows translation of multicellular simulation output into a formal graph representation equivalent to that described by Lobo and colleagues [15, 16]. This real-time translation is central to the automation of model discovery as it enables use of a fitness metric based upon a graph-edit distance calculation, which quantitatively compares simulation output and target morphologies stored in the PlanformDB. The combination of the model, translation algorithm, and fitness metric provide the basis for future automated model discovery in regeneration biology.