A PATO-compliant zebrafish screening database (MODB): management of morpholino knockdown screen information
BMC Bioinformatics volume 9, Article number: 7 (2008)
The zebrafish is a powerful model vertebrate amenable to high throughput in vivo genetic analyses. Examples include reverse genetic screens using morpholino knockdown, expression-based screening using enhancer trapping and forward genetic screening using transposon insertional mutagenesis. We have created a database to facilitate web-based distribution of data from such genetic studies.
The MOrpholino DataBase is a MySQL relational database with an online, PHP interface. Multiple quality control levels allow differential access to data in raw and finished formats. MODBv1 includes sequence information relating to almost 800 morpholinos and their targets and phenotypic data regarding the dose effect of each morpholino (mortality, toxicity and defects). To improve the searchability of this database, we have incorporated a fixed-vocabulary defect ontology that allows for the organization of morpholino affects based on anatomical structure affected and defect produced. This also allows comparison between species utilizing Phenotypic Attribute Trait Ontology (PATO) designated terminology. MODB is also cross-linked with ZFIN, allowing full searches between the two databases. MODB offers users the ability to retrieve morpholino data by sequence of morpholino or target, name of target, anatomical structure affected and defect produced.
MODB data can be used for functional genomic analysis of morpholino design to maximize efficacy and minimize toxicity. MODB also serves as a template for future sequence-based functional genetic screen databases, and it is currently being used as a model for the creation of a mutagenic insertional transposon database.
Advances in scientific technology and the availability of genomic sequence for an increasing number of species have led to a paradigm shift from identifying sequence to gene product function. Large functional genetic screens are becoming increasingly prevalent and require a new way of thinking about data to facilitate cross-species comparison and deciphering gene product function in different organisms. Zebrafish are particularly amenable to large screens as they have a tremendous ability to reproduce, and their embryos finish the development of most organ rudiments by 2 days post-fertilization . Their semi-transparent embryos and the ready availability of transgenic lines facilitate functional annotation of the many genes required for numerous organ systems and pathways. While historically their most common use in genetic screenings has been primarily restricted to ENU mutagenesis screens [2–4] and retroviral mutagenesis , the availability of morpholinos , antisense knockdown oligonucleotides, and insertional mutagenic transposons  have opened new avenues for functional genetic research.
Recently, several morpholino screens have been published in collaboration with the Ekker laboratory [8, 9]. The largest among them, the Secretome Screen [9, 10], identified likely secreted genes from available transcriptome databases using a bioinformatics approach. The Vertebrate Secretome DataBase (VSDB)  was established to identify co-translationally translocated (CTT) proteins in several model organism species . Using a combination of signal sequence, transmembrane domain, and initiation site predictors as well as homology to known secreted proteins, a subset of zebrafish CTT genes were identified (0.3× genome coverage based on zebrafish loaded and predicted protein sequences in the CTTome database and Genescan gene prediction estimates of the zebrafish genome [4, 5]). 150 gene targets were selected at random for morpholino design, carried out using an Assisted MOrpholino Design tool (AMOD) [11, 12]. Each gene was knocked down in zebrafish via morpholino-injection then screened for defects in general morphology and several organ systems including ocular, renal, pigment, vascular, hematopoietic, cranio-facial, and motor . A second morpholino knockdown screen, the Hematopoietic Stem Cell Screen, was conducted using zebrafish homologues of human genes differentially expressed in hematopoietic stem cell differentiation by microarray analysis .
Currently, the zebrafish community stores published and sequence-related results on several web-based databases. The principal database for zebrafish-related research is the Zebrafish Information Network (ZFIN) [13, 14]. Ensembl , likewise, stores sequences targeted by published morpholinos. Ensembl is limited in its ability to store and display experimental data, and while ZFIN contains a wealth of information on published results it lacks specific information such as dosage and searchable phenotype data for published morpholinos and phenotypic data for largely unpublished morpholinos.
Many model organism communities have developed online databases to access results from mutagenic, knock-down/out and over-expression studies. While mouse knockout mutant [15, 16] and C. elegans RNAi [17, 18] experimental result databases with phenotypic content exist, the only morpholino-specific database available is for a pilot Xenopus morpholino screen in which 202 genes were targeted and subsequent phenotypes displayed on a webpage [19, 20]. The Xenopus tropicalis Morpholino Screen database is a flat format website that allows for records to be accessed and viewed by gene targeted, synphenotype group or specific defect. Notably this database does not include search functions and does not offer a means for further input by outside sources. However, the intent of this database as stated by the authors is to display the results from their specific screen [19, 20].
Given the intent of our database to display the large amount of data generated from our on-going collaborative screens, we have developed of a scalable, centralized yet easily accessible worldwide, MOrpholino DataBase (MODB)  to store and facilitate retrieval of zebrafish morpholino screening results. Additional access has been provided by active links between MODB and ZFIN.
MODB is a scalable, searchable and readily adaptable online MySQL database with a PHP interface. MODB includes storage and retrieval of results generated by our screens and public access of published and unpublished information on morpholinos MODB not only allows dissemination of target-specific information such as effect of knockdown, but also facilitates analysis of morpholino-specific data such as toxicity, dosage, mortality and design. MODB is a new, online, publicly accessible database for data-mining zebrafish morpholino knockdown effects and morpholino non-specific effects.
Construction and Content
The MODB web project is currently running on an Apache web server  with PHP  on a Linux system (Intel Xeon, 2.8 GHz, CentOS release 4.4). The project architecture was developed according to the Model-View-Controller (MVC) design paradigm. The Model (database tables) is stored in a MySQL relational database . Major tables include Morpholino, Staff, Lab, Test, and Test_Result and store all information pertinent to each morpholino and users (Figure 1). The View was developed with the help of Smarty , a template engine in PHP that helps the designer make modifications in the layout without breaking PHP logic. PHP scripts control the connections between the Model and the View. The GD graphics library (a PHP module)  is used for the graphic functions displayed in the summary page.
NCBI BLAST 2.2.13  is invoked by PHP through system calls in the sequence search functions. The MODB BLAST target database is updated automatically when a new morpholino sequence is entered into the database. The ZFIN anatomy ontology  was downloaded (release date 08/16/2007) from the ZFIN web site in the OBO format  and transferred into the database. An anatomy browser was written in PHP and incorporated in MODB.
The unique identifier in MODB is the sequence of each morpholino. Many morpholino targets in our screens are derived from transcriptome sequence data from The Institute for Genomic Research (TIGR)  and GenBank  that are subject to change as new sequence data becomes available. For every unique morpholino entered into the database, there is an accompanying set of tables devoted to information pertaining to that morpholino such as name, designator, laboratory created in, contact person, target name, target sequence, gene targeted, reference images and number of tests created (Figure 2). This concise format allows key information regarding the morpholino to be viewed at once.
Early in 2007, MODB contained over 700 unique morpholino sequences generated from several screens, individual projects and control morpholinos. The first generation Secretome Screen  is the largest overall contributor of morpholino sequences. From 150 candidate genes sequences, 209 morpholinos were designed. Each morpholino was subjected to a variety of morphological and molecular assays to determine the effect of down-regulating a particular gene product during the first 5-days of zebrafish development . This Secretome Screen identified 26/150 gene targets that produced reproducible effects on specific aspects of early development in early morphology and a variety of organ systems including visual, renal, cardiovascular, cranio-facial, hematopoiesis, and pigment. The more focused Hematopoietic Stem Cells Screen identified 16/64 targets as being necessary for proper blood development . The bulk of the data entered into MODB (124/150 target gene morpholinos in the Secretome Screen and 48/64 in the Hematopoietic Stem Cell Screen) consists of morpholinos that had no observed developmentally specific effect in our screening tests, caused toxic effects such as a high amount of cell death and/or mortality, or produced pleiotropic effects we could not decipher. This information is of use to the scientific community for optimizing morpholino design and learning dosage data and overall effect of previously designed and tested morpholinos.
Since MODB is a living database, new data is regularly being submitted. In the coming year another 50 to 150 morpholinos will be added to the database for the continuing Hematopoietic Stem Cell and secretome screens. To allow the release of data in defined formats and quality, MODB has three data sharing quality control levels. Users are capable of accessing and modifying data related to their designated morpholinos. To facilitate collaborative efforts, a shared designation has been developed to allow members of consortium laboratories to access data related to morpholinos designed by any user within the laboratories that are connected by a relationship within MODB. The third designation ('public') allows morpholino data to be used to generate the summary reports available to the open scientific community via the MODB website. Nearly all morpholinos (779/784) in MODB, with the exception of non-zebrafish morpholinos, are designated as public in MODBv1.
A major function of MODB is data compilation of morpholino-specific information generated by primary research. There are two main ways to add data. Currently, administrators alone enter new listings that provide basic information on a morpholino. Authorized users can submit morpholino sequence and results from specific morphological or molecular experiments for a given observation point and dose. Researchers can apply for an authorized-user identification and password (see author contact information).
These results are called 'tests'. In early 2007, MODB contained over 9000 individual test entries. A free text notes field permits detailed description defects observed in morpholino-injected embryos once a fixed vocabulary defect term has been selected. The defects ontology check box was established with common scientific terms used in our screening protocols (Figure 3). Specific in-laboratory scientific descriptions correspond to established defects ontology. The reduction of free text and the promotion of fixed vocabularies increases the searchability of MODB for search by keyword, anatomical structure or fixed defects term and allows for future integration with other fixed-vocabulary phenotype databases such as WormBase and the Mouse Phenome database [15, 29]. To utilize the data entered into MODB in a meaningful manner, we developed a variety of search functions. In addition to the parameters already mentioned (laboratory designator, morpholino name, morpholino target name, gene targeted, laboratory created in, or keyword), MODB can be queried by a controlled defects vocabulary. This defects ontology was developed in conjunction with ZFIN to model the current status of Phenotypic Attribute Trait Ontology (PATO) developed by the National Center for Biomedical Ontology (NCBO) . To facilitate the use of this common language, MODB users assess phenotypes using defect descriptions linked to current PATO terminology (Figure 4). Initially, PATO was designed to describe a defect observed in an anatomical structure (the entity, e.g. tail) using a trait (e.g. relative length) and attribute of that trait (e.g. short length). This system has subsequently been replaced with an entity and quality designation (e.g. tail, short length). MODB's defects ontology contains the latter version of PATO. When no PATO term exists to adequately describe the defect observed, a proposed term is submitted to NCBO via ZFIN to assist in the development of a cross-species phenotypic ontology. The importance of a common phenotypic vocabulary resides in the ability to perform cross-species analysis of gene function and conserved pathways.
The usefulness of the data stored within any database is subject to the retrieval methods available. MODB has been designed with numerous search and sort functions that allow for retrieval by morpholino sequence, morpholino name and gene targeted (Figure 5A). An advanced search feature in the public interface allows users to search by anatomy and phenotype to identify morpholinos with a given defect (Figure 5B). A sequence-based search is also available, using either the target morpholino sequence or the sequence of the morpholino (Figure 5C). By offering numerous search and sorting options, results of nearly 800 morpholinos and more that 9000 tests can be accessed and analyzed for further scientific insight into the knockdown effect of a specific morpholino target, genes effecting the same developmental pathways, and non-specific morpholino effects.
MODB v1 Public Interface
To facilitate display and digestion of the large amount of data stored within MODB, a summary sheet for each morpholino was created (see construction and content) and is publicly available. Each morpholino summary contains sequence and target information as well as survival and normalcy (percent normal embryos observed overall or for specific screen types) curves based on dose (Figure 6). Several search functions can be used to access this information including by gene targeted, BLAST and an advanced search function that facilitates searching for morpholinos by PATO designation and anatomical structure affected (Figure 5B). MODB thus presents users with the valuable knowledge, such as mortality and dose curves, effectiveness and morphant phenotype. Currently, 101 morpholinos contain hyperlinks to specific morpholino entries in ZFIN and hyperlinks are provided for gene targets in MODBv1.
With advent of whole-genome sequencing technology, a greater emphasis on gene function is emerging. Functional genomic screening tools available in zebrafish such as ENU mutagenesis, GripNAs, RNAi, insertional mutagenesis and morpholinos allow for large functional gene screens that produce an enormous amount of biological information (reviewed in [2, 6, 7, 31–33]). Efficient data storage is critical to allow for retrieval of results in a meaningful fashion and to facilitate data mining. MODB is a non-proprietary, relational database with online accessibility that acts as a storehouse for critical information relating to sequence-specific morpholino knockdown screens. MODB is, to our knowledge, unique among the available databases in that it offers searchable morpholino-specific data and fixed-vocabulary defects ontological cataloging of morphants that allows for cross-species analysis using a common, PATO-derived ontology. Cross-linking to ZFIN provides additional value of MODB for the research community.
Continued screening efforts in zebrafish using morpholinos and other sequence-specific knockdown or knockout technologies will expand our knowledge of gene function. MODB provides a central repository for morpholino information. This schema is potentially applicable to any sequence-based screening approach in zebrafish or related systems. For example, MODB is currently being adapted to the creation of a transposon database that includes insertional mutagenesis and expression analysis data.
By offering public access to our schema and the data contained within MODB, we hope to aid in the advancement of functional genetic studies and gene annotation as well as improvements in morpholino analysis and design. The dissemination of valuable knowledge gained via large screens such as the Secretome  and Hematopoietic Stem Cell  screens goes beyond published gene target characterization and provides a foundation for future research.
Availability and requirements
MODB is accessible via the web http://www.secretomes.umn.edu/MODB/. To be added as an authorized user, contact the corresponding author M.K. or S.C.E. Morpholinos are readily available for purchase through Gene Tools  or Open Biosystems . The authors have no affiliation, financial or otherwise, with either corporate entity.
Kimmel CB, Ballard WW, Kimmel SR, Ullmann B, Schilling TF: Stages of embryonic development of the zebrafish. Dev Dyn 1995, 203(3):253–310.
Geisler R, Rauch GJ, Geiger-Rudolph S, Albrecht A, van Bebber F, Berger A, Busch-Nentwich E, Dahm R, Dekens MP, Dooley C, et al.: Large-scale mapping of mutations affecting zebrafish development. BMC Genomics 2007, 8(1):11. 10.1186/1471-2164-8-11
Draper BW, McCallum CM, Stout JL, Slade AJ, Moens CB: A high-throughput method for identifying N-ethyl-N-nitrosourea (ENU)-induced point mutations in zebrafish. Methods Cell Biol 2004, 77: 91–112.
Wienholds E, van Eeden F, Kosters M, Mudde J, Plasterk RH, Cuppen E: Efficient target-selected mutagenesis in zebrafish. Genome Res 2003, 13(12):2700–2707. 10.1101/gr.1725103
Amsterdam A, Burgess S, Golling G, Chen W, Sun Z, Townsend K, Farrington S, Haldi M, Hopkins N: A large-scale insertional mutagenesis screen in zebrafish. Genes Dev 1999, 13(20):2713–2724. 10.1101/gad.13.20.2713
Pickart MA, Sivasubbu S, Nielsen AL, Shriram S, King RA, Ekker SC: Functional genomics tools for the analysis of zebrafish pigment. Pigment Cell Res 2004, 17(5):461–470. 10.1111/j.1600-0749.2004.00189.x
Taylor MR, Kikkawa S, Diez-Juan A, Ramamurthy V, Kawakami K, Carmeliet P, Brockerhoff SE: The zebrafish pob gene encodes a novel protein required for survival of red cone photoreceptor cells. Genetics 2005, 170(1):263–273. 10.1534/genetics.104.036434
Eckfeldt CE, Mendenhall EM, Flynn CM, Wang TF, Pickart MA, Grindle SM, Ekker SC, Verfaillie CM: Functional analysis of human hematopoietic stem cell gene expression using zebrafish. PLoS Biol 2005, 3(8):e254. 10.1371/journal.pbio.0030254
Michael A, Pickart EWK, Nielsen AubreyL, Sivasubbu Sridhar, Mendenhall EricM, Bill BrentR, Chen Eleanor, Eckfeldt CraigE, Knowlton Michelle, Robu MaraE, Larson JonD, Deng Yun, Schimmenti LisaA, Ellis LyndaBM, Verfaillie CatherineM, Hammerschmidt Matthias, Farber StevenA, Ekker StephenC: Genome-Wide Reverse Genetics Framework to Identify Novel Functions of the Vertebrate Secretome. PLoS One 2006, 1(1):e104. 10.1371/journal.pone.0000104
Klee EW, Ellis LBM: Evaluating eukaryotic secreted protein prediction. BMC Bioinformatics 2005, 6: 256. 10.1186/1471-2105-6-256
Sadl VS, Sing A, Mar L, Jin F, Cordes SP: Analysis of hindbrain patterning defects caused by the kreisler(enu) mutation reveals multiple roles of Kreisler in hindbrain segmentation. Dev Dyn 2003, 227(1):134–142. 10.1002/dvdy.10279
Klee EW, Shim KJ, Pickart MA, Ekker SC, Ellis LB: AMOD: a morpholino oligonucleotide selection tool. Nucleic Acids Res 2005, (33 Web Server):W506–511. 10.1093/nar/gki453
Miller-Bertoglio V, Carmany-Rampey A, Furthauer M, Gonzalez EM, Thisse C, Thisse B, Halpern ME, Solnica-Krezel L: Maternal and zygotic activity of the zebrafish ogon locus antagonizes BMP signaling. Dev Biol 1999, 214(1):72–86. 10.1006/dbio.1999.9384
Sprague J, Bayraktaroglu L, Clements D, Conlin T, Fashena D, Frazer K, Haendel M, Howe DG, Mani P, Ramachandran S, et al.: The Zebrafish Information Network: the zebrafish model organism database. Nucleic Acids Res 2006, (34 Database):D581–585. 10.1093/nar/gkj086
Mouse Phenome Database[http://phenome.jax.org/pub-cgi/phenome/mpdcgi?rtn=docs/home]
Bogue MA, Grubb SC, Maddatu TP, Bult CJ: Mouse Phenome Database (MPD). Nucleic Acids Res 2007, (35 Database):D643–649. 10.1093/nar/gkl1049
Horn T, Arziman Z, Berger J, Boutros M: GenomeRNAi: a database for cell-based RNAi phenotypes. Nucleic Acids Res 2007, (35 Database):D492–497. 10.1093/nar/gkl906
Xenopus tropicalis Morpholino Screen[http://www.gurdon.cam.ac.uk/~smithlab/screens/Xenopus-morpholino-pilot/morpholino-screen-lists.html#synphenotypes]
Rana AA, Collart C, Gilchrist MJ, Smith JC: Defining synphenotype groups in Xenopus tropicalis by use of antisense morpholino oligonucleotides. PLoS Genet 2006, 2(11):e193. 10.1371/journal.pgen.0020193
Apache Web Server[http://www.apache.org]
Nasevicius A, Ekker SC: Effective targeted gene 'knockdown' in zebrafish. Nat Genet 2000, 26(2):216–220. 10.1038/79951
Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ: Basic local alignment search tool. J Mol Biol 1990, 215(3):403–410.
ZFIN Anatomy Ontology[http://www.geneontology.org/GO.format.shtml]
The Institute for Genomic Research (TIGR)[http://www.tigr.org]
National Center for Biomedical Ontology (NCBO)[http://www.bioontology.org/]
Amsterdam A: Insertional mutagenesis in zebrafish: genes for development, genes for disease. Brief Funct Genomic Proteomic 2006, 5(1):19–23. 10.1093/bfgp/ell008
Amsterdam A, Hopkins N: Mutagenesis strategies in zebrafish for identifying genes involved in development and disease. Trends Genet 2006, 22(9):473–478. 10.1016/j.tig.2006.06.011
Dahm R, Geisler R: Learning from small fry: the zebrafish as a genetic model organism for aquaculture fish species. Mar Biotechnol (NY) 2006, 8(4):329–345. 10.1007/s10126-006-5139-0
The authors would like to acknowledge the numerous collaborations that have contributed to the development of MODB such as the laboratories of Drs. Farber, Hammerschmidt, Pickart, Schimmenti, Sivasubbu, and Verfaillie. We would also like to acknowledge the bioinformaticians who have worked on this project including Dr. Eric Klee and Kyong Jin Shim. Funding for this project was obtained from the NIH to SCE (GM63904 and CA65493). This work was supported in part by the University of Minnesota Supercomputing Institute.
MK designed, tested and managed the database. TL and YR were responsible for the programming, debugging and informatics support. BB aided in the conception, design and testing of the database. EL and SE aided in the conception and design of the database and coordinated the project as a whole. All authors read and approved the final manuscript.
Authors’ original submitted files for images
About this article
Cite this article
Knowlton, M.N., Li, T., Ren, Y. et al. A PATO-compliant zebrafish screening database (MODB): management of morpholino knockdown screen information. BMC Bioinformatics 9, 7 (2008). https://doi.org/10.1186/1471-2105-9-7