MicroGen: a MIAME compliant web system for microarray experiment information and workflow management
© Burgarella et al; licensee BioMed Central Ltd 2005
Published: 1 December 2005
Improvements of bio-nano-technologies and biomolecular techniques have led to increasing production of high-throughput experimental data. Spotted cDNA microarray is one of the most diffuse technologies, used in single research laboratories and in biotechnology service facilities. Although they are routinely performed, spotted microarray experiments are complex procedures entailing several experimental steps and actors with different technical skills and roles. During an experiment, involved actors, who can also be located in a distance, need to access and share specific experiment information according to their roles. Furthermore, complete information describing all experimental steps must be orderly collected to allow subsequent correct interpretation of experimental results.
We developed MicroGen, a web system for managing information and workflow in the production pipeline of spotted microarray experiments. It is constituted of a core multi-database system able to store all data completely characterizing different spotted microarray experiments according to the Minimum Information About Microarray Experiments (MIAME) standard, and of an intuitive and user-friendly web interface able to support the collaborative work required among multidisciplinary actors and roles involved in spotted microarray experiment production. MicroGen supports six types of user roles: the researcher who designs and requests the experiment, the spotting operator, the hybridisation operator, the image processing operator, the system administrator, and the generic public user who can access the unrestricted part of the system to get information about MicroGen services.
MicroGen represents a MIAME compliant information system that enables managing workflow and supporting collaborative work in spotted microarray experiment production.
Microarray systems presently represent the most diffuse high-throughput technology in the biomolecular field. Among them, spotted cDNA microarrays are widely diffused both in single research groups and in biotechnology service centres because of their flexibility and lower running costs. However, they inherently require a few different technical skills and involve several articulated experimental steps, with numerous critical experimental parameters that must be carefully complied in order to ensure reliable and comparable results. Thus, complete information describing all experimental steps must be orderly collected to allow correct subsequent interpretation of experimental results. For these reasons, spotted microarray experiments tend to be produced in central facilities rather than in single laboratories. Furthermore, different actors, who can also be located in a distance, often take part in a microarray experiment, ensuring all required skills. In such cases, they need to access and share specific experimental information according to their skills and role in the experiment, thus they act in a typical collaborative work scenario.
To standardize the considerable amount of heterogeneous information and data produced in a microarray experiment in order to allow their portability and comparability, the Microarray Gene Expression Data (MGED) society proposed a standard called Minimum Information Amount about Microarray Experiments (MIAME) [1, 2]. It precisely defines the information that must be collected during microarray experiment production to completely define experimental and array design, experimental procedures, and generated data results. MIAME standard has greatly standardized presentation of microarray results and allowed aggregation and comparison of results from different centres within common public repositories such as ArrayExpress, GEO, and SMD [3–5].
To facilitate management and local storage, according to the MIAME standard, of great quantity of microarray data produced in single laboratories or research centres, a few software products are available [6–11]. However, they mainly focus on maintaining data integrity  in a flexible and robust database environment  directly compatible with production instrumentation platforms and in facilitating data analysis . Very few of them limitedly consider a possible ideal production workflow , whereas at present to our knowledge none of them support collaborative work in microarray experiment production. Specifically focusing on these last two aspects, we developed MicroGen, a MIAME compliant web-based information system for managing all the information completely characterizing spotted microarray experiments and the produced data. Based on experiment workflow, it supports distributed collaborative work in the production pipeline of spotted microarray experiments.
MicroGen is a MIAME compliant multi-database information system for the workflow management of the production pipeline of spotted microarray experiments.
MicroGen core database tables containing information of microarray experiment production according to the MIAME standard.
• Experiment type
• Experimental factors
• Hybridisation design
• Number of performed hybridisations
• Type of used hybridisation references
• Quality control steps
• URL of websites with additional information related to the experiment
Biological Samples, Preparation Extraction and Labelling
• Extraction protocols
• Labelling protocols
• External controls
• Platform type
• Surface and coating specifications
• PCR amplification
• Array commercial availability
• Spotting protocols
• Additional treatments
Hybridisation Procedures and Parameters
• Hybridisation, blocking and washing protocols
• Hybridisation, blocking and washing conditions
Measurement Data and Specification
• Scanning hardware and software
• Image analysis software
• Type of image quantifications
• Measurement description
Other database tables have been implemented in order to collect data regarding all people, biologists or technicians, who take part in each experiment.
All data regarding clones available for spotting (i.e. type, name, identification code, and characteristics) are orderly stored in additional databases customisable according to the types of used microarrays (e.g. medium or high density microarrays).
Actors and experimental workflow
the researcher who wants to perform the microarray experiment (a biologist or a medical doctor who asks specialized biotechnology technicians for microarray production and hybridisation);
the spotting operator (a technician specialized in microarray production and spotting);
the hybridisation operator (a technician specialized in microarray hybridisation);
the image processing operator (a technician specialized in analysis of generated microarray images and in production of quantification results).
the generic public user, who can read system service presentation and a tutorial of its use, and can download an example pdf document (see Additional file 1), automatically compiled by the system according to the MIAME standard, which describes a performed sample microarray experiment);
the web master who manages users and registrations to the system, and has access to statistical data regarding performed experiments managed by the system.
In order to access MicroGen system functionalities, the first four kinds of users must register to the system. A registered researcher has then three possibilities: 1) define a new experiment, 2) verify the progress of requested experiments, and 3) consult the fully compiled MIAME description and the quantitative data of concluded experiments (for an example, see Additional file 1).
When the first option is selected, the researcher must define the general specifications about the experiment he/she wants to perform by entering the information required according to the MIAME standard (Table 1). Then, the clone libraries available for spotting the required microarrays can be selected. All the information about the clones chosen to be spotted on the microarrays is saved in a labelling excel file automatically generated by the system and easily downloadable by the researcher. Finally, the researcher must describe the required hybridisation design, in particular origin and manipulation of used biological samples.
Also a registered spotting operator has three, but different, possibilities to choose from: 1) receive the request of a new experiment and start the creation of new microarrays according to their descriptions in the microarray labelling file previously generated by the researcher; 2) after spotting the required microarrays, complete his/her task by entering in the system all information about the performed experimental step according to the MIAME standard (Table 1); 3) consult the fully compiled MIAME description of concluded experiments (for an example, see Additional file 1).
upload on MicroGen web server the produced microarray images. This last option also allows consulting the fully compiled MIAME description of the concluded experiments (for an example, see Additional file 1).
Also a registered processing operator has three different options: 1) receive the acquired images of a new experiment that need to be processed and quantified; 2) after downloading, processing and quantifying the acquired microarray images, and producing their quantification files, upload these last in a central system repository and complete the quantification step by entering, according to the MIAME standard, the specifications about instrumentation and software used for the quantifications; 3) consult the fully compiled MIAME description of the concluded experiments (for an example, see Additional file 1). When a processing operator completes all steps in his/her option 2), the whole experiment is completed and the system automatically sends an informative e-mail to the researcher who requested the experiment.
In order to manage the workflow of each specific experiment and make automatically available to the involved actors the information they need at the right time when they require it, MicroGen assigns a status code to each managed experiment. Value of the current status and date of each previous status change are saved in each experiment workflow log together with identifiers of all actors that took part in the experiment. Seven different experimental statuses have been defined: 1 = experiment required by a researcher but not started yet, 2 = experiment started by a spotting operator, 3 = microarray spotting completed, 4 = experiment taken by a hybridisation operator, 5 = hybridisation completed, 6 = experiment taken by a processing operator, 7 = experiment completed.
Three are the main issues related to information management and workflow support in the production of microarray experiments: the large amount of information produced, their heterogeneity, and the geographic distance that may exist among different actors working on a same experiment. Thus, a system designed to perform the tasks related to information management and workflow support, must grant flexibility, consistency and completeness in managing experimental workflow and correct storage of all produced information. Moreover, to support at the same time collaborative activities, it must be easily accessible to users geographically distant and provide them the information they need when they require it. Such information management can be achieved with appropriate architecture's design of the system database, which also gives better performances of the whole database. For example, if some data that are likely to be consulted at the same time are stored in the same table, less joins among database tables are necessary to extract the required data, thus database interrogation time is reduced. Furthermore, if all information about a performed microarray experiment is completely and orderly collected, it can be used to possibly improve analysis results of produced experimental data and to compare data from different performed experiments.
Finally, use of server-side web technologies, which allow centralization of data archiving and processing operations and easy deployment of suitable graphic user interfaces (GUI), enables MicroGen to easily provide collaborative work support also among different actors located in a distance. This choice makes faster and easier both managing and maintaining the system, and deploying the developed functionalities and GUIs to all its remote clients, besides lowering system maintenance costs. In fact, the employed web technologies allow using MicroGen everywhere an Internet/intranet access is available, requiring only a common web browser without any additional plug-in. The system is also simple to run. It only requires to be installed on a server computer with an Internet Information Server as web server.
MicroGen system is freely available for academic and non-profit use at:http://www.bioinformatics.polimi.it/MicroGen/.
MicroGen facilitates workflow management of spotted microarray experiment production, provides an efficient way to gather complete experimental information, and supports collaborative work. In fact, thanks to its well-defined core database architecture, MicroGen facilitates collection and storage of all experimental information according to the MIAME standard. Ordered availability of such information allows subsequent efficient and effective analyses of experimental results.
In addition to orderly store all information produced, by easing the process of information sharing, MicroGen represents a valid support for collaborative work even among research centres geographically distant from each other.
MicroGen also facilitates experimental data comparison. In fact, it allows saving quantitative results also in a standard text format. This increases portability and compatibility of results. Identification of results from experiment with similar characteristic is also facilitated thanks to the complete experimental information orderly stored within the system.
MicroGen graphic user interface is very simple and intuitive, providing an easy method for a biologist or a biotechnology technician to read or collect information about performed microarray experiments. The whole procedure of collecting experimental information is driven by the system, and all the steps to follow are simple and immediate. Forms with multiple choices or dynamic links are presented within web pages in order to quickly access a great number of different data. User's information, present progress status of an experiment, its MIAME data, and experimental results currently uploaded are all viewable at any time to all actors involved in the experiment.
- Brazma A, Hingamp P, Quackenbush J, Sherlock G, Spellman P, Stoeckert C, Aach J, Ansorge W, Ball CA, Causton HC, Gaasterland T, Glenisson P, Holstege FC, Kim IF, Markowitz V, Matese JC, Parkinson H, Robinson A, Sarkans U, Schulze-Kremer S, Stewart J, Taylor R, Vilo J, Vingron M: Minimum Information About a Microarray Experiment (MIAME) – toward standards for microarray data. Nat Genet 2001, 29(4):365–371. 10.1038/ng1201-365View ArticlePubMedGoogle Scholar
- The Microarray Gene Expression Data (MGED) society. The MIAME checklist[http://www.mged.org/Workgroups/MIAME/miame_checklist.html]
- EMBL-EBI European Bioinformatics Institute. The ArrayExpress Database[http://www.ebi.ac.uk/arrayexpress/Help/faq.html]
- Barrett T, Suzek T, Troup D, Wilhite S, Ngau W, Ledoux P, Rudnev D, Lash A, Fujibuchi W, Edgar R: NCBI GEO: mining millions of expression profiles-database and tools. Nucleic Acids Res 2005, 33: D562-D566. 10.1093/nar/gki022PubMed CentralView ArticlePubMedGoogle Scholar
- Ball CA, Awad IA, Demeter J, Gollub J, Hebert JM, Hernandez-Boussard T, Jin H, Matese JC, Nitzberg M, Wymore F, Zachariah ZK, Brown PO, Sherlock G: The Stanford Microarray Database accommodates additional microarray platforms and data formats. Nucleic Acids Res 2005, 33(1):D580-D582.PubMed CentralPubMedGoogle Scholar
- Grant GR, Manduchi E, Pizarro A, Stoeckert CJ Jr: Maintaining data integrity in microarray data management. Biotechnol Bioeng 2003, 84(7):795–800. 10.1002/bit.10847View ArticlePubMedGoogle Scholar
- Kokocinski F, Wrobel G, Hahn M, Lichter P: QuickLIMS, facilitating the data management for DNA-microarray fabrication. Bioinformatics 2003, 19: 283–284. 10.1093/bioinformatics/19.2.283View ArticlePubMedGoogle Scholar
- Kapushesky M, Kemmeren P, Culhane AC, Durinck S, Ihmels J, Korner C, Kull M, Torrente A, Sarkans U, Vilo J, Brazma A: Expression Profiler: next generation – an online platform for analysis of microarray data. Nucleic Acids Res 2004, 32: W465-W470. 10.1093/nar/gkh191PubMed CentralView ArticlePubMedGoogle Scholar
- Saal LH, Troein C, Vallon-Christersson J, Gruvberger S, Borg Å, Peterson C: BioArray Software Environment (BASE): a platform for comprehensive management and analysis of microarray. Genome Biol 2002, 3(8):1–0003. 10.1186/gb-2002-3-8-software0003View ArticleGoogle Scholar
- Killion PJ, Sherlock G, Iyer VR: The Longhorn Array Database (LAD): an open-source, MIAME compliant implementation of the Stanford Microarray Database (SMD). BMC Bioinformatics 2003, 4: 32. 10.1186/1471-2105-4-32PubMed CentralView ArticlePubMedGoogle Scholar
- Gardiner-Garden M, Littlejohn TG: A comparison of microarray databases. Brief Bioinform 2001, 2(2):143–158. 10.1093/bib/2.2.143View ArticlePubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.