FuGEFlow: data model and markup language for flow cytometry

Background Flow cytometry technology is widely used in both health care and research. The rapid expansion of flow cytometry applications has outpaced the development of data storage and analysis tools. Collaborative efforts being taken to eliminate this gap include building common vocabularies and ontologies, designing generic data models, and defining data exchange formats. The Minimum Information about a Flow Cytometry Experiment (MIFlowCyt) standard was recently adopted by the International Society for Advancement of Cytometry. This standard guides researchers on the information that should be included in peer reviewed publications, but it is insufficient for data exchange and integration between computational systems. The Functional Genomics Experiment (FuGE) formalizes common aspects of comprehensive and high throughput experiments across different biological technologies. We have extended FuGE object model to accommodate flow cytometry data and metadata. Methods We used the MagicDraw modelling tool to design a UML model (Flow-OM) according to the FuGE extension guidelines and the AndroMDA toolkit to transform the model to a markup language (Flow-ML). We mapped each MIFlowCyt term to either an existing FuGE class or to a new FuGEFlow class. The development environment was validated by comparing the official FuGE XSD to the schema we generated from the FuGE object model using our configuration. After the Flow-OM model was completed, the final version of the Flow-ML was generated and validated against an example MIFlowCyt compliant experiment description. Results The extension of FuGE for flow cytometry has resulted in a generic FuGE-compliant data model (FuGEFlow), which accommodates and links together all information required by MIFlowCyt. The FuGEFlow model can be used to build software and databases using FuGE software toolkits to facilitate automated exchange and manipulation of potentially large flow cytometry experimental data sets. Additional project documentation, including reusable design patterns and a guide for setting up a development environment, was contributed back to the FuGE project. Conclusion We have shown that an extension of FuGE can be used to transform minimum information requirements in natural language to markup language in XML. Extending FuGE required significant effort, but in our experiences the benefits outweighed the costs. The FuGEFlow is expected to play a central role in describing flow cytometry experiments and ultimately facilitating data exchange including public flow cytometry repositories currently under development.


Introduction
This document represents an example description of a flow cytometry experiment that is compliant with the Minimum Information about a Flow Cytometry Experiment (MIFlowCyt).
MIFlowCyt has been reviewed by members of the International Society for Analytical Cytology (ISAC) and other interested parties and has been endorsed by the ISAC President and ISAC Council as an ISAC Recommendation. It is a stable document and may be used as reference material. The purpose of MIFlowCyt is to establish criteria to record flow cytometry experiments in a way that provides enough detail to allow for correct interpretation of experimental details including samples, analysis and results. MIFlowCyt promotes consistent annotation of biological and technical issues surrounding a flow cytometry experiment by specifying the requirements for data content and by providing a structured framework for capturing information. This example is intended for the purpose of demonstrating MIFlowCyt only.
A MIFlowCyt-compliant flow cytometry experiment description shall include all relevant information specified in the MIFlowCyt standard. MIFlowCyt states the content of the provided information; it does not imply the format of the information or whether an item should be directly provided or referenced. Within this example we follow the MIFlowCyt structure in order to demonstrate MIFlowCyt as clearly as possible.

Purpose
The purpose of the experiment is to quantify donor hematopoietic stem cells (HSC) contribution to lymphocytes and myeloid cells by flow cytometric analysis of peripheral blood (PB) cells stained with appropriate combinations of donor specific marker and lineage specific antibodies.

Number of transplanted HSC cells
Decreasing numbers of donor HSC cells (5, 20, 100) were injected at day 0 into 3 groups respectively of recipient mice; 3 mice per group. Kit + /sca-1 + /Lineage -(KSL) HSC cells (500) were injected into each mouse of the control group (3 mice in the control group).

Quality Control Measures
PB has been collected from positive as well as negative mice in order to provide a staining control. Mice from the 500 KSL group were consider as the positive controls; mice that are not reconstituted as the negative controls.
Unstained controls have not been used as they are not critical for the evaluation of results because in each sample dull negative cells are being compared to bright positive populations.

Other Relevant Experiment Information
The Competitive Repopulation Assay is used for the experiment. It is a quantitative assay for long-term repopulating stem cells with the potential for reconstituting all hematopoietic lineages. This assay has two key features. The first is the use of competitive repopulation conditions that ensure not only the detection of a very primitive class of hematopoietic stem cells but also the survival of lethally irradiated mice transplanted with very low numbers of such cells. The second is the use of a limiting-dilution experimental design to allow stem cell quantitation.

Sample Characteristics
Expected/analyzed types of cells: red blood cells, lymphoid and myeloid properties, Ly-5.1 vs. Ly-5.2. After lysing red blood cells, samples are stained for lymphoid and myeloid properties in addition to the specific form (Ly-5.1 or Ly-5.2) of the alloantigen ptprc (protein tyrosine phosphatase receptor type c polypeptide) to detect donor and recipient respectively.

Sample Treatment Description
Single cell suspensions are prepared and the cell concentrations are adjusted to 107/ml. Cells have been incubated for 40 minutes at 4°C in a staining buffer (approx. 106 cells in 100µl of staining buffer). The staining buffer contained a pre-titrated, optimal concentration (≤ 1µg) of a fluorescent monoclonal antibody specific for a receptor or with an immunoglobulin (Ig) isotype-matched control respectively (see below for details).
After the incubation, cells have been washed 1x with 2ml of staining buffer and pelleted by centrifugation (250 X g for 5 min); supernatant has been removed.

Finally, cells have been resuspended in PBS/2%FCS (Phosphate Buffered Solution, Fetal
Calf Serum) for flow cytometric analysis.

Fluorescence Reagent Description
Each sample has been stained according to the following The following reagents are being used:

Flow Cell and Fluidics
The instrument has not been altered; fixed-alignment cuvette flow cell.

Light Sources
The instrument has not been altered; three-laser base configuration with ACDU.

Excitation Optics Configuration
The instrument has not been altered.

Optical Filters
The instrument has not been altered, all filters are original and came with the instrument (February 15, 2007). See also figure below.

Optical Paths
The instrument has not been altered. The following figure shows the filter and detector configuration:

List-mode Data Files
FCS data files can be obtained by contacting Dr. Clayton Smith after this work has been published.

Compensation Description
Compensation has been performed computationally post acquisition according to the following spillover matrix (values in %):

Gating (Data Filtering) Details
The same gating strategy has been used for all data files (all recipient mice at different time points post transplant). All these gates would be reported in a real experiment description (e.g., attached as Gating-ML descriptions or as external image files). In order to keep this document as a clear and simple example we provide details only on gating of a single list mode data file and we include these as images within this document.

Gate Description
The gating strategy involves the following gates: − FSC-SSC gate to define the leukocytes (Figure 3).
− FSC-DAPI gate applied on leukocytes to discriminate viable leukocytes (Figure 4). − CD45.1-CD45.2 gates applied on the viable leukocytes for detecting the donor and the recipient percentage of engraftment respectively ( Figure 5).
− Gr-1/CD11b-B220 gate applied on donor's viable leukocytes to determine the type of lineage reconstitution. After dividing the PE/APC plot into four quadrants, granulocytes/monocytes and lymphocytes (B cells) would appear in the upper left quadrant and lower right quadrant respectively ( Figure 6).
− B220-CD4/CD8 gate applied on donor's viable leukocytes to determine the type of lineage reconstitution. After dividing the TxRed/APC plot into four quadrants, Tlymphocytes and B-lymphocytes would appear in the upper left quadrant and lower right quadrant respectively (Figure 7). A different combination of two stains and two-color analysis is required to gate multi-color subsets of leukocytes in the same staining protocol. Gate Statistics would be reported for all list mode data files on which gates have been applied in a real experiment description. Again, we are focusing on a single data file only in order to keep this document as a clear and simple example.