Skip to main content

Table 1 Microarray Workflow Listing

From: Workflows for microarray data processing in the Kepler environment

Workflow file name Goal
GFF file workflows
  Descriptive Statistics and File Information Group
DisplayRegion.xml Create a graphical display of the value field of a GFF file (like output provided by NimbleGen SignalMap)
GeneralHist.xml Create a histogram of a given column of a text file. Useful for microarray GFF files.
gffFreqPoly_python.xml Make several frequency polygons superimposed on one another for comparison.
gffFullDescription.xml Display information about the GFF file specified.
gffQuickLook.xml Displays first few lines of a GFF file.
gffStats_gffread_simple.xml Calculate min, max, mean, median, num of lines, and various percentiles of a specified field. (Python version)
gffStats_Rbased_simple.xml Calculate min, max, mean, median, num of lines, and various percentiles of a specified field. (R version)
ProbeSpacings.xml Make a histogram of the probe spacings of a GFF file.
  File Modification Group
AddComments.xml Add comments to the beginning of a GFF file.
gffMakeTinyl.xml Greatly reduces the size of a GFF so that loading and processing is much faster. Reduces file size by replacing the second, third, and last fields of the file with placeholders. Assumes that these fields are the same in all lines.
gffModThirdField.xml Modify the third field of a GFF file.
  File Processing Group (Sorting, Smoothing, Normalization, Subtraction, Splitting)
gffSmooth.xml Median smooth (length 3) the 6th column of GFF files.
gffSort.xml Sort a GFF file in chromosome + start point order (actually field 1 then field 4 order).
QuantNorm.xml Quantile normalize the 6th field (ratio field) of a series of GFF files.
gffQN_SM3_TINY.xml Quantile Normalize, Smooth, and Tiny-ize a set of GFF files. See gffMakeTiny.xml for explanation of Tiny-ize.
gffSubtract.xml Subtract one GFF file from another GFF file (result based on subtraction of values in field 6).
gffSplit.xml Split a GFF file containing the strings ‘tiled region’, ‘transcription_start_site’, and ‘primary_transcript’ into 3 separate files.
  Binding Site Detection
RunDetection.xml Calculates runs of ratios (6th field) that are greater than or equal to the specified percentile of that column. Can be used for binding site detection for ChIP-chip as in [26].
RunDetection_with_annotation.xml RunDetection workflow with added annotation of resulting binding sites (e.g. nearest gene) by using R/BioConductor ChIPpeakAnno package
Affymetrix Analysis
AMDA.xml Perform Affymetrix gene expression microarray analysis.
AMDA_limmafinal.xml Variant of AMDA workflow using limma package [28] for differentially expressed gene determination.
PCR Primer Design
PrimerDesign.xml Pick sets of primers, given a chromosome range from user. Uses UCSC genome browser for outputs.
General Utilities
Regex_R.xml Simple example of find a substring within a string using regular expressions in R framework.
kepler_cut.xml clone UNIX ‘cut’ command
kepler_paste.xml clone UNIX ‘paste’ command
kepler_sort.xml clone UNIX ‘sort’ command
  1. These workflows are further described in Additional file 2: Table S 1. Each workflow is displayed in Additional file 1: Figures S1-S26.