From: Workflows for microarray data processing in the Kepler environment
Workflow file name | Goal |
---|---|
GFF file workflows | |
 | Descriptive Statistics and File Information Group |
DisplayRegion.xml | Create a graphical display of the value field of a GFF file (like output provided by NimbleGen SignalMap) |
GeneralHist.xml | Create a histogram of a given column of a text file. Useful for microarray GFF files. |
gffFreqPoly_python.xml | Make several frequency polygons superimposed on one another for comparison. |
gffFullDescription.xml | Display information about the GFF file specified. |
gffQuickLook.xml | Displays first few lines of a GFF file. |
gffStats_gffread_simple.xml | Calculate min, max, mean, median, num of lines, and various percentiles of a specified field. (Python version) |
gffStats_Rbased_simple.xml | Calculate min, max, mean, median, num of lines, and various percentiles of a specified field. (R version) |
ProbeSpacings.xml | Make a histogram of the probe spacings of a GFF file. |
 | File Modification Group |
AddComments.xml | Add comments to the beginning of a GFF file. |
gffMakeTinyl.xml | Greatly reduces the size of a GFF so that loading and processing is much faster. Reduces file size by replacing the second, third, and last fields of the file with placeholders. Assumes that these fields are the same in all lines. |
gffModThirdField.xml | Modify the third field of a GFF file. |
 | File Processing Group (Sorting, Smoothing, Normalization, Subtraction, Splitting) |
gffSmooth.xml | Median smooth (length 3) the 6th column of GFF files. |
gffSort.xml | Sort a GFF file in chromosome + start point order (actually field 1 then field 4 order). |
QuantNorm.xml | Quantile normalize the 6th field (ratio field) of a series of GFF files. |
gffQN_SM3_TINY.xml | Quantile Normalize, Smooth, and Tiny-ize a set of GFF files. See gffMakeTiny.xml for explanation of Tiny-ize. |
gffSubtract.xml | Subtract one GFF file from another GFF file (result based on subtraction of values in field 6). |
gffSplit.xml | Split a GFF file containing the strings ‘tiled region’, ‘transcription_start_site’, and ‘primary_transcript’ into 3 separate files. |
 | Binding Site Detection |
RunDetection.xml | Calculates runs of ratios (6th field) that are greater than or equal to the specified percentile of that column. Can be used for binding site detection for ChIP-chip as in [26]. |
RunDetection_with_annotation.xml | RunDetection workflow with added annotation of resulting binding sites (e.g. nearest gene) by using R/BioConductor ChIPpeakAnno package |
Affymetrix Analysis | |
AMDA.xml | Perform Affymetrix gene expression microarray analysis. |
AMDA_limmafinal.xml | Variant of AMDA workflow using limma package [28] for differentially expressed gene determination. |
PCR Primer Design | |
PrimerDesign.xml | Pick sets of primers, given a chromosome range from user. Uses UCSC genome browser for outputs. |
General Utilities | |
Regex_R.xml | Simple example of find a substring within a string using regular expressions in R framework. |
kepler_cut.xml | clone UNIX ‘cut’ command |
kepler_paste.xml | clone UNIX ‘paste’ command |
kepler_sort.xml | clone UNIX ‘sort’ command |