Skip to main content

Table 2 The description and possible values of designGG arguments

From: designGG: an R-package and web tool for the optimal design of genetical genomics experiments

Arguments

Description

Possible value(s)

bTwoColorArray a

The type of platform

T(RUE) or F(ALSE) for the dual- or single-channel option, respectively. For example, F for one-color and T for two-color gene expression microarrays (the dual-channel option is also used for any other technology profiling pairs of samples)

genotype a

Genotype information

A matrix of marker genotypes for each marker and each strain. The values can be numeric: "1" and "0" for two homozygous genotypes, respectively (optionally, "0.5" for heterozygous allele). They can also be characters: "A" "B" or "H" and "H" is for heterozygous allele; NA for missing data. The column names are strain names, such as "Strain 1", "Strain 2", etc. The row names are marker names, such as "C1M1", "C2M2", etc.

nEnvFactors a

Number of environmental factors in the study

A numeric integer value between 1 and 3 which indicates the number of environmental factors to be studied. Experiments with more than three environmental factor are not recommended here since the power to estimate the high-order interactions is very limited for a realistic number of samples (several hundreds).

nLevels a

Number of levels for each environmental factor

A numeric integer vector. For example, there are two different levels for two environmental factors under study, then we use nLevels <- c(2, 2)

Level b

Level values for each environmental factor

A list which specifies the levels for each factor in the experiment. The element is a vector describing all levels of the environmental factor. In the given example, temperature levels are 16 and 24 and drug treatment levels are 5 and 10. The we use:

Level <- list(c(16, 24), c(5, 10))

nSlides c

Total number of slides available for the experiment.

A numeric integer value

nTuple c

Average number of strains to be assigned onto each condition

A numeric value which is larger than 1

region b

Genome region of biological interest

A numeric integer vector which indicates the markers of biological interest, for example those previously detected for phenotypic quantitative trait loci. The value is the marker index (i.e., the row number in the genotype data table), not the marker name.

weight b

The weights for estimating genetic and environmental factors, and their interaction terms

A numeric vector which indicates the parameters of biological interest. Higher weights correspond to higher interest, and the optimization is adjusted in such a way as to result in a higher accuracy of the estimate for the parameters with higher weight. Prior knowledge about expected effect sizes of interesting factors can also be incorporated as weight parameters for the algorithm. The weight is inversely proportional to the expected effect size of the corresponding parameter, if the same relative accuracy is intended. When there is no environmental perturbation, weights is 1, as there is only one parameter of interest (genotype); When nEnvFactor = 1, weight = c(wQ, wF1, wQF1); When nEnvFactor = 2, weight = c(wQ, wF1, wF2, wQF1, wQF2, wF1F2, wQF1F2); When nEnvFactor = 3, weight = c(wQ, wF1, wF2, wF2, wQF1, wQF2, wQF3, wF1F2, wF1F3, wF2F3, wQF1F2, wQF1F3, wQF2F3, wQF1F2F3). Here wQ represents the weight for genotype effect, wF1 represents the weight for environmental factor F1 effect and wQF1 represents the weight for interaction between genotype and F1 effect, etc.

nIterations b

Number of iterations of the simulated annealing method

A numeric integer value larger than 1. Default = 3000

directory b

Output file directory

The path where output files will be saved.

fileName b

Output file names

The name for output tables in CSV format to be produced.

  1. Required input arguments from users a
  2. Optional input arguments b
  3. Alternative arguments: either of them is required c