Skip to main content
Figure 1 | BMC Bioinformatics

Figure 1

From: G-NEST: a gene neighborhood scoring tool to identify co-conserved, co-expressed genes

Figure 1

Overview of G-NEST. The user’s gene expression data is first filtered to remove overlapping transcripts. Next all possible gene neighborhoods are compiled based on the range, in number of genes and base pair width, of neighborhood sizes to test as requested by the user. Based on the user’s gene expression data, the correlation of every gene’s expression profile with the expression profile of every other gene in the genome is computed and stored in a matrix. Non-expressing genes, which are identified using user- supplied minimum gene expression level threshold or a minimum number of MAS5 detection calls, are assigned correlation values of 0. Given the genes within each potential neighborhood and the matrix of pairwise correlations, the Average Neighborhood Correlation is computed for each neighborhood. The ANC is the average of all pairwise correlations of genes in the neighborhood. For example, the ANC of a neighborhood with genes A, B, and C would be equal to [corr(A,B) + corr(B,C) + corr(A,C)] / 3. The significance of the observed ANC is then determined by comparing the ANCs computed from genomes with randomized gene order. Given the genes within each potential neighborhood and syntenic blocks for organisms of interest, a Synteny Score (SS) is computed as the proportion of genomes in which the synteny of the neighborhood is maintained. Finally, a Total Neighborhood Score (TNS) is computed from the Synteny Score (SS) and the Average Neighborhood Correlation (ANC).

Back to article page