Skip to main content

Table 3 The effect of truncating the sequence upstream of the TSS

From: Finding sequence motifs with Bayesian models incorporating positional information: an application to transcription factor binding sites

Sequence range

TSS Tompa Dataset

TRANSFAC Dataset

 

Without positional info

With positional info

p-value

Without positional info

With positional info

p-value

[-2000, 0]

-0.008

0.101

0.002

-0.009

0.027

10-8

[-1000, 0]

0.086

0.098

0.583

0.050

0.066

0.112

[-500, 0]

0.125

0.133

0.338

0.077

0.078

0.070

[-250, 0]

0.139

0.139

0.054

0.094

0.076

0.603

  1. The first column shows the sequence range upstream of the TSS given as input to A-GLAM. The change of CCC from modes with and without positional information for the TSS Tompa and TRANSFAC datasets is displayed in the corresponding groups of three columns. The third column of each group shows a Wilcoxon p-value, which evaluates the difference between the CCCs in the previous two columns. Because not all TFBSs in our datasets are known, small improvements in the CCC correspond to true improvements of unknown magnitude. In particular, e.g., in the Table, two CCC values rounded to 0.139 have unseen decimals different enough to have a p-value of 0.054. To view results for individual sites in the Tompa dataset, see Supplementary Table 7 [see Additional file 1].