Skip to main content
Fig. 7 | BMC Bioinformatics

Fig. 7

From: Enhancer prediction in the human genome by probabilistic modelling of the chromatin feature patterns

Fig. 7

Diagram of the PREPRINT steps. The PREPRINT procedure consists of five modules. In the preprocessing module, the genome-wide coverage signals for K chromatin features were first extracted from the short sequencing reads (Step a). In the second Step b, the training data coverage matrices and the aggregate patterns were obtained. In the statistical modelling module (Step c), the individual samples were assumed to follow a Poisson distribution given the scaled aggregate patterns as parameters. In the third module, to quantify the fit of the samples to the aggregate patterns, maximum likelihood (ML) or Bayesian probabilistic scores were computed (Step d). The probabilistic scores were collected into the final training data matrix (Step e). In the fourth module, an SVM classifier was trained with a Gaussian Kernel (Step f). Finally, in the fifth module, the probabilistic scores were computed along the whole genome (Step g) and the probabilistic scores were classified by PREPRINT to obtain the enhancer predictions (Step h). The aligned reads and coverage signals were visualised with the UCSC genome browser [44]

Back to article page