Skip to main content

Table 1 An overview of Bioconductor packages for analyzing high-throughput sequencing data.

From: ChIPpeakAnno: a Bioconductor package to annotate ChIP-seq and ChIP-chip data

Package Classification Functionalities
ShortRead Input/Output
QA
Filtering
Supplies methods for reading, quality assessment (QA) and basic manipulation of high-throughput sequencing data.
Rolexa Base Calling
QA
Supports probabilistic base calling, quality checks and diagnostic plots for Solexa sequencing data.
IRanges Infrastructure
Ranged-based algorithm
Provides infrastructure for representing and manipulating sets of integer ranges, and implements algorithms for range-based calculations such as intersect, union, disjoint, overlap and coverage.
BSgenome Whole Genome Annotation Data Supplies infrastructure for efficiently representing, accessing and analyzing whole genome.
Biostrings String manipulation Implements functions for pattern matching, sequence alignment and string manipulation
rtracklayer Visualization Provides an interface between R and genome browsers and implements functions to import, create, export, and display track data by linking R with existing genome browsers.
GenomeGraphs   Integrates Ensembl annotation obtained using the biomaRt package and the grid graphic package to facilitate visualization, plotting and analysis of a diverse genomic datasets.
ChIPpeakAnno Annotation
Plotting
Overlap test
Enrichment test
Implements a common annotation workflow for ChIP-seq data such as finding nearest or overlapping features and obtaining enriched GO terms. In addition, it contains functions for determining the significance of the overlap and visualizing the overlap as a Venn diagram among different datasets.
Genominator Annotation
Summarization
Offers an interface for storing and retrieving genomic data in SQLite database.
ChIPsim Simulation of ChIP-seq experiments Provides a framework for the simulation of ChIP-seq experiments such as nucleosome positioning and transcription factor binding sites.
chipseq* Analysis of ChIP-seq data Implements basic workflow for analyzing ChIP-seq experiments, including functions to extend reads, calculating genomic coverage, and identifying peaks.
CSAR*   Contributes methods to normalize the count data and detect protein-bound genomic regions with controlled false discovery rate through random permutation. Models the sequence counts as poison distribution.
BayesPeak*   Identifies peaks using hidden Markov models and Bayesian statistical methodology. Models the sequence counts as the negative binomial distribution.
ChIPseqR Analysis of nucleosome
ChIP-seq data
Furnishes functions to analyze nucleosome ChIP-seq data and may be adapted to handle other types of ChIP-seq experiments.
edgeR Analysis of RNA-seq data Provides statistical routines for determining differential expression in count-based expression data such as RNA-seq, SAGE and CAGE. The RNA-seq data are modelled as the negative binomial distribution and applied with empirical Bays procedure.
DEGseq   Implements functions for identifying differentially expressed genes from RNA-seq data by modelling the RNA-seq data as the binomial distribution.
baySeq   Contains methods to determine differential expression in count based expression data with more complex experimental designs using Bayesian methods.
DESeq*   Provides functions for identifying differentially expressed genes from RNA-seq data by modelling the RNA-seq data as the negative binomial distribution.
goseq* Enrichment testing of
RNA-seq data
GO enrichment testing for RNA-seq data.
  1. *Available in BioC 2.6 in R 2.11.0.