Skip to main content

Table 1 An overview of Bioconductor packages for analyzing high-throughput sequencing data.

From: ChIPpeakAnno: a Bioconductor package to annotate ChIP-seq and ChIP-chip data

Package

Classification

Functionalities

ShortRead

Input/Output

QA

Filtering

Supplies methods for reading, quality assessment (QA) and basic manipulation of high-throughput sequencing data.

Rolexa

Base Calling

QA

Supports probabilistic base calling, quality checks and diagnostic plots for Solexa sequencing data.

IRanges

Infrastructure

Ranged-based algorithm

Provides infrastructure for representing and manipulating sets of integer ranges, and implements algorithms for range-based calculations such as intersect, union, disjoint, overlap and coverage.

BSgenome

Whole Genome Annotation Data

Supplies infrastructure for efficiently representing, accessing and analyzing whole genome.

Biostrings

String manipulation

Implements functions for pattern matching, sequence alignment and string manipulation

rtracklayer

Visualization

Provides an interface between R and genome browsers and implements functions to import, create, export, and display track data by linking R with existing genome browsers.

GenomeGraphs

 

Integrates Ensembl annotation obtained using the biomaRt package and the grid graphic package to facilitate visualization, plotting and analysis of a diverse genomic datasets.

ChIPpeakAnno

Annotation

Plotting

Overlap test

Enrichment test

Implements a common annotation workflow for ChIP-seq data such as finding nearest or overlapping features and obtaining enriched GO terms. In addition, it contains functions for determining the significance of the overlap and visualizing the overlap as a Venn diagram among different datasets.

Genominator

Annotation

Summarization

Offers an interface for storing and retrieving genomic data in SQLite database.

ChIPsim

Simulation of ChIP-seq experiments

Provides a framework for the simulation of ChIP-seq experiments such as nucleosome positioning and transcription factor binding sites.

chipseq*

Analysis of ChIP-seq data

Implements basic workflow for analyzing ChIP-seq experiments, including functions to extend reads, calculating genomic coverage, and identifying peaks.

CSAR*

 

Contributes methods to normalize the count data and detect protein-bound genomic regions with controlled false discovery rate through random permutation. Models the sequence counts as poison distribution.

BayesPeak*

 

Identifies peaks using hidden Markov models and Bayesian statistical methodology. Models the sequence counts as the negative binomial distribution.

ChIPseqR

Analysis of nucleosome

ChIP-seq data

Furnishes functions to analyze nucleosome ChIP-seq data and may be adapted to handle other types of ChIP-seq experiments.

edgeR

Analysis of RNA-seq data

Provides statistical routines for determining differential expression in count-based expression data such as RNA-seq, SAGE and CAGE. The RNA-seq data are modelled as the negative binomial distribution and applied with empirical Bays procedure.

DEGseq

 

Implements functions for identifying differentially expressed genes from RNA-seq data by modelling the RNA-seq data as the binomial distribution.

baySeq

 

Contains methods to determine differential expression in count based expression data with more complex experimental designs using Bayesian methods.

DESeq*

 

Provides functions for identifying differentially expressed genes from RNA-seq data by modelling the RNA-seq data as the negative binomial distribution.

goseq*

Enrichment testing of

RNA-seq data

GO enrichment testing for RNA-seq data.

  1. *Available in BioC 2.6 in R 2.11.0.