Skip to main content

Table 1 Summary of PileLine functionalities

From: PileLine: a toolbox to handle genome position information in next-generation sequencing studies

Tool

Description

Processing and annotation

fastseek

Retrieves all lines within a specified genome range.

fastjoin

Joins two GP input files by genomic coordinate. It can also perform left- and right- outer joins which print orphan lines.

rfilter

Selects only those positions inside at least one of a given set of intervals (.bed or .gff files). It also implements an annotation mode to report all positions plus an extra column containing all the intervals in which each position is contained.

sort

Sorts a GP file by genomic coordinate. SAMtools generated pileup files are usually sorted.

pileup2sift

Generates a SIFT-compatible change column for each variant line in the GP file.

pileup2polyphen

Generates a Polyphen2-compatible change column for each variant line in the GP file.

pileup2firestar

Generates a firestar-compatible input for each variant line in the GP file.

Analysis

2smc

Compares two samples (i.e. case VS control) by retrieving all positions where the genotype is discrepant between the two samples. For each sample a variant GP file is needed, as well as the complete GP file (which includes the invariant positions).

nsmc

Compares n samples of two conditions (i.e. case VS control). Taking one GP file per sample, it reports those samples containing each position and also performs a Fisher's exact test to find reproducible and characteristic positions.

genotest

Performs a QC test on genotyping. Compares two genotypes (experimental VS gold standard) and evaluates the performance on detecting homo/heterozygous variants. It also generates data to plot a ROC curve in order to estimate the best SNP quality threshold. See Additional File 1 for an example output.