CloVR: A virtual machine for automated and portable sequence analysis from the desktop using cloud computing

BMC Bioinformatics

Table 1 Overview of CloVR analysis protocols

Track	Process	Tool	Input	Output
CloVR-Search	Database search	BLAST [60]	nt or pep FASTA	BLAST output
CloVR-Microbe[38]	Assembly	Celera assembler [61], Velvet [51]	Raw sequence data (SFF, nt.FASTA¹, nt.FASTQ¹)	nt.FASTA
	Gene prediction	Glimmer3 [62]		pep.FASTA
	tRNA prediction	tRNA-scan [63]		GBK, SQN
	rRNA prediction	RNAmmer [64]		GBK, SQN
	Functional annotation	BLASTX against UniRef100 [58] and COG [65], HMMER [66] search against Pfam [67] and TIGRfam [68]		Annotated GBK, SQN
CloVR-16S [39]	Quality checking	Mothur [17], Qiime [18]	nt.FASTA	nt.FASTA
	Taxonomic classification	RDP classifier [69]		raw output, summary reports
	Multiple sequence alignment	Mothur, Qiime (PyNAST)		nt.FASTA alignments
	OTU clustering	Mothur (distance matrix), Qiime (uclust [70])		OTU list/table
	α-diversity analysis	Mothur (collectors curves, rarefaction curves, diversity and richness estimators)		summary reports/diversity curves
	β-diversity analysis	Metastats [71], custom R scripts, Qiime		summary reports/figures
CloVR-Metagenomics[40]	Clustering and artificial replicate removal	UCLUST	nt.FASTA	nt.FASTA
	Functional classification	BLASTX against COG		raw output, summary reports
	Taxonomic classification	BLASTN against RefSeq [72]		raw output, summary reports
	Comparative analysis	Metastats, custom R scripts		summary reports/figures

Abbreviations: nt, nucleotide; pep, peptide; GBK, GenBank.; SQN, Sequin (NCBI sequence submission table format);
Key bioinformatics tools utilized in each protocol are listed. For input, only the required inputs from the user for each analysis track are listed. For outputs, only the data saved from each step is listed.
¹- Inputs may require adapter and qc trimming prior to assembly

ISSN: 1471-2105