Table 1 Overview of CloVR analysis protocols

Track Process Tool Input Output
CloVR-Search Database search BLAST [60] nt or pep FASTA BLAST output
CloVR-Microbe[38] Assembly Celera assembler [61], Velvet [51] Raw sequence data (SFF, nt.FASTA1, nt.FASTQ1) nt.FASTA
  Gene prediction Glimmer3 [62]   pep.FASTA
  tRNA prediction tRNA-scan [63]   GBK, SQN
  rRNA prediction RNAmmer [64]   GBK, SQN
  Functional annotation BLASTX against UniRef100 [58] and COG [65], HMMER [66] search against Pfam [67] and TIGRfam [68]   Annotated GBK, SQN
Quality checking Mothur [17], Qiime [18] nt.FASTA nt.FASTA
  Taxonomic classification RDP classifier [69]   raw output, summary reports
  Multiple sequence alignment Mothur, Qiime (PyNAST)   nt.FASTA alignments
  OTU clustering Mothur (distance matrix), Qiime (uclust [70])   OTU list/table
  α-diversity analysis Mothur (collectors curves, rarefaction curves, diversity and richness estimators)   summary reports/diversity curves
  β-diversity analysis Metastats [71], custom R scripts, Qiime   summary reports/figures
CloVR-Metagenomics[40] Clustering and artificial replicate removal UCLUST nt.FASTA nt.FASTA
  Functional classification BLASTX against COG   raw output, summary reports
  Taxonomic classification BLASTN against RefSeq [72]   raw output, summary reports
  Comparative analysis Metastats, custom R scripts   summary reports/figures
  1. Abbreviations: nt, nucleotide; pep, peptide; GBK, GenBank.; SQN, Sequin (NCBI sequence submission table format);
  2. Key bioinformatics tools utilized in each protocol are listed. For input, only the required inputs from the user for each analysis track are listed. For outputs, only the data saved from each step is listed.
  3. 1- Inputs may require adapter and qc trimming prior to assembly