Skip to main content

Table 2 Evaluation parameters used in TransFlow, described by their name, the software that calculates the parameter (FLN: Full-LengtherNext), a brief description of its meaning and the expected trend for such a parameter

From: TransFlow: a modular framework for assembling and assessing accurate de novo transcriptomes in non-model organisms

Parameter name Software Description Trenda
AllTransSize FLN The sum of every transcript length in nucleotides
N50 FLN The shortest contig(or scaffold) length (in nucleotides) in the set needed to cover 50% of AllTransSize
N90 FLN The shortest contig (or scaffold) length (in nucleotides) in the set needed to cover 90% of AllTransSize
Contigs FLN Number of contigs mapping at least one pair of reads
Contigs500 FLN Same as previous, but taking into account only contigs > 500 nt
MeanContigLen FLN Mean sequence length (in nucleotides) across all useful contigs or scaffolds
Ns FLN Number of Ns (indeterminations) in the contigs or scaffolds
MeanGapLen FLN Mean indetermination length in nucleotides, where 1 indicates that gaps are randomly distributed, and greater values indicate real gaps
DiffProts FLN Number of unique, different proteins
DiffComplProts FLN Same as previous, but onlyconsidering those proteins that seem to be complete
MissAssembl FLN Percentage of contigs where the annotating protein finds similarity in both plus and minus strands
MeanContigCov FLN Fraction of the contig lengths (expressed as percentage) covered by mapped reads. This fraction is calculated per contig an then averaged for the full assembly
ComplOrtho BUSCO Percentage of OrthoDB orthologues from a lineage fully identified in one single contig
FragOrtho BUSCO Percentage of OrthoDB orthologues from a lineage that are fragmented across several contigs
DuplOrtho BUSCO Percentage of OrthoDB orthologues from a lineage that are repeated in several contigs
  1. All parameters are calculated for every assembly
  2. a indicates that the higher the value, the better the transcriptome; indicates that this value should be maintained in good transcriptomes as low as possible
\