Skip to main content

Table 2 Evaluation parameters used in TransFlow, described by their name, the software that calculates the parameter (FLN: Full-LengtherNext), a brief description of its meaning and the expected trend for such a parameter

From: TransFlow: a modular framework for assembling and assessing accurate de novo transcriptomes in non-model organisms

Parameter name

Software

Description

Trenda

AllTransSize

FLN

The sum of every transcript length in nucleotides

N50

FLN

The shortest contig(or scaffold) length (in nucleotides) in the set needed to cover 50% of AllTransSize

N90

FLN

The shortest contig (or scaffold) length (in nucleotides) in the set needed to cover 90% of AllTransSize

Contigs

FLN

Number of contigs mapping at least one pair of reads

Contigs500

FLN

Same as previous, but taking into account only contigs > 500 nt

MeanContigLen

FLN

Mean sequence length (in nucleotides) across all useful contigs or scaffolds

Ns

FLN

Number of Ns (indeterminations) in the contigs or scaffolds

MeanGapLen

FLN

Mean indetermination length in nucleotides, where 1 indicates that gaps are randomly distributed, and greater values indicate real gaps

DiffProts

FLN

Number of unique, different proteins

DiffComplProts

FLN

Same as previous, but onlyconsidering those proteins that seem to be complete

MissAssembl

FLN

Percentage of contigs where the annotating protein finds similarity in both plus and minus strands

MeanContigCov

FLN

Fraction of the contig lengths (expressed as percentage) covered by mapped reads. This fraction is calculated per contig an then averaged for the full assembly

ComplOrtho

BUSCO

Percentage of OrthoDB orthologues from a lineage fully identified in one single contig

FragOrtho

BUSCO

Percentage of OrthoDB orthologues from a lineage that are fragmented across several contigs

DuplOrtho

BUSCO

Percentage of OrthoDB orthologues from a lineage that are repeated in several contigs

  1. All parameters are calculated for every assembly
  2. a indicates that the higher the value, the better the transcriptome; indicates that this value should be maintained in good transcriptomes as low as possible