Skip to main content

Table 2 Descriptive statistics of the P. longum transcriptomes assembled with transXpress using the Trinity and rnaSPADES assemblers

From: transXpress: a Snakemake pipeline for streamlined de novo transcriptome assembly and annotation

 

Trinity (v2.13.2)

rnaSPADES (v3.13.0)

Number of raw sequencing reads (input data)

16,901,456 (leaf) + 22,900,035 (spike) + 27,496,748 (root) = 67,298,239 total reads

Number of assembled transcripts (isoforms)

268,313

296,600

Number of reconstructed genes (Trinity estimate)

132,944

Min / median / mean / max transcript lengths

185/577/914/15,159

112/363/832/15,665

Number of predicted protein ORFs (TransDecoder)

131,098

118,984

% of full-length ORFs (TransDecoder estimate)

54.7

60.4

Min / median / mean / max ORF lengths

85/200/282/4982

85 / 191 / 255 / 5091

Transcriptome completeness (BUSCO, embryophyta_odb10 lineage)

C: 95.2% [S: 10.5%, D: 84.7%], F: 2.7%, M: 2.1%

C: 84.1% [S: 18.6%, D: 65.5%], F: 11.1%, M: 4.8%

% of reads aligned to the transcriptome (Bowtie2)

87.5%

83.3%

  1. The estimate of the number of reconstructed genes is only generated by Trinity, by grouping the transcript isoforms that likely originated from the same gene