Skip to main content

Table 1 Performance analysis of real plant transcriptome assembly

From: A consensus-based ensemble approach to improve transcriptome assembly

Assembler

A. thaliana Col-0 (48,359)a

M. charantia (45,859)a

G. hirsutum (70,478)a

# of contigsb

BUSCOc

EVALd

# of contigsb

BUSCOc

EVALd

# of contigsb

BUSCOc

EVALd

[Genome-guided]

Bayesembler

45,583 (94) [10,611]

1677

− 3.18

44,524 (97)

1629

− 1.99

NAe

NAe

NAe

Cufflinks

34,771 (72) [9,886]

1593

− 2.95

38,655 (84)

1662

− 1.64

98,590 (140)

1908

− 3.44

Scallop

73,172 (151) [13,272]

1841

− 2.82

68,593 (150)

1855

− 1.49

183,625 (261)

1967

− 3.32

StringTie2

37,211 (77) [12,619]

1819

− 2.82

44,149 (90)

1847

− 1.51

108,389 (154)

1998

− 3.24

[De novo]

IDBA-Tran

52,982 (110) [6,871]

1287

− 2.92

106,170 (232)

657

− 1.82

215,470 (306)

411

− 5.71

rnaSPAdes

101,444 (210) [8,521]

1475

− 2.74

91,791 (200)

1051

− 1.50

211,024 (299)

1533

− 3.66

SOAPdenovo

68,141 (141) [8,973]

1689

− 2.93

115,117 (251)

1776

− 1.47

224,718 (318)

971

− 6.22

Trinity

50,023 (103) [7,501]

1573

− 2.89

90,347 (197)

1779

− 1.44

212,852 (302)

1921

− 3.60

[Ensemble]

EvidentialGene

83,749 (173) [10,892]

1899

− 2.87

148,938 (324)

1983

− 1.60

726,828 (1,031)

2070

− 4.07

Concatenation

70,444 (146) [12,184]

1855

− 2.71

66,613 (145)

1923

− 1.53

97,419 (138)

2010

− 3.76

ConSemble3+d

49,713 (103) [12,659]

1833

− 2.73

90,781 (198)

1858

− 1.42

114,974 (163)

1312

− 5.53

ConSemble3+g

24,355 (50) [11,322]

1587

− 1.18

24,699 (74)

1718

− 2.20

49,530 (54)

1689

− 1.13

TransBorrow

1521 (3) [734]

29

− 2.40

53,191 (116)

1896

− 1.66

6234 (9)

79

− 2.83

  1. aNumber of protein-coding transcripts reported [36, 40, 41]
  2. bProportion (%) of unique protein sequences in the assembly relative to the number of proteins in the reference transcriptome is shown in parentheses. Number of contigs whose sequences matched those in the Araport11 reference transcriptome [52] is shown in square brackets
  3. cNumber of complete BUSCOs identified from the 2121 orthologs in the Eudicot set
  4. dRSEM-EVAL scores (× 10–9) from DETONATE
  5. eBayesembler was unable to run on this dataset