Skip to main content

Table 2 Number of benchmark proteins in orthologous pairs recovered perfectly or as essentially complete by coding regions assembled by different methods

From: SAUTE: sequence assembly using target enrichment

Read Target Ortho pairs Perfect Essentially complete
set species Count Median rnaSP Trinity SPAln Clust SP10 rnaSP Trinity SPAln Clust SP10
Corn Z. marina 2710 57.78 537 539 213 312 548 1434 1450 1037 916 1344
  A. tauschii 2888 77.94 550 555 410 510 664 1491 1514 1360 1395 1552
  S. bicolor 2871 92.60 549 554 463 572 666 1488 1507 1399 1464 1549
Thale P. axillaris 2287 61.56 1616 0 598 1216 1598 1902 0 1364 1503 1933
cress B. rapa 2312 85.18 1636 0 1255 1405 1870 1924 0 1879 1697 2107
  A. thaliana 2317 100.00 1640 0 1689 1375 1951 1928 0 1991 1668 2122
Worm T. spiralis 2194 40.92 1588 0 156 646 1046 1940 0 638 867 1269
  C. briggsae 3112 86.46 2230 0 1216 2016 2436 2707 0 2336 2486 2731
  C. elegans 3130 100.00 2244 0 2276 2176 2640 2724 0 2720 2624 2851
Mouse P. cinereus 9005 77.04 2824 2880 1954 2185 3308 3790 3743 3393 3202 4274
  H. sapiens 9109 86.73 2831 2889 2212 2568 3469 3803 3753 3537 3640 4356
  M. musculus 9177 100.00 2858 2914 2862 3001 3686 3832 3781 3850 3982 4460
Human P. cinereus 8984 78.80 2796 3107 2144 2206 3340 4383 4645 3995 3551 4792
  M. Musculus 9109 86.73 2824 3134 2372 2401 3496 4425 4684 4168 3784 4875
  H. sapiens 9157 100.00 2839 3152 2994 2601 3760 4448 4708 4630 4032 5013
  1. SAUTE_PROT low (SP10) with maximum of 10 variants reported per graph, SPAligner (SPAln), and CLUSTER (Clust) used proteins from the target species for assembling the read set. rnaSPAdes (rnaSP) and Trinity are de-novo assemblers. The median percent identity between orthologous protein pairs varies from 40.92 to 100%. In each row, count for the method that finds the largest number of proteins as perfect or as essentially complete are in bold