Skip to main content

Table 11 Summary of ribodbmaker pass/fail outcomes for Yarza(SSU), SilvaMicrosporidia(SSU), SilvaRef(LSU), and SilvaParc(LSU) datasets

From: Ribovore: ribosomal RNA sequence analysis for GenBank submissions and database curation

Dataset/test

Yarza pass/fail

SilvaMicrosporidia pass/fail

SilvaRef pass/fail

SilvaParc pass/fail

Ambiguous nucleotides

8493/277

1435/26

2928/65

385607/8640

Specified species

7543/1227

774/687

2089/904

204923/189324

Vector contamination

8735/35

460/1

2989/4

393938/305

Self-repeats

8635/135

1418/43

2929/64

390305/3942

Ribotyper

8085/685

1140/321

2925/68

145450/248797

Riboaligner

7720/1050

998/463

1583/1410

135252/258995

Length in range?

7617/103

957/41

1187/396

132984/2268

Expected span?

7612/5

669/288

990/197

103016/29968

All 1-sequence tests?

6359/2411

405/1056

759/2234

70127/324120

Ingroup analysis (many)

5158/1201

288/117

649/110

56606/13521

Ingroup analysis (1)

3246/3113

136/269

478/281

7129/52998

  1. All tests except the ingroup analysis depend only on the sequence being tested. The four tests for ambiguous nucleotides, specified species, vector contamination, and self-repeats are done on all sequences, so sequences may fail more than one test. Only sequences that pass the ribotyper test are eligible as input to riboaligner. Only sequences that pass the riboaligner test are eligible to be tested for length and alignment span. Only sequences that pass all 1-sequence tests are eligle for ingroup analysis. The ingroup analysis can be done allowing many sequences from the same taxon to pass or limiting to 1 the number of sequences that pass from each taxon (argument --fione). The many option is a more meaningful test; we show the 1 option just for comparison. Data and instructions for reproducing these comparisons are available at https://github.com/nawrockie/ribovore-paper-2021-supplementary-material