Skip to main content

Table 5 Results on GISAID catalogs

From: MALVIRUS: an integrated application for viral variant analysis

Pipeline

\(min_{lin}\)

\(max_{lin}\)

Min support

Precision

Recall

Time (s)

No. of correct lineages

A

5

20

2

.953

.947

38

40

A

5

20

5

.951

.945

19

40

B

5

20

2

.992

.967

48

40

B

5

20

5

.993

.960

21

40

A

50

50

2

.933

.918

897

40

A

50

50

5

.942

.948

355

40

B

50

50

2

.960

.962

2465

40

B

50

50

5

.972

.960

677

40

  1. For each catalog, we report the precision and recall achieved by MALVIRUS in genotyping its variations, the average running times, and the number of input samples (out of 41) assigned to the correct lineage. We considered 8 different catalogs, built using pipeline A or B on the set of assemblies retrieved from GISAID, prefiltered using \(\tau _N = 5\%\) and then subsampled using different combinations of parameters \(min_{lin}\) and \(max_{lin}\). In addition, we also filtered out from the catalogs all variations present in less than either 2 or 5 assemblies (Min support columns)