Skip to main content

Table 1 Summary of twelve original genome binner and three refinning genome binner

From: Evaluating metagenomics tools for genome binning with real metagenomic datasets and CAMI datasets

Genome binner

Parameters

Model

Version to validate

Publication

Last update

Resources

MaxBin

k-mer frequencies, coverage, single-copy genes

Expectation-maximization, bin number estimated from single-copy marker gene analysis

2.2.6

2014

2019

https://sourceforge.net/projects/maxbin

MetaBat

4-mer frequencies, coverage

Modified K-medoids algorithm

1&2.13

2015

2020

https://bitbucket.org/berkeleylab/metabat/src/master

Groopm

coverage, contig’s length, tetranucleotide frequency

Two way clustering, Hough partitioning, self-organizing map

2

2014

2017

https://github.com/timbalam/GroopM

CONCOCT

k-mer frequencies, coverage

Gaussian mixture models, bin number determined by variable Bayesian

1.0.0

2014

2019

https://github.com/BinPro/CONCOCT

MyCC

k-mer frequencies, coverage (optional), universal single-copy genes

Affinity propagation

1

2016

2017

https://sourceforge.net/projects/sb2nhri

MetaWatt

tetranucleotide frequency, coverage

Firstly clustering by empirical relationship of the average standard deviation at tetranucleotide frequency mean, then employing interpolated Markov models

3.5.3

2012

2016

https://sourceforge.net/projects/metawatt

BMC3C

frequency variation of oligonucleotides, coverage, codon usage

Ensemble k-means, construct a weigh graph and partition it by Normalized cuts [49, 50]

\

2018

2018

http://mlda.swu.edu.cn/codes.php?name = BMC3C

Binsanity

coverage, tetranucleotide frequency, percent GC content

Affinity propagation

0.2.8

2017

2020

https://github.com/edgraham/BinSanity

Autometa

sequence homology, single-copy genes, 5-mer frequency, coverage, single-copy genes

Lowest common ancestor analysis, DBSCAN algorithm, supervised decision tree classifier recruite unclustered contigs

\

2019

2020

https://bitbucket.org/jason_c_kwan/autometa/src/master

COCACOLA

k-mer frequency, coverage, co-alignment, paired-end read linkage

K-means based on L1 distance, non-negative matrix factorization with sparse regularization, hierarchical clustering

\

2017

2017

https://github.com/younglululu/COCACOLA

SolidBin-naive

single-copy mark genes, tetranucleotide frequencies, coverage, pairwise constraints

Semi-supervised spectral Normalized cut

1.1

2019

2020

https://github.com/sufforest/SolidBin

Vamb

​tetranucleotide frequencies, coverage

Variational autoencoders, iterative medoid clustering algorithm

2.0.1

2018

2020

https://github.com/RasmussenLab/vamb

DAS Tool

original binner output bin sets

Refine bins according shared contigs between two original binner results

1.1.1

2018

2019

https://github.com/cmks/DAS_Tool

MetaWrap

original binner output bin sets

Separating every pair of contigs in different bins, selecting the best bin sets according completion and contamination

1.2.2

2018

2019

https://github.com/bxlab/metaWRAP

Binning_refiner

original binner output bin sets, single-copy genes

Scoring bins based on single-copy genes and picking up high-score bins iteratively

1.4.0

2017

2019

https://github.com/songweizhi/Binning_refiner