TY - JOUR AU - Austerlitz, Frederic AU - David, Olivier AU - Schaeffer, Brigitte AU - Bleakley, Kevin AU - Olteanu, Madalina AU - Leblois, Raphael AU - Veuille, Michel AU - Laredo, Catherine PY - 2009 DA - 2009/11/10 TI - DNA barcode analysis: a comparison of phylogenetic and statistical classification methods JO - BMC Bioinformatics SP - S10 VL - 10 IS - 14 AB - DNA barcoding aims to assign individuals to given species according to their sequence at a small locus, generally part of the CO1 mitochondrial gene. Amongst other issues, this raises the question of how to deal with within-species genetic variability and potential transpecific polymorphism. In this context, we examine several assignation methods belonging to two main categories: (i) phylogenetic methods (neighbour-joining and PhyML) that attempt to account for the genealogical framework of DNA evolution and (ii) supervised classification methods (k-nearest neighbour, CART, random forest and kernel methods). These methods range from basic to elaborate. We investigated the ability of each method to correctly classify query sequences drawn from samples of related species using both simulated and real data. Simulated data sets were generated using coalescent simulations in which we varied the genealogical history, mutation parameter, sample size and number of species. SN - 1471-2105 UR - https://doi.org/10.1186/1471-2105-10-S14-S10 DO - 10.1186/1471-2105-10-S14-S10 ID - Austerlitz2009 ER -