TY - JOUR AU - Misof, Bernhard AU - Meyer, Benjamin AU - von Reumont, Björn Marcus AU - Kück, Patrick AU - Misof, Katharina AU - Meusemann, Karen PY - 2013 DA - 2013/12/03 TI - Selecting informative subsets of sparse supermatrices increases the chance to find correct trees JO - BMC Bioinformatics SP - 348 VL - 14 IS - 1 AB - Character matrices with extensive missing data are frequently used in phylogenomics with potentially detrimental effects on the accuracy and robustness of tree inference. Therefore, many investigators select taxa and genes with high data coverage. Drawbacks of these selections are their exclusive reliance on data coverage without consideration of actual signal in the data which might, thus, not deliver optimal data matrices in terms of potential phylogenetic signal. In order to circumvent this problem, we have developed a heuristics implemented in a software called mare which (1) assesses information content of genes in supermatrices using a measure of potential signal combined with data coverage and (2) reduces supermatrices with a simple hill climbing procedure to submatrices with high total information content. We conducted simulation studies using matrices of 50 taxa × 50 genes with heterogeneous phylogenetic signal among genes and data coverage between 10-30%. SN - 1471-2105 UR - https://doi.org/10.1186/1471-2105-14-348 DO - 10.1186/1471-2105-14-348 ID - Misof2013 ER -