Feature selection and prediction with a Markov blanket structure learning algorithm

Tan, Yuan; Liu, Zhifa

doi:10.1186/1471-2105-14-S17-A3

Volume 14 Supplement 17

Proceedings of the 12th Annual UT-ORNL-KBRIN Bioinformatics Summit 2013

Meeting abstract
Open access
Published: 22 October 2013

Feature selection and prediction with a Markov blanket structure learning algorithm

Yuan Tan¹ &
Zhifa Liu²

BMC Bioinformatics volume 14, Article number: A3 (2013) Cite this article

3883 Accesses
4 Citations
1 Altmetric
Metrics details

Background

Classification and prediction are common tasks in machine learning. For example, many studies have attempted to predict gene expression given information, such as DNA sequence, expression of other genes or epigenetic modifications. Many existing methods, such as neural networks and support vector machines, have been used to make these predictions. Unfortunately, these black box techniques offer little insight into the reasoning behind the predictions. In many cases, relatively few attributes contribute to the classification accuracy. Bayesian networks explicitly encode the relationships among attributes to make predictions. In a Bayesian network, the Markov blanket (MB) of the class variable gives all of the information necessary to predict its value. In this work, we propose an algorithm to learn only the MB of the class variable; other attributes are removed. Therefore, our algorithm combines classification and feature selection. Results on benchmark machine learning datasets indicate that our feature selection technique usually reduces the size of the dataset more than 80% on some datasets. Accuracy results suggest that the classification ability of our algorithm is competitive with existing state of the art techniques.

Materials and methods

In a classification problem, we are given a dataset consisting of a set of attributes A and a class variable C. Furthermore, the dataset is split into a training set D_tr and a testing set D_te. The goal is to learn a classifier from D_tr that correctly predicts C in D_te. In this study, we compared the performance of our Markov Blanket structure with other classical classifiers such as C4.5 [1] , optimal Bayesian network [2], and Tree Augmented Naïve Bayes Network [3] and Markov Blank Hill Climbing [4]. Here is a general introduction for those classifiers.

Markov blanket feature selection algorithm

The intuition of this algorithm is that an attribute is either a parent, child or spouse of C, or the attribute is not in C’s MB. Hence we only add each attribute to the MB according to an ordering and score for the new network. And we do not add attributes that make the score worse. Meanwhile, we keep the MB with the best score among all orderings. To this end, our algorithm performs feature selection and finds MB structure which has the maximum classification ability. The return structure is a Bayesian classifier for classification variable C. The general idea of the algorithm is shown in the Figure 1.

Experiments

We compared our feature selection algorithm to several state of the art classification methods on several benchmark datasets. All of the classification methods we selected learn a “human readable” model. In order to represent a wide variety of data domains, we downloaded 14 datasets from UCI machine learning repository [5]. The data processing and the classification steps in Figure 2 was followed a similar data procedure in [6].

Results

As shown in Figure 3 for compression ratio of these benchmark datasets, our feature selection often achieved quite high compression ratios by ignoring attributes which do not help predict C. From this, we infer that only a few attributes are necessary to predict C.

The accuracy results in Figure 4 demonstrate that, despite compressing the data over 80% in some cases, MB feature selection is still competitive in terms of accuracy with state of the art methods.

Discussion and conclusions

The compression ratio decreases as the number of variables in the dataset increases. This suggests that, even as dataset sizes increase, only a few attributes are helpful in predicting the class variable. The compression ratio is unaffected by the number of records in the dataset. This suggests that even when given many records, our algorithm does not pick many attributes in an attempt to overfit the dataset. Ignoring unimportant attributes does not significantly affect the classification accuracy. Despite compressing the data on average more than 70%, the classification accuracy is rarely more that 5% below the best classifier. Identifying MB variables could significantly reduce the cost of diagnostic lab tests by focusing interest on only the most relevant attributes.

References

Quinlan JR: C4.5: programs for machine learning. Machine Learning. 1994, 16 (3): 235-240.
Google Scholar
Malone B, Yuan C, Hansen E, Bridges S: Improving the scalability of optimal Bayesian network learning with external-memory frontier breadth-first branch and bound search. Proceedings of the Twenty-Seventh Annual Conference on Uncertainty in Artificial Intelligence. Edited by: Fabio G. Cozman and Avi Pfeffer. 2011, Barcelona: AUAI Press, 2011: 479-488.
Google Scholar
Friedman N, Geiger D, Goldszmidt M: Bayesian network classifiers. Machine Learning. 1997, 29: 131-163. 10.1023/A:1007465528199.
Article Google Scholar
Tsamardinos I, Brown LE, Aliferis CF: The max-min hill-climbing Bayesian network structure learning algorithm. Machine Learning. 2006, 65 (1): 31-78. 10.1007/s10994-006-6889-7.
Article Google Scholar
Bache K, Lichman M: UCI Machine Learning Repository. [http://archive.ics.uci.edu/ml]
Liu Z, Malone B, Yuan C: Empirical evaluation of scoring functions for Bayesian network model selection. BMC Bioinformatics. 2012, 13 (Suppl 15): S14-10.1186/1471-2105-13-S15-S14.
Article PubMed Central PubMed Google Scholar

Download references

Author information

Authors and Affiliations

Department of Statistics, Mississippi State University, Starkville, MS, 39759, USA
Yuan Tan
Department of Biostatistics, St. Jude Children’s Research Hospital, Memphis, TN, 38105, USA
Zhifa Liu

Authors

Yuan Tan
View author publications
You can also search for this author in PubMed Google Scholar
Zhifa Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhifa Liu.

Rights and permissions

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Tan, Y., Liu, Z. Feature selection and prediction with a Markov blanket structure learning algorithm. BMC Bioinformatics 14 (Suppl 17), A3 (2013). https://doi.org/10.1186/1471-2105-14-S17-A3

Download citation

Published: 22 October 2013
DOI: https://doi.org/10.1186/1471-2105-14-S17-A3

Proceedings of the 12th Annual UT-ORNL-KBRIN Bioinformatics Summit 2013

Feature selection and prediction with a Markov blanket structure learning algorithm

Background

Materials and methods

Markov blanket feature selection algorithm

Experiments

Results

Discussion and conclusions

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

BMC Bioinformatics

Contact us

Proceedings of the 12th Annual UT-ORNL-KBRIN Bioinformatics Summit 2013

Feature selection and prediction with a Markov blanket structure learning algorithm

Background

Materials and methods

Markov blanket feature selection algorithm

Experiments

Results

Discussion and conclusions

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

BMC Bioinformatics

Contact us