An Improved IAMB Algorithm for Markov Blanket Discovery

  title={An Improved IAMB Algorithm for Markov Blanket Discovery},
  author={Yishi Zhang and Zigang Zhang and Kaijun Liu and Gangyi Qian},
  journal={J. Comput.},
Finding an efficient way to discover Markov blanket is one of the core issue s in data mining. This paper first discusses the problems existed in IAMB algorithm which is a typical algorithm for discovering the Markov b lanket of a target variable from the training dat a, and then proposes an improved algorithm λ -IAMB based on the improving approach which contains two aspects: code optimization and the improving strategy for conditional independence testing. E xperimental results show that… 

Figures and Tables from this paper

Markov blanket: Efficient strategy for feature subset selection method for high dimensional microarray cancer datasets

The results show that the performance measures of classification algorithms based on Markov Blanket model mostly offer better accuracy rates than other types of classical classification algorithms for the cancer Microarray datasets.

Detecting high-dimensional genetic associations using a Markov-Blanket in a family-based study

This proposed MB-TDT method can identify a minimal set of causal SNPs, associated with a specific disease, thus avoiding an exhaustive search and shows its superior high power in many cases, and lower false positive rates, in others.

Stochastic Complexity for Testing Conditional Independence on Discrete Data

It is shown that the proposed test, SCI, is an asymptotically unbiased as well as L2 consistent estimator for conditional mutual information (CMI) and can be reformulated to find a sensible threshold for CMI that works well on limited samples.

Approximating Algorithmic Conditional Independence for Discrete Data

This work proposes a new conditional independence test based on the notion of algorithmic independence, SCI, which is an asymptotically unbiased estimator for conditional mutual information (CMI) as well as L2 consistent and can be reformulated to find a sensible threshold for CMI that works well given only limited data.

Causal Discovery by Telling Apart Parents and Children

This work shows how through algorithmic information theory SCI, a highly robust, effective and computationally efficient test for conditional independence, can be obtained, and outperforms the state of the art when applied in constraint-based inference methods such as stable PC.

Inferring Gene Regulatory Networks Using the Improved Markov Blanket Discovery Algorithm

A novel network inference method based on the improved Markov blanket discovery algorithm, IMBDANET, is proposed to infer GRNs and experimental results show that the proposed method can be effectively used to inferGRNs.

Testing Conditional Independence on Discrete Data using Stochastic Complexity

A new test based on the notion of algorithmic independence that the authors instantiate using stochastic complexity is proposed, SCI, which is an asymptotically unbiased as well as L2 consistent estimator for conditional mutual information (CMI).

Swamping and masking in Markov boundary discovery

A theoretical improvement on LCMB is made and the experimental results reveal that LRH is much more efficient than the existing two LCMB algorithms and that WLCMB can further improve LCMB.

Improved IAMB with Expanded Markov Blanket for High-Dimensional Time Series Prediction ∗

Empirical results show that the method based on EMB for macroeconomic prediction has less mean-square forecast error than other classic methods, especially when predicting the value with sharp fluctua- tion.

Markov Boundary Discovery Based on Variant Ridge Regularized Linear Models

A novel variant of ridge regularized linear models (VRRLMs) is presented to identify a subset of Markov boundary from data sets with collinear and non-collinear variables and the relationship between covariance matrix andCollinearity of variables in the theory is revealed.



Speculative Markov blanket discovery for optimal feature selection

A novel algorithm for the induction of Markov blankets from data, called Fast-IAMB, that employs a heuristic to quickly recover the Markov blanket and performs in many cases faster and more reliably than existing algorithms without adversely affecting the accuracy of the recovered Markov covers.

Algorithms for Large Scale Markov Blanket Discovery

A low-order polynomial algorithm and several variants that soundly induce the Markov Blanket under certain broad conditions in datasets with thousands of variables are introduced and compared to other state-of-the-art local and global methods with excellent results.

HITON: A Novel Markov Blanket Algorithm for Optimal Variable Selection

A novel, sound, sample-efficient, and highly-scalable algorithm for variable selection for classification, regression and prediction called HITON, which reduces the number of variables in the prediction models by three orders of magnitude relative to the original variable set while improving or maintaining accuracy.

Programs for Machine Learning

In his new book, C4.5: Programs for Machine Learning, Quinlan has put together a definitive, much needed description of his complete system, including the latest developments, which will be a welcome addition to the library of many researchers and students.

Bayesian Network Induction via Local Neighborhoods

This work presents an efficient algorithm for learning Bayes networks from data by first identifying each node's Markov blankets, then connecting nodes in a maximally consistent way, and proves that under mild assumptions, the approach requires time polynomial in the size of the data and the number of nodes.

Selection of Relevant Features in Machine Learning

This paper describes the problem of selecting rele- vant features for use in machine learning in terms of heuristic search through a space of feature sets, and identifies four dimensions along which approaches to the problem can vary.

A Bayesian method for the induction of probabilistic networks from data

This paper presents a Bayesian method for constructing probabilistic networks from databases, focusing on constructing Bayesian belief networks, and extends the basic method to handle missing data and hidden variables.

Toward Optimal Feature Selection

An efficient algorithm for feature selection which computes an approximation to the optimal feature selection criterion is given, showing that the algorithm effectively handles datasets with a very large number of features.

Comparing Bayesian Network Classifiers

Experimental results show the obtained classifiers are competitive with (or superior to) the best known classifiers, based on both Bayesian networks and other formalisms; and that the computational time for learning and using these classifiers is relatively small.

Toward integrating feature selection algorithms for classification and clustering

  • Huan LiuLei Yu
  • Computer Science
    IEEE Transactions on Knowledge and Data Engineering
  • 2005
With the categorizing framework, the efforts toward-building an integrated system for intelligent feature selection are continued, and an illustrative example is presented to show how existing feature selection algorithms can be integrated into a meta algorithm that can take advantage of individual algorithms.