A Unified Framework for Knowledge Intensive Gradient Boosting: Leveraging Human Experts for Noisy Sparse Domains

  title={A Unified Framework for Knowledge Intensive Gradient Boosting: Leveraging Human Experts for Noisy Sparse Domains},
  author={Harsha Kokel and Phillip Odom and Shuo Yang and S. Natarajan},
Incorporating richer human inputs including qualitative constraints such as monotonic and synergistic influences has long been adapted inside AI. Inspired by this, we consider the problem of using such influence statements in the successful gradient-boosting framework. We develop a unified framework for both classification and regression settings that can both effectively and efficiently incorporate such constraints to accelerate learning to a better model. Our results in a large number of… Expand
Lightweight surrogate random forest support for model simplification and feature relevance
It is proved experimentally that the proposed lightweight surrogate random forest algorithm achieves a more effective performance than black box AI models in terms of model transparency and memory requirement, as well as the interpretation of the feature relevance. Expand
A Probabilistic Approach to Extract Qualitative Knowledge for Early Prediction of Gestational Diabetes
This work applies the Qualitative Knowledge Extraction method toward early prediction of gestational diabetes on clinical study data and empirical results demonstrate that the extracted rules are both interpretable and valid. Expand
Knowledge Infused Policy Gradients for Adaptive Pandemic Control
A mathematical framework for KIPG methods is introduced that can induce relevant feature counts over multi-relational features of the world, handle latent non-homogeneity in different contributing features across the pandemic timeline and infuse knowledge as functional constraints in a principled manner. Expand
Towards advancing the earthquake forecasting by machine learning of satellite data
This paper investigates physical and dynamic changes of seismic data and develops a novel machine learning method, namely Inverse Boosting Pruning Trees (IBPT), to issue short-term forecast based on the satellite data of 1371 earthquakes of magnitude six or above due to their impact on the environment. Expand
Cost-sensitive Boosting Pruning Trees for depression detection on Twitter.
It is argued that it is feasible to identify depression at an early stage by mining online social behaviours by using a novel classifier, namely, Cost-sensitive Boosting Pruning Trees (CBPT), which demonstrates a strong classification ability on two publicly accessible Twitter depression detection datasets. Expand


Deep Lattice Networks and Partial Monotonic Functions
Experiments show that six-layer monotonic deep lattice networks achieve state-of-the art performance for classification and regression with monotonicity guarantees. Expand
Human-Guided Learning for Probabilistic Logic Models
The empirical evidence shows that human advice can effectively accelerate learning in noisy, structured domains where so far humans have been merely used as labelers or as designers of the (initial or final) structure of the model. Expand
Learning from Imbalanced Data in Relational Domains: A Soft Margin Approach
This work modifications the objective function of the learning problem to explicitly include the trade-off between false positives and negatives and shows empirically that this approach is more successful in handling the class imbalance problem than the original framework that weighed all the examples equally. Expand
Learning from Sparse Data by Exploiting Monotonicity Constraints
This paper shows how to interpret knowledge of qualitative influences, and in particular of monotonicities, as constraints on probability distributions, and to incorporate this knowledge into Bayesian network learning algorithms. Expand
Effective Monotone Knowledge Integration in Kernel Support Vector Machines
The results show that the proposed techniques can significantly improve accuracy when the unconstrained model is not already fully monotone, which often occurs at smaller sample sizes. Expand
The Adviceptron: Giving Advice to the Perceptron
A novel approach for incorporating prior knowledge into the perceptron taking into account both label feedback and prior knowledge, in the form of soft polyhedral advice, to make increasingly accurate predictions on subsequent rounds. Expand
Knowledge Intensive Learning: Combining Qualitative Constraints with Causal Independence for Parameter Learning in Probabilistic Models
This work derives an algorithm based on gradient descent for estimating the parameters of a Bayesian network in the presence of causal independencies in the form of Noisy-Or and qualitative constraints such as monotonicities and synergies. Expand
XGBoost: A Scalable Tree Boosting System
This paper proposes a novel sparsity-aware algorithm for sparse data and weighted quantile sketch for approximate tree learning and provides insights on cache access patterns, data compression and sharding to build a scalable tree boosting system called XGBoost. Expand
Learning Bayesian Networks with qualitative constraints
  • Yan Tong, Q. Ji
  • Computer Science
  • 2008 IEEE Conference on Computer Vision and Pattern Recognition
  • 2008
A closed-form solution to systematically combine the limited training data with some generic qualitative knowledge for BN parameter learning that can robustly and accurately estimate the BN model parameters. Expand
Knowledge-Based Artificial Neural Networks
These tests show that the networks created by KBANN generalize better than a wide variety of learning systems, as well as several techniques proposed by biologists. Expand