Task-wise Split Gradient Boosting Trees for Multi-center Diabetes Prediction

  title={Task-wise Split Gradient Boosting Trees for Multi-center Diabetes Prediction},
  author={Mingcheng Chen and Zhenghui Wang and Zhiyun Zhao and Weinan Zhang and Xiawei Guo and Jian Shen and Yanru Qu and Jieli Lu and Min Xu and Yu Xu and Tiange Wang and Mian Li and Weiwei Tu and Yong Yu and Yufang Bi and Weiqing Wang and Guang Ning},
  journal={Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery \& Data Mining},
  • Mingcheng Chen, Zhenghui Wang, +14 authors G. Ning
  • Published 14 August 2021
  • Computer Science
  • Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining
Diabetes prediction is an important data science application in the social healthcare domain. There exist two main challenges in the diabetes prediction task: data heterogeneity since demographic and metabolic data are of different types, data insufficiency since the number of diabetes cases in a single medical center is usually limited. To tackle the above challenges, we employ gradient boosting decision trees (GBDT) to handle data heterogeneity and introduce multi-task learning (MTL) to solve… Expand


Tree-Based Ensemble Multi-Task Learning Method for Classification and Regression
A new tree-based ensemble multi-task learning method for classication and regression (MT-ExtraTrees), based on Extremely Randomized Trees, which is able to share data between tasks minimizing negative transfer while keeping the ability to learn non-linear solutions and to scale well to large datasets. Expand
XGBoost: A Scalable Tree Boosting System
This paper proposes a novel sparsity-aware algorithm for sparse data and weighted quantile sketch for approximate tree learning and provides insights on cache access patterns, data compression and sharding to build a scalable tree boosting system called XGBoost. Expand
Clustered Multi-Task Learning Via Alternating Structure Optimization
The equivalence relationship between ASO and CMTL is shown, providing significant new insights into ASO as well as their inherent relationship, and the proposed convex CMTl formulation is significantly more efficient especially for high-dimensional data. Expand
An accelerated gradient method for trace norm minimization
This paper exploits the special structure of the trace norm, based on which it is proposed an extended gradient algorithm that converges as O(1/k) and proposes an accelerated gradient algorithm, which achieves the optimal convergence rate of O( 1/k2) for smooth problems. Expand
Product-Based Neural Networks for User Response Prediction over Multi-Field Categorical Data
This article discusses an insensitive gradient issue in DNN-based models and proposes Product-based Neural Network, which adopts a feature extractor to explore feature interactions and Generalizing the kernel product to a net-in-net architecture, which can generalize previous models. Expand
  • 2018
  • 2017
  • 2017
A Survey on Multi-Task Learning
  • Yu Zhang, Qiang Yang
  • Computer Science, Mathematics
  • ArXiv
  • 2017
A survey for MTL is given, which classifies different MTL algorithms into several categories, including feature learning approach, low-rank approach, task clustering approaches, task relation learning approaches, and decomposition approach, and then discusses the characteristics of each approach. Expand
A decision tree framework for understanding blast-induced mild Traumatic Brain Injury in a military medical database
A decision tree classifier is developed to classify symptom progression based on the outputs from the Constrained Spectral Partitioning algorithm, and results are presented that illustrate the adaptability of these algorithms for utilization as decision rules for the treatment of patients following blast-induced mTBI. Expand