Cost-sensitive Selection of Variables by Ensemble of Model Sequences

  title={Cost-sensitive Selection of Variables by Ensemble of Model Sequences},
  author={Donghui Yan and Zhiwei Qin and Songxiang Gu and Haiping Xu and Ming Shao},
  journal={Knowl. Inf. Syst.},
Many applications require the collection of data on different variables or measurements over many system performance metrics. We term those broadly as measures or variables. Often data collection along each measure incurs a cost, thus it is desirable to consider the cost of measures in modeling. This is a fairly new class of problems in the area of cost-sensitive learning. A few attempts have been made to incorporate costs in combining and selecting measures. However, existing studies either do… 
A Deep Neural Network Based Approach to Building Budget-Constrained Models for Big Data Analysis
This paper introduces an approach to eliminating less important features for big data analysis using Deep Neural Networks (DNNs), and removes some input features to bring the model cost within a given budget.


The Foundations of Cost-Sensitive Learning
It is argued that changing the balance of negative and positive training examples has little effect on the classifiers produced by standard Bayesian and decision tree learning methods, and the recommended way of applying one of these methods is to learn a classifier from the training set and then to compute optimal decisions explicitly using the probability estimates given by the classifier.
Cost-sensitive learning by cost-proportionate example weighting
Costing is proposed, a method based on cost-proportionate rejection sampling and ensemble aggregation, which achieves excellent predictive performance on two publicly available datasets, while drastically reducing the computation required by other methods.
Experiments with a New Boosting Algorithm
This paper describes experiments carried out to assess how well AdaBoost with and without pseudo-loss, performs on real learning problems and compared boosting to Breiman's "bagging" method when used to aggregate various classifiers.
Higher criticism thresholding: Optimal feature selection when useful features are rare and weak
In the most challenging RW settings, HCT uses an unconventionally low threshold, which keeps the missed-feature detection rate under better control than FDRT and yields a classifier with improved misclassification performance.
DC2: A Divide-and-conquer Algorithm for Large-scale Kernel Learning with Application to Clustering
The DC2 algorithm is proposed, which achieves the efficiency of sampling-based large scale kernel methods while enabling parallel multicore or clustered computation and is as accurate as some fastest approximate spectral clustering algorithms while maintaining a running time close to that of K-means clustering.
Random Forests
Internal estimates monitor error, strength, and correlation and these are used to show the response to increasing the number of features used in the forest, and are also applicable to regression.