Scoring levels of categorical variables with heterogeneous data

  title={Scoring levels of categorical variables with heterogeneous data},
  author={Eugene Tuv and George C. Runger},
  journal={IEEE Intelligent Systems},
Heterogeneous (mixed-type) data present significant challenges in both supervised and unsupervised learning. The situation is even more complicated when nominal variables have several levels (values) that make using indicator variables (for every categorical level) infeasible. With unsupervised learning, several fairly involved, computationally intensive, nonlinear multivariate techniques iteratively alternate data transformations with optimal scoring. These seek to optimize an objective on the… CONTINUE READING


Publications citing this paper.


Publications referenced by this paper.
Showing 1-9 of 9 references

The Elements of Statistical Learning: Data Mining, Inference and Prediction, SpringerVerlag

  • T. Hastie, R. Tibshirani, J. Friedman
  • 2001

Multivariate Adaptive Regression Splines (with discussion),

  • J. H. Friedman
  • Annals of Statistics,
  • 1991

The Principal Components of Mixed Measurement Level Multivariate Data: An Alternating Least Squares Method with Optimal Scaling Features,

  • F. W. Young, Y. Takane, J. de Leeuw
  • Psychometrika, vol
  • 1978

Similar Papers

Loading similar papers…