Unsupervised dimensionality reduction versus supervised regularization for classification from sparse data

@article{Clark2019UnsupervisedDR,
  title={Unsupervised dimensionality reduction versus supervised regularization for classification from sparse data},
  author={Jessica Clark and F. Provost},
  journal={Data Mining and Knowledge Discovery},
  year={2019},
  volume={33},
  pages={871-916}
}
Unsupervised matrix-factorization-based dimensionality reduction (DR) techniques are popularly used for feature engineering with the goal of improving the generalization performance of predictive models, especially with massive, sparse feature sets. Often DR is employed for the same purpose as supervised regularization and other forms of complexity control: exploiting a bias/variance tradeoff to mitigate overfitting. Contradicting this practice, there is consensus among existing expert… CONTINUE READING

Citations

Publications citing this paper.

A benchmarking study of classification techniques for behavioral data

  • International Journal of Data Science and Analytics
  • 2019
VIEW 7 EXCERPTS
CITES BACKGROUND & RESULTS

References

Publications referenced by this paper.
SHOWING 1-10 OF 55 REFERENCES

Private traits and attributes are predictable from digital records of human behavior.

  • Proceedings of the National Academy of Sciences of the United States of America
  • 2013
VIEW 5 EXCERPTS
HIGHLY INFLUENTIAL

Scikit-learn: Machine Learning in Python

  • J. Mach. Learn. Res.
  • 2011
VIEW 6 EXCERPTS
HIGHLY INFLUENTIAL

Principles of Data Mining

VIEW 3 EXCERPTS
HIGHLY INFLUENTIAL

The elements of statistical learning, vol 1. Springer series in statistics

J Friedman, T Hastie, R Tibshirani
  • 2001
VIEW 5 EXCERPTS
HIGHLY INFLUENTIAL