On the Existence and Significance of Data Preprocessing Biases in Web-Usage Mining

@article{Zheng2003OnTE,
  title={On the Existence and Significance of Data Preprocessing Biases in Web-Usage Mining},
  author={Zhiqiang Zheng and Balaji Padmanabhan and Steven Orla Kimbrough},
  journal={INFORMS Journal on Computing},
  year={2003},
  volume={15},
  pages={148-170}
}
The literature on web-usage mining is replete with data preprocessing techniques, which correspond to many closely related problem formulations. We survey datapreprocessing techniques for session-level pattern discovery and compare three of these techniques in the context of understanding session-level purchase behavior on the web. Using real data collected from 20,000 users’ browsing behavior over a period of six months, four different models (linear regressions, logistic regressions, neural… CONTINUE READING