Corpus ID: 232380320

Predictive and explanatory models might miss informative features in educational data

@inproceedings{Young2021PredictiveAE,
  title={Predictive and explanatory models might miss informative features in educational data},
  author={Nicholas T. Young and M. Caballero},
  year={2021}
}
We encounter variables with little variation often in educational data mining (EDM) and discipline-based education research (DBER) due to the demographics of higher education and the questions we ask. Yet, little work has examined how to analyze such data. Therefore, we conducted a simulation study using logistic regression, penalized regression, and random forest. We systematically varied the fraction of positive outcomes, feature imbalances, and odds ratios. We find the algorithms treat… Expand

References

SHOWING 1-10 OF 116 REFERENCES
Who's Learning? Using Demographics in EDM Research
  • 2
  • PDF
Prediction of default probability by using statistical models for rare events
  • 3
  • PDF
Logistic Regression in Rare Events Data
  • 3,084
  • PDF
Sample size for binary logistic prediction models: Beyond events per variable criteria
  • 104
Please Stop Permuting Features: An Explanation and Alternatives
  • 42
  • PDF
Performance of logistic regression modeling: beyond the number of events per variable, the role of data structure.
  • 177
Review and evaluation of penalised regression methods for risk prediction in low‐dimensional data with few events
  • 101
  • PDF
Penalized logistic regression with low prevalence exposures beyond high dimensional settings
  • 14
  • PDF
...
1
2
3
4
5
...