PUMiner: Mining Security Posts from Developer Question and Answer Websites with PU Learning

@article{Le2020PUMinerMS,
  title={PUMiner: Mining Security Posts from Developer Question and Answer Websites with PU Learning},
  author={T. H. Le and David Hin and Roland Croft and M. Babar},
  journal={Proceedings of the 17th International Conference on Mining Software Repositories},
  year={2020}
}
  • T. H. Le, David Hin, +1 author M. Babar
  • Published 2020
  • Computer Science
  • Proceedings of the 17th International Conference on Mining Software Repositories
Security is an increasing concern in software development. Developer Question and Answer (Q&A) websites provide a large amount of security discussion. Existing studies have used human-defined rules to mine security discussions, but these works still miss many posts, which may lead to an incomplete analysis of the security practices reported on Q&A websites. Traditional supervised Machine Learning methods can automate the mining process; however, the required negative (non-security) class is too… Expand
1 Citations
Challenges in Docker Development: A Large-scale Study Using Stack Overflow
  • 1
  • PDF

References

SHOWING 1-10 OF 18 REFERENCES
What Security Questions Do Developers Ask? A Large-Scale Study of Stack Overflow Posts
  • 68
  • Highly Influential
  • PDF
What are developers talking about? An analysis of topics and trends in Stack Overflow
  • 357
  • Highly Influential
  • PDF
Security and emotion: sentiment analysis of security discussions on GitHub
  • 115
  • Highly Influential
  • PDF
Identification of Cybersecurity Specific Content Using the Doc2Vec Language Model
  • 4
  • Highly Influential
Distributed Representations of Sentences and Documents
  • 5,948
  • Highly Influential
  • PDF
One-Class SVMs for Document Classification
  • 1,178
  • Highly Influential
  • PDF
Support Vector Method for Novelty Detection
  • 1,336
  • Highly Influential
  • PDF
XGBoost: A Scalable Tree Boosting System
  • 7,818
  • Highly Influential
  • PDF
LightGBM: A Highly Efficient Gradient Boosting Decision Tree
  • 1,653
  • Highly Influential
  • PDF
Latent Dirichlet Allocation
  • 26,703
  • Highly Influential
  • PDF
...
1
2
...