EMBER: An Open Dataset for Training Static PE Malware Machine Learning Models

@article{Anderson2018EMBERAO,
  title={EMBER: An Open Dataset for Training Static PE Malware Machine Learning Models},
  author={Hyrum S. Anderson and Phil Roth},
  journal={CoRR},
  year={2018},
  volume={abs/1804.04637}
}
This paper describes EMBER: a labeled benchmark dataset for training machine learning models to statically detect malicious Windows portable executable files. The dataset includes features extracted from 1.1M binary files: 900K training samples (300K malicious, 300K benign, 300K unlabeled) and 200K test samples (100K malicious, 100K benign). To accompany the dataset, we also release open source code for extracting features from additional binaries so that additional sample features can be… CONTINUE READING

References

Publications referenced by this paper.
SHOWING 1-10 OF 27 REFERENCES

How to build a malware classifier

  • J. Seymour, C. Nicholas
  • In Security Education Conference Toronto,
  • 2016
1 Excerpt

Similar Papers

Loading similar papers…