AI Total: Analyzing Security ML Models with Imperfect Data in Production

  title={AI Total: Analyzing Security ML Models with Imperfect Data in Production},
  author={Awalin Sopan and Konstantin Berlin},
  journal={2021 IEEE Symposium on Visualization for Cyber Security (VizSec)},
Development of new machine learning models is typically done on manually curated data sets, making them unsuitable for evaluating the models’ performance during operations, where the evaluation needs to be performed automatically on incoming streams of new data. Unfortunately, pure reliance on a fully automatic pipeline for monitoring model performance makes it difficult to understand if any observed performance issues are due to model performance, pipeline issues, emerging data distribution… 

Figures from this paper


Learning from Context: A Multi-View Deep Learning Architecture for Malware Detection
A multi-view deep neural network architecture is proposed, which takes feature vectors from the PE file content as well as corresponding file paths as inputs and outputs a detection score, and finds that this model learns useful aspects of the file path for classification, resulting in a 26.6% improvement in the true positive rate at a 0.001 false positive rate (FPR) and a 64.
Building a Machine Learning Model for the SOC, by the Input from the SOC, and Analyzing it for the SOC
This work has developed a system that shows the prediction for an alert and the prediction explanation to security analysts during their daily workflow of investigating individual security alerts and presents the aggregated model analytics to managers and stakeholders to help them understand the model and decide on when to trust the model.
Software Engineering for Machine Learning: A Case Study
  • Saleema Amershi, A. Begel, +6 authors T. Zimmermann
  • Computer Science
    2019 IEEE/ACM 41st International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP)
  • 2019
A study conducted on observing software teams at Microsoft as they develop AI-based applications finds that various Microsoft teams have united this workflow into preexisting, well-evolved, Agile-like software engineering processes, providing insights about several essential engineering challenges that organizations may face in creating large-scale AI solutions for the marketplace.
Deep neural network based malware detection using two dimensional binary program features
A deep neural network based malware detection system that Invincea has developed is introduced, which achieves a usable detection rate at an extremely low false positive rate and scales to real world training example volumes on commodity hardware.
A Survey of Visualization Systems for Malware Analysis
A systematic overview and categorization of malware visualization systems from the perspective of visual analytics is provided, which helps to reveal data types that are currently underrepresented, enabling new research opportunities in the visualization community.
ModelTracker: Redesigning Performance Analysis Tools for Machine Learning
ModelTracker is presented, an interactive visualization that subsumes information contained in numerous traditional summary statistics and graphs while displaying example-level performance and enabling direct error examination and debugging.
The ML test score: A rubric for ML production readiness and technical debt reduction
This paper presents 28 specific tests and monitoring needs, drawn from experience with a wide range of production ML systems to help quantify these issues and present an easy to follow road-map to improve production readiness and pay down ML technical debt.
The goods, the bads and the uglies: Supporting decisions in malware detection through visual analytics
The paper addresses the problem presenting a visual analytics solution supporting the analysis of the classification system presented in AMICO, providing the user with a better understanding of the Classification decisions and the possibility of changing the classification results.
“Everyone wants to do the model work, not the data work”: Data Cascades in High-Stakes AI
This paper defines, identifies, and presents empirical evidence on Data Cascades—compounding events causing negative, downstream effects from data issues—triggered by conventional AI/ML practices that undervalue data quality.
A Survey on Deep Learning with Noisy Labels: How to train your model when you cannot trust on the annotations?
  • F. Cordeiro, G. Carneiro
  • Computer Science
    2020 33rd SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI)
  • 2020
A survey on the main techniques in literature to improve the training of deep learning models in the presence of noisy labels is presented, in which the algorithm is classified in the following groups: robust losses, sample weighting, sample selection, meta-learning, and combined approaches.