Lessons Learned Developing a Visual Analytics Solution for Investigative Analysis of Scamming Activities

  title={Lessons Learned Developing a Visual Analytics Solution for Investigative Analysis of Scamming Activities},
  author={Jay Koven and Cristian Felix and Hossein Siadati and Markus Jakobsson and Enrico Bertini},
  journal={IEEE Transactions on Visualization and Computer Graphics},
The forensic investigation of communication datasets which contain unstructured text, social network information, and metadata is a complex task that is becoming more important due to the immense amount of data being collected. Currently there are limited approaches that allow an investigator to explore the network, text and metadata in a unified manner. We developed Beagle as a forensic tool for email datasets that allows investigators to flexibly form complex queries in order to discover… Expand

Figures and Topics from this paper

Lessons Learned Developing and Extending a Visual Analytics Solution for Investigative Analysis of Scamming Activities
This paper has proposed and demonstrated via implementation, a few more visualizations that it feels would help in grouping and analyzing the e-mail data more efficiently, and presented a case study that shows the potential use of the tool in a real-world scenario. Expand
A benchmark for visual analysis of insider threat detection
  • Ying ZHAO, Kui YANG, +6 authors Xiaoping FAN
  • 2020
We introduce an open-source benchmark data set tailored to visual analytics community, called ITD-2018, which is specifically designed for insider threat detection domain. The background of theExpand
Visual Analytics of Anomalous User Behaviors: A Survey
This work surveys the state of art in visual analytics of anomalous user behaviors and classify them into four categories including social interaction, travel, network communication, and transaction, and examines the research works in each category in terms of data types, anomaly detection techniques, and visualization techniques. Expand
Communication Analysis through Visual Analytics: Current Practices, Challenges, and New Frontiers
A conceptual framework and design space is constructed based on the existing research landscape, technical considerations, and communication research that describes the properties, capabilities, and composition of communication analysis systems through 30 criteria in four analysis dimensions to pave the path for the formalization of digital communication analysis through visual analytics. Expand
Modeling a Functional Engine for the Opinion Mining as a Service using Compounded Score Computation and Machine Learning
This paper proposes a design framework of the evolution of the classification engine for opinion mining using score-based computation using a customized Vader algorithm and a machine learning model that supports a large corpus of unstructured text data classifications. Expand
A Hybrid Approach with Machine Learning Towards Opinion Mining for Complex Textual Content
The proposed paper introduces a novel solution where two variants of approaches has been used for opinion mining i.e. hybrid approach and machine learning approach in order to perform opinion mining from such complex textual content. Expand
ARGUS: Interactive Visual Analytics Framework for the Discovery of Disruptions in Bio-Behavioral Rhythms
An intuitive Rhythm Deviation score is designed that analyzes users’ smartphone sensor data, extracts underlying 24 hour rhythms and quantifies their degree of irregularity and is visualized using a glyph that makes it easy to recognize deviations and disruptions in the regularity of HBRs. Expand
ARGUS: Interactive visual analysis of disruptions in smartphone-detected Bio-Behavioral Rhythms
An intuitive Rhythm Deviation Score (RDS) is designed that analyzes users’ smartphone sensor data, extracts underlying twenty four hour rhythms and quantifies their degree of irregularity, and is visualized using a glyph that makes it easy to recognize disruptions in the regularity of HBRs. Expand


A Framework for the Forensic Investigation of Unstructured Email Relationship Data
It is posits that visualisation of unstructured data can greatly aid the examiner in their analysis of evidence discovered during an investigation, and demonstrates the applicability of the approach by applying relevant stages of the framework to the Enron email corpus. Expand
Forensic triage of email network narratives through visualisation
A novel approach that automates the visualisation of both quantitative data and qualitative data within emails to aid the triage of evidence during a forensics investigation is proposed. Expand
Knowledge Generation Model for Visual Analytics
A knowledge generation model for visual analytics is proposed that ties together these diverse frameworks, yet retains previously developed models (e.g., KDD process) to describe individual segments of the overall visual analytic processes. Expand
Combining Computational Analyses and Interactive Visualization for Document Exploration and Sensemaking in Jigsaw
A visual analytics approach that integrates multiple text analysis algorithms with a suite of interactive visualizations to provide a flexible and powerful environment that allows analysts to explore collections of documents while sensemaking. Expand
EmailTime: Visual analytics of emails
It is suggested that integrating both statistics and visualizations in order to display information about the email datasets may simplify its evaluation. Expand
An Insight-Based Longitudinal Study of Visual Analytics
The main focus of this work is to capture the entire analysts process that an analyst goes through from a raw data set to the insights sought from the data. Expand
Post-retrieval search hit clustering to improve information retrieval effectiveness: Two digital forensics case studies
A self-organizing neural network is used to conceptually cluster search hits retrieved during a real-world digital forensic investigation and indicates that the clustering process significantly reduces information retrieval overhead of the digital forensic text string search process. Expand
Collaborative Brushing and Linking for Co‐located Visual Analytics of Document Collections
Cambiera is presented, a tabletop visual analytics tool that supports individual and collaborative information foraging activities in large text document collections and defines collaborative brushing and linking as an awareness mechanism that enables analysts to follow their own hypotheses during collaborative sessions while still remaining aware of the group's activities. Expand
Adding Semantics to Email Clustering
A novel unsupervised approach is put forward which treats GSPs as pseudo class labels and conduct email clustering in a supervised manner, although no human labeling is involved, which is expected to improve the clustering performance. Expand
Discovering important nodes through graph entropy the case of Enron email database
This work exploits an information theoretic model that combines information theory with statistical techniques from area of text mining and natural language processing to show how entropy models on graphs are relevant to study of information flow in an organization. Expand