• Publications
  • Influence
Identifying Authorship by Byte-Level N-Grams: The Source Code Author Profile (SCAP) Method
TLDR
We present a new approach, called the SCAP (Source Code Author Profiles) approach, based on byte-level n-gram profiles representing the source code author’s style. Expand
  • 108
  • 11
  • PDF
Effective identification of source code authors using byte-level information
TLDR
We present a new approach, called the SCAP (Source Code Author Profiles) approach, based on byte-level n-gram profiles in order to represent a source code author's style. Expand
  • 88
  • 5
  • PDF
Source Code Author Identification Based on N-gram Author Profiles
TLDR
We present a new approach, called the SCAP (Source Code Author Profiles) approach, based on byte-level n-gram profiles in order to represent a source code author’s style. Expand
  • 44
  • 5
  • PDF
Author Identification in Imbalanced Sets of Source Code Samples
TLDR
We present a systematic experimental study of source code author identification in skewed training sets where the training samples are unequally distributed over the candidate authors. Expand
  • 7
  • 3
  • PDF
Source Code Authorship Analysis For Supporting the Cybercrime Investigation Process
TLDR
We present the set of tools and techniques used to achieve the goal of authorship identification and a new taxonomy on source code authorship analysis. Expand
  • 50
  • 1
  • PDF
Supporting the cybercrime investigation process: Effective discrimination of source code authors based on byte-level information
TLDR
We propose a simplified profile and a new similarity measure which is less complicated than the algorithm followed in text authorship attribution and it seems more suitable for source code identification since is better able to deal with very small training sets. Expand
  • 15
  • 1
  • PDF
A methodology to assess the impact of design patterns on software quality
TLDR
This paper introduces a theoretical/analytical methodology to compare sets of ''canonical'' solutions to design problems. Expand
  • 53
  • PDF
Examining the significance of high-level programming features in source code author classification
TLDR
The use of Source Code Author Profiles (SCAP) represents a new, highly accurate approach to source code authorship identification that is, unlike previous methods, language independent. Expand
  • 42
  • PDF
The significance of user-defined identifiers in Java source code authorship identification
TLDR
We assess the importance of three types of identifiers in source code author classification for two different Java program data sets. Expand
  • 2
  • PDF