• Corpus ID: 230523761

Label Augmentation via Time-based Knowledge Distillation for Financial Anomaly Detection

  title={Label Augmentation via Time-based Knowledge Distillation for Financial Anomaly Detection},
  author={H. Shen and Eren Kursun},
Detecting anomalies has become increasingly critical to the financial service industry. Anomalous events are often indicative of illegal activities such as fraud, identity theft, network intrusion, account takeover, and money laundering. Financial anomaly detection use cases face serious challenges due to the dynamic nature of the underlying patterns especially in adversarial environments such as constantly changing fraud tactics. While retraining the models with the new patterns is absolutely… 

Figures and Tables from this paper


Anomaly Detection in Finance: Editors' Introduction
A select set of papers from the KDD Workshop on Anomaly Detection in Finance held at Halifax, Nova Scotia on Aug 14, 2017 are published in this issue of the Proceedings of Machine Learning Research.
Computer-Assisted Fraud Detection, From Active Learning to Reward Maximization
The setting of 'Computer-assisted fraud detection' is introduced where the goal is to minimize the number of non fraudulent operations submitted to an oracle and it is shown that a simple meta-algorithm provides competitive results in this scenario on benchmark datasets.
Born Again Neural Networks
This work studies KD from a new perspective: rather than compressing models, students are trained parameterized identically to their teachers, and shows significant advantages from transferring knowledge between DenseNets and ResNets in either direction.
FitNets: Hints for Thin Deep Nets
This paper extends the idea of a student network that could imitate the soft output of a larger teacher network or ensemble of networks, using not only the outputs but also the intermediate representations learned by the teacher as hints to improve the training process and final performance of the student.
Label Refinery: Improving ImageNet Classification through Label Progression
The effects of various properties of labels are studied, an iterative procedure that updates the ground truth labels after examining the entire dataset is introduced, and significant gain is shown using refined labels across a wide range of models.
Model compression
This work presents a method for "compressing" large, complex ensembles into smaller, faster models, usually without significant loss in performance.
Distilling the Knowledge in a Neural Network
This work shows that it can significantly improve the acoustic model of a heavily used commercial system by distilling the knowledge in an ensemble of models into a single model and introduces a new type of ensemble composed of one or more full models and many specialist models which learn to distinguish fine-grained classes that the full models confuse.
XGBoost: A Scalable Tree Boosting System
This paper proposes a novel sparsity-aware algorithm for sparse data and weighted quantile sketch for approximate tree learning and provides insights on cache access patterns, data compression and sharding to build a scalable tree boosting system called XGBoost.
The Precision-Recall Plot Is More Informative than the ROC Plot When Evaluating Binary Classifiers on Imbalanced Datasets
It is shown that the visual interpretability of ROC plots in the context of imbalanced datasets can be deceptive with respect to conclusions about the reliability of classification performance, owing to an intuitive but wrong interpretation of specificity.
The relationship between Precision-Recall and ROC curves
It is shown that a deep connection exists between ROC space and PR space, such that a curve dominates in R OC space if and only if it dominates in PR space.