SMOTE: Synthetic Minority Over-sampling Technique
- N. Chawla, K. Bowyer, L. Hall, W. Kegelmeyer
- Computer ScienceJournal of Artificial Intelligence Research
- 2002
A combination of the method of oversampling the minority (abnormal) class and under-sampling the majority class can achieve better classifier performance (in ROC space) and a combination of these methods and the area under the Receiver Operating Characteristic curve (AUC) and the ROC convex hull strategy is evaluated.
metapath2vec: Scalable Representation Learning for Heterogeneous Networks
- Yuxiao Dong, N. Chawla, A. Swami
- Computer ScienceKnowledge Discovery and Data Mining
- 4 August 2017
Two scalable representation learning models, namely metapath2vec and metapATH2vec++, are developed that are able to not only outperform state-of-the-art embedding models in various heterogeneous network mining tasks, but also discern the structural and semantic correlations between diverse network objects.
SMOTEBoost: Improving Prediction of the Minority Class in Boosting
- N. Chawla, A. Lazarevic, L. Hall, K. Bowyer
- Computer ScienceEuropean Conference on Principles of Data Mining…
- 22 September 2003
This paper presents a novel approach for learning from imbalanced data sets, based on a combination of the SMOTE algorithm and the boosting procedure, which shows improvement in prediction performance on the minority class and overall improved F-values.
New perspectives and methods in link prediction
- Ryan Lichtenwalter, Jake T. Lussier, N. Chawla
- Computer ScienceKnowledge Discovery and Data Mining
- 25 July 2010
This paper examines important factors for link prediction in networks and provides a general, high-performance framework for the prediction task and presents an effective flow-based predicting algorithm, formal bounds on imbalance in sparse network link prediction, and employ an evaluation method appropriate for the observed imbalance.
Heterogeneous Graph Neural Network
- Chuxu Zhang, Dongjin Song, Chao Huang, A. Swami, N. Chawla
- Computer ScienceKnowledge Discovery and Data Mining
- 25 July 2019
HetGNN, a heterogeneous graph neural network model, is proposed that can outperform state-of-the-art baselines in various graph mining tasks, i.e., link prediction, recommendation, node classification and clustering and inductive node classification & clustering.
Editorial: special issue on learning from imbalanced data sets
- N. Chawla, N. Japkowicz, Aleksander Kotcz
- Computer ScienceSKDD
- 1 June 2004
Data Mining for Imbalanced Datasets: An Overview
- N. Chawla
- Computer ScienceThe Data Mining and Knowledge Discovery Handbook
- 2005
In this Chapter, some of the sampling techniques used for balancing the datasets, and the performance measures more appropriate for mining imbalanced datasets are discussed.
A Deep Neural Network for Unsupervised Anomaly Detection and Diagnosis in Multivariate Time Series Data
- Chuxu Zhang, Dongjin Song, N. Chawla
- Computer ScienceAAAI Conference on Artificial Intelligence
- 20 November 2018
This paper proposes a Multi-Scale Convolutional Recurrent Encoder-Decoder (MSCRED), to perform anomaly detection and diagnosis in multivariate time series data and demonstrates that MSCRED can outperform state-of-the-art baseline methods.
SVMs Modeling for Highly Imbalanced Classification
- Yuchun Tang, Yanqing Zhang, N. Chawla, S. Krasser
- Computer ScienceIEEE Transactions on Systems, Man, and…
- 1 February 2009
Of the four SVM variations considered in this paper, the novel granular SVMs-repetitive undersampling algorithm (GSVM-RU) is the best in terms of both effectiveness and efficiency.
A unifying view on dataset shift in classification
- J. G. Moreno-Torres, T. Raeder, R. AlaĂz-RodrĂguez, N. Chawla, F. Herrera
- Computer SciencePattern Recognition
- 2012
...
...