• Publications
  • Influence
SMOTE: Synthetic Minority Over-sampling Technique
A combination of the method of oversampling the minority (abnormal) class and under-sampling the majority class can achieve better classifier performance (in ROC space) and a combination of these methods and the area under the Receiver Operating Characteristic curve (AUC) and the ROC convex hull strategy is evaluated.
metapath2vec: Scalable Representation Learning for Heterogeneous Networks
Two scalable representation learning models, namely metapath2vec and metapATH2vec++, are developed that are able to not only outperform state-of-the-art embedding models in various heterogeneous network mining tasks, but also discern the structural and semantic correlations between diverse network objects.
SMOTEBoost: Improving Prediction of the Minority Class in Boosting
This paper presents a novel approach for learning from imbalanced data sets, based on a combination of the SMOTE algorithm and the boosting procedure, which shows improvement in prediction performance on the minority class and overall improved F-values.
New perspectives and methods in link prediction
This paper examines important factors for link prediction in networks and provides a general, high-performance framework for the prediction task and presents an effective flow-based predicting algorithm, formal bounds on imbalance in sparse network link prediction, and employ an evaluation method appropriate for the observed imbalance.
Data Mining for Imbalanced Datasets: An Overview
  • N. Chawla
  • Computer Science
    The Data Mining and Knowledge Discovery Handbook
  • 2005
In this Chapter, some of the sampling techniques used for balancing the datasets, and the performance measures more appropriate for mining imbalanced datasets are discussed.
SVMs Modeling for Highly Imbalanced Classification
Of the four SVM variations considered in this paper, the novel granular SVMs-repetitive undersampling algorithm (GSVM-RU) is the best in terms of both effectiveness and efficiency.
Heterogeneous Graph Neural Network
HetGNN, a heterogeneous graph neural network model, is proposed that can outperform state-of-the-art baselines in various graph mining tasks, i.e., link prediction, recommendation, node classification and clustering and inductive node classification & clustering.
A unifying view on dataset shift in classification
This work attempts to present a unifying framework through the review and comparison of some of the most important works in the literature on dataset shift, and uses different names to refer to the same concepts.
A Deep Neural Network for Unsupervised Anomaly Detection and Diagnosis in Multivariate Time Series Data
This paper proposes a Multi-Scale Convolutional Recurrent Encoder-Decoder (MSCRED), to perform anomaly detection and diagnosis in multivariate time series data and demonstrates that MSCRED can outperform state-of-the-art baseline methods.