An Unbalanced Dataset Classification Approach Based on v-Support Vector Machine

@article{Zhao2006AnUD,
  title={An Unbalanced Dataset Classification Approach Based on v-Support Vector Machine},
  author={Yinggang Zhao and Qinming He},
  journal={2006 6th World Congress on Intelligent Control and Automation},
  year={2006},
  volume={2},
  pages={10496-10501},
  url={https://api.semanticscholar.org/CorpusID:17786485}
}
The V-support vector machine (V-SVM) is a new formulation of the regular SVM, and its parameter V has intuitive meanings compared with C (the penalty constant in SVM), and an equation between V and C was given.

Figures from this paper

Artificial intelligence techniques for unbalanced datasets in real world classification tasks

In this chapter a survey on the problem of classification tasks in unbalanced datasets is presented, and the main approaches to improve the generally not satisfactory results obtained by standard classifiers such as decision trees and support vector machines are described.

Feature Relevance Assessment in Automatic Inter-patient Heart Beat Classification

The results show that the choice of the features is of major importance, and that some usual feature sets do not serve the classification performances, which shows that this issue must be addressed to grasp the importance of the pathological cases.

Multi-Class Classification via Subspace Modeling

Experimental results show that the proposed Multi-class Subspace Modeling (MSM) classification framework outperforms those compared classifiers in 10 data sets, among which 8 of them hold a confidence level of significance higher than 99.5%.

Analysis of binary feature mapping rules for promoter recognition in imbalanced DNA sequence datasets using Support Vector Machine

    R. Damasevicius
    Biology, Computer Science
  • 2008
A machine learning method, called support vector machine (SVM), is used for classification of DNA sequences and promoter recognition and the results of classification for drosophila and human sequence datasets are presented.

Classification for Fraud Detection with Social Network Analysis

A new method is proposed that identifies patterns among the social networks for fraudulent organizations, and uses them to enrich the description of its entity, and will then be used jointly with balancing techniques to produce a better classifier to identify fraud.

MovieGEN : A Movie Recommendation System

This paper implements MovieGEN, an expert system for movie recommendation that takes in the users’ personal information and predicts their movie preferences using well-trained support vector machine (SVM) models, and traverses the parameter space, enabling the customizability of the system.

Associating Financial Trading Volume Volatility and Information Volume based on Neural Network and Support Vector Machine

This paper attempts to employ two approaches to forecast the stock market volatility using the online financial information, and delves into the associations between the trading volume volatility and the onlinefinancial information volume.

Financial volatility forecasting based on inter- company connections and support vector machine

An approach to mine the associations between the volatility and the online financial information volume for US market, with the employment of inter-company connections is presented, using GARCH theory in conjunction with support vector machine to achieve a modified non-linear learning model.

A Tutorial on Support Vector Machines for Pattern Recognition

There are several arguments which support the observed high accuracy of SVMs, which are reviewed and numerous examples and proofs of most of the key theorems are given.

SMOTE: Synthetic Minority Over-sampling Technique

A combination of the method of oversampling the minority (abnormal) class and under-sampling the majority class can achieve better classifier performance (in ROC space) and a combination of these methods and the area under the Receiver Operating Characteristic curve (AUC) and the ROC convex hull strategy is evaluated.

Class-Boundary Alignment for Imbalanced Dataset Learning

The class-boundaryalignment algorithm is proposed to augment SVMs to deal with imbalanced training-data problems posed by many emerging applications (e.g., image retrieval, video surveillance, and gene profiling).

New Support Vector Algorithms

A new class of support vector algorithms for regression and classification that eliminates one of the other free parameters of the algorithm: the accuracy parameter in the regression case, and the regularization constant C in the classification case.

Text Categorization with Support Vector Machines: Learning with Many Relevant Features

SVMs achieve substantial improvements over the currently best performing methods and behave robustly over a variety of di-erent learning tasks, eliminating the need for manual parameter tuning.

Prediction of Generalization Ability in Learning Machines

The main goal of the dissertation is to merge theory and practice: to develop theoretically based, but experimentally adapted tools that allow an accurate prediction of the generalization error of an arbitrarily complex classifier.

Robust Classification for Imprecise Environments

It is shown that it is possible to build a hybrid classifier that will perform at least as well as the best available classifier for any target conditions, and in some cases, the performance of the hybrid actually can surpass that of the best known classifier.

Support vector machine active learning for image retrieval

This work proposes the use of a support vector machine active learning algorithm for conducting effective relevance feedback for image retrieval and achieves significantly higher search accuracy than traditional query refinement schemes after just three to four rounds of relevance feedback.

Controlling the Sensitivity of Support Vector Machines

Two schemes for adjusting the sensitivity and speciicity of Support Vector Machines and the description of their performance using receiver operating characteristic (ROC) curves are discussed and their use on real-life medical diagnostic tasks is illustrated.