Trung Le

Learn More
We introduce a new model to deal with imbalanced data sets for novelty detection problems where the normal class of training data set can be majority or minority class. The key idea is to construct an optimal hypersphere such that the inside margin between the surface of this sphere and the normal data and the outside margin between that surface and the(More)
Support Vector Machine (SVM) is a very well-known tool for classification and regression problems. Many applications require SVMs with non-linear kernels for accurate classification. Training time complexity for SVMs with non-linear kernels is typically quadratic in the size of the training dataset. In this paper, we depart from the very well-known(More)
Current data description learning methods for novelty detection such as support vector data description and small sphere with large margin construct a spherically shaped boundary around a normal data set to separate this set from abnormal data. The volume of this sphere is minimized to reduce the chance of accepting abnormal data. However those learning(More)
One of the most challenging problems in kernel online learning is to bound the model size. Budgeted kernel online learning addresses this issue by bounding the model size to a predefined budget. However, determining an appropriate value for such predefined budget is arduous. In this paper, we propose the Nonparametric Budgeted Stochastic Gradient Descent(More)
Support Vector Data Description (SVDD) is a well-known supervised learning method for novelty detection purpose. For its classification task, SVDD requires a fully-labeled dataset. Nonetheless, contemporary datasets always consist of a collection of labeled data samples jointly a much larger collection of unlabeled ones. This fact impedes the usage of SVDD(More)
One-class Support Vector Machine (OCSVM) is a well-known method for novelty detection. However, OCSVM regards all negative data samples as a common symbol and thereby not being able to utilize the information carried by them. Furthermore, OCSVM requires a fully labeled data set and cannot work efficiently with data set with both labeled and unlabeled data(More)
BACKGROUND The Comparative Data Analysis Ontology (CDAO) is an ontology developed, as part of the EvoInfo and EvoIO groups supported by the National Evolutionary Synthesis Center, to provide semantic descriptions of data and transformations commonly found in the domain of phylogenetic analysis. The core concepts of the ontology enable the description of(More)