• Publications
  • Influence
Weighted Distance Weighted Discrimination and Its Asymptotic Properties
While Distance Weighted Discrimination (DWD) is an appealing approach to classification in high dimensions, it was designed for balanced datasets. In the case of unequal costs, biased sampling, orExpand
  • 77
  • 10
  • PDF
Flexible high-dimensional classification machines and their asymptotic properties
TLDR
We investigate two popular large-margin classification methods, Support Vector Machine (SVM) and Distance Weighted Discrimination (DWD), under two contexts: the high-dimensional, low-sample size data and the imbalanced data. Expand
  • 26
  • 4
  • PDF
Adaptive weighted learning for unbalanced multicategory classification.
In multicategory classification, standard techniques typically treat all classes equally. This treatment can be problematic when the dataset is unbalanced in the sense that certain classes have veryExpand
  • 62
  • 2
  • PDF
Distance-weighted Support Vector Machine
TLDR
A novel linear classification method that possesses the merits of both the Support Vector Machine (SVM) and the Distance-weighted Discrimination (DWD) is proposed in this article. Expand
  • 22
  • 2
  • PDF
Partial Least Squares (PLS) Applied to Medical Bioinformatics
TLDR
We explain how to address multi-collinearity in least squares regression by performing a hypotheses driven preliminary research study and sensitivities analysis by not doing a combinatorial analysis. Expand
  • 17
  • 1
Stabilized Nearest Neighbor Classifier and its Statistical Properties
TLDR
In this article, we introduce a general measure of classification instability (CIS) to quantify the sampling variability of the prediction made by a classification method. Expand
  • 9
  • 1
  • PDF
Asymptotic Properties of Distance-Weighted Discrimination
While Distance-Weighted Discrimination (DWD) is an appeal ing approach to classification in high dimensions, it was designed for balanced data sets. In the case of unequal costs, biased sampling orExpand
  • 9
  • 1
  • PDF
Sparse Fisher's linear discriminant analysis for partially labeled data
TLDR
We propose a semi-supervised sparse LDA classifier to take advantage of the seemingly useless unlabeled data. Expand
  • 4
  • 1
  • PDF
Significance Analysis of High-Dimensional, Low-Sample Size Partially Labeled Data
TLDR
We propose a significance analysis approach for partially labeled data that makes use of the whole data and tries to test the class difference as if all the labels were observed. Expand
  • 3
  • 1
  • PDF
Learning Confidence Sets using Support Vector Machines
TLDR
We propose a support vector classifier to construct confidence sets by empirical risk minimization. Expand
  • 6
  • PDF