Guangzhe Fan

Learn More
Treemodels are valuable tools for predictivemodeling and datamining. Traditional tree-growingmethodologies such as CART are known to suffer from problems including greediness, instability, and bias in split rule selection. Alternative tree methods, including Bayesian CART (Chipman et al., 1998; Denison et al., 1998), random forests (Breiman, 2001a),(More)
Kernel-Induced Classification Trees and Random forests Guangzhe Fan Department of Statistics and Actuarial Science University of Waterloo (This is a revised version for Technometrics submission, @ all rights reserved) Abstract A recursive-partitioning procedure using kernel functions is proposed for classification problems. We call it KICTkernel-induced(More)
The compress-and-forward relay scheme developed by (Cover and El Gamal, 1979) is modified by realizing that it is not necessary for the destination to decode the compressed observation of the relay; and even if the compressed observation is to be decoded, it can be more easily done by joint decoding with the original message, rather than in a successive(More)
Scientists and others today often collect samples of curves and other functional data. The multivariate data classification methods cannot be directly used for functional data classification because the curse of dimensionality and difficulty in taking in account the correlation and order of functional data. We extend the kernel-induced random forest method(More)
We propose a simple kernel based nearest neighbor approach for handwritten digit classification. The "distance" here is actually a kernel defining the similarity between two images. We carefully study the effects of different number of neighbors and weight schemes and report the results. With only a few nearest neighbors (or most similar images) to vote,(More)
--We propose a simple kernel based nearest neighbor approach for handwritten digit classification. The "distance" here is actually a kernel defining the similarity between two images. We carefully study the effects of different number of neighbors and weight schemes and report the results. With only a few nearest neighbors (or most similar images) to vote,(More)
High-resolution nuclear magnetic resonance (NMR) spectra contain important biomarkers that have potentials for early diagnosis of disease and subsequent monitoring of its progression. Traditional features extraction and analysis methods have been carried out in the original frequency spectrum domain. In this study, we conduct feature selection based on a(More)
  • 1