Gerda Claeskens

Learn More
Support vector machines for classification have the advantage that the curse of dimensionality is circumvented. It has been shown that a reduction of the dimension of the input space leads to even better results. For this purpose, we propose two information criteria which can be computed directly from the definition of the support vector machine. We assess(More)
Penalized spline-based additive models allow a simple mixed model representation where the variance components control departures from linear models. The smoothing parameter is the ratio between the random-coefficient and error variances and tests for linear regression reduce to tests for zero random-coefficient variances. We propose exact likelihood and(More)
Dealing with missing data via parametric multiple imputation methods usually implies stating several strong assumptions about both the distribution of the data and about underlying regression relationships. If such parametric assumptions do not hold, the multiply imputed data are not appropriate and might produce inconsistent estimators and thus misleading(More)
When the data do not come from the assumed parametric model, the usual asymptotic chi-squared distribution under the null hypothesis, remains valid for " robustified " Wald and score test statistics. In this paper we compare the performance of this chi-squared approximation to that of a semiparametric bootstrap method. The bootstrap approximation is based(More)
Recently, Hjort and Claeskens (2003) developed an asymptotic theory for model selection, model averaging and post-model selection/averaging inference using likelihood methods in parametric models, along with associated confidence statements. In this paper, we consider a semiparametric version of this problem, wherein the likelihood depends on parameters and(More)
Standard variable selection procedures, primarily developed for the construction of outcome prediction models, are routinely applied when assessing exposure effects in observational studies. We argue that this tradition is sub-optimal and prone to yield bias in exposure effect estimators as well as their corresponding uncertainty estimators. We weigh the(More)
Mixed models, with both random and fixed effects, are most often estimated on the assumption that the random effects are normally distributed. In this paper we propose several formal tests of the hypothesis that the random effects and/or errors are normally distributed. Most of the proposed methods can be extended to generalized linear models where tests(More)
In order to make predictions of future values of a time series, one needs to specify a forecasting model. A popular choice is an autoregressive time series model, where the order of the model is chosen by an information criterion. We propose an extension of the Focussed Information Criterion (FIC) for model-order selection with focus on a high predictive(More)