Learn More
In this article, we present FactoMineR an R package dedicated to multivariate data analysis. The main features of this package is the possibility to take into account different types of variables (quantitative or categorical), different types of structure on the data (a partition on the variables, a hierarchy on the variables, a partition on the(More)
This paper combines three exploratory data analysis methods, principal component methods, hierarchical clustering and partitioning, to enrich the description of the data. Principal component methods are used as preprocessing step for the clustering in order to denoise the data, transform categorical data in continuous ones or balanced groups of variables.(More)
A common approach to deal with missing values in multivariate exploratory data analysis consists in minimizing the loss function over all non-missing elements. This can be achieved by EM-type algorithms where an iterative imputation of the missing values is performed during the estimation of the axes and components. This paper proposes such an algorithm,(More)
Principal component analysis (PCA) is a standard technique to summarize the main structures of a data table containing the measurements of several quantitative variables for a number of individuals. Here, we study the case where some of the data values are missing and propose a review of methods which accommodate PCA to missing data. In plant ecology, this(More)
The available methods to handle missing values in principal component analysis only provide point estimates of the parameters (axes and components) and estimates of the missing values. To take into account the variability due to missing values a multiple imputation method is proposed. First a method to generate multiple imputed data sets from a principal(More)
This paper considers watermarking detection, also known as zero-bit watermarking. A watermark, carrying no hidden message, is inserted in content. The watermark detector checks for the presence of this particular weak signal in content. The paper aims at looking to this problem from a classical detection theory point of view, but with side information(More)
Cross-validation is a tried and tested approach to select the number of components in principal component analysis (PCA), however, its main drawback is its computational cost. In a regression (or in a non parametric regression) setting, criteria such as the general cross-validation one (GCV) provide convenient approximations to leave-one-out(More)
Principal component analysis (PCA) is a well-established dimensionality reduction method commonly used to denoise and visualise data. A classical PCA model is the fixed effect model in which data are generated as a fixed structure of low rank corrupted by noise. Under this model, PCA does not provide the best recovery of the underlying signal in terms of(More)