Finding relevant spectral regions between spectroscopic techniques by use of cross model validation and partial least squares regression.

Abstract

In this paper, we extend the concept of cross model validation (CMV) to multiple X and Y variables where different spectroscopic techniques serve as X and Y data in a regression context. For the first dataset on marzipan samples the main objective was to find significant regions in the spectral data, and to discuss the issue of false discovery, i.e. combinations of variables that erroneously are found to be significant. A permutation test within the framework of CMV showed that no regression coefficients in the partial least squares regression (PLSR) model between FT-IR and VIS/NIR spectra show significance at the 5% level. We believe the reason is that the CMV acts as strong filter towards spurious correlations. Corresponding CH- and OH-bands between FT-IR and NIR spectra gave significant regions. For the second dataset, the results from CMV are interpreted more in detail with chemical background knowledge in mind. Most of the significant regions found between the Raman and NIR spectra could be interpreted from the chemical composition of the oil mixtures. Some regions were more difficult to interpret, which could be due to systematic baseline effects in the NIR data.

Cite this paper

@article{Westad2007FindingRS, title={Finding relevant spectral regions between spectroscopic techniques by use of cross model validation and partial least squares regression.}, author={Frank Westad and Nils Kristian Afseth and Rasmus Bro}, journal={Analytica chimica acta}, year={2007}, volume={595 1-2}, pages={323-7} }