Fractionating wheat proteins by reversed-phase high-performance liquid chromatography yields extremely complex chromatograms. The data they contain may relate to many characteristics of milled wheat such as the volume of a loaf of bread or the texture of the dough produced, but such relationships are not readily apparent from the raw data. We report our experiences with two dimension-reduction techniques that are widely cited in the chemometrics literature: principal component analysis and partial least squares (PLS). Each of these methods replaces the original observation vectors by weighted averages of their components, where the weights are selected according to a data-dependent criterion. The analysis proceeds by operating on these weighted averages rather than the original, high-dimensional data. In order to elucidate properties of significance tests and other inferences, we focus on the special case where only one factor is selected. We show how to use simulation to compute the appropriate significance level of the regression on the PLS scores. The common technique of using the F distribution to compute significance levels for PLS regression can be an extremely liberal procedure. The interpretation of PLS weights requires considerable care.
ASJC Scopus subject areas
- Analytical Chemistry
- Process Chemistry and Technology
- Computer Science Applications