Principal component analysis optimization of a PM2.5 land use regression model with small monitoring network.


The use of land-use regression (LUR) techniques for modeling small-scale variations of intraurban air pollution has been increasing in the last decade. The most appealing feature of LUR techniques is the economical monitoring requirements. In this study, principal component analysis (PCA) was employed to optimize an LUR model for PM2.5. The PM2.5 monitoring network consisted of 13 sites, which constrained the regression model to a maximum of one independent variable. An optimized surrogate of vehicle emissions was produced by PCA and employed as the predictor variable in the model. The vehicle emissions surrogate consisted of a linear combination of several traffic variables (e.g., vehicle miles traveled, speed, traffic demand, road length, and time) obtained from a road network used for traffic modeling. The vehicle-emissions surrogate produced by the PCA had a predictive capacity greater (R2=.458) than the traffic variable, Traffic Demand summarized for a 1 km buffer, with best predictive capacity (R2=.341). The PCA-based method employed in this study was effective at increasing the fit of an ordinary LUR model by optimizing the utilization of a PM2.5 dataset from small-n monitoring network. In general, the method used can contribute to LUR techniques in two major ways: 1) by improving the predictive power of the input variable, by substituting a principal component for a single variable and 2) by creating an orthogonal set of predictor variables, and thus fulfilling the no colinearity assumption of the linear regression methods. The proposed PCA method, should be universally applicable to LUR methods and will expand their economical attractiveness.

DOI: 10.1016/j.scitotenv.2012.02.068