#### Abstract

Many aspects of the classical price-setting newsvendor problem have been studied in the literature and most of the results pertain to the case where the price-demand relationship and demand distribution are explicitly provided. However, in practice, one needs to model and estimate these from historical sales data. Furthermore, many other drivers besides price must be included in the demand response model for statistical accuracy, along with conditional heteroskedasticity effects in the demand distribution. In this paper we develop a practical framework for data-driven, distribution-free, multivariate modeling of the price-setting newsvendor problem, which includes statistical estimation and price optimization methods for estimating the optimal solutions and associated confidence intervals. The specific novelty of the framework is that the relevant statistical estimation methods are carried out in close conjunction with the requirements of the optimization problem, which in this context requires the estimation of three distinct aspects of the demand distribution, namely the mean, quantile and superquantile (also known as conditional value-atrisk, CVaR). We investigate different statistical estimators, which are broadly based on generalized linear regression (GLR), mixed-quantile regression (MQR), and superquantile regression (SQR) respectively. Our results extend the previous literature, notably to incorporate heteroskedasticity in MQR, and to obtain a novel and exact large-scale decomposition method that is computationally efficient for SQR (these extensions are of independent interest, besides the application discussed here). Our detailed computational experiments indicate that quantile-based methods such as MQR and SQR provide better solutions for a wide range of demand distributions, although for certain location-scale demand distributions that are similar to the Normal distribution, GLR may be preferable.