Rajen Dinesh Shah

Learn More
Stability selection was recently introduced by Meinshausen and Bühlmann as a very general technique designed to improve the performance of a variable selection algorithm. It is based on aggregating the results of applying a selection procedure to subsamples of the data. We introduce a variant, called complementary pairs stability selection, and derive(More)
We would like to begin by congratulating the authors on their fine paper. Handling highly correlated variables is one of the most important issues facing practitioners in highdimensional regression problems, and in some ways it is surprising that it has not received more attention up to this point. The authors have made substantial progress towards(More)
Finding interactions between variables in large and high-dimensional data sets is often a serious computational challenge. Most approaches build up interaction sets incrementally, adding variables in a greedy fashion. The drawback is that potentially informative high-order interactions may be overlooked. Here, we propose an alternative approach for(More)
PURPOSE Current diagnostic tests for diffuse large B-cell lymphoma use the updated WHO criteria based on biologic, morphologic, and clinical heterogeneity. We propose a refined classification system based on subset-specific B-cell-associated gene signatures (BAGS) in the normal B-cell hierarchy, hypothesizing that it can provide new biologic insight and(More)
Spheres are widely used as the basis for the design of multiparticulate drug delivery systems. Although the extrusion and spheronization processes are frequently used to produce such spheres, there is a lack of basic understanding of these processes and of the requisite properties of excipients and formulations. It is hypothesized that the rheological or(More)
Extrusion-spheronization is a popular means of producing spheres which can be coated to form a controlled-release system. In the extrusion process, stress is necessary to force a wet mass through small orifices, and as a result, frictional heat builds up at the screen. Therefore, the quantitative measurement of the screen pressure and screen temperature is(More)
We study the problem of high-dimensional regression when there may be interacting variables. Approaches using sparsity-inducing penalty functions such as the Lasso can be useful for producing interpretable models. However, when the number variables runs into the thousands, and so even two-way interactions number in the millions, these methods may become(More)
Large-scale regression problems where both the number of variables, p, and the number of observations, n, may be large and in the order of millions or more, are becoming increasingly more common. Typically the data are sparse: only a fraction of a percent of the entries in the design matrix are non-zero. Nevertheless, often the only computationally feasible(More)
We study large-scale regression analysis where both the number of variables, p, and the number of observations, n, may be large and in the order of millions or more. This is very different from the now well-studied high-dimensional regression context of “large p, small n”. For example, in our “large p, large n” setting, an ordinary least squares estimator(More)
We would like to begin by congratulating the authors on their fine paper. Handling highly correlated variables is one of the most important issues facing practitioners in high-dimensional regression problems, and in some ways it is surprising that it has not received more attention up to this point. The authors have made substantial progress towards(More)