Rajen Dinesh Shah

Learn More
Stability selection was recently introduced by Meinshausen and Bühlmann as a very general technique designed to improve the performance of a variable selection algorithm. It is based on aggregating the results of applying a selection procedure to subsamples of the data. We introduce a variant, called complementary pairs stability selection, and derive(More)
Finding interactions between variables in large and high-dimensional data sets is often a serious computational challenge. Most approaches build up interaction sets incremen-tally, adding variables in a greedy fashion. The drawback is that potentially informative high-order interactions may be overlooked. Here, we propose an alternative approach for(More)
PURPOSE Current diagnostic tests for diffuse large B-cell lymphoma use the updated WHO criteria based on biologic, morphologic, and clinical heterogeneity. We propose a refined classification system based on subset-specific B-cell-associated gene signatures (BAGS) in the normal B-cell hierarchy, hypothesizing that it can provide new biologic insight and(More)
We study large-scale regression analysis where both the number of variables, p, and the number of observations, n, may be large and in the order of millions or more. This is very different from the now well-studied high-dimensional regression context of " large p, small n ". For example, in our " large p, large n " setting, an ordinary least squares(More)
When performing regression on a dataset with p variables, it is often of interest to go beyond using main linear effects and include interactions as products between individual variables. For small-scale problems, these interactions can be computed explicitly but this leads to a computational complexity of at least O(p2) if done naively. This cost can be(More)
  • 1