Learn More
Autism spectrum disorders (ASD) are believed to have genetic and environmental origins, yet in only a modest fraction of individuals can specific causes be identified. To identify further genetic risk factors, here we assess the role of de novo mutations in ASD by sequencing the exomes of ASD cases and their parents (n = 175 trios). Fewer than half of the(More)
We develop a cyclical blockwise coordinate descent algorithm for the multi-task Lasso that efficiently solves problems with thousands of features and tasks. The main result shows that a closed-form Winsorization operator can be obtained for the sup-norm penalized least squares regression. This allows the algorithm to find solutions to very large-scale(More)
Recent methods for estimating sparse undirected graphs for real-valued data in high dimensional problems rely heavily on the assumption of normality. We show how to use a semiparametric Gaus-sian copula—or " nonparanormal " —for high dimensional inference. Just as additive models extend linear models by replacing linear functions with a set of(More)
We present a new class of models for high-dimensional nonparametric regression and classification called sparse additive models (SpAM). Our methods combine ideas from sparse linear modeling and additive nonparametric regression. We derive a method for fitting the models that is effective even when the number of covariates is larger than the sample size. A(More)
In this paper, we propose a semiparametric approach, named nonparanor-mal skeptic, for efficiently and robustly estimating high dimensional undirected graph-ical models. To achieve modeling flexibility, we consider Gaussian Copula graphical models (or the nonparanormal) as proposed by Liu et al. (2009). To achieve estimation robustness, we exploit(More)
Protein-protein interactions play a pivotal role in coordinating many cellular processes. Determination of subcellular localization of interacting proteins and visualization of dynamic interactions in living cells are crucial to elucidate cellular functions of proteins. Using fluorescent proteins, we previously developed a bimolecular fluorescence(More)
We study the estimation of low rank matrices via nonconvex optimization. Compared with convex relaxation, nonconvex optimization exhibits superior empirical performance for large scale instances of low rank matrix estimation. However, the understanding of its theoretical guarantees are limited. In this paper, we define the notion of projected oracle(More)
In this paper, we focus on the application of the Peaceman-Rachford splitting method (PRSM) to a convex minimization model with linear constraints and a separable objective function. Compared to the Douglas-Rachford splitting method (DRSM), another splitting method from which the alternating direction method of multipliers originates, PRSM requires more(More)
Big Data bring new opportunities to modern society and challenges to data scientists. On one hand, Big Data hold great promises for discovering subtle population patterns and heterogeneities that are not possible with small-scale data. On the other hand, the massive sample size and high dimensionality of Big Data introduce unique computational and(More)