Learn More
We analyze tests for long-run abnormal returns and document that two approaches yield well-specified test statistics in random samples. The first uses a traditional event study framework and buy-and-hold abnormal returns calculated using carefully constructed reference portfolios. Inference is based on either a skewness-adjusted t-statistic or the(More)
We report the generation and analysis of functional data from multiple, diverse experiments performed on a targeted 1% of the human genome as part of the pilot phase of the ENCODE Project. These data have been further integrated and augmented by a number of evolutionary and computational analyses. Together, our results advance the collective knowledge about(More)
Chromatin immunoprecipitation (ChIP) followed by high-throughput DNA sequencing (ChIP-seq) has become a valuable and widely used approach for mapping the genomic location of transcription-factor binding and histone modifications in living cells. Despite its widespread use, there are considerable differences in how these experiments are conducted, how the(More)
A key component of the ongoing ENCODE project involves rigorous comparative sequence analyses for the initially targeted 1% of the human genome. Here, we present orthologous sequence generation, alignment, and evolutionary constraint analyses of 23 mammalian species for all ENCODE targets. Alignments were generated using four different methods; comparisons(More)
The least-absolute-deviations (LAD) estimator for a median-regression model does not satisfy the standard conditions for obtaining asymptotic refinements through use of the bootstrap because the LAD objective function is not smooth. This paper overcomes this problem by smoothing the objective function so that it becomes differentiable. The smoothed(More)
Data from the Encyclopedia of DNA Elements (ENCODE) project show over 9640 human genome loci classified as long noncoding RNAs (lncRNAs), yet only ~100 have been deeply characterized to determine their role in the cell. To measure the protein-coding output from these RNAs, we jointly analyzed two recent data sets produced in the ENCODE project: tandem mass(More)
Collinearity and near-collinearity of predictors cause difficulties when doing regression. In these cases, variable selection becomes un-tenable because of mathematical issues concerning the existence and numerical stability of the regression coefficients, and interpretation of the coefficients is ambiguous because gradients are not defined. Using a(More)
Variational methods for parameter estimation are an active research area, potentially offering computationally tractable heuristics with theoretical performance bounds. We build on recent work that applies such methods to network data, and establish asymptotic normality rates for parameter estimates of stochastic blockmodel data, by either maximum(More)
Local linearization techniques are an important class of nonparametric system identification. Identifying local linearizations in practice involves solving a linear regression problem that is ill-posed. The problem can be ill-posed either if the dynamics of the system lie on a manifold of lower dimension than the ambient space or if there are not enough(More)
1 Matching estimators are widely used in empirical economics for the evaluation of programs or treatments. Researchers using matching methods often apply the boot-strap to calculate the standard errors. However, no formal justification has been provided for the use of the bootstrap in this setting. In this article, we show that the standard bootstrap is, in(More)