• Publications
  • Influence
Improving massive experiments with threshold blocking
This work presents an algorithm that implements a widely applicable class of blocking—threshold blocking—that solves the blocking problem of minimizing the maximum distance between any two units within the same group and constructs the groups flexibly for any chosen minimum size. Expand
Blocking estimators and inference under the Neyman-Rubin model
We derive the variances of estimators for sample average treatment effects under the Neyman-Rubin potential outcomes model for arbitrary blocking assignments and an arbitrary number of treatments.
Does Dividing the Range by Four Provide an Accurate Estimate of a Standard Deviation in Family Science Research? A Teaching Editorial
ABSTRACT Occasionally, scientific reports have omitted information on standard deviations, making estimates of effect sizes very difficult to impossible. In such situations, several scholars haveExpand
Generalized full matching and extrapolation of the results from a large-scale voter mobilization experiment.
Matching is an important tool in causal inference. The method provides a conceptually straightforward way to make groups of units comparable on observed characteristics. The use of the method is,Expand
Generalized Full Matching
A generalization of full matching is introduced that inherits its optimality properties but allows the investigator to specify any desired structure of the matched groups over any number of treatment conditions and describes a new approximation algorithm to derive generalized full matchings. Expand
Applications of Integer Programming Methods to Solve Statistical Problems
This work develops a new method for blocking in randomized experiments that works for an arbitrary number of treatments, and provides the first polynomial time approximately optimal blocking algorithm for when there are more than two treatment categories. Expand
Improving Experiments by Optimal Blocking: Minimizing the Maximum Within-block Distance
We develop a new method for blocking in randomized experiments that works for an arbitrary number of treatments. We analyze the following problem: given a threshold for the minimum number of units toExpand
A new method for quantifying network cyclic structure to improve community detection
This paper introduces renewal non-backtracking random walks (RNBRW) as a way of quantifying this structure and gives simulation results showing that pre-weighting edges through RNBRW may substantially improve the performance of common community detection algorithms. Expand
Hybridized Threshold Clustering for Massive Data
Through simulation results and by applying the methodology on several real datasets, it is shown that IHTC combined with $k-means or HAC substantially reduces the run time and memory usage of the original clustering algorithms while still preserving their performance. Expand
New methods for incorporating network cyclic structures to improve community detection
Simulation results suggest pre-weighting edges by the proposed methods can improve the performance of popular community detection algorithms substantially, and are especially efficient for the challenging case of detecting communities in sparse graphs. Expand