What Makes A Good Fisherman? Linear Regression under Self-Selection Bias
@article{Cherapanamjeri2022WhatMA, title={What Makes A Good Fisherman? Linear Regression under Self-Selection Bias}, author={Yeshwanth Cherapanamjeri and Constantinos Daskalakis and Andrew Ilyas and Manolis Zampetakis}, journal={ArXiv}, year={2022}, volume={abs/2205.03246} }
In the classical setting of self-selection, the goal is to learn k models, simultaneously from observations ( x ( i ) , y ( i ) ) where y ( i ) is the output of one of k underlying models on input x ( i ) . In contrast to mixture models, where we observe the output of a randomly selected model, here the observed model depends on the outputs themselves, and is determined by some known selection criterion. For example, we might observe the highest output, the smallest output, or the median output…
Figures from this paper
References
SHOWING 1-10 OF 64 REFERENCES
Estimation of Standard Auction Models
- EconomicsArXiv
- 2022
We provide efficient estimation methods for first- and second-price auctions under independent (asymmetric) private values and partial observability. Given a finite set of observations, each comprising…
Test-optional Policies: Overcoming Strategic Behavior and Informational Gaps
- EducationEAAMO
- 2021
It is found that colleges can overcome two challenges with optional testing: strategic applicants (when those with low test scores can pretend to not have taken the test), and informational gaps (it has more information on those who submit a test score than those who do not).
Small Covers for Near-Zero Sets of Polynomials and Learning Latent Variable Models
- Computer Science2020 IEEE 61st Annual Symposium on Foundations of Computer Science (FOCS)
- 2020
A qualitatively optimal upper bound on the size of a quasi-polynomial sized cover for the set of hidden parameters is established and significantly improved learning algorithms for several fundamental high-dimensional probabilistic models with hidden variables are obtained.
High-Dimensional Probability: An Introduction with Applications in Data Science
- Mathematics
- 2020
© 2018, Cambridge University Press Let us summarize our findings. A random projection of a set T in R n onto an m-dimensional subspace approximately preserves the geometry of T if m ⪆ d ( T ) . For...
Classification with Strategically Withheld Data
- Computer ScienceAAAI
- 2021
This work designs three classification methods and presents IC-LR, a modification of Logistic Regression that removes the incentive to strategically drop features, and designs a simpler alternative called HC which consists of a hierarchical ensemble of out-of-thebox classifiers, trained using a specialized hill-climbing procedure which is shown to be convergent.
A Theoretical and Practical Framework for Regression and Classification from Truncated Samples
- Computer ScienceAISTATS
- 2020
This work considers the classical challenge of bias due to truncation, wherein samples falling outside of an “observation window” cannot be observed, and presents a general framework for regression and classification from samples that are truncated according to the value of the dependent variable.
Learning mixtures of linear regressions in subexponential time via Fourier moments
- Computer Science, MathematicsSTOC
- 2020
This paper gives the first algorithm for learning an MLR that runs in time which is sub-exponential in k and demonstrates a new method that is called “Fourier moment descent,” which uses univariate density estimation and low-degree moments of the Fourier transform of suitable univariate projections of the MLR to iteratively refine the estimate of the parameters.
Max-Affine Regression: Provable, Tractable, and Near-Optimal Statistical Estimation
- Computer Science, MathematicsArXiv
- 2019
This work analyzes a natural alternating minimization (AM) algorithm for the non-convex least squares objective and shows that the AM algorithm, when initialized suitably, converges with high probability and at a geometric rate to a small ball around the optimal coefficients.