• Corpus ID: 7420342

Linear regression without correspondence

@inproceedings{Hsu2017LinearRW,
  title={Linear regression without correspondence},
  author={Daniel J. Hsu and Kevin Shi and Xiaorui Sun},
  booktitle={NIPS},
  year={2017}
}
This article considers algorithmic and statistical aspects of linear regression when the correspondence between the covariates and the responses is unknown. [] Key Method draws from a standard multivariate normal distribution, an efficient algorithm based on lattice basis reduction is shown to exactly recover the unknown linear function in arbitrary dimension. Finally, lower bounds on the signal-to-noise ratio are established for approximate recovery of the unknown linear function by any estimator.
Linear Regression Without Correspondences via Concave Minimization
TLDR
The resulting algorithm outperforms state-of-the-art methods for fully shuffled data and remains tractable for up to 8-dimensional signals, an untouched regime in prior work.
An Algebraic-Geometric Approach to Shuffled Linear Regression
TLDR
Using the machinery of algebraic geometry it is proved that as long as the independent samples are generic, this polynomial system is always consistent with at most $n!$ complex roots, regardless of any type of corruption inflicted on the observations.
An Algebraic-Geometric Approach for Linear Regression Without Correspondences
TLDR
The machinery of algebraic geometry is used, which uses symmetric polynomials to extract permutation-invariant constraints that the parameters of the linear regression model must satisfy, to prove that as long as the independent samples are generic, this polynomial system is always consistent with at most n complex roots, regardless of any type of corruption inflicted on the observations.
Robust approximate linear regression without correspondence
TLDR
An important computational neuroscience application of the proposed framework is demonstrated by demonstrating its effectiveness in a neuron matching problem where the presence of outliers in both the source and target nematodes is a natural tendency.
Generalized Shuffled Linear Regression
TLDR
This work generalizes the formulation of shuffled linear regression to a broader range of conditions where only part of the data should correspond, and presents a remarkably simple yet effective optimization algorithm with guaranteed global convergence.
Isotonic regression with unknown permutations: Statistics, computation, and adaptation
TLDR
It is shown that natural modifications of existing estimators fail to satisfy at least one of the desiderata of optimal worst-case statistical performance, computational efficiency, and fast adaptation in the multivariate case.
Optimal Estimator for Unlabeled Linear Regression
TLDR
This paper proposes a one-step estimator which is optimal from both the computational and the statistical aspects of unlabeled linear regression and exhibits the same order of computational complexity as that of the oracle case.
A Sparse Representation-Based Approach to Linear Regression with Partially Shuffled Labels
TLDR
It turns out that in this situation, estimation of the regression parameter on the one hand and recovery of the underlying permutation on the other hand can be decoupled so that the computational hardness associated with the latter can be sidestepped.
Linear regression with partially mismatched data: local search with theoretical guarantees
TLDR
This paper uses an optimization formulation to simultaneously learn the underlying regression coefficients and the permutation corresponding to the mismatches of partially mismatched pairs, and proposes and studies a simple greedy local search algorithm for this optimization problem that enjoys strong theoretical guarantees and appealing computational performance.
A Hypergradient Approach to Robust Regression without Correspondence
TLDR
Thorough numerical experiments show that ROBOT achieves better performance than existing methods in both linear and nonlinear regression tasks, including real-world applications such as flow cytometry and multi-object tracking.
...
...

References

SHOWING 1-10 OF 26 REFERENCES
Linear regression with an unknown permutation: Statistical and computational limits
TLDR
This work analyzes the problem of permutation recovery in a random design setting in which the entries of the matrix A are drawn i.i.d. from a standard Gaussian distribution, and establishes sharp conditions on the SNR, sample size n, and dimension d under which Π* is exactly and approximately recoverable.
Denoising linear models with permuted data
TLDR
This work focuses on the denoising problem and characterize the minimax error rate up to logarithmic factors, and provides an exact algorithm for the noiseless problem and demonstrates its performance on an image point-cloud matching task.
Unbiased estimates for linear regression via volume sampling
TLDR
The methods are used to obtain an algorithm for volume sampling that is faster than state-of-the-art and for obtaining bounds for the total loss of the estimated least-squares solution on all labeled columns.
Linear Regression with Shuffled Labels
TLDR
This work proposes several estimators that recover the weights of a noisy linear model from labels that are shuffled by an unknown permutation, and shows that the analog of the classical least-squares estimator produces inconsistent estimates in this setting.
Sketching as a Tool for Numerical Linear Algebra
TLDR
This survey highlights the recent advances in algorithms for numericallinear algebra that have come from the technique of linear sketching, and considers least squares as well as robust regression problems, low rank approximation, and graph sparsification.
Adaptive estimation of a quadratic functional by model selection
We consider the problem of estimating ∥s∥ 2 when s belongs to some separable Hilbert space and one observes the Gaussian process Y(t) = (s, t) + σ L(t), for all t ∈ H, where L is some Gaussian
Unlabeled sensing: Reconstruction algorithm and theoretical guarantees
TLDR
This paper considers the situation in which the order of noisy samples, taken from a linear measurement system, is missing and proposes a much more efficient algorithm based on a geometrical viewpoint of the problem.
One-dimensional empirical measures, order statistics, and Kantorovich transport distances
This work is devoted to the study of rates of convergence of the empirical measures μn = 1 n ∑n k=1 δXk , n ≥ 1, over a sample (Xk)k≥1 of independent identically distributed real-valued random
Non-asymptotic theory of random matrices: extreme singular values
TLDR
This survey addresses the non-asymptotic theory of extreme singular values of random matrices with independent entries and focuses on recently developed geometric methods for estimating the hard edge ofrandom matrices (the smallest singular value).
Near-Optimal Coresets for Least-Squares Regression
TLDR
Deterministic, low-order polynomial-time algorithms are given to construct such coresets with approximation guarantees, together with lower bounds indicating that there is not much room for improvement upon the results.
...
...