Corpus ID: 51880421

Gradient Descent for Sparse Rank-One Matrix Completion for Crowd-Sourced Aggregation of Sparsely Interacting Workers

@inproceedings{Ma2018GradientDF,
  title={Gradient Descent for Sparse Rank-One Matrix Completion for Crowd-Sourced Aggregation of Sparsely Interacting Workers},
  author={Yao Ma and Alexander Olshevsky and Csaba Szepesvari and Venkatesh Saligrama},
  booktitle={ICML},
  year={2018}
}
We consider worker skill estimation for the single-coin Dawid-Skene crowdsourcing model. In practice, skill-estimation is challenging because worker assignments are sparse and irregular due to the arbitrary and uncontrolled availability of workers. We formulate skill estimation as a rank-one correlation-matrix completion problem, where the observed components correspond to observed label correlations between workers. We show that the correlation matrix can be successfully recovered and skills… Expand
Adversarial Crowdsourcing Through Robust Rank-One Matrix Completion
TLDR
This work proposes a new algorithm combining alternating minimization with extreme-value filtering and provide sufficient and necessary conditions to recover the original rank-one matrix when some of the revealed entries are corrupted with perturbations that are unknown and can be arbitrarily large. Expand
Crowdsourcing via Annotator Co-occurrence Imputation and Provable Symmetric Nonnegative Matrix Factorization
TLDR
This work recasts the pairwise co-occurrence based D&S model learning problem as a symmetric NMF (SymNMF) problem— which offers enhanced identifiability relative to CNMF. Expand
A Worker-Task Specialization Model for Crowdsourcing: Efficient Inference and Fundamental Limits
TLDR
A highly general d-type worker-task specialization model in which the reliability of each worker can change depending on the type of a given task, where the number d of types can scale in the number of tasks is proposed. Expand
Minimax Rank-$1$ Matrix Factorization
TLDR
This work considers the problem of recovering a rankone matrix when a perturbed subset of its entries is revealed and proposes a method based on least squares in the log-space that matches the lower bounds that are derived for this problem in the smallperturbation regime. Expand
Minimax Rank-1 Factorization
We consider the problem of recovering a rank-one matrix from a subset of entries subject to arbitrary perturbations, assuming we have no information about the magnitude of perturbation. We propose aExpand
Factorization Approach for Low-complexity Matrix Completion Problems: Exponential Number of Spurious Solutions and Failure of Gradient Methods
TLDR
This work investigates the landscape of B-M factorized polynomial-time solvable matrix completion (MC) problems, which are the most popular subclass of low-rank matrix optimization problems without the RIP condition, and defines a new complexity metric that potentially measures the solvability ofLow-rank Matrix optimization problems based on the B- M factorization approach. Expand
Crowdsourced Label Aggregation Using Bilayer Collaborative Clustering
TLDR
A novel bilayer collaborative clustering (BLCC) method for the label aggregation in crowdsourcing that first generates the conceptual-level features for the instances from their multiple noisy labels and infers the initially integrated labels by performing clustering on the conceptual -level features. Expand
Exact Guarantees on the Absence of Spurious Local Minima for Non-negative Rank-1 Robust Principal Component Analysis
TLDR
This work shows that the low-dimensional formulation of the symmetric and asymmetric positive rank-1 RPCA based on the Burer-Monteiro approach has benign landscape, and provides strong deterministic and probabilistic guarantees for the exact recovery of the true principal components. Expand
Crowdsourced Classification with XOR Queries: Fundamental Limits and An Efficient Algorithm
TLDR
This work considers an effective query type that asks "group attribute" of a chosen subset of objects and proposes an efficient inference algorithm that achieves the information-theoretic limit on the optimal number of queries to reliably recover unknown labels. Expand
Exact Guarantees on the Absence of Spurious Local Minima for Non-negative Robust Principal Component Analysis
TLDR
This work shows that the low-dimensional formulation of the symmetric and asymmetric positive rank-1 RPCA based on the Burer-Monteiro approach has benign landscape, and provides strong deterministic and probabilistic guarantees for the exact recovery of the true principal components. Expand
...
1
2
...

References

SHOWING 1-10 OF 37 REFERENCES
Matrix Completion has No Spurious Local Minimum
TLDR
It is proved that the commonly used non-convex objective function for positive semidefinite matrix completion has no spurious local minima --- all local minata must also be global. Expand
Low-Rank Matrix Approximation with Weights or Missing Data Is NP-Hard
TLDR
This paper proves that computing an optimal WLRA is NP-hard, already when a rank-one approximation is sought, and shows that it is hard to compute approximate solutions to the WL RA problem with some prescribed accuracy. Expand
Spectral Methods Meet EM: A Provably Optimal Algorithm for Crowdsourcing
TLDR
Experimental results demonstrate that the proposed algorithm for multi-class crowd labeling problems is comparable to the most accurate empirical approach, while outperforming several other recently proposed methods. Expand
Error Rate Bounds and Iterative Weighted Majority Voting for Crowdsourcing
TLDR
Nite-sample exponential bounds on the error rate (in probability and in expectation) of general aggregation rules under the Dawid-Skene crowdsourcing model are provided and can be used to analyze many aggregation methods, including majority voting, weighted majority voting and the oracle Maximum A Posteriori rule. Expand
Efficient crowdsourcing for multi-class labeling
TLDR
It is shown that it is possible to obtain an answer to each task correctly with probability 1-ε as long as the redundancy per task is O((K/q) log (K/ε)), where each task can have any of the $K$ distinct answers equally likely, q is the crowd-quality parameter that is defined through a probabilistic model. Expand
Budget-Optimal Task Allocation for Reliable Crowdsourcing Systems
TLDR
A new algorithm is given for deciding which tasks to assign to which workers and for inferring correct answers from the workers' answers, and it is shown that the minimum price necessary to achieve a target reliability scales in the same manner under both adaptive and nonadaptive scenarios. Expand
Minimax Optimal Convergence Rates for Estimating Ground Truth from Crowdsourced Labels
Crowdsourcing has become a primary means for label collection in many real-world machine learning applications. A classical method for inferring the true labels from the noisy labels provided byExpand
Weighted Low-Rank Approximations
TLDR
This work provides a simple and efficient algorithm for solving weighted low-rank approximation problems, which, unlike their unweighted version, do not admit a closed-form solution in general. Expand
Variational Inference for Crowdsourcing
TLDR
By choosing the prior properly, both BP and MF perform surprisingly well on both simulated and real-world datasets, competitive with state-of-the-art algorithms based on more complicated modeling assumptions. Expand
Max-Margin Majority Voting for Learning from Crowds
TLDR
This paper presents max-margin majority voting (M$^3$3V) to improve the discriminative ability of majority voting and further presents a Bayesian generalization to incorporate the flexibility of generative methods on modeling noisy observations with worker confusion matrices for different application settings. Expand
...
1
2
3
4
...