• Corpus ID: 225062565

Adversarial Crowdsourcing Through Robust Rank-One Matrix Completion

  title={Adversarial Crowdsourcing Through Robust Rank-One Matrix Completion},
  author={Qianqian Ma and Alexander Olshevsky},
We consider the problem of reconstructing a rank-one matrix from a revealed subset of its entries when some of the revealed entries are corrupted with perturbations that are unknown and can be arbitrarily large. It is not known which revealed entries are corrupted. We propose a new algorithm combining alternating minimization with extreme-value filtering and provide sufficient and necessary conditions to recover the original rank-one matrix. In particular, we show that our proposed algorithm is… 

Figures and Tables from this paper

Rethinking Noisy Label Models: Labeler-Dependent Noise with Adversarial Awareness
A more principled model of label noise that generalizes instance-dependent noise to multiple labelers, based on the observation that modern datasets are typically annotated using distributed crowdsourcing methods, and shows that the proposed framework remains robust even in the presence of extreme adversarial label noise.
Detecting adversaries in Crowdsourcing
This work develops an approach that leverages the structure of second-order moments of annotator responses, to identify large numbers of adversaries, and mitigate their impact on the crowdsourcing task.
Generic Multi-label Annotation via Adaptive Graph and Marginalized Augmentation
A generic multi-label learning framework based on Adaptive Graph and Marginalized Augmentation (AGMA) in a semi-supervised scenario and makes use of a small amount of labeled data associated with a lot of unlabeled data to boost the learning performance.
Generative Multi-Label Correlation Learning
A general and compact Multi-Label Correlation Learning (MUCO) framework that explicitly and effectively learns the latent label correlations by updating a label correlation tensor, which provides high accurate and interpretable prediction results.
A General-Purpose Crowdsourcing Computational Quality Control Toolkit for Python
Crowd-Kit is demonstrated, a general-purpose crowdsourcing computational quality control toolkit that provides efficient implementations in Python of computationalquality control algorithms for crowdsourcing, including uncertainty measures and crowd consensus methods.
Semi-supervised Domain Adaptive Structure Learning
An adaptive structure learning method to regularize the cooperation of SSL and DA, inspired by the multi-views learning, that applies the maximum mean discrepancy (MMD) distance minimization and self-training (ST) to project the contradictory structures into a shared view to make the reliable final decision.
Adaptive Trajectory Prediction via Transferable GNN
This work proposes a novel Transferable Graph Neural Network (T-GNN) framework, which jointly conducts trajectory prediction as well as domain alignment in a unified framework, and is the pioneer which closes the gap in benchmarks and techniques for practical pedestrian trajectory prediction across different domains.


Crowdsourcing with Arbitrary Adversaries
This work designs an efficient algorithm to consistently estimate the workers’ error probabilities in an adversarial scenario that allows for arbitrary adversaries, for which not only error probabilities can be high, but which can also perfectly collude.
Gradient Descent for Sparse Rank-One Matrix Completion for Crowd-Sourced Aggregation of Sparsely Interacting Workers
This work formulate skill estimation as a rank-one correlation-matrix completion problem, where the observed components correspond to observed label correlations between workers, and derives sample complexity bounds in terms of spectral properties of the signless Laplacian of the sampling matrix.
Exact matrix completion via convex optimization
It is proved that one can perfectly recover most low-rank matrices from what appears to be an incomplete set of entries, and that objects other than signals and images can be perfectly reconstructed from very limited information.
The Power of Convex Relaxation: Near-Optimal Matrix Completion
This paper shows that, under certain incoherence assumptions on the singular vectors of the matrix, recovery is possible by solving a convenient convex program as soon as the number of entries is on the order of the information theoretic limit (up to logarithmic factors).
Efficient crowdsourcing for multi-class labeling
It is shown that it is possible to obtain an answer to each task correctly with probability 1-ε as long as the redundancy per task is O((K/q) log (K/ε)), where each task can have any of the $K$ distinct answers equally likely, q is the crowd-quality parameter that is defined through a probabilistic model.
Crowdsourcing via Pairwise Co-occurrences: Identifiability and Algorithms
This paper proposes an algebraic algorithm reminiscent of convex geometry-based structured matrix factorization to solve the model identification problem efficiently, and an identifiability-enhanced algorithm for handling more challenging and critical scenarios.
Minimax Rank-1 Factorization
This work proposes a weighted log least square based algorithm whose performance for small disturbances matches exactly the fundamental lower bounds that are derived for this problem, and which are related to the spectral gap of a graph representing the revealed entries.
Crowdsourcing via Tensor Augmentation and Completion
This paper proposes a novel structured approach based on tensor augmentation and completion that uses tensor representation for the labeled data, augments it with a groundtruth layer, and explores two methods to estimate the ground truth layer via low rank tensor completion.
Achieving budget-optimality with adaptive schemes in crowdsourcing
This work characterize the fundamental trade-off between budget and accuracy and introduces a novel adaptive scheme that matches this fundamental limit, and quantifies the fundamental gap between adaptive and non-adaptive schemes, by comparing the trade-offs with the one for non- Adaptive schemes.
A Permutation-Based Model for Crowd Labeling: Optimal Estimation and Robustness
A permutation-based model for crowd labeled data is proposed that is a significant generalization of the classical Dawid-Skene model, and a new error metric is introduced by which to compare different estimators.