Maxima Units Search (MUS) algorithm: methodology and applications

@article{Egidi2016MaximaUS,
  title={Maxima Units Search (MUS) algorithm: methodology and applications},
  author={Leonardo Egidi and Roberta Pappad{\`a} and Francesco Pauli and Nicola Torelli},
  journal={arXiv: Computation},
  year={2016}
}
An algorithm for extracting identity submatrices of small rank and pivotal units from large and sparse matrices is proposed. The procedure has already been satisfactorily applied for solving the label switching problem in Bayesian mixture models. Here we introduce it on its own and explore possible applications in different contexts. 
pivmet: Pivotal Methods for Bayesian Relabelling and k-Means Clustering
TLDR
The R package pivmet presented in this paper includes different methods for extracting pivotal units from a dataset, and provides functions to perform consensus clustering based on pivotal units, which may allow to improve classical techniques.
K-means seeding via MUS algorithm
TLDR
A modified version of K-means is proposed, based on a suitable choice of the initial centers, that takes advantage of the information contained in a co-association matrix to define a pivot-based initialization step.
Relabelling in Bayesian mixture models by pivotal units
TLDR
A new procedure based on the post-MCMC relabelling of the chains is proposed, which performs a clustering technique on the similarity matrix, obtained through the MCMC sample, whose elements are the probabilities that any two units in the observed sample are drawn from the same component.

References

SHOWING 1-8 OF 8 REFERENCES
Bayesian Solutions to the Label Switching Problem
TLDR
A fully Bayesian treatment of the permutations which performs better than alternatives and can even be used to compute summaries of the posterior samples for nonparametric Bayesian methods, for which no good solutions exist so far.
Relabelling in Bayesian mixture models by pivotal units
TLDR
A new procedure based on the post-MCMC relabelling of the chains is proposed, which performs a clustering technique on the similarity matrix, obtained through the MCMC sample, whose elements are the probabilities that any two units in the observed sample are drawn from the same component.
Dealing with label switching in mixture models
TLDR
It is demonstrated that this fails in general to solve the ‘label switching’ problem, and an alternative class of approaches, relabelling algorithms, which arise from attempting to minimize the posterior expected loss under a class of loss functions are described.
Data clustering using evidence accumulation
  • A. Fred, Anil K. Jain
  • Computer Science
    Object recognition supported by user interaction for service robots
  • 2002
TLDR
Results on both synthetic and real data show the ability of the K-means method to identify arbitrary shaped clusters in multidimensional data.
Markov Chain Monte Carlo Methods and the Label Switching Problem in Bayesian Mixture Modeling
TLDR
The solutions to the label switching problem of Markov chain Monte Carlo methods, such as artificial identifiability constraints, relabelling algorithms and label invariant loss functions are reviewed.
An index of factorial simplicity
An index of factorial simplicity, employing the quartimax transformational criteria of Carroll, Wrigley and Neuhaus, and Saunders, is developed. This index is both for each row separately and for a
Algorithms for the Assignment and Transportation Problems
TLDR
In this paper, algorithms for the solution of the general assignment and transportation problems are presen, and the algorithm is generalized to one for the transportation problem.
The Hungarian method for the assignment problem
  • 1955