# Bayesian Boolean Matrix Factorisation

@article{Rukat2017BayesianBM, title={Bayesian Boolean Matrix Factorisation}, author={Tammo Rukat and Christopher C. Holmes and Michalis K. Titsias and Christopher Yau}, journal={ArXiv}, year={2017}, volume={abs/1702.06166} }

Boolean matrix factorisation aims to decompose a binary data matrix into an approximate Boolean product of two low rank, binary matrices: one containing meaningful patterns, the other quantifying how the observations can be expressed as a combination of these patterns. We introduce the OrMachine, a probabilistic generative model for Boolean matrix factorisation and derive a Metropolised Gibbs sampler that facilitates efficient parallel posterior inference. On real world and simulated data, our…

## Figures and Tables from this paper

## 28 Citations

Bayesian Nonparametric Boolean Factor Models

- Computer ScienceArXiv
- 2019

This work lifts the restriction of a pre-specified number of latent dimensions by introducing an Indian Buffet Process prior over factor matrices to enable posterior inference to scale to Billions of observation.

Probabilistic Boolean Tensor Decomposition

- Computer ScienceICML
- 2018

This work facilitates scalable sampling-based posterior inference by exploitation of the combinatorial structure of the factor conditionals in Boolean tensor decomposition, and provides an entirely novel perspective on relational properties of continuous data and, in the present example, on the molecular heterogeneity of cancer.

TensOrMachine: Probabilistic Boolean Tensor Decomposition

- Computer ScienceArXiv
- 2018

This work facilitates scalable sampling-based posterior inference by exploitation of the combinatorial structure of the factor conditionals in Boolean tensor decomposition, and provides an entirely novel perspective on relational properties of continuous data and, in the present example, on the molecular heterogeneity of cancer.

Recent Developments in Boolean Matrix Factorization

- Computer ScienceArXiv
- 2020

A concise summary of the efforts of all of the communities studying Boolean Matrix Factorization is given and some open questions which in this opinion require further investigation are raised.

MEBF: a fast and efficient Boolean matrix factorization method

- Computer ScienceArXiv
- 2019

MEBF demonstrated superior performances in lower reconstruction error, and higher computational efficiency, as well as more accurate sparse patterns than popular methods such as ASSO, PANDA and MP, and revealed its further potential in knowledge retrieving and data denoising.

Fast and Efficient Boolean Matrix Factorization by Geometric Segmentation

- Computer ScienceAAAI
- 2020

MEBF (Median Expansion for Boolean Factorization) demonstrated superior performances in lower reconstruction error, and higher computational efficiency, as well as more accurate density patterns than popular methods such as ASSO, PANDA and Message Passing.

Bayesian Mean-parameterized Nonnegative Binary Matrix Factorization

- Computer ScienceData Min. Knowl. Discov.
- 2020

This work proposes a unified framework for Bayesian mean-parameterized nonnegative binary matrix factorization models (NBMF) and derives a novel collapsed Gibbs sampler and a collapsed variational algorithm to infer the posterior distribution of the factors.

Geometric All-Way Boolean Tensor Decomposition

- Computer ScienceNeurIPS
- 2020

This work presented a computationally efficient BTD algorithm, namely GETF, that sequentially identifies the rank-1 basis components for a tensor from a geometric perspective that has significantly improved performance in reconstruction accuracy, extraction of latent structures and it is an order of magnitude faster than other state-of-the-art methods.

Biclustering and Boolean Matrix Factorization in Data Streams

- Computer Science, MathematicsProc. VLDB Endow.
- 2020

An algorithm is provided that, after one pass over the stream, recovers the set of clusters on the right side of the graph using sublinear space; to the best of the knowledge, this is the first algorithm with this property.

Boolean matrix factorization meets consecutive ones property

- Computer Science, MathematicsSDM
- 2019

This paper studies a problem of Boolean matrix factorization where it is additionally require that the factor matrices have consecutive ones property (OBMF), and develops a greedy algorithm where at each step the authors look for the best 1-rank factorization.

## References

SHOWING 1-10 OF 28 REFERENCES

MDL4BMF: Minimum Description Length for Boolean Matrix Factorization

- Computer ScienceTKDD
- 2014

An existing algorithm for BMF is extended to use MDL to identify the best Boolean matrix factorization, analyze the complexity of the problem, and perform an extensive experimental evaluation to study its behavior.

Boolean Matrix Factorization and Noisy Completion via Message Passing

- Computer ScienceICML
- 2016

This empirical study demonstrates that message passing is able to recover low-rank Boolean matrices, in the boundaries of theoretically possible recovery and compares favorably with state-of-the-art in real-world applications, such collaborative filtering with large-scale Boolean data.

The Discrete Basis Problem

- Computer ScienceIEEE Transactions on Knowledge and Data Engineering
- 2008

This paper describes a matrix decomposition formulation for Boolean data, the Discrete Basis Problem, and gives a simple greedy algorithm for solving it and shows how it can be solved using existing methods.

Multi-assignment clustering for Boolean data

- Computer ScienceICML '09
- 2009

A generative method for clustering vectorial data, where each object can be assigned to multiple clusters using a deterministic annealing scheme, which decomposes the observed data into the contributions of individual clusters and infers their parameters.

Modeling Dyadic Data with Binary Latent Factors

- Computer ScienceNIPS
- 2006

This work introduces binary matrix factorization, a novel model for unsupervised matrix decomposition, and shows how to extend it to an infinite model in which the number of features is not a priori fixed but is allowed to grow with the size of the data.

Why Does Deep and Cheap Learning Work So Well?

- Computer ScienceArXiv
- 2016

It is argued that when the statistical process generating the data is of a certain hierarchical form prevalent in physics and machine learning, a deep neural network can be more efficient than a shallow one.

Deep Exponential Families

- Computer ScienceAISTATS
- 2015

This extensive study shows that going beyond one layer improves predictions for DEFs, and demonstrates that DEFs find interesting exploratory structure in large data sets, and give better predictive performance than state-of-the-art models.

Probabilistic topic models

- Computer ScienceCommun. ACM
- 2010

Surveying a suite of algorithms that offer a solution to managing large document archives suggests they are well-suited to handle large amounts of data.

Hierarchical compositional feature learning

- Computer ScienceArXiv
- 2016

Using MPMP as an inference engine for HCN makes new tasks simple: adding supervision information, classifying images, or performing inpainting all correspond to clamping some variables of the model to their known values and running MPMP on the rest.

A Non-Parametric Bayesian Method for Inferring Hidden Causes

- Computer ScienceUAI
- 2006

This work presents a non-parametric Bayesian approach to structure learning with hidden causes that assumes that the number of hidden causes is unbounded, but only a finite number influence observable variables.