Corpus ID: 125743601

PhD Dissertation: Generalized Independent Components Analysis Over Finite Alphabets

  title={PhD Dissertation: Generalized Independent Components Analysis Over Finite Alphabets},
  author={Amichai Painsky},
  journal={arXiv: Machine Learning},
  • Amichai Painsky
  • Published 2018
  • Mathematics, Computer Science
  • arXiv: Machine Learning
Independent component analysis (ICA) is a statistical method for transforming an observable multi-dimensional random vector into components that are as statistically independent as possible from each other. Usually the ICA framework assumes a model according to which the observations are generated (such as a linear transformation with additive noise). ICA over finite fields is a special case of ICA in which both the observations and the independent components are over a finite alphabet. In this… Expand
Linear Independent Component Analysis Over Finite Fields: Algorithms and Bounds
A basic lower bound is introduced that provides a fundamental limit to the ability of any linear solution to solve ICA over finite fields, and a greedy algorithm is presented that outperforms all currently known methods. Expand
Lossless (and Lossy) Compression of Random Forests
This work introduces a novel method for lossless compression of tree-based ensemble methods, focusing on random forests, based on probabilistic modeling of the ensemble's trees, followed by model clustering via Bregman divergence. Expand
Faster Algorithms for Binary Matrix Factorization
These techniques generalize to minimizing ‖U · V − A‖p for p ≥ 1, in 2O(k log poly(mn) time, which has a graph-theoretic consequence, namely, a 2 2)poly(mn)-time algorithm to approximate a graph as a union of disjoint bicliques. Expand
Innovation Representation of Stochastic Processes With Application to Causal Inference
This paper studies the representation of different stochastic process as a memoryless innovation process triggering a dynamic system, and shows that such a representation is always feasible for innovation processes taking values over a continuous set. Expand


Generalized Independent Component Analysis Over Finite Alphabets
This work considers a generalization of this framework in which an observation vector is decomposed to its independent components with no prior assumption on the way it was generated, and provides the first efficient and constructive set of solutions to Barlow's problem. Expand
Binary independent component analysis: Theory, bounds and algorithms
This work introduces novel lower bounds and theoretical properties for the BICA problem, both under linear and non-linear transformations, and presents simple algorithms which apply the methodology and achieve favorable merits, both in terms of their accuracy and their practically optimal computational complexity. Expand
Binary Independent Component Analysis With or Mixtures
It is proved that bICA is uniquely identifiable under the disjunctive generation model, and a deterministic iterative algorithm to determine the distribution of the latent random variables and the mixing matrix is proposed. Expand
Large Alphabet Source Coding Using Independent Component Analysis
This paper introduces a conceptual framework in which a large alphabet source is decomposed into “as statistically independent as possible” components and shows that in many cases, such decomposition results in a sum of marginal entropies which is only slightly greater than the entropy of the source. Expand
Independent component analysis, A new concept?
  • P. Comon
  • Mathematics, Computer Science
  • Signal Process.
  • 1994
An efficient algorithm is proposed, which allows the computation of the ICA of a data matrix within a polynomial time and may actually be seen as an extension of the principal component analysis (PCA). Expand
ICA in Boolean XOR Mixtures
It is shown that if none of the independent random sources is uniform (i.e., neither one has probability 0.5 for 1/0), then any invertible mixing is identifiable (up to permutation ambiguity). Expand
Universal noiseless coding
  • L. Davisson
  • Mathematics, Computer Science
  • IEEE Trans. Inf. Theory
  • 1973
This paper considers noiseless coding for sources with unknown parameters, primarily in terms of variable-length coding, with performance measured as a function of the coding redundancy relative to the per-letter conditional source entropy given the unknown parameter. Expand
Universal Compression of Memoryless Sources over Large Alphabets via Independent Component Analysis
This work proposes a conceptual framework in which a large alphabet memory less source is decomposed into multiple 'as independent as possible' sources whose alphabet is much smaller, to efficiently find the ideal trade-off so that the overall compression size is minimal. Expand
Blind Deconvolution of Multi-Input Single-Output Systems With Binary Sources
The problem of blind source separation for multi-input single-output (MISO) systems with binary inputs is treated and the problem of misclassified observations is correct using an iterative scheme based on the Viterbi algorithm for the decoding of a hidden Markov model (HMM). Expand
Information Theoretical Analysis of Multivariate Correlation
The present paper gives various theorems, according to which Ctot(λ) can be decomposed in terms of the partial correlations existing in subsets of λ, and of quantities derivable therefrom. Expand