• Corpus ID: 243860657

Exploratory Factor Analysis of Data on a Sphere

  title={Exploratory Factor Analysis of Data on a Sphere},
  author={Fan Dai and Karin S. Dorman and Somak Dutta and Ranjan Maitra},
Data on high-dimensional spheres arise frequently in many disciplines either naturally or as a consequence of preliminary processing and can have intricate dependence structure that needs to be understood. We develop exploratory factor analysis of the projected normal distribution to explain the variability in such data using a few easily interpreted latent factors. Our methodology provides maximum likelihood estimates through a novel fast alternating expectation profile conditional… 

Figures from this paper



A Matrix-Free Likelihood Method for Exploratory Factor Analysis of High-Dimensional Gaussian Data

  • Fan DaiSomak DuttaR. Maitra
  • Computer Science
    Journal of computational and graphical statistics : a joint publication of American Statistical Association, Institute of Mathematical Statistics, Interface Foundation of North America
  • 2020
A novel profile likelihood method for estimating the covariance parameters in exploratory factor analysis of high-dimensional Gaussian datasets with fewer observations than number of variables is proposed.

Directional data analysis under the general projected normal distribution.

Clustering on the Unit Hypersphere using von Mises-Fisher Distributions

A generative mixture-model approach to clustering directional data based on the von Mises-Fisher distribution, which arises naturally for data distributed on the unit hypersphere, and derives and analyzes two variants of the Expectation Maximization framework for estimating the mean and concentration parameters of this mixture.

Statistical analysis on high-dimensional spheres and shape spaces

We consider the statistical analysis of data on high-dimensional spheres and shape spaces. The work is of particular relevance to applications where high-dimensional data are available—a commonly

The General Projected Normal Distribution of Arbitrary Dimension: Modeling and Bayesian Inference

The general projected normal distribution is a simple and intuitive model for directional data in any dimension: a multivariate normal random vector divided by its length is the projection of that

Model-based clustering on the unit sphere with an illustration using gene expression profiles.

This work considers model-based clustering of data that lie on a unit sphere and proposes to model the clusters on the sphere with inverse stereographic projections of multivariate normal distributions.

Bi-cross-validation for factor analysis

A method based on bi-cross-validation, using randomly held-out submatrices of the data to choose the optimal number of factors is introduced, which performs better than many existing methods especially when both the number of variables and the sample size are large and some of the factors are relatively weak.

Population Structure and Eigenanalysis

An approach to studying population structure (principal components analysis) is discussed that was first applied to genetic data by Cavalli-Sforza and colleagues, and results from modern statistics are used to develop formal significance tests for population differentiation.

Metric learning for text documents

  • G. Lebanon
  • Computer Science, Mathematics
    IEEE Transactions on Pattern Analysis and Machine Intelligence
  • 2006
Many algorithms in machine learning rely on being given a good distance metric over the input space. Rather than using a default metric such as the Euclidean metric, it is desirable to obtain a

EM algorithms for ML factor analysis

The details of EM algorithms for maximum likelihood factor analysis are presented for both the exploratory and confirmatory models. The algorithm is essentially the same for both cases and involves