Structured Latent Factor Analysis for Large-scale Data: Identifiability, Estimability, and Their Implications

  title={Structured Latent Factor Analysis for Large-scale Data: Identifiability, Estimability, and Their Implications},
  author={Yunxiao Chen and Xiaoou Li and Siliang Zhang},
  journal={Journal of the American Statistical Association},
  pages={1756 - 1770}
Abstract–Latent factor models are widely used to measure unobserved latent traits in social and behavioral sciences, including psychology, education, and marketing. When used in a confirmatory manner, design information is incorporated as zero constraints on corresponding parameters, yielding structured (confirmatory) latent factor models. In this article, we study how such design information affects the identifiability and the estimation of a structured latent factor model. Insights are gained… 

Identifying Interpretable Discrete Latent Structures from Discrete Data

A class of interpretable discrete latent structure models for discrete data and a general identifiability theory that is applicable to various types of latent structures are proposed, ranging from a single latent variable to deep layers of latent variables organized in a sparse graph.

A note on identifiability conditions in confirmatory factor analysis

Identifiability of Bifactor Models

The bifactor model and its extensions are multidimensional latent variable models, under which each item measures up to one subdimension on top of the primary dimension(s). Despite their wide

Determining the Number of Factors in High-Dimensional Generalized Latent Factor Models

An information criterion to determine the number of factors in generalized latent factor models is proposed and an error bound is established for the parameter estimates, which plays an important role in establishing the consistency of the proposed information criterion.

On Estimation in Latent Variable Models

This paper considers a gradient based method via using variance reduction technique to accelerate estimation procedure and shows the convergence results for the proposed method under general and mild model assumptions.

Bayesian Pyramids: Identifiable Multilayer Discrete Latent Structure Models for Discrete Data

This article establishes the identifiability of Bayesian pyramids by developing novel transparent conditions on the pyramid-shaped deep latent directed graph, which can ensure Bayesian posterior consistency under suitable priors and can be a useful alternative to popular machine learning methods.

A Tensor-EM Method for Large-Scale Latent Class Analysis with Binary Responses

Theoretically, the clustering consistency of the MLE in assigning subjects into latent classes when N and J both go to infinity is established, and the proposed tensor-EM pipeline enjoys both good accuracy and computational efficiency for large-scale data with binary responses.

Estimation Methods for Item Factor Analysis: An Overview

This chapter discusses estimation methods for IFA models and their computation, with a focus on the situation where the sample size, the number of items, and the numberof factors are all large.

A Note on Exploratory Item Factor Analysis by Singular Value Decomposition

This note provides the statistical underpinning of the singular value decomposition (SVD) algorithm and shows its statistical consistency under the same double asymptotic setting as in Chen et al. (2019b).

Computation for Latent Variable Model Estimation: A Unified Stochastic Proximal Framework.

A unified formulation for the optimization problem is provided and then a quasi-Newton stochastic proximal algorithm is proposed, which is shown to be efficient and robust under various settings for latent variable model estimation.



Identifiability of restricted latent class models with binary responses

The identifiability issue of a family of restricted latent class models, where the restriction structures are needed to reflect pre-specified assumptions on the related assessment, is considered and a new technique to establish the identifiable result is developed.

Latent Variable Models and Factor Analysis: A Unified Approach

A unified approach showing how such apparently diverse methods as Latent Class Analysis and Factor Analysis are actually members of the same family of latent variable modeling from a statistical perspective is provided.

Joint Maximum Likelihood Estimation for High-Dimensional Exploratory Item Factor Analysis

A notion of statistical consistency is established for a constrained JML estimator, under an asymptotic setting that both the numbers of items and people grow to infinity and that many responses may be missing.

The Sufficient and Necessary Condition for the Identifiability and Estimability of the DINA Model

This work gives the sufficient and necessary condition for identifiability of the basic DINA model, which not only addresses the open problem in Xu and Zhang on the minimal requirement forIdentifiability, but also sheds light on the study of more general CDMs, which often cover DINA as a submodel.

Generalized Latent Variable Modeling: Multilevel, Longitudinal, and Structural Equation Models

This area is a little more complicated than the Gaussian one, because there is no such unifying concept as the multivariate normal distribution. To begin with, there is the important distinction

Bi-cross-validation for factor analysis

A method based on bi-cross-validation, using randomly held-out submatrices of the data to choose the optimal number of factors is introduced, which performs better than many existing methods especially when both the number of variables and the sample size are large and some of the factors are relatively weak.

Joint Maximum Likelihood Estimation for High-dimensional Exploratory Item Response Analysis

A constrained joint maximum likelihood estimator is proposed for estimating both item and person parameters, which yields good theoretical properties and computational advantage and derives error bounds for parameter estimation and develops an efficient algorithm that can scale to very large datasets.

Generalized latent trait models

A unified maximum likelihood method for estimating the parameters of the generalized latent trait model will be presented and in addition the scoring of individuals on the latent dimensions is discussed.

Exploratory and Confirmatory Factor Analysis: Understanding Concepts and Applications

This volume presents the important concepts required for implementing two disciplines of factor analysis - exploratory factor analysis (EFA) and confirmatory factor analysis (CFA) with an emphasis on

Generalized latent variable models: multilevel, longitudinal, and structural equation models

METHODOLOGY THE OMNI-PRESENCE OF LATENT VARIABLES Introduction 'True' variable measured with error Hypothetical constructs Unobserved heterogeneity Missing values and counterfactuals Latent responses