Eigenvalue-based model selection during latent semantic indexing

  title={Eigenvalue-based model selection during latent semantic indexing},
  author={Miles Efron},
This study describes amended parallel analysis (APA), a novel method for model selection in unsupervised learning problems such as information retrieval (IR). At issue is the selection of k, the number of dimensions retained under latent semantic indexing (LSI). APA is an elaboration of Horn’s parallel analysis, which advocates retaining eigenvalues larger than those that we would expect under term independence. APA operates by deriving confidence intervals on these “null eigenvalues.” The… CONTINUE READING


Publications citing this paper.
Showing 1-10 of 19 extracted citations

Development and Research of the Text Messages Semantic Clustering Methodology

2016 Third European Network Intelligence Conference (ENIC) • 2016
View 9 Excerpts
Highly Influenced

Instability of Relevance-Ranked Results Using Latent Semantic Indexing for Web Search

2010 43rd Hawaii International Conference on System Sciences • 2010
View 1 Excerpt


Publications referenced by this paper.
Showing 1-10 of 28 references

An improvement on horn’s parallel analysis methodology for selecting the correct number of factors to retain

L. W. Glorfeld
Educational and Psychological Measurement, • 1995
View 7 Excerpts
Highly Influenced

A Vector Space Model for Automatic Indexing

Commun. ACM • 1975
View 9 Excerpts
Highly Influenced

A Rationale and Test for the Number of Factors in Factor Analysis.

Psychometrika • 1965
View 4 Excerpts
Highly Influenced

Stopping rules in principal components analysis: A comparison of heuristical and

J. E. Efron 42 Jackson
View 4 Excerpts
Highly Influenced

Eigenvalue-Based Estimators for Optimal Dimensionality Reduction in Information Retrieval

M. Efron
PhD thesis, • 2002
View 2 Excerpts

Text Retrieval and Filtering

The Information Retrieval Series • 1998
View 1 Excerpt

Similar Papers

Loading similar papers…