# Estimation of the number of clusters on d-dimensional sphere

@article{Fujita2021EstimationOT, title={Estimation of the number of clusters on d-dimensional sphere}, author={Kazuhisa Fujita}, journal={ArXiv}, year={2021}, volume={abs/2011.07530} }

Spherical data is distributed on the sphere. The data appears in various fields such as meteorology, biology, and natural language processing. However, a method for analysis of spherical data does not develop enough yet. One of the important issues is an estimation of the number of clusters in spherical data. To address the issue, I propose a new method called the Spherical X-means (SX-means) that can estimate the number of clusters on d-dimensional sphere. The SX-means is the model-based… Expand

#### References

SHOWING 1-10 OF 24 REFERENCES

PG-means: learning the number of clusters in data

- Computer Science
- NIPS
- 2006

A novel algorithm called PG-means is presented, able to learn the number of clusters in a classical Gaussian mixture model, which is robust and efficient, and provides a much more stable estimate of thenumber of clusters than existing methods. Expand

Estimating the number of clusters using diversity

- Computer Science
- Artif. Intell. Res.
- 2018

It is shown that the difference between the global diversity of clusters and the sum of each cluster’s local diversity of their members can be used as an effective indicator of the optimality of the number of clusters, where the diversity is measured by Rao's quadratic entropy. Expand

Clustering on the Unit Hypersphere using von Mises-Fisher Distributions

- Computer Science, Mathematics
- J. Mach. Learn. Res.
- 2005

A generative mixture-model approach to clustering directional data based on the von Mises-Fisher distribution, which arises naturally for data distributed on the unit hypersphere, and derives and analyzes two variants of the Expectation Maximization framework for estimating the mean and concentration parameters of this mixture. Expand

A Clustering Method for Data in Cylindrical Coordinates

- Mathematics
- 2017

We propose a new clustering method for data in cylindrical coordinates based on the -means. The goal of the -means family is to maximize an optimization function, which requires a similarity. Thus,… Expand

Generative model-based clustering of directional data

- Computer Science, Mathematics
- KDD '03
- 2003

Modeling text data by vMF distributions lends theoretical validity to the use of cosine similarity which has been widely used by the information retrieval community and results indicate that this approach yields superior clusterings especially for difficult clustering tasks in high-dimensional spaces. Expand

X-means: Extending K-means with Efficient Estimation of the Number of Clusters

- Computer Science
- ICML
- 2000

A new algorithm is introduced that eeciently, searches the space of cluster locations and number of clusters to optimize the Bayesian Information Criterion (BIC) or the Akaike Information Criteria (AIC) measure. Expand

Learning the k in k-means

- Computer Science, Mathematics
- NIPS
- 2003

An improved algorithm for learning k while clustering based on a statistical test for the hypothesis that a subset of data follows a Gaussian distribution, which works well, and better than a recent method based on the BIC penalty for model complexity. Expand

Some methods for classification and analysis of multivariate observations

- Mathematics
- 1967

The main purpose of this paper is to describe a process for partitioning an N-dimensional population into k sets on the basis of a sample. The process, which is called 'k-means,' appears to give… Expand

Parameter estimation for von Mises–Fisher distributions

- Mathematics, Computer Science
- Comput. Stat.
- 2007

An iterative algorithm using fixed points to obtain the maximum likelihood estimate (m.l.e.) for κ is proposed, and it is proved that there is a unique local maximum for δ, i.e. the level of precision of the von Mises–Fisher distribution. Expand

A Quantitative Discriminant Method of Elbow Point for the Optimal Number of Clusters in Clustering Algorithm

- Computer Science
- 2020

A new elbow point discriminant method is proposed to work out a statistical metric estimating an optimal cluster number when clustering on a dataset and the experimental results demonstrated that the estimated optimal clusters number output by the newly proposed method is better than widely used Silhouette method. Expand