Using heuristics to estimate an appropriate number of latent topics in source code analysis

@article{Grant2013UsingHT,
  title={Using heuristics to estimate an appropriate number of latent topics in source code analysis},
  author={Scott Grant and James R. Cordy and David B. Skillicorn},
  journal={Sci. Comput. Program.},
  year={2013},
  volume={78},
  pages={1663-1678}
}
Latent Dirichlet Allocation (LDA) is a data clustering algorithm that performs especially well for text documents. In natural-language applications it automatically finds groups of related words (called “latent topics”) and clusters the documents into sets that are about the same “topic”. LDA has also been applied to source code, where the documents are natural source code units such as methods or classes, and the words are the keywords, operators, and programmer-defined names in the code. The… CONTINUE READING
Highly Cited
This paper has 33 citations. REVIEW CITATIONS
Recent Discussions
This paper has been referenced on Twitter 2 times over the past 90 days. VIEW TWEETS

Citations

Publications citing this paper.
Showing 1-10 of 20 extracted citations

References

Publications referenced by this paper.
Showing 1-10 of 39 references

Similar Papers

Loading similar papers…