Quang Vu Bui

We don’t have enough information about this author to calculate their statistics. If you think this is an error let us know.
Learn More
The Latent Dirichlet Allocation (LDA) is one of the most used topic models to discover complex semantic structure. However, for massive corpora of text LDA can be very slow and can require days or even months. This problem created a particular interest in parallel solutions, like the Approximate Distributed LDA (AD-LDA), where clusters of computers are used(More)
We address in this work the problem of document clustering. Our approach is based on the following pipeline. First, we quantify the topics in a document. Then, a number of clusters is set automatically. Finally, a multi-criteria distance is defined to cluster the documents. The advantage of this approach is that it allows us to have a number of(More)
  • 1