Clustering and Diversifying Web Search Results with Graph-Based Word Sense Induction

@article{Marco2013ClusteringAD,
  title={Clustering and Diversifying Web Search Results with Graph-Based Word Sense Induction},
  author={Antonio Di Marco and R. Navigli},
  journal={Computational Linguistics},
  year={2013},
  volume={39},
  pages={709-754}
}
Web search result clustering aims to facilitate information search on the Web. Rather than the results of a query being presented as a flat list, they are grouped on the basis of their similarity and subsequently shown to the user as a list of clusters. Each cluster is intended to represent a different meaning of the input query, thus taking into account the lexical ambiguity (i.e., polysemy) issue. Existing Web clustering methods typically rely on some shallow notion of textual similarity… Expand
Retrieving web search results using Max–Max soft clustering for Hindi query
TLDR
This is the first attempt to fuzzy IR for a query in Hindi language, experimental evaluations shows promising results. Expand
Multilingual Word Sense Induction to Improve Web Search Result Clustering
TLDR
Some preliminary ideas to exploit the multilingual Word Sense Induction method to Web search result clustering to improve the WSI results are given. Expand
Neural Embedding Language Models in Semantic Clustering of Web Search Results
TLDR
It is shown that in the task of semantically clustering search results, prediction-based models slightly but stably outperform traditional count-based ones, with the same training corpora. Expand
Graph-Based Concept Clustering for Web Search Results
TLDR
This paper proposes a method to cluster the web search results with high clustering quality using graph-based clustering with concept which extract from the external knowledge source, and compared the clustering results of this method with two well-known search results clustering techniques, Suffix Tree Clustering and Lingo. Expand
PageRank-based Word Sense Induction within Web Search Results Clustering
TLDR
The evaluation results show that PageRank-based sense induction achieves interesting results when compared to state-of-the-art content-based algorithms in the context of Web Search Results Clustering. Expand
A Relative Study on Search Results Clustering Algorithms - K-means, Suffix Tree and LINGO
TLDR
A comparative analysis is done on three common search results of clustering algorithms to study the performance of the web search engine using m ultiple test collections and evaluation measures. Expand
Web Search Results Clustering Using Frequent Termset Mining
TLDR
This work acquires the senses of a query by means of a word sense induction method that identify meanings as trees of closed frequent termsets mining and clusters the search results based on their lexical and semantic intersection with induced senses. Expand
A comparison of graph-based word sense induction clustering algorithms in a pseudoword evaluation framework
TLDR
A self-sufficient pseudoword-based evaluation framework for wsi graph-based clustering algorithms, thereby defining a new evaluation measure (top2) and a secondary clustering process (hyperclustering). Expand
A Novel Method for Clustering Web Search Results with Wikipedia Disambiguation Pages
TLDR
A novel method to cluster search results of ambiguous query into topics about the query constructed from Wikipedia disambiguation pages (WDP) is proposed and a concept filtering method to filter semantically unrelated concepts in each topic is proposed. Expand
Word Sense Induction with Closed Frequent Termsets
TLDR
The proposed sense induction algorithm (called SenseSearcher, or SnS) is based on closed frequent sets, and as a result, it provides a multilevel sense representation, which can be used for clustering search result, according to the discovered senses. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 125 REFERENCES
Inducing Word Senses to Improve Web Search Result Clustering
TLDR
This work first acquires the senses of a query by means of a graph-based clustering algorithm that exploits cycles in the co-occurrence graph of the query, then clusters the search results based on their semantic similarity to the induced word senses. Expand
An Unsupervised Approach to Cluster Web Search Results Based on Word Sense Communities
TLDR
The clustering problem as a word sense discovery problem is reformalized as a unsupervised method and the modularity score of the discovered keyword community structure is used to measure page clustering necessity. Expand
Graph-based Word Clustering using a Web Search Engine
TLDR
An unsupervised algorithm for word clustering based on a word similarity measure by web counts, called Newman clustering, is proposed for efficiently identifying word clusters. Expand
Web Search Clustering and Labeling with Hidden Topics
TLDR
This article introduces a novel framework for clustering Web search results in Vietnamese which is able to cluster and label short snippets effectively in a topic-oriented manner without concerning whole Web pages. Expand
Word Sense Induction & Disambiguation Using Hierarchical Random Graphs
TLDR
The inferred hierarchical structures are applied to the problem of word sense disambiguation, where it is shown that the method performs significantly better than traditional graph-based methods and agglomerative clustering yielding improvements over state-of-the-art WSD systems based on sense induction. Expand
Clustering Web Search Results with Maximum Spanning Trees
TLDR
This work presents a novel method for clustering Web search results based on Word Sense Induction, which improves classical search result clustering methods in terms of both clustering quality and degree of diversification. Expand
Web document clustering: a feasibility demonstration
TLDR
To satisfy the stringent requirements of the Web domain, an incremental, linear time algorithm called Suffix Tree Clustering (STC) is introduced which creates clusters based on phrases shared between documents, showing that STC is faster than standard clustering methods in this domain. Expand
Wikipedia as Sense Inventory to Improve Diversity in Web Search Results
TLDR
Wikipedia has a much better coverage of search results, the distribution of senses in search results can be estimated using the internal graph structure of the Wikipedia and the relative number of visits received by each sense in Wikipedia, and associating Web pages to Wikipedia senses with simple and efficient algorithms can produce modified rankings that cover 70% more Wikipedia senses than the original search engine rankings. Expand
Word sense disambiguation in queries
TLDR
A new approach to determine the senses of words in queries by using WordNet is presented, which has 100% applicability and 90% accuracy on the most recent robust track of TREC collection of 250 queries and the retrieval effectiveness is 7% better than the best reported result in the literature. Expand
Information retrieval using word senses: root sense tagging approach
TLDR
This paper proposes a new method using word senses in information retrieval: root sense tagging method that assigns coarse-grained word senses defined in WordNet to query terms and document terms by unsupervised way using co-occurrence information constructed automatically. Expand
...
1
2
3
4
5
...