Understanding user's query intent with wikipedia

  title={Understanding user's query intent with wikipedia},
  author={Jian Hu and G. Wang and Frederick H. Lochovsky and Jian-Tao Sun and Zheng Chen},
  booktitle={The Web Conference},
Understanding the intent behind a user's query can help search engine to automatically route the query to some corresponding vertical search engines to obtain particularly relevant contents, thus, greatly improving user satisfaction. [] Key Method Moreover, the method is very general and can be easily applied to various intent domains. We demonstrate the effectiveness of this method in three different applications, i.e., travel, job, and person name. In each of the three cases, only a couple of seed intent…

Figures and Tables from this paper

Query Understanding via Intent Description Generation

A novel Contrastive Generation model, namely CtrsGen for short, is proposed to generate the intent description by contrasting the relevant documents with the irrelevant documents given a query to address query understanding.

Query Classification by Leveraging Explicit Concept Information

This paper first leverage existing knowledge bases to enrich the short query from the concept level, then discusses the usage of the mined concept information and proposes a novel language model based query classification method which takes both words and concepts into consideration.

Mining Coordinated Intent Representation for Entity Search and Recommendation

A novel generative model is proposed to discover coordinated intent representations from the entity search logs that is effective for discovering meaningful coordinated shopping intents, and can be directly used for improving the accuracy of product search and recommendation.

User Intent in Multimedia Search

A thorough survey of multimedia information retrieval research directed at the problem of enabling search engines to respond to user intent is presented, including a differentiation from related, often-confused concepts of search intent.

Deep Search Query Intent Understanding

This paper focuses on the design for predicting users' intents as they type in queries on-the-fly in typeahead search using character-level models and accurate word-level intent prediction models for complete queries.

Query classification using Wikipedia

  • R. Khoury
  • Computer Science
    Int. J. Intell. Inf. Database Syst.
  • 2011
This paper develops a new query classification system that relies on the freely-available online encyclopedia Wikipedia as a natural-language knowledge-based, and exploits Wikipedia's structure to infer the correct classification of any given query.

Deep Query Intent Understanding at Scale

This paper focuses on the design for predicting users' intents as they type in queries on-the-fly in typeahead search using character-level models and accurate word-level intent prediction models for complete queries.

Scalable multi-dimensional user intent identification using tree structured distributions

A generic, extensible framework for learning the multi-dimensional representation of user intent from the query words, and empirical results show that FastQ yields accurate identification of intent when compared to a gold standard.

Identifying Web Queries with Question Intent

This work presents a supervised classification scheme, random forest over word-clusters for variable length texts, which can model the query structure and substantially improves classification performance in the CQA-intent selection task compared to content-oriented based classification, especially as query length grows.



Learning query intent from regularized click graphs

This work aims at drastically increasing the amounts of training data by semi-supervised learning with click graphs by inferring class memberships of unlabeled queries from those of labeled ones according to their proximities in a click graph.

Improving automatic query classification via semi-supervised learning

An application of computational linguistics is used to develop an approach for mining the vast amount of unlabeled data in Web query logs to improve automatic topical Web query classification and it is shown that this approach in combination with manual matching and supervised learning allows us to classify a substantially larger proportion of queries than any single technique.

Classifying search engine queries using the web as background knowledge

The architecture of a classification system that uses a web directory to identify the subject context that the query terms are frequently used in is described, which received the Runner-Up Award for Query Categorization Performance of the KDD Cup 2005.

Building bridges for web query classification

A novel approach for QC is presented that outperforms the winning solution of the ACM KDDCUP 2005 competition and introduces category selection as a new method for narrowing down the scope of the intermediate taxonomy based on which the authors classify the queries.

Robust classification of rare queries using web knowledge

We propose a methodology for building a practical robust query classification system that can identify thousands of query classes with reasonable accuracy, while dealing in real-time with the query

Personal name classification in web queries

This paper develops four different methods for building probabilistic name-term dictionaries in which a term is assigned with a probability value of the term being a name term and compared these methods with baseline algorithms.

Inferring the most important types of a query: a semantic approach

In this paper we present a technique for ranking the most important types or categories for a given query. Rather than trying to find the category of the query, known as query categorization, our

Enhancing text clustering by leveraging Wikipedia semantics

A way to build a concept thesaurus based on the semantic relations (synonym, hypernym, and associative relation) extracted from Wikipedia is proposed and a unified framework to leverage these semantic relations in order to enhance traditional content similarity measure for text clustering is developed.

Computing Semantic Relatedness Using Wikipedia-based Explicit Semantic Analysis

This work proposes Explicit Semantic Analysis (ESA), a novel method that represents the meaning of texts in a high-dimensional space of concepts derived from Wikipedia that results in substantial improvements in correlation of computed relatedness scores with human judgments.

Detecting online commercial intention (OCI)

The framework of building machine learning models to learn OCI based on any Web page content is presented, which builds models to detect OCI from search queries and Web pages, and discovers that frequent queries are more likely to have commercial intention.