Topical link analysis for web search

  title={Topical link analysis for web search},
  author={Lan Nie and Brian D. Davison and Xiaoguang Qi},
  journal={Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval},
  • Lan Nie, Brian D. Davison, Xiaoguang Qi
  • Published 6 August 2006
  • Computer Science
  • Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Traditional web link-based ranking schemes use a single score to measure a page's authority without concern of the community from which that authority is derived. As a result, a resource that is highly popular for one topic may dominate the results of another topic in which it is less authoritative. To address this problem, we suggest calculating a score vector for each page to distinguish the contribution from different topics, using a random walk model that probabilistically combines page… 

Figures and Tables from this paper

Ranking by community relevance

This work decomposes a web page into separate subnodes with respect to each community pointing to it, to better model the query-specific reputation for each potential result.

C-Rank and its variants: A contribution-based ranking approach exploiting links and content

This paper addresses the problem in Web page ranking of effectively combining link and content information with efficiency high enough to be applicable to real-world search engines, and proposes three contribution-based models: C-Rank, PC-Rank and HC-Rank.

Web page importance ranking

  • W. Gaul
  • Computer Science
    Adv. Data Anal. Classif.
  • 2011
An approach is proposed that uses a set of interesting Web pages as starting point for a minimum walk algorithm to provide recommendations of additionally important Web information within a

Topical PageRank: A Model of Scientific Expertise for Bibliographic Search

ThemedPageRank, the combination of LDA-derived topics with PageRank differs from previous models in that topics influence both the bias and transition probabilities of PageRank, and incorporates the age of documents.

From Whence Does Your Authority Come? Utilizing Community Relevance in Ranking

This work applies a total of eighty queries over two real-world datasets to demonstrate that the use of community decomposition can consistently and significantly improve upon Page-Rank's top-ten results.

Generic Multi-Document Summarization Using Topic-Oriented Information

The topic-oriented PageRank (ToPageRank) model, in which topic information is fully incorporated, and the topic- oriented HITS (ToHITS) model is designed to compare the influence of different graph-based algorithms are proposed.

Topic-based PageRank on author cocitation networks

  • Ying Ding
  • Computer Science
    J. Assoc. Inf. Sci. Technol.
  • 2011
This paper proposes topic-dependent ranks based on the combination of a topic model and a weighted PageRank algorithm, using the author-conference-topic model to extract topic distribution of individual authors.

Mining Neighbors' Topicality to Better Control Authority Flow

This work separates page authority interaction by incorporating the topical context and the relationship between associated pages, and proposes a probabilistic method to model authority flows from different sources of neighbor pages.

Topic-based ranking in Folksonomy via probabilistic model

This paper puts forward a topic-sensitive tag ranking (TSTR) approach to rank tags automatically according to their topic relevance and applies it into tag recommendation, which demonstrates that the proposed tag ranking approach really boosts the performances of social-tagging related applications.

RankTopic: Ranking Based Topic Modeling

Experimental results show that Rank Topic performs much better than some baseline models and is comparable with the state-of-the-art link combined relational topic model (RTM) in generalization performance, document clustering and classification by setting a proper balancing parameter.



Topic-sensitive PageRank

A set of PageRank vectors are proposed, biased using a set of representative topics, to capture more accurately the notion of importance with respect to a particular topic, and are shown to generate more accurate rankings than with a single, generic PageRank vector.

The Intelligent surfer: Probabilistic Combination of Link and Content Information in PageRank

Experiments indicate that the proposed improves PageRank by using a more intelligent surfer, one that is guided by a probabilistic model of the relevance of a page to a query, significantly outperforms PageRank in the (human-rated) quality of the pages returned, while remaining efficient enough to be used in today's large search engines.

The PageRank Citation Ranking : Bringing Order to the Web

This paper describes PageRank, a mathod for rating Web pages objectively and mechanically, effectively measuring the human interest and attention devoted to them, and shows how to efficiently compute PageRank for large numbers of pages.

Improved algorithms for topic distillation in a hyperlinked environment

This paper addresses the problem of topic distillation on the World Wide Web, namely, given a typical user query to find quality documents related to the query topic, by augmenting a previous connectivity analysis based algorithm with content analysis.

The stochastic approach for link-structure analysis (SALSA) and the TKC effect

Block-level link analysis

Based on block-level link analysis, two new algorithms are proposed, Block Level PageRank and Block Level HITS, whose performances are studied extensively using web data.

Identifying link farm spam pages

Algorithms for detecting link farms automatically are presented by first generating a seed set based on the common link set between incoming and outgoing links of Web pages and then expanding it, providing a modified web graph to use in ranking page importance.

Mining the Web's Link Structure

Clever is a search engine that analyzes hyperlinks to uncover two types of pages: authorities, which provide the best source of information on a given topic; and hubs, which provides collections of links to authorities.

A Web surfer model incorporating topic continuity

A surfer model which incorporates information about topic continuity derived from the surfer's history, and captures the interrelationship between categorization (context) and ranking of Web documents simultaneously, unlike earlier models.

The Anatomy of a Large-Scale Hypertextual Web Search Engine