• Publications
  • Influence
Empirical study of topic modeling in Twitter
Social networks such as Facebook, LinkedIn, and Twitter have been a crucial source of information for a wide spectrum of users. In Twitter, popular information that is deemed important by theExpand
  • 832
  • 59
Predicting popular messages in Twitter
Social network services have become a viable source of information for users. In Twitter, information deemed important by the community propagates through retweets. Studying the characteristics ofExpand
  • 470
  • 38
Web page classification: Features and algorithms
Classification of Web page content is essential to many tasks in Web information retrieval such as maintaining Web directories and focused crawling. The uncontrolled nature of Web content presentsExpand
  • 453
  • 25
Identifying link farm spam pages
With the increasing importance of search in guiding today's web traffic, more and more effort has been spent to create search engine spam. Since link analysis is one of the most important factors inExpand
  • 267
  • 25
Topical locality in the Web
Most web pages are linked to others with related content. This idea, combined with another that says that text in, and possibly around, HTML anchors describe the pages to which they point, is theExpand
  • 325
  • 24
Predicting Sequences of User Actions
People display regularities in almost everything they do. This paper proposes characteristics of an idealized algorithm that, when applied to sequences of user actions, would allow a user interfaceExpand
  • 226
  • 21
Detection of Harassment on Web 2.0
Web 2.0 has led to the development and evolution of web-based communities and applications. These communities provide places for information sharing and collaboration. They also open t he door forExpand
  • 250
  • 18
Co-factorization machines: modeling user interests and predicting individual decisions in Twitter
Users of popular services like Twitter and Facebook are often simultaneously overwhelmed with the amount of information delivered via their social connections and miss out on much content that theyExpand
  • 170
  • 15
Recognizing Nepotistic Links on the Web
The use of link analysis and page popularity in search engines has grown recently to improve query result rankings. Since the number of such links contributes to the value of the document in suchExpand
  • 201
  • 13
Topical TrustRank: using topicality to combat web spam
Web spam is behavior that attempts to deceive search engine ranking algorithms. TrustRank is a recent algorithm that can combat web spam. However, TrustRank is vulnerable in the sense that the seedExpand
  • 165
  • 12