Learn More
Web spam pages use various techniques to achieve higher-than-deserved rankings in a search en-gine's results. While human experts can identify spam, it is too expensive to manually evaluate a large number of pages. Instead, we propose techniques to semi-automatically separate reputable, good pages from spam. We first select a small set of seed pages to be(More)
Web spamming refers to actions intended to mislead search engines into ranking some pages higher than they deserve. Recently, the amount of web spam has in­ creased dramatically, leading to a degradation of search results. This paper presents a comprehensive taxon­ omy of current spamming techniques, which we believe can help in developing appropriate(More)
Tagging systems allow users to interactively annotate a pool of shared resources using descriptive tags. As tagging systems are gaining in popularity, they become more susceptible to <i>tag spam:</i> misleading tags that are generated in order to increase the visibility of some resources or simply to confuse users. We introduce a framework for modeling(More)
Link spam is used to increase the ranking of certain target web pages by misleading the connectivity-based ranking algorithms in search engines. In this paper we study how web pages can be interconnected in a spam farm in order to optimize rankings. We also study alliances, that is, interconnections of spam farms. Our results identify the optimal structures(More)
Link spamming intends to mislead search engines and trigger an artificially high link-based ranking of specific target web pages. This paper introduces the concept of spam mass, a measure of the impact of link spamming on a page's ranking. We discuss how to estimate spam mass and how the estimates can help identifying pages that benefit significantly from(More)
Yahoo! Answers represents a new type of community portal that allows users to post questions and/or answer questions asked by other members of the community, already featuring a very large number of questions and several million users. Other recently launched services, like Microsoft's Live QnA and Amazon's Askville, follow the same basic interaction model.(More)
Tagging systems allow users to interactively annotate a pool of shared resources using descriptive strings called <i>tags</i>. Tags are used to guide users to interesting resources and help them build communities that share their expertise and resources. As tagging systems are gaining in popularity, they become more susceptible to <i>tag spam</i>:(More)
A novel dicistrovirus (strain NB-1/2011/HUN, KJ802403) genome was detected from guano collected from an insectivorous bat (species Pipistrellus pipistrellus) in Hungary, using viral metagenomics. The complete genome of NB-1 is 9136 nt in length, excluding the poly(A) tail. NB-1 has a genome organization typical of a dicistrovirus with multiple 3BVPg and a(More)