The World Conversation: Web Page Metadata Generation From Social Sources

  title={The World Conversation: Web Page Metadata Generation From Social Sources},
  author={Omar Alonso and Sushma Nagesh Bannur and Kartikay Khandelwal and Shankar Kalyanaraman},
  journal={Proceedings of the 24th International Conference on World Wide Web},
Over the past couple of years, social networks such as Twitter and Facebook have become the primary source for consuming information on the Internet. One of the main differentiators of this content from traditional information sources available on the Web is the fact that these social networks surface individuals' perspectives. When social media users post and share updates with friends and followers, some of those short fragments of text contain a link and a personal comment about the web page… 
How it Happened: Discovering and Archiving the Evolution of a Story Using Social Signals
This work proposes the problem of automatic event story generation and archiving by combining social and news data to construct a new type of document in the form of a Wiki-like page structure by using a timeline algorithm as the base for a story.
What's Happening and What Happened: Searching the Social Web
The goal is to find links that are relevant on social networks as a mechanism to discover what people are talking about at a given point in time and make such information searchable and persistent.
Proposal of a New Social Signal for Excluding Common Web Pages in Multiple Social Networking Services
A new social signal is proposed that assesses the degree to which a certain web page is a hot-topic web only in an SNS by combining the social signals of SNSs and it is shown that by acquiring web pages on the basis of the magnitude of the proposed newsocial signal, hot- topic web pages in multiple S NSs are excludable.
Social Information Access
This chapter offers an introduction to the emerging field of social information access, a stream of research that explores methods for organizing the past interactions of users in a community in order to provide future users with better access to information.
A Lightweight Representation of News Events on Social Media
This work proposes a lightweight representation of newsworthy social media data that leverages microblog features, such as redundancy and re-sharing capabilities, by using surrogate texts from shared URLs and word embeddings, to achieve comparable clustering results to those obtained by using the complete data.
Gaining historical and international relations insights from social media: spatio-temporal real-world news analysis using Twitter
The hypothesis is that by including social, temporal, and spatial information in the event representation, this work is enabling the analysis of historical world news from a social and geopolitical perspective, and facilitates new information retrieval tasks related to historical event information extraction and international relations analysis.
Social Search
This chapter begins by framing the social search landscape in terms of the sources of data available and the ways in which this can be leveraged before, during, and after search.
Automatic Generation of Event Timelines from Social Data
A technique that uses social information as relevance surrogates to generate an informative timeline using a variation of pseudo relevance feedback that is automatically generated using social data without external evidence is presented.
Short video metadata acquisition game
  • Ales Masiar, Jakub Simko
  • Computer Science
    2015 10th International Workshop on Semantic and Social Media Adaptation and Personalization (SMAP)
  • 2015
A human computation game, which acquires metadata for short videos found in the Vine social media service, and shows, that these keywords are both correct and from the great part, represent new information atop existing video descriptions.


Describing the Web in less than 140 Characters
This paper investigates how people deal with the strong limitation of 140 characters per message, showing that this constraint encourages people to perform a good synthesis of the content they are linking to and efficiently cluster the actual content of the linked pages with an algorithm based on lexical proximities between messages.
Incorporating social anchors for ad hoc retrieval
It is shown that by incorporating social anchor features, search effectiveness for "ad hoc" tasks can be significantly improved compared to state-of-the-art approaches.
The Metadata Triumvirate: Social Annotations, Anchor Texts and Search Queries
  • Michael G. Noll, C. Meinel
  • Computer Science
    2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology
  • 2008
A large research data set called CABS120k is introduced, which is created from a variety of information sources such as AOL500k, the Open Directory Project,!, Google and the WWW in general to investigate several characteristics of metadata including length, novelty, diversity, and similarity.
Kondenzer: Exploration and visualization of archived social media
Kondenzer is presented - an offline system for condensing, archiving and visualizing social data that creates digests of social data using a combination of filtering, duplicate removal and efficient clustering.
Social annotations: utility and prediction modeling
A taxonomy of social relevance aspects that influence the utility of social annotations in search, spanning query classes, the social network, and content relevance is introduced.
Understanding Document Aboutness Step One: Identifying Salient Entities
This work proposes salience classification functions that incorporate various cues from document content, web search logs, and a large web graph that significantly outperform competitive baselines and the previous state of the art, while keeping the human annotation cost to a minimum.
Analysis of anchor text for web search
The main premise is that anchor text behaves very much like real user queries and consensus titles, so an understanding of how anchor text is related to a document will likely lead to better understanding ofHow to translate a user’s query into high quality search results.
Social annotations in web search
This work asks how to best present social annotations on search results, and attempts to find an answer through mixed-method eye-tracking and interview experiments, by recommending improvements to the design and content of social annotations to make them more noticeable and useful.
Social summarization in collaborative web search
This paper focuses on the role of snippets in collaborative web search and describes a technique for summarizing search results that harnesses the collaborative search behaviour of communities of like-minded searchers to produce snippets that are more focused on the preferences of the searchers.
Exploiting Anchor Text as a Lexical Resource
It is found that for many target pages, incoming anchors form a miniature corpus of reference expressions whose properties with relation both to other target sites and to each other can be put to use for mining lexical information.