Newsroom: A Dataset of 1.3 Million Summaries with Diverse Extractive Strategies
- Max Grusky, Mor Naaman, Yoav Artzi
- Computer ScienceNorth American Chapter of the Association for…
- 30 April 2018
The NEWSROOM dataset is presented, a summarization dataset of 1.3 million articles and summaries written by authors and editors in newsrooms of 38 major news publications between 1998 and 2017, and the summaries combine abstractive and extractive strategies.
Why we tag: motivations for annotation in mobile and online media
- Morgan G. Ames, Mor Naaman
- Computer ScienceInternational Conference on Human Factors in…
- 29 April 2007
The incentives for annotation in Flickr, a popular web-based photo-sharing system, and ZoneTag, a cameraphone photo capture and annotation tool that uploads images to Flickr are investigated to offer a taxonomy of motivations for annotation along two dimensions (sociality and function).
HT06, tagging paper, taxonomy, Flickr, academic article, to read
- Cameron A. Marlow, Mor Naaman, D. Boyd, Marc Davis
- Computer ScienceUK Conference on Hypertext
- 22 August 2006
A model of tagging systems, specifically in the context of web-based systems, is offered to help illustrate the possible benefits of these tools and a simple taxonomy of incentives and contribution models is provided to inform potential evaluative frameworks.
Is it really about me?: message content in social awareness streams
- Mor Naaman, Jeffrey Boase, Chih‐Hui Lai
- Computer ScienceConference on Computer Supported Cooperative Work
- 6 February 2010
A content-based categorization of the type of messages posted by Twitter users is developed, based on which the analysis shows two common types of user behavior in terms of the content of the posted messages, and exposes differences between users in respect to these activities.
Beyond Trending Topics: Real-World Event Identification on Twitter
- H. Becker, Mor Naaman, L. Gravano
- Computer ScienceInternational Conference on Web and Social Media
- 5 July 2011
This paper explores approaches for analyzing the stream of Twitter messages to distinguish between messages about real-world events and non-event messages, and relies on a rich family of aggregatestatistics of topically similar message clusters.
Generating diverse and representative image search results for landmarks
This work uses a combination of context- and content-based tools to generate representative sets of images for location-driven features and landmarks, a common search task.
Learning similarity metrics for event identification in social media
A variety of techniques for learning multi-feature similarity metrics for social media documents in a principled manner are explored and evaluation results suggest that the approach identifies events more effectively than the state-of-the-art strategies on which they are built.
Towards automatic extraction of event and place semantics from flickr tags
- T. Rattenbury, Nathaniel Good, Mor Naaman
- Computer ScienceAnnual International ACM SIGIR Conference on…
- 23 July 2007
An approach for extracting semantics of tags, unstructured text-labels assigned to resources on the Web, based on each tag's usage patterns, and shows that the Scale-structure Identification method outperforms the existing techniques.
Towards quality discourse in online news comments
The complex interplay between the needs and desires of news commenters with the functioning of different journalistic approaches toward managing comment quality is examined, and tensions and opportunities for value-sensitive innovation within such online communities are explored.
World explorer: visualizing aggregate data from unstructured text in geo-referenced collections
- Shane Ahern, Mor Naaman, Rahul Nair, J. Yang
- Computer ScienceACM/IEEE Joint Conference on Digital Libraries
- 18 June 2007
This work analyzes the tags associated with the geo-referenced Flickr images to generate aggregate knowledge in the form of "representative tags" for arbitrary areas in the world, and uses these tags to create a visualization tool, World Explorer, that can help expose the content of the data, using a map interface to display the derived tags and the original photo items.