• Corpus ID: 220936610

SHARI - An Integration of Tools to Visualize the Story of the Day

@article{Jones2020SHARIA,
  title={SHARI - An Integration of Tools to Visualize the Story of the Day},
  author={Shawn M. Jones and Alexander C. Nwala and Martin Klein and Michele C. Weigle and Michael L. Nelson},
  journal={ArXiv},
  year={2020},
  volume={abs/2008.00139}
}
Tools such as Google News and Flipboard exist to convey daily news, but what about the past? In this paper, we describe how to combine several existing tools with web archive holdings to perform news analysis and visualization of the "biggest story" for a given date. StoryGraph clusters news articles together to identify a common news story. Hypercane leverages ArchiveNow to store URLs produced by StoryGraph in web archives. Hypercane analyzes these URLs to identify the most common terms… 

References

SHOWING 1-10 OF 15 REFERENCES
Generating Stories From Archived Collections
TLDR
The Dark and Stormy Archive (DSA) framework is proposed, in which it is found that the stories automatically generated by DSA are indistinguishable from those created by human subject domain experts, while at the same time both kinds of stories are easily distinguished from randomly generated stories.
365 Dots in 2019: Quantifying Attention of News Sources
TLDR
The overlap of topics of online news articles from a variety of sources is investigated by measuring this overlap and scoring news stories according to the degree of attention in near-real time to enable multiple studies, including identifying topics that receive the most attention from news organizations and identifying slow news days versus major news days.
The Many Shapes of Archive-It
TLDR
This work focuses on the collections within Archive-It, a subscription service started by the Internet Archive in 2005 for the purpose of allowing organizations to create their own collections of archived web pages, or mementos, and proposes using structural metadata as an additional way to understand these collections.
Scraping SERPs for Archival Seeds: It Matters When You Start
TLDR
The findings suggest that due to the difficulty in retrieving the URIs of news stories from Google, collection building that originates from search engines should begin as soon as possible in order to capture the first stages of events, and should persist in orderto capture the evolution of the events.
Focused Crawl of Web Archives to Build Event Collections
TLDR
The results show that focused crawling on the archived web can be done and indeed results in highly relevant collections, especially for events that happened further in the past.
Bootstrapping Web Archive Collections from Social Media
TLDR
The results showed that social media sources such as Reddit, Storify, Twitter, and Wikipedia produce collections that are similar to Archive-It collections, and curators may consider extracting URIs from these sources in order to begin or augment collections about various news topics.
ArchiveNow: Simplified, Extensible, Multi-Archive Preservation
TLDR
This module allows a user to submit a URI of a web page for archiving at several configured web archives, and provides the user with links to the archived copies of the web page.
HTTP Framework for Time-Based Access to Resource States - Memento
TLDR
The HTTP-based Memento framework bridges the present and past Web by introducing datetime negotiation and TimeMaps, a variation on content negotiation that leverages the given resource's URI and a user agent's preferred datetime.
Raintale – A Storytelling Tool For Web Archives
  • https://ws-dl.blogspot.com/2019/07/ 2019-07-11-raintale-storytelling-tool.html,
  • 2019
A Preview of MementoEmbed: Embeddable Surrogates for Archived Web Pages. https://ws-dl.blogspot
  • 2018
...
1
2
...