A Multimodal Analytics Platform for Journalists Analyzing Large-Scale, Heterogeneous Multilingual, and Multimedia Content

  title={A Multimodal Analytics Platform for Journalists Analyzing Large-Scale, Heterogeneous Multilingual, and Multimedia Content},
  author={Stefanos Vrochidis and Anastasia Moumtzidou and Ilias Gialampoukidis and Dimitris Liparas and Gerard Casamayor and Leo Wanner and Nicolaus Heise and Tilman Wagner and Andriy Bilous and Emmanuel Jamin and Boyan Simeonov and Vladimir Alexiev and Reinhard Busch and Ioannis Arapakis and Yiannis Kompatsiaris},
  journal={Frontiers in Robotics and AI},
Analysts and journalists face the problem of having to deal with very large, heterogeneous, and multilingual data volumes that need to be analyzed, understood, and aggregated. Automated and simplified editorial and authoring process could significantly reduce time, labor, and costs. Therefore, there is a need for unified access to multilingual and multicultural news story material, beyond the level of a nation, ensuring context-aware, spatiotemporal, and semantic interpretation, correlating… 

A Relational Aggregated Disjoint Multimedia Search Results Approach using Semantics

The relational aggregation approach assembles the disjoint multimedia vertical snippets, fully blend, and re-rank them in a linearly ranked list based on their semantic similarity with respect to the user’s query to retain the inter-relationship between multimedia contents and enhance multimedia searches.



Diamonds in the rough: Social media visual analytics for journalistic inquiry

This work presents a visual analytic tool, Vox Civitas, designed to help journalists and media professionals extract news value from large-scale aggregations of social media content around broadcast events.

A Survey on Visual Analytics of Social Media Data

A comprehensive survey to characterize this fast-growing area and summarize the state-of-the-art techniques for analyzing social media data is presented and existing techniques are classified into two categories: gathering information and understanding user behaviors.

Event analysis in social multimedia: a survey

A comprehensive survey on event based analysis over social multimedia data, including event enrichment, detection, and categorization is provided, which introduces each paradigm and summarizes related research efforts.

Hashtagger+: Efficient High-Coverage Social Tagging of Streaming News

This work proposes Hashtagger+, an efficient learning-to-rank framework for merging news and social streams in real-time, by recommending Twitter hashtags to news articles, and improves the efficiency and coverage of a state-of-the-art hashtag recommendation model.

A Hybrid Framework for News Clustering Based on the DBSCAN-Martingale and LDA

A novel density-based news clustering framework, in which the assignment of news articles to topics is done by the well-established Latent Dirichlet Allocation, but the estimation of the number of clusters is performed by the novel DBSCAN-Martingale, which allows for extracting noise from the dataset and progressively extracts clusters from an OPTICS reachability plot.

Influence-based Twitter browsing with NavigTweet

University of Surrey Participation in TREC8: Weirdness Indexing for Logical Document Extrapolation and Retrieval (WILDER)

This paper describes the development of a prototype document retrieval system based on frequency calculations and corpora comparison techniques, and uses term identification and extraction techniques for identifying topics discussed in a given text.

News Articles Classification Using Random Forests and Weighted Multimodal Features

The main contribution of this work is the introduction of a news article classification framework based on Random Forests and multimodal features (textual and visual), as well as the late fusion strategy that makes use of Random Fore forests operational capabilities.

Support vector machines and Word2vec for text classification with semantic features

This work demonstrates the effectiveness ofword2vec by showing that tf-idf and word2vec combined can outperform tf-IDf because word2 Vec provides complementary features (e.g. semantics that TF-idF can't capture) to tf- idf.