A Multimodal Analytics Platform for Journalists Analyzing Large-Scale, Heterogeneous Multilingual, and Multimedia Content

@article{Vrochidis2018AMA,
  title={A Multimodal Analytics Platform for Journalists Analyzing Large-Scale, Heterogeneous Multilingual, and Multimedia Content},
  author={Stefanos Vrochidis and Anastasia Moumtzidou and Ilias Gialampoukidis and Dimitris Liparas and Gerard Casamayor and Leo Wanner and Nicolaus Heise and Tilman Wagner and Andriy Bilous and Emmanuel Jamin and Boyan Simeonov and Vladimir Alexiev and Reinhard Busch and Ioannis Arapakis and Yiannis Kompatsiaris},
  journal={Frontiers in Robotics and AI},
  year={2018},
  volume={5}
}
Analysts and journalists face the problem of having to deal with very large, heterogeneous, and multilingual data volumes that need to be analyzed, understood, and aggregated. Automated and simplified editorial and authoring process could significantly reduce time, labor, and costs. Therefore, there is a need for unified access to multilingual and multicultural news story material, beyond the level of a nation, ensuring context-aware, spatiotemporal, and semantic interpretation, correlating… 

A Relational Aggregated Disjoint Multimedia Search Results Approach using Semantics

TLDR
The relational aggregation approach assembles the disjoint multimedia vertical snippets, fully blend, and re-rank them in a linearly ranked list based on their semantic similarity with respect to the user’s query to retain the inter-relationship between multimedia contents and enhance multimedia searches.

References

SHOWING 1-10 OF 34 REFERENCES

Diamonds in the rough: Social media visual analytics for journalistic inquiry

TLDR
This work presents a visual analytic tool, Vox Civitas, designed to help journalists and media professionals extract news value from large-scale aggregations of social media content around broadcast events.

A Survey on Visual Analytics of Social Media Data

TLDR
A comprehensive survey to characterize this fast-growing area and summarize the state-of-the-art techniques for analyzing social media data is presented and existing techniques are classified into two categories: gathering information and understanding user behaviors.

Event analysis in social multimedia: a survey

TLDR
A comprehensive survey on event based analysis over social multimedia data, including event enrichment, detection, and categorization is provided, which introduces each paradigm and summarizes related research efforts.

Hashtagger+: Efficient High-Coverage Social Tagging of Streaming News

TLDR
This work proposes Hashtagger+, an efficient learning-to-rank framework for merging news and social streams in real-time, by recommending Twitter hashtags to news articles, and improves the efficiency and coverage of a state-of-the-art hashtag recommendation model.

A Hybrid Framework for News Clustering Based on the DBSCAN-Martingale and LDA

TLDR
A novel density-based news clustering framework, in which the assignment of news articles to topics is done by the well-established Latent Dirichlet Allocation, but the estimation of the number of clusters is performed by the novel DBSCAN-Martingale, which allows for extracting noise from the dataset and progressively extracts clusters from an OPTICS reachability plot.

Influence-based Twitter browsing with NavigTweet

University of Surrey Participation in TREC8: Weirdness Indexing for Logical Document Extrapolation and Retrieval (WILDER)

TLDR
This paper describes the development of a prototype document retrieval system based on frequency calculations and corpora comparison techniques, and uses term identification and extraction techniques for identifying topics discussed in a given text.

Support vector machines and Word2vec for text classification with semantic features

TLDR
This work demonstrates the effectiveness ofword2vec by showing that tf-idf and word2vec combined can outperform tf-IDf because word2 Vec provides complementary features (e.g. semantics that TF-idF can't capture) to tf- idf.

An Efficient Method for Document Categorization Based on Word2vec and Latent Semantic Analysis

  • Ronghui JuPan ZhouC. LiLijun Liu
  • Computer Science
    2015 IEEE International Conference on Computer and Information Technology; Ubiquitous Computing and Communications; Dependable, Autonomic and Secure Computing; Pervasive Intelligence and Computing
  • 2015
TLDR
This is the first attempt of combining word2vec with LSA at document categorization and it can map document to vector space under the premise of keeping document contents fully, and the results show that the accuracy made about 15% improvement than traditional methods.