Bridging the Ultimate Semantic Gap: A Semantic Search Engine for Internet Videos

@article{Jiang2015BridgingTU,
  title={Bridging the Ultimate Semantic Gap: A Semantic Search Engine for Internet Videos},
  author={Lu Jiang and Shoou-I Yu and Deyu Meng and Teruko Mitamura and Alexander Hauptmann},
  journal={Proceedings of the 5th ACM on International Conference on Multimedia Retrieval},
  year={2015}
}
Semantic search in video is a novel and challenging problem in information and multimedia retrieval. Existing solutions are mainly limited to text matching, in which the query words are matched against the textual metadata generated by users. This paper presents a state-of-the-art system for event search without any textual metadata or example videos. The system relies on substantial video content understanding and allows for semantic search over a large collection of videos. The novelty and… Expand
Text-to-video: a semantic search engine for internet videos
TLDR
This paper presents a state-of-the-art system for event search without any user-generated metadata or example videos, known as text-to-video search, which relies on substantial video content understanding and allows for searching complex events over a large collection of videos. Expand
Web-scale Multimedia Search for Internet Video Content
TLDR
E-Lamp Lite is the first content-based semantic search system that is capable of indexing and searching a collection of 100 million videos, and is believed to be the first of its kind large-scale semantic search engine for Internet videos. Expand
Fast and Accurate Content-based Semantic Search in 100M Internet Videos
TLDR
This paper proposes a scalable solution to large-scale content-based semantic search in video that represents a video by a few salient and consistent concepts that can be efficiently indexed by the modified inverted index. Expand
Strategies for Searching Video Content with Text Queries or Video Examples
TLDR
The proposed strategies have been incorporated into the submission for the TRECVID 2014 Multimedia Event Detection evaluation, where the system outperformed other submissions in both text queries and video example queries, demonstrating the effectiveness of the proposed approaches. Expand
Video Search via Ranking Network with Very Few Query Exemplars
TLDR
Experimental results show the effectiveness of the proposed triplet ranking network-based method on video retrieval with only a handful of positive exemplars. Expand
Semantic Based Video Retrieval System: Survey
TLDR
A general discussion on the overall process of the semantic video retrieval phases and a generic review of techniques that has been proposed to solve the semantic gap as the major scientific problem in semantic based video retrieval. Expand
Semantic-Based Video Retrieval Survey
TLDR
The different approaches of video retrieval are clearly and briefly categorized and the different methods which try to bridge the semantic gap are discussed in more details. Expand
Interpretable Embedding for Ad-Hoc Video Search
TLDR
This paper empirically demonstrates that, by using either the embedding features or concepts, considerable search improvement is attainable on TRECVid benchmarked datasets. Expand
Extracting semantic knowledge from web context for multimedia IR: a taxonomy, survey and challenges
TLDR
A data-driven taxonomy is introduced which is used in a literature review of the most emblematic and important approaches that use context-based data for multimedia information retrieval on the Web, and identifies important challenges and opportunities. Expand
Semantic Reasoning in Zero Example Video Event Retrieval
TLDR
The Semantic Event Retrieval System is presented which shows the importance of high-level concepts in a vocabulary for the retrieval of complex and generic high- level events and uses a novel concept selection method (i-w2v) based on semantic embeddings. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 53 REFERENCES
When textual and visual information join forces for multimedia retrieval
TLDR
This paper proposes and evaluates a video search framework based on using visual information to enrich the classic text-based search for video retrieval, and attempts to overcome the so called problem of semantic gap by automatically mapping query text to semantic concepts. Expand
Zero-shot video retrieval using content and concepts
TLDR
This work introduces a new method for automatically identifying relevant concepts given a text query using the Markov Random Field (MRF) retrieval framework, and finds that concept-based retrieval significantly outperforms text based approaches in recall. Expand
Incremental Multimodal Query Construction for Video Search
TLDR
An interactive system based on a state-of-the-art content-based video search pipeline which enables users to do multimodal text-to-video and video- to-video search in large video collections, and to incrementally refine queries through relevance feedback and model visualization is presented. Expand
Can High-Level Concepts Fill the Semantic Gap in Video Retrieval? A Case Study With Broadcast News
TLDR
It is concluded that "concept-based" video retrieval with fewer than 5000 concepts, detected with a minimal accuracy of 10% mean average precision is likely to provide high accuracy results in broadcast news retrieval. Expand
n-gram Models for Video Semantic Indexing
TLDR
The proposed n-gram modeling of shot sequences for video semantic indexing, in which semantic concepts are extracted from a video shot, improves the robustness against occlusion and camera-angle changes by effectively using information from the previous video shots. Expand
Multimodal knowledge-based analysis in multimedia event detection
TLDR
This work proposes a novel Adaptive Semantic Similarity (ASS) to measure textual similarity between ASR transcripts of videos, and incorporates acoustic concept indexing and classification to retrieve test videos, specially with too few spoken words. Expand
Few-Example Video Event Retrieval using Tag Propagation
TLDR
This paper proposes a tag-based video event retrieval system which propagates tags from a tagged video source to an unlabeled video collection without the need of any training examples, based on weighted frequency neighbor voting using concept vector similarity. Expand
Recommendations for video event recognition using concept vocabularies
TLDR
The recommendation that for effective event recognition the concept vocabulary should contain more than 200 concepts and be diverse by covering object, action, scene, people, animal and attribute concepts is considered. Expand
Zero-Example Event Search using MultiModal Pseudo Relevance Feedback
TLDR
The proposed MMPRF takes advantage of multiple modalities and multiple ranked lists to enhance event search performance in a principled way and leverages not only semantic features, but also non-semantic low-level features for event search in the absence of training data. Expand
E-LAMP: integration of innovative ideas for multimedia event detection
TLDR
The core methods and technologies of the framework developed recently for Event Labeling through Analytic Media Processing (E-LAMP) system are introduced and a novel algorithm is developed to learn a more robust and discriminative intermediate feature representation from multiple features so that better event models can be built upon it. Expand
...
1
2
3
4
5
...