Corpus ID: 212694843

TRECVID 2019: An evaluation campaign to benchmark Video Activity Detection, Video Captioning and Matching, and Video Search & retrieval

  title={TRECVID 2019: An evaluation campaign to benchmark Video Activity Detection, Video Captioning and Matching, and Video Search \& retrieval},
  author={G. Awad and A. Butt and Keith Curtis and Yooyoung Lee and J. Fiscus and A. Godil and Andrew Delgado and Jesse Zhang and Eliot Godard and Lukas L. Diduch and A. Smeaton and Yyette Graham and Wessel Kraaij and G. Qu{\'e}not},
The TREC Video Retrieval Evaluation (TRECVID) 2019 was a TREC-style video analysis and retrieval evaluation, the goal of which remains to promote progress in research and development of content-based exploitation and retrieval of information from digital video via open, metrics-based evaluation. Over the last nineteen years this effort has yielded a better understanding of how systems can effectively accomplish such processing and how one can reliably benchmark their performance. TRECVID has… Expand
VIREO-EURECOM @ TRECVID 2019: Ad-hoc Video Search (AVS)
The systems developed for Ad-hoc Video Search (AVS) task at TRECVID 2019 and the achieved results are described and the advantages and shortcomings of these video search approaches are analyzed. Expand
RUC_AIM3 at TRECVID 2019: Video to Text
This paper proposes a late fusion strategy to ensemble different models to improve system generalization abilities and generate video representations with rich semantic information via fusing multi-modal features for both two sub-tasks of TRECVID 2019 Video to Text Challenge. Expand
IMFD IMPRESEE at TRECVID 2019: Ad-Hoc Video Search and Video To Text
A deep learning model based on Word2VisualVec++ is developed, extracting temporal information of the video by using Dense Trajectories and a clustering approach to encode them into a single vector representation. Expand
What Matters for Ad-hoc Video Search? A Large-scale Evaluation on TRECVID
  • Aozhu Chen, Fan Hu, Zihan Wang, Fangming Zhou, Xirong Li
  • Computer Science
  • ArXiv
  • 2021
For quantifying progress in Ad-hoc Video Search (AVS), the annual TRECVID AVS task is an important international evaluation. Solutions submitted by the task participants vary in terms of theirExpand
Is the Reign of Interactive Search Eternal? Findings from the Video Browser Showdown 2020
Comprehensive and fair performance evaluation of information retrieval systems represents an essential task for the current information age. Whereas Cranfield-based evaluations with benchmarkExpand
Renmin University of China and Zhejiang Gongshang University at TRECVID 2019: Learn to Search and Describe Videos
The 2019 edition of the TRECVID benchmark has been a fruitful participation for the joint-team and the solutions based on two deep learning models, i.e. the W2VV++ network and the Dual Encoding Network, are developed. Expand
VireoJD-MM @ TRECVid 2019: Activities in Extended Video (ActEV)
This paper describes the system developed for Activities in Extended Video(ActEV) task at TRECVid 2019 and the achieved results, and extends the system for two aspects separately: better object detection and advanced two-stream action classification. Expand
Hybrid Sequence Encoder for Text Based Video Retrieval
This report presents a hybrid sequential encoder which make use of the utilities of not only the multi-modal sources but also the feature extractors such as GRU, aggregated vectors, graph modeling, etc in this AVS task. Expand
NTT_CQUPT@TRECVID2019 ActEV: Activities in Extended Video
The system for activity detection in extended videos (ActEV) in TRECVID2019[4] is composed of five modules: object detection, activity proposal generation, feature extraction, classification and postprocessing. Expand
Waseda_Meisei_SoftBank at TRECVID 2019: Ad-hoc Video Search
The Waseda Meisei SoftBank team participated in the TRECVID 2019 Ad-hoc Video Search (AVS) task and used two approaches for video retrieval from large-scale video data using query sentences, consisting of concept-based video retrieval for manually assisted runs and visual-semantic embedding for fully automatic runs. Expand


TRECVid Semantic Indexing of Video: A 6-year Retrospective
The data, protocol and metrics used for the main and the secondary tasks, the results obtained and the main approaches used by participants are described. Expand
Evaluation of automatic video captioning using direct assessment
It is shown how the direct assessment method is replicable and robust and scales to where there are many caption-generation techniques to be evaluated including the TRECVid video-to-text task in 2017. Expand
Dual Encoding for Zero-Example Video Retrieval
This paper takes a concept-free approach, proposing a dual deep encoding network that encodes videos and queries into powerful dense representations of their own and establishes a new state-of-the-art for zero-example video retrieval. Expand
A large-scale benchmark dataset for event recognition in surveillance video
We introduce a new large-scale video dataset designed to assess the performance of diverse visual event recognition algorithms with a focus on continuous visual event recognition (CVER) in outdoorExpand
TRECVID 2016: Evaluating Video Search, Video Event Detection, Localization, and Hyperlinking
TRECVID 2016: Evaluating Video Search, Video Event Detection, Localization, and Hyperlinking George Awad, Jonathan Fiscus, David Joy, Martial Michel, Alan Smeaton, Wessel Kraaij, Maria Eskevich, Robin Aly, Roeland Ordelman, Marc Ritter, et al. Expand
CIDEr: Consensus-based image description evaluation
A novel paradigm for evaluating image descriptions that uses human consensus is proposed and a new automated metric that captures human judgment of consensus better than existing metrics across sentences generated by various sources is evaluated. Expand
V3C - a Research Video Collection
This work states that existing video datasets used for research and experimentation are either not large enough to represent current collections or do not reflect the properties of video commonly found on the Internet in terms of content, length, or resolution. Expand
Very Deep Convolutional Networks for Large-Scale Image Recognition
This work investigates the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting using an architecture with very small convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers. Expand
METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments
METEOR is described, an automatic metric for machine translation evaluation that is based on a generalized concept of unigram matching between the machineproduced translation and human-produced reference translations and can be easily extended to include more advanced matching strategies. Expand
A simple and efficient sampling method for estimating AP and NDCG
We consider the problem of large scale retrieval evaluation. Recently two methods based on random sampling were proposed as a solution to the extensive effort required to judge tens of thousands ofExpand