• Publications
  • Influence
MSR-VTT: A Large Video Description Dataset for Bridging Video and Language
TLDR
We present MSR-VTT (standing for "MSRVideo to Text") which is a new large-scale video benchmark for video understanding, especially the emerging task of translating video to text. Expand
  • 416
  • 135
  • PDF
Image Retrieval: Current Techniques, Promising Directions, and Open Issues
TLDR
This paper provides a comprehensive survey of the technical achievements in the research area of image retrieval, especially content-based Image retrieval, an area that has so active and prosperous in the past few years. Expand
  • 2,474
  • 92
  • PDF
Content-based image retrieval with relevance feedback in MARS
TLDR
We propose an integration approach for content-based image retrieval of image and multimedia data using the term weighting and relevance feedback techniques developed in IP. Expand
  • 891
  • 62
  • PDF
GeoMF: joint geographical modeling and matrix factorization for point-of-interest recommendation
TLDR
We first propose to exploit weighted matrix factorization for POI recommendation since it usually serves collaborative filtering with implicit feedback better. Expand
  • 401
  • 53
  • PDF
Jointly Modeling Embedding and Translation to Bridge Video and Language
TLDR
This paper presents a novel unified framework, named Long Short-Term Memory with visual-semantic Embedding (LSTM-E), which can simultaneously explore the learning of LSTM and visual-Semantic embedding. Expand
  • 359
  • 48
  • PDF
Optimizing learning in image retrieval
  • Y. Rui, T. Huang
  • Computer Science
  • Proceedings IEEE Conference on Computer Vision…
  • 2000
TLDR
We present a vigorous optimization formulation of the learning process and solve the problem in a principled way. Expand
  • 416
  • 46
  • PDF
Adaptive key frame extraction using unsupervised clustering
TLDR
We present a clustering based approach for key frame extraction based on unsupervised clustering which is both e cient and e ective. Expand
  • 587
  • 25
  • PDF
Constructing table-of-content for videos
TLDR
In this paper, we present an effective semantic-level ToC construction technique based on intelligent unsupervised clustering. Expand
  • 300
  • 24
  • PDF
Highlight Detection with Pairwise Deep Ranking for First-Person Video Summarization
  • Ting Yao, T. Mei, Y. Rui
  • Computer Science
  • IEEE Conference on Computer Vision and Pattern…
  • 1 June 2016
TLDR
This paper studies the discovery of moments of user's major or special interest (i.e., highlights) in a video, for generating the summarization of first-person videos. Expand
  • 158
  • 22
  • PDF
Better proposal distributions: object tracking using unscented particle filter
  • Y. Rui, Yunqiang Chen
  • Computer Science
  • Proceedings of the IEEE Computer Society…
  • 8 December 2001
TLDR
We introduce the unscented Kalman filter to solve non-linear non-Gaussian system problems. Expand
  • 342
  • 20
  • PDF