Learn More
This paper describes our solution for the MSR Video to Language Challenge. We start from the popular ConvNet + LSTM model, which we extend with two novel modules. One is early embedding, which enriches the current low-level input to LSTM by tag embeddings. The other is late reranking, for re-scoring generated sentences in terms of their relevance to a(More)
In this paper we summarize our experiments in the ImageCLEF 2015 Scalable Concept Image Annotation challenge. The RUCTencent team participated in all subtasks: concept detection and localization, and image sentence generation. For concept detection, we experiments with automated approaches to gather high-quality training examples from the Web, in(More)
This paper summarizes our efforts for the first time participation in the Violent Scene Detection subtask of the MediaEval 2015 Affective Impact of Movies Task. We build violent scene detectors using both audio and visual cues. In particular, the audio cue is represented by bag-of-audio-words with fisher vector encoding. The visual cue is exploited by(More)
The current research status and the problems had been summarized about the Exoskeleton Intelligence Systems in worldwide. Human lower limbs movement data had been collected by using the VICON 460 infrared motion capture system, NOVEL pedar insole plantar pressure measure system and CASIO high speed camera. After the data had been researched, the key(More)
This abstract paper sketches our research towards Structured Semantic Embedding of multimedia data. Though a tag may have multiple senses with completely different visual imagery, current semantic embedding methods represent the tag by a single vector regardless of its senses. We challenge this convention, arguing the importance of adding semantic(More)
  • 1