Learn More
Recently, the capability of character-level evaluation measures for machine translation output has been confirmed by several metrics. This work proposes translation edit rate on character level (CharacTER), which calculates the character level edit distance while performing the shift edit on word level. The novel metric shows high system-level correlation(More)
This paper addresses the problem of detecting coherent motions in crowd scenes and subsequently constructing semantic regions for activity recognition. We first introduce a coarse-to-fine thermal-diffusionbased approach. It processes input motion fields (e.g., optical flow fields) and produces a coherent motion filed, named as thermal energy field. The(More)
This paper addresses the problem of detecting coherent motions in crowd scenes and presents its two applications in crowd scene understanding: semantic region detection and recurrent activity mining. It processes input motion fields (e.g., optical flow fields) and produces a coherent motion field named thermal energy field. The thermal energy field is able(More)
In this paper, we propose a new approach which utilizes coherent motion regions to extract and visualize recurrent motion flows in crowded scene surveillance videos. The proposed approach first extract coherent motion regions from a crowded scene video. Then a frame-level clustering process is proposed to cluster frames into different(More)
Recently, the neural machine translation systems showed their promising performance and surpassed the phrase-based systems for most translation tasks. Retreating into conventional concepts machine translation while utilizing effective neural models is vital for comprehending the leap accomplished by neural machine translation over phrase-based methods. This(More)
Recurrent neural network (RNN), as a powerful contextual dependency modeling framework, has been widely applied to scene labeling problems. However, this work shows that directly applying traditional RNN architectures, which unfolds a 2D lattice grid into a sequence, is not sufficient to model structure dependencies in images due to the “impact vanishing”(More)
Recently, neural network models have achieved consistent improvements in statistical machine translation. However, most networks only use one-hot encoded input vectors of words as their input. In this work, we investigated the exponentially decaying bag-of-words input features for feed-forward neural network translation models and proposed to train the(More)
Recent advances in convolutional neural networks have shown promising results in 3D shape completion. But due to GPU memory limitations, these methods can only produce low-resolution outputs. To inpaint 3D models with semantic plausibility and contextual details, we introduce a hybrid framework that combines a 3D Encoder-Decoder Generative Adversarial(More)
Coherent motions, which represent coherent movements of massive individual particles, are pervasive in natural and social scenarios. Examples include traffic flows and parades of people (cf. Figs 1a and 2a). Since coherent motions can effectively decompose scenes into meaningful semantic parts and facilitate the analysis of complex crowd scenes, they are of(More)
Accurate road segmentation is a prerequisite for autonomous driving. Current state-of-the-art methods are mostly based on convolutional neural networks (CNNs). Nevertheless, their good performance is at expense of abundant annotated data and high computational cost. In this work, we address these two issues by a self-paced cross-modality transfer learning(More)
  • 1