Learn More
Semantic matching is of central importance to many natural language tasks [2, 28]. A successful matching algorithm needs to adequately model the internal structures of language objects and the interaction between them. As a step toward this goal, we propose convolutional neural network models for matching two sentences, by adapting the convolutional(More)
Pairwise constraints specify whether or not two samples should be in one cluster. Although it has been successful to incorporate them into traditional clustering methods, such as K-means, little progress has been made in combining them with spectral clustering. The major challenge in designing an effective constrained spectral clustering is a sensible(More)
Attention mechanism advanced state-ofthe-art neural machine translation (NMT) by jointly learning to align and translate. However, attention-based NMT ignores past alignment information, which often leads to over-translation and undertranslation. In response to this problem, we maintain a coverage vector to keep track of the attention history. The coverage(More)
We address an important problem in sequence-to-sequence (Seq2Seq) learning referred to as copying, in which certain segments in the input sequence are selectively replicated in the output sequence. A similar phenomenon is observable in human language communication. For example, humans tend to repeat entity names or even long phrases in conversation. The(More)
In this paper, we propose to employ the convolutional neural network (CNN) for learning to answer questions from the image. Our proposed CNN provides an endto-end framework for learning not only the image representation, the composition model for question, but also the intermodal interaction between the image and question, for the generation of answer. More(More)
In this paper, we propose multimodal convolutional neural networks (m-CNNs) for matching image and sentence. Our m-CNN provides an end-to-end framework with convolutional architectures to exploit image representation, word composition, and the matching relations between the two modalities. More specifically, it consists of one image CNN encoding the image(More)
In graph-based learning models, entities are often represented as vertices in an undirected graph with weighted edges describing the relationships between entities. In many real-world applications, however, entities are often associated with relations of different types and/or from different sources, which can be well captured by multiple undirected graphs(More)