• Publications
  • Influence
Convolutional Neural Networks for Sentence Classification
  • Yoon Kim
  • Computer Science
    EMNLP
  • 25 August 2014
TLDR
The CNN models discussed herein improve upon the state of the art on 4 out of 7 tasks, which include sentiment analysis and question classification, and are proposed to allow for the use of both task-specific and static vectors.
Character-Aware Neural Language Models
TLDR
A simple neural language model that relies only on character-level inputs that is able to encode, from characters only, both semantic and orthographic information and suggests that on many languages, character inputs are sufficient for language modeling.
Adversarially Regularized Autoencoders
TLDR
This work proposes a flexible method for training deep latent variable models of discrete structures based on the recently-proposed Wasserstein autoencoder (WAE), and shows that the latent representation can be trained to perform unaligned textual style transfer, giving improvements both in automatic/human evaluation compared to existing methods.
Sequence-Level Knowledge Distillation
TLDR
It is demonstrated that standard knowledge distillation applied to word-level prediction can be effective for NMT, and two novel sequence-level versions of knowledge distilling are introduced that further improve performance, and somewhat surprisingly, seem to eliminate the need for beam search.
Temporal Analysis of Language through Neural Language Models
TLDR
A method for automatically detecting change in language across time through a chronologically trained neural language model that identifies words such as cell and gay as having changed during that time period.
Compound Probabilistic Context-Free Grammars for Grammar Induction
TLDR
A formalization of the grammar induction problem that models sentences as being generated by a compound probabilistic context free grammar, which is modulated by a per-sentence continuous latent variable, which induces marginal dependencies beyond the traditional context-free assumptions.
Semi-Amortized Variational Autoencoders
TLDR
This work proposes a hybrid approach, to use AVI to initialize the variational parameters and run stochastic variational inference (SVI) to refine them, which enables the use of rich generative models without experiencing the posterior-collapse phenomenon common in training VAEs for problems like text generation.
Unsupervised Recurrent Neural Network Grammars
TLDR
An inference network parameterized as a neural CRF constituency parser is developed to maximize the evidence lower bound and apply amortized variational inference to unsupervised learning of RNNGs.
Structured Attention Networks
TLDR
This work shows that structured attention networks are simple extensions of the basic attention procedure, and that they allow for extending attention beyond the standard soft-selection approach, such as attending to partial segmentations or to subtrees.
Avoiding Latent Variable Collapse With Generative Skip Models
TLDR
Compared to existing VAE architectures, it is shown that generative skip models maintain similar predictive performance but lead to less collapse and provide more meaningful representations of the data.
...
...