• Publications
  • Influence
ReZero is All You Need: Fast Convergence at Large Depth
TLDR
This work shows that the simplest architecture change of gating each residual connection using a single zero-initialized parameter satisfies initial dynamical isometry and outperforms more complex approaches and is applied to language modeling and finds that it can easily train 120-layer Transformers.
Improving Neural Story Generation by Targeted Common Sense Grounding
TLDR
A simple multi-task learning scheme to achieve quantitatively better common sense reasoning in language models by leveraging auxiliary training signals from datasets designed to provide common sense grounding.
Generating Personalized Recipes from Historical User Preferences
TLDR
This work proposes a new task of personalized recipe generation to help users with culinary preferences: expanding a name and incomplete ingredient details into complete natural-text instructions aligned with the user’s historical preferences.
Representation Learning for Information Extraction from Form-like Documents
TLDR
An extraction system that uses knowledge of the types of the target fields to generate extraction candidates and a neural network architecture that learns a dense representation of each candidate based on neighboring words in the document is proposed.
Like Hiking? You Probably Enjoy Nature: Persona-grounded Dialog with Commonsense Expansions
TLDR
This paper proposes to expand available persona sentences using existing commonsense knowledge bases and paraphrasing resources to imbue dialog models with access to an expanded and richer set of persona descriptions, and introduces fine-grained grounding on personas by encouraging the model to make a discrete choice among persona sentences while synthesizing a dialog response.
Interview: Large-scale Modeling of Media Dialog with Discourse Patterns and Knowledge Grounding
TLDR
This work presents a dialog model that leverages external knowledge as well as dialog acts via auxiliary losses and demonstrates that the model quantitatively and qualitatively outperforms strong discourse-agnostic baselines for dialog modeling—generating more specific and topical responses in interview-style conversations.
The GEM Benchmark: Natural Language Generation, its Evaluation and Metrics
TLDR
GEM, a living benchmark for natural language Generation (NLG), its Evaluation, and Metrics, is introduced and the description of the data for the 2021 shared task at the associated GEM Workshop is described.
An efficient iterative double auction for energy trading in microgrids
TLDR
Simulation results indicate that the proposed iterative double auction can establish social welfare maximization, requiring only a reasonable amount of computational overhead.
Differential evolution based score level fusion for multi-modal biometric systems
TLDR
Experimental results show that the proposed method outperforms the conventional score-level fusion rules (sum, product, tanh, exponential) when tested on two databases of 4 modalities and thus confirms the effectiveness of score level fusion.
Deep Recurrent Neural Networks for Product Attribute Extraction in eCommerce
TLDR
A significant coverage of important facets or attributes of products is achieved which not only shows the efficacy of deep recurrent models over previous machine learning benchmarks but also greatly enhances the overall customer experience while shopping online.
...
1
2
3
...