Toyota Technological Institute at Chicago, University of Chicago
Author pages are created from data sourced from our academic publisher partnerships and public sources.
Share This Author
ALBERT: A Lite BERT for Self-supervised Learning of Language Representations
- Zhenzhong Lan, Mingda Chen, Sebastian Goodman, Kevin Gimpel, Piyush Sharma, Radu Soricut
- Computer ScienceICLR
- 26 September 2019
This work presents two parameter-reduction techniques to lower memory consumption and increase the training speed of BERT, and uses a self-supervised loss that focuses on modeling inter-sentence coherence.
A Baseline for Detecting Misclassified and Out-of-Distribution Examples in Neural Networks
A simple baseline that utilizes probabilities from softmax distributions is presented, showing the effectiveness of this baseline across all computer vision, natural language processing, and automatic speech recognition, and it is shown the baseline can sometimes be surpassed.
Gaussian Error Linear Units (GELUs)
An empirical evaluation of the GELU nonlinearity against the ReLU and ELU activations is performed and performance improvements are found across all considered computer vision, natural language processing, and speech tasks.
Part-of-Speech Tagging for Twitter: Annotation, Features, and Experiments
A tagset is developed, data is annotated, features are developed, and results nearing 90% accuracy are reported on the problem of part-of-speech tagging for English data from the popular micro-blogging service Twitter.
Improved Part-of-Speech Tagging for Online Conversational Text with Word Clusters
- Olutobi Owoputi, Brendan T. O'Connor, Chris Dyer, Kevin Gimpel, Nathan Schneider, Noah A. Smith
- Computer ScienceNAACL
- 1 June 2013
This work systematically evaluates the use of large-scale unsupervised word clustering and new lexical features to improve tagging accuracy on Twitter and achieves state-of-the-art tagging results on both Twitter and IRC POS tagging tasks.
Towards Universal Paraphrastic Sentence Embeddings
This work considers the problem of learning general-purpose, paraphrastic sentence embeddings based on supervision from the Paraphrase Database, and compares six compositional architectures, finding that the most complex architectures, such as long short-term memory (LSTM) recurrent neural networks, perform best on the in-domain data.
Adversarial Example Generation with Syntactically Controlled Paraphrase Networks
A combination of automated and human evaluations show that SCPNs generate paraphrases that follow their target specifications without decreasing paraphrase quality when compared to baseline (uncontrolled) paraphrase systems.
Bridging Nonlinearities and Stochastic Regularizers with Gaussian Error Linear Units
An empirical evaluation of the GELU nonlinearity against the ReLU and ELU activations and finding performance improvements across all tasks suggests a new probabilistic understanding of nonlinearities.
From Paraphrase Database to Compositional Paraphrase Model and Back
This work proposes models to leverage the phrase pairs from the Paraphrase Database to build parametric paraphrase models that score paraphrase pairs more accurately than the PPDB’s internal scores while simultaneously improving its coverage.
Using Trusted Data to Train Deep Networks on Labels Corrupted by Severe Noise
It is demonstrated that robustness to label noise up to severe strengths can be achieved by using a set of trusted data with clean labels, and a loss correction that utilizes trusted examples in a data-efficient manner to mitigate the effects of label noise on deep neural network classifiers is proposed.