NL-Augmenter: A Framework for Task-Sensitive Natural Language Augmentation
- Kaustubh D. Dhole, Varun Gangal, Yue Zhang
- Computer ScienceArXiv
- 6 December 2021
NL-Augmenter is presented, a new participatory Python-based natural language augmentation framework which supports the creation of both transformations and data splits according to specific features and demonstrates the robustness of popular natural language models using several of its tranformations.
Coarse2Fine: Fine-grained Text Classification on Coarsely-grained Annotated Data
- Dheeraj Mekala, Varun Gangal, Jingbo Shang
- Computer ScienceConference on Empirical Methods in Natural…
- 22 September 2021
This work proposes a label-conditioned fine-tuning formulation to attune rich pre-trained generative language models into the iterative weak supervision strategy and devise a regularization objective based on the coarse-fine label constraints derived from the problem setting, giving us even further improvements over the prior formulation.
EUREKA: EUphemism Recognition Enhanced through Knn-based methods and Augmentation
- Sedrick Scott Keh, R. Bharadwaj, Emmy Liu, Simone Tedeschi, Varun Gangal, Roberto Navigli
- Computer ScienceFLP
- 23 October 2022
Using the augmented dataset and kNN-based methods, EUREKA was able to achieve state-of-the-art results on the public leaderboard of the Euphemism Detection Shared Task, ranking first with a macro F1 score of 0.881.
Investigating Robustness of Dialog Models to Popular Figurative Language Constructs
- Harsh Jhamtani, Varun Gangal, E. Hovy, Taylor Berg-Kirkpatrick
- Computer ScienceConference on Empirical Methods in Natural…
- 1 October 2021
This work analyzes the performance of existing dialog models in situations where the input dialog context exhibits use of figurative language, and proposes lightweight solutions to help existing models become more robust to figurativelanguage.
Retrieve, Caption, Generate: Visual Grounding for Enhancing Commonsense in Text Generation Models
- Steven Y. Feng, Kevin Lu, Varun Gangal
- Computer ScienceAAAI Conference on Artificial Intelligence
- 8 September 2021
Comprehensive evaluation and analysis demonstrate that VisCTG noticeably improves model performance while successfully addressing several issues of the baseline generations, including poor commonsense, fluency, and specificity.
PINEAPPLE: Personifying INanimate Entities by Acquiring Parallel Personification Data for Learning Enhanced Generation
- Sedrick Scott Keh, Kevin Lu, E. Hovy
- Computer ScienceInternational Conference on Computational…
- 16 September 2022
Both automatic and human evaluations show that fine-tuning with PersonifCorp leads to significant gains in personification-related qualities such as animacy and interestingness, demonstrating a strong ability to generate diverse and creative personifications that enhance the overall appeal of a sentence.
PANCETTA: Phoneme Aware Neural Completion to Elicit Tongue Twisters Automatically
- Sedrick Scott Keh, Steven Y. Feng, Varun Gangal, Malihe Alikhani, E. Hovy
- Linguistics, Computer ScienceArXiv
- 13 September 2022
Through automatic and human evaluation, as well as qualitative analysis, it is shown that PANCETTA generates novel, phonetically difficult, fluent, and semantically meaningful tongue twisters.
OPERA: Operations-oriented Probabilistic Extraction, Reasoning, and Analysis
- E. Hovy, J. Carbonell, Varun Gangal
- Computer ScienceText Analysis Conference
- 2019
The OPERA system of CMU and USC/ISI performs end-to-end information extraction from multiple media and languages (English, Russian, Ukrainian), integrates the results, builds Knowledge Bases about…