• Publications
  • Influence
Probing Toxic Content in Large Pre-Trained Language Models
TLDR
A method based on logistic regression classifiers is proposed to probe English, French, and Arabic PTLMs and quantify the potentially harmful content that they convey with respect to a set of templates to assess and mitigate the toxicity transmitted by PTL Ms.
DISCOS: Bridging the Gap between Discourse Knowledge and Commonsense Knowledge
TLDR
Experiments demonstrate that the proposed commonsense knowledge acquisition framework DISCOS can successfully convert discourse knowledge about eventualities from ASER, a large-scale discourse knowledge graph, into if-then Commonsense knowledge defined in ATOMIC without any additional annotation effort.
Benchmarking Commonsense Knowledge Base Population with an Effective Evaluation Dataset
Reasoning over commonsense knowledge bases (CSKB) whose elements are in the form of free-text is an important yet hard task in NLP. While CSKB completion only fills the missing links within the
Acquiring and Modelling Abstract Commonsense Knowledge via Conceptualization
TLDR
This work thoroughly study the possible role of conceptualization in commonsense reasoning, formulates a framework to replicate human conceptual induction from acquiring abstract knowledge about abstract concepts, and develops tools for contextualization on ATOMIC, a large-scale human annotated CKG.
Do Boat and Ocean Suggest Beach? Dialogue Summarization with External Knowledge
TLDR
This paper addresses the problem of inferring Concepts Out of the Dialogue Context (CODC) in the dialogue summarization task and proposes a novel framework comprised of a CODC inference module leveraging external knowledge from WordNet and a knowledge attention module aggregating the inferred knowledge into a neural summarization model.
Weakly Supervised Text Classification using Supervision Signals from a Language Model
TLDR
A latent variable model is proposed to learn a word distribution learner which associates generated words to pre-defined categories and a document classi fier simultaneously without using any annotated data.