• Publications
  • Influence
Including Signed Languages in Natural Language Processing
TLDR
This position paper calls on the NLP community to include signed languages as a research area with high social and scientific impact and urges the adoption of an efficient tokenization method, development of linguistically-informed models, and the inclusion of local signed language communities as an active and leading voice in the direction of research.
“Caption” as a Coherence Relation: Evidence and Implications
We study verbs in image–text corpora, contrasting caption corpora, where texts are explicitly written to characterize image content, with depiction corpora, where texts and images may stand in more
Arrows are the Verbs of Diagrams
TLDR
This work establishes a novel analogy between arrows and verbs: it advocate representing arrows in terms of qualitatively different structural and semantic frames, and resolving frames to specific interpretations using shallow world knowledge.
ParsiNLU: A Suite of Language Understanding Challenges for Persian
TLDR
This work introduces ParsiNLU, the first benchmark in Persian language that includes a range of language understanding tasks—reading comprehension, textual entailment, and so on, and presents the first results on state-of-the-art monolingual and multilingual pre-trained language models on this benchmark and compares them with human performance.
CITE: A Corpus of Image-Text Discourse Relations
TLDR
A novel crowd-sourced resource that characterizes inferences in image-text contexts in the domain of cooking recipes in the form of coherence relations aids in establishing a better understanding of natural communication and common-sense reasoning.
Clue: Cross-modal Coherence Modeling for Caption Generation
TLDR
A new task for learning inferences in imagery and text, coherence relation prediction, is introduced, and it is shown that coherence annotations can be exploited to learn relation classifiers as an intermediary step, and also train coherence-aware, controllable image captioning models.
Exploring Coherence in Visual Explanations
TLDR
A case study of instructions presented using text and pictures is used to motivate and describe an analysis of multimodal discourse interpretation in terms of coherence relations and to sketch a roadmap for operationalizing the approach in computer systems.
Cross-modal Coherence Modeling for Caption Generation
TLDR
A new task for learning inferences in imagery and text, coherence relation prediction, is introduced, and it is shown that coherence annotations can be exploited to learn relation classifiers as an intermediary step, and also train coherence-aware, controllable image captioning models.
That and There: Judging the Intent of Pointing Actions with Robotic Arms
TLDR
The study indicates that human subjects show greater flexibility in interpreting the intent of referential pointing compared to locating pointing, which needs to be more deliberate, and the effects of variation in the environment and task context on the interpretation of pointing.
AI2D-RST: A multimodal corpus of 1000 primary school science diagrams
TLDR
A multi-layer annotation schema that provides a rich description of diagram elements into perceptual units, the connections set up by diagrammatic elements such as arrows and lines, and the discourse relations between diagram elements, which are described using Rhetorical Structure Theory (RST).
...
...