Combining Language and Vision with a Multimodal Skip-gram Model
- Angeliki Lazaridou, N. Pham, Marco Baroni
- Computer ScienceNorth American Chapter of the Association for…
- 12 January 2015
Since they propagate visual information to all words, the MMSKIP-GRAM models discover intriguing visual properties of abstract words, paving the way to realistic implementations of embodied theories of meaning.
DISSECT - DIStributional SEmantics Composition Toolkit
- Georgiana Dinu, N. Pham, Marco Baroni
- Computer ScienceAnnual Meeting of the Association for…
- 1 August 2013
DISSECT can be useful to researchers and practitioners who need models of word meaning (without composition) as well, as it supports various methods to construct distributional semantic spaces, assessing similarity and even evaluating against benchmarks, that are independent of the composition infrastructure.
General estimation and evaluation of compositional distributional semantic models
- Georgiana Dinu, N. Pham, Marco Baroni
- Computer ScienceCVSM@ACL
- 1 August 2013
An evaluation of alternative cDSMs under truly comparable conditions is presented, and the linguistically motivated functional model of Baroni and Zamparelli and Coecke et al. (2010) emerges as the winner in all the authors' tests.
A practical and linguistically-motivated approach to compositional distributional semantics
- Denis Paperno, N. Pham, Marco Baroni
- Computer ScienceAnnual Meeting of the Association for…
- 1 June 2014
A new model that closely mimics the standard Montagovian semantic treatment of composition in distributional terms is presented, showing that it consistently outperforms a set of competitive rivals.
Jointly optimizing word representations for lexical and sentential tasks with the C-PHRASE model
- N. Pham, Germán Kruszewski, Angeliki Lazaridou, Marco Baroni
- Computer ScienceAnnual Meeting of the Association for…
- 1 July 2015
C-PHRASE, a distributional semantic model that learns word representations by optimizing context prediction for phrases at all levels in a syntactic tree, outperforms the state-of-theart C-BOW model on a variety of lexical tasks.
A Large-Scale Multi-Document Summarization Dataset from the Wikipedia Current Events Portal
- D. Ghalandari, Chris Hokamp, N. Pham, John Glover, Georgiana Ifrim
- Computer ScienceAnnual Meeting of the Association for…
- 1 May 2020
This work presents a new dataset for MDS that is large both in the total number of document clusters and in the size of individual clusters, and provides a quantitative analysis of the dataset and empirical results for several state-of-the-art MDS techniques.
Intensionality was only alleged: On adjective-noun composition in distributional semantics
- Gemma Boleda, Marco Baroni, N. Pham, L. McNally
- Computer ScienceInternational Conference on Computational…
- 1 March 2013
This work acknowledges the support of Spanish MICINN grant FFI2010-09464-E (Mcnally, Boleda), the ICREA Foundation (McNally), Catalan AGAUR grant 2010BP-A00070 (Baroni, Pham).
A Multitask Objective to Inject Lexical Contrast into Distributional Semantics
- N. Pham, Angeliki Lazaridou, Marco Baroni
- Linguistics, Computer ScienceAnnual Meeting of the Association for…
- 1 July 2015
The multitask Lexical Contrast Model (mLCM), an extension of the effective Skip-gram method that optimizes semantic vectors on the joint tasks of predicting corpus contexts and making the representations of WordNet synonyms closer than that of matching WordNet antonyms, is introduced.
Towards Multi-Agent Communication-Based Language Learning
- Angeliki Lazaridou, N. Pham, Marco Baroni
- Computer ScienceArXiv
- 23 May 2016
An interactive multimodal framework for language learning where learners engage in cooperative referential games starting from a tabula rasa setup, and thus develop their own language from the need to communicate in order to succeed at the game.
DynE: Dynamic Ensemble Decoding for Multi-Document Summarization
- Chris Hokamp, D. Ghalandari, N. Pham, John Glover
- Computer ScienceArXiv
- 15 June 2020
This work proposes a simple decoding methodology which ensembles the output of multiple instances of the same model on different inputs, and obtains state-of-the-art results on several multi-document summarization datasets.
...
...