Cross-Document Language Modeling
- Avi Caciularu, Arman Cohan, Iz Beltagy, Matthew E. Peters, Arie Cattan, Ido Dagan
- Computer ScienceArXiv
- 2021
The crossdocument language model (CD-LM) improves masked language modeling for multi-document NLP tasks with two key ideas, including pretraining with multiple related documents in a single input, via cross-document masking, which encourages the model to learn cross- document and long-range relationships.
Paraphrasing vs Coreferring: Two Sides of the Same Coin
- Y. Meged, Avi Caciularu, Vered Shwartz, Ido Dagan
- Computer ScienceFindings
- 30 April 2020
This work used annotations from an event coreference dataset as distant supervision to re-score heuristically-extracted predicate paraphrases, and used the same re-ranking features as additional inputs to a state-of-the-art eventcoreference resolution model, which yielded modest but consistent improvements to the model’s performance.
Unsupervised Linear and Nonlinear Channel Equalization and Decoding Using Variational Autoencoders
- Avi Caciularu, D. Burshtein
- Computer ScienceIEEE Transactions on Cognitive Communications and…
- 21 May 2019
A new approach for blind channel equalization and decoding, variational inference, and variational autoencoders (VAEs) in particular, is introduced and significant and consistent improvements in the error rate of the reconstructed symbols are demonstrated, compared to existing blind equalization methods, thus enabling faster channel acquisition.
CDLM: Cross-Document Language Modeling
- Avi Caciularu, Arman Cohan, Iz Beltagy, Matthew E. Peters, Arie Cattan, Ido Dagan
- Computer ScienceConference on Empirical Methods in Natural…
- 2 January 2021
This work introduces a new pretraining approach geared for multi-document language modeling, incorporating two key ideas into the masked language modeling self-supervised objective: pretrain over sets of multiple related documents, encouraging the model to learn cross-document relationships.
Scalable Attentive Sentence-Pair Modeling via Distilled Sentence Embedding
- Oren Barkan, Noam Razin, Itzik Malkiel, Ori Katz, Avi Caciularu, Noam Koenigstein
- Computer ScienceAAAI Conference on Artificial Intelligence
- 14 August 2019
Distilled Sentence Embedding is introduced - a model that is based on knowledge distillation from cross-attentive models, focusing on sentence-pair tasks that significantly outperforms several ELMO variants and other sentence embedding methods, while accelerating computation of the query-candidate sentence-pairs similarities.
Transformer Feed-Forward Layers Build Predictions by Promoting Concepts in the Vocabulary Space
- Mor Geva, Avi Caciularu, Ke Wang, Yoav Goldberg
- Computer ScienceConference on Empirical Methods in Natural…
- 28 March 2022
This work reverse-engineering the operation of the feed-forward network layers, one of the building blocks of transformer models, shows that each update can be decomposed to sub-updates corresponding to single FFN parameter vectors, each promoting concepts that are often human-interpretable.
Grad-SAM: Explaining Transformers via Gradient Self-Attention Maps
- Oren Barkan, Edan Hauon, Noam Koenigstein
- Computer ScienceInternational Conference on Information and…
- 26 October 2021
A novel gradient-based method that analyzes self-attention units and identifies the input elements that explain the model's prediction the best, and obtains significant improvements over state-of-the-art alternatives.
Blind Channel Equalization Using Variational Autoencoders
- Avi Caciularu, D. Burshtein
- Computer ScienceIEEE International Conference on Communications…
- 5 March 2018
A new maximum likelihood estimation approach for blind channel equalization, using variational autoencoders (VAEs), is introduced. Significant and consistent improvements in the error rate of the…
RecoBERT: A Catalog Language Model for Text-Based Recommendations
- Itzik Malkiel, Oren Barkan, Avi Caciularu, Noam Razin, Ori Katz, Noam Koenigstein
- Computer ScienceConference on Empirical Methods in Natural…
- 25 September 2020
This work introduces RecoBERT, a BERT-based approach for learning catalog-specialized language models for text-based item recommendations, and suggests novel training and inference procedures for scoring similarities between pairs of items, that don't require item similarity labels.
Attentive Item2vec: Neural Attentive User Representations
- Oren Barkan, Avi Caciularu, Ori Katz, Noam Koenigstein
- Computer ScienceIEEE International Conference on Acoustics…
- 15 February 2020
This work presents Attentive Item2vec (AI2V) - a novel attentive version of Item 2vec that employs a context-target attention mechanism in order to learn and capture different characteristics of user historical behavior with respect to a potential recommended item (target).
...
...