Corpus ID: 220845540

Fantastic Embeddings and How to Align Them: Zero-Shot Inference in a Multi-Shop Scenario

  title={Fantastic Embeddings and How to Align Them: Zero-Shot Inference in a Multi-Shop Scenario},
  author={Federico Bianchi and J. Tagliabue and Bingqing Yu and Luca Bigon and C. Greco},
This paper addresses the challenge of leveraging multiple embedding spaces for multi-shop personalization, proving that zero-shot inference is possible by transferring shopping intent from one website to another without manual intervention. We detail a machine learning pipeline to train and optimize embeddings within shops first, and support the quantitative findings with additional qualitative insights. We then turn to the harder task of using learned embeddings across shops: if products from… Expand
Shopping in the Multiverse: A Counterfactual Approach to In-Session Attribution
This work proposes to learn a generative browsing model over a target shop, leveraging the latent space induced by prod2vec embeddings, and proposes to approach counterfactuals in analogy with treatments in formal semantics, explicitly modeling possible outcomes through alternative shopper timelines. Expand
Query2Prod2Vec: Grounded Word Embeddings for eCommerce
We present Query2Prod2Vec, a model that grounds lexical representations for product search in product embeddings: in our model, meaning is a mapping between words and a latent space of products in aExpand
Language in a (Search) Box: Grounding Language Learning in Real-World Human-Machine Interaction
This work investigates grounded language learning through real-world data, by modelling a teacher-learner dynamics through the natural interactions occurring between users and search engines, and shows how the resulting semantics for noun phrases exhibits compositional properties while being fully learnable without any explicit labelling. Expand
SIGIR 2021 E-Commerce Workshop Data Challenge
The need for efficient procedures for personalization is even clearer if the authors consider the e-commerce landscapemore broadly, and the constraints of the problem are stricter, due to smaller user bases and the realization that most users are not frequently returning customers. Expand
"Are you sure?": Preliminary Insights from Scaling Product Comparisons to Multiple Shops
Preliminary results from building a comparison pipeline designed to scale in a multi-shop scenario are presented and the design choices are described and extensive benchmarks on multiple shops to stress-test it. Expand
Aligning Hotel Embeddings using Domain Adaptation for Next-Item Recommendation
In online platforms it is often the case to have multiple brands under the same group which may target different customer profiles, or have different domains. For example, in the hospitality domain,Expand
BERT Goes Shopping: Comparing Distributional Models for Product Representations
This work proposes to transfer BERT-like architectures to eCommerce: the model - Prod2BERT - is trained to generate representations of products through masked session modeling and provides guidelines to practitioners for training embeddings under a variety of computational and data constraints. Expand
You Do Not Need a Bigger Boat: Recommendations at Reasonable Scale in a (Mostly) Serverless and Open Stack
This work proposes a template data stack for machine learning at “reasonable scale”, and details how modern open source can provide a pipeline processing terabytes of data with limited infrastructure work. Expand
The Embeddings That Came in From the Cold: Improving Vectors for New and Rare Products with Content-Based Inference
This work shows how to inject product knowledge into behavior-based embeddings to provide the best accuracy with minimal engineering changes in existing infrastructure and without additional manual effort. Expand


“An Image is Worth a Thousand Features”: Scalable Product Representations for In-Session Type-Ahead Personalization
It is shown how a shared vector space between similar shops can be used to improve the experience of users browsing across sites, opening up the possibility of applying zero-shot unsupervised personalization to increase conversions. Expand
Meta-Graph: Few shot Link Prediction via Meta Learning
A new gradient-based meta learning framework, Meta-Graph, that can learn to quickly adapt to a new graph using only a small sample of true edges, enabling not only fast adaptation but also improved results at convergence. Expand
A robust self-learning method for fully unsupervised cross-lingual mappings of word embeddings
This work proposes an alternative approach based on a fully unsupervised initialization that explicitly exploits the structural similarity of the embeddings, and a robust self-learning algorithm that iteratively improves this solution. Expand
Personalized Query Auto-Completion Through a Lightweight Representation of the User Context
A novel method for personalized QAC that uses lightweight embeddings learnt through fastText that significantly outperforms text based personalization features studied in the literature before and adding text based features on top of the proposed embedding based features results only in minor improvements. Expand
Meta-Prod2Vec: Product Embeddings Using Side-Information for Recommendation
This work proposes Meta-Prod2vec, a novel method to compute item similarities for recommendation that leverages existing item metadata and shows that the new item representations lead to better performance on recommendation tasks on an open music dataset. Expand
Large-scale Collaborative Filtering with Product Embeddings
This approach combines neural attention mechanisms, which allow for context dependent weighting of past behavioral signals, with representation learning techniques to produce models which obtain extremely high coverage, can easily incorporate new information as it becomes available, and are computationally efficient. Expand
Word2vec applied to recommendation: hyperparameters matter
This work investigates the marginal importance of each hyperparameters in a recommendation setting through large hyperparameter grid searches on various datasets, and finds that optimal hyper-parameters configurations for Natural Language Processing tasks and Recommendation tasks are noticeably different. Expand
Revisiting Skip-Gram Negative Sampling Model with Regularization
This work revisits skip-gram negative sampling and rectifies the SGNS model with quadratic regularization, and shows that this simple modification suffices to structure the solution in the desired manner. Expand
Are we really making much progress? A worrying analysis of recent neural recommendation approaches
A systematic analysis of algorithmic proposals for top-n recommendation tasks that were presented at top-level research conferences in the last years sheds light on a number of potential problems in today's machine learning scholarship and calls for improved scientific practices in this area. Expand
Training Temporal Word Embeddings with a Compass
A new heuristic to train temporal word embeddings based on the Word2vec model consists in using atemporal vectors as a reference, i.e., as a compass, when training the representations specific to a given time interval. Expand