Learning Latent Vector Spaces for Product Search

@article{Gysel2016LearningLV,
  title={Learning Latent Vector Spaces for Product Search},
  author={Christophe Van Gysel and M. de Rijke and E. Kanoulas},
  journal={Proceedings of the 25th ACM International on Conference on Information and Knowledge Management},
  year={2016}
}
We introduce a novel latent vector space model that jointly learns the latent representations of words, e-commerce products and a mapping between the two without the need for explicit annotations. The power of the model lies in its ability to directly model the discriminative relation between products and a particular word. We compare our method to existing latent vector space models (LSI, LDA and word2vec) and evaluate it as a feature in a learning to rank setting. Our latent vector space… 

Figures and Tables from this paper

Neural Vector Spaces for Unsupervised Information Retrieval
TLDR
It is found that an unsupervised ensemble of multiple models trained with different hyperparameter values performs better than a single cross-validated model, and therefore NVSM can safely be used for ranking documents without supervised relevance judgments.
Multi-modal Preference Modeling for Product Search
TLDR
This work proposes a multi-modal personalized product search method, which aims to search products which not only are relevant to the submitted textual query, but also match the user preferences from both textual and visual modalities.
Learning a Hierarchical Embedding Model for Personalized Product Search
TLDR
The hierarchical embedding model is the first latent space model that jointly learns distributed representations for queries, products and users with a deep neural network and experiments show that it significantly outperforms existing product search baselines on multiple benchmark datasets.
Dynamic Bayesian Metric Learning for Personalized Product Search
TLDR
Experimental results on large datasets over a number of applications demonstrate that the proposed Dynamic Bayesian Metric Learning model outperforms the state-of-the-art algorithms, and can effectively capture the evolutions of semantic representations of different categories of entities over time.
Modeling User Behavior with Graph Convolution for Personalized Product Search
TLDR
This work uses an efficient jumping graph convolution to explore high-order relations to enrich product representations for user preference modeling and addresses the limitations of prior arts by exploring local and global user behavior patterns on a user successive behavior graph.
Learning a Joint Search and Recommendation Model from User-Item Interactions
TLDR
inspired by the neural approaches to collaborative filtering and the language modeling approaches to information retrieval, this model is jointly optimized to predict user-item interactions and reconstruct the item textual descriptions.
Neural IR Meets Graph Embedding: A Ranking Model for Product Search
TLDR
The recent advances in graph embedding techniques are leveraged to enable neural retrieval models to exploit graph-structured data for automatic feature extraction to overcome the long-tail problem of click-through data and incorporate external heterogeneous information to improve search results.
A Mixture-of-Experts Model for Learning Multi-Facet Entity Embeddings
TLDR
This paper proposes a model that learns several vectors for each entity, each of which intuitively captures a different aspect of the considered domain, and uses a mixture-of-experts formulation to jointly learn these facet-specific embeddings.
A Multi-task Learning Framework for Product Ranking with BERT
TLDR
The proposed model utilizes domain-specific BERT with fine-tuning to bridge the vocabulary gap and employs multi-task learning to optimize multiple objectives simultaneously, which yields a general end-to-end learning framework for product search.
Deep Neural Network and Boosting Based Hybrid Quality Ranking for e-Commerce Product Search
TLDR
This work proposes an e-commerce product search engine based on a similarity metric that works on top of query and product embeddings and demonstrates the effectiveness of context-aware embedDings in retrieving relevant products and the quality indicators in ranking high-quality products.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 66 REFERENCES
A Latent Semantic Model with Convolutional-Pooling Structure for Information Retrieval
TLDR
A new latent semantic model that incorporates a convolutional-pooling structure over word sequences to learn low-dimensional, semantic vector representations for search queries and Web documents is proposed.
Learning deep structured semantic models for web search using clickthrough data
TLDR
A series of new latent semantic models with a deep structure that project queries and documents into a common low-dimensional space where the relevance of a document given a query is readily computed as the distance between them are developed.
Unsupervised, Efficient and Semantic Expertise Retrieval
TLDR
An unsupervised discriminative model for the task of retrieving experts in online document collections achieves the retrieval performance levels of state-of-the-art document-centric methods with the low inference cost of so-called profile-centric approaches.
GloVe: Global Vectors for Word Representation
TLDR
A new global logbilinear regression model that combines the advantages of the two major model families in the literature: global matrix factorization and local context window methods and produces a vector space with meaningful substructure.
Learning word embeddings efficiently with noise-contrastive estimation
TLDR
This work proposes a simple and scalable new approach to learning word embeddings based on training log-bilinear models with noise-contrastive estimation, and achieves results comparable to the best ones reported, using four times less data and more than an order of magnitude less computing time.
Distributed Representations of Sentences and Documents
TLDR
Paragraph Vector is an unsupervised algorithm that learns fixed-length feature representations from variable-length pieces of texts, such as sentences, paragraphs, and documents, and its construction gives the algorithm the potential to overcome the weaknesses of bag-of-words models.
Representation Learning for Measuring Entity Relatedness with Rich Information
TLDR
A framework of coordinate matrix factorization is proposed to construct lowdimensional continuous representation for entities, categories and words in the same semantic space and shows that the model outperforms both traditional entity relatedness algorithms and other representation learning models.
Efficient Estimation of Word Representations in Vector Space
TLDR
Two novel model architectures for computing continuous vector representations of words from very large data sets are proposed and it is shown that these vectors provide state-of-the-art performance on the authors' test set for measuring syntactic and semantic word similarities.
Latent Dirichlet Allocation
Semantic hashing
...
1
2
3
4
5
...