Studying Product Competition Using Representation Learning

  title={Studying Product Competition Using Representation Learning},
  author={Fanglin Chen and Xiao Liu and Davide Proserpio and Isamar Troncoso and Feiyu Xiong},
  journal={Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval},
  • Fanglin Chen, Xiao Liu, Feiyu Xiong
  • Published 21 May 2020
  • Business
  • Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval
Studying competition and market structure at the product level instead of brand level can provide firms with insights on cannibalization and product line optimization. However, it is computationally challenging to analyze product-level competition for the millions of products available on e-commerce platforms. We introduce Product2Vec, a method based on the representation learning algorithm Word2Vec, to study product-level competition, when the number of products is large. The proposed model… 

Tables from this paper

Scalable bundling via dense product embeddings
A new machine-learning-driven methodology for designing bundles in a large-scale, cross-category retail setting and develops heuristics for complementarity and substitutability among products, which are strong predictors of bundle success, robust across product categories, and generalize well to the retailer's entire assortment.
Extracting complements and substitutes from sales data: a network perspective
A network perspective is taken to help automatically identify complements and substitutes from sales transaction data, by developing appropriate null models to infer significant relations between products, and design measures based on random walks to quantify their importance.
Posterior summaries of grocery retail topic models: Evaluation, interpretability and credibility
A clustering methodology that post‐processes posterior LDA draws to summarise topic distributions represented as recurrent topics is introduced, demonstrating that selecting recurrent topics not only improves predictive likelihood but also outperforms interpretability and credibility in grocery retail data.
A Simple Measure of Product Substitutability Based on Common Purchases
We propose a measure of product substitutability based on correlation of common purchases, which is fast to compute and easy to interpret. In an empirical study of a drugstore retail chain, we


P2V-MAP: Mapping Market Structures for Large Retail Assortments
A neural network language model is customized to derive latent product attributes by analyzing the co-occurrences of products in shopping baskets and applying dimensionality reduction to the latent attributes yields a two-dimensional product map.
SHOPPER: A Probabilistic Model of Consumer Choice with Substitutes and Complements
SHOPPER is a sequential probabilistic model of market baskets that provides accurate predictions even under price interventions, and that it helps identify complementary and substitutable pairs of products.
Testing Competitive Market Structures
An accurate understanding of the structure of competition is important in the formulation of many marketing strategies. For example, in new product launch, product reformulation, or positioning
Modeling Consumer Choice among SKUs
Most choice models in marketing implicitly assume that the fundamental unit of analysis is the brand. In reality, however, many more of the decisions made by consumers, manufacturers, and retailers
A Control Function Approach to Endogeneity in Consumer Choice Models
Endogeneity arises for numerous reasons in models of consumer choice. It leads to inconsistency with standard estimation methods that maintain independence between the model's error and the included
Competitor identification and competitor analysis: a broad-based managerial approach
Managerial myopia in identifying competitive threats is a well-recognized phenomenon (Levitt, 1960; Zajac and Bazerman, 1991). Identifying such threats is particularly problematic, since they may
Database Paper - The IRI Marketing Data Set
A new data set comprised of store sales and consumer panel data for 30 product categories is described, which aims to address several potential applications of these data, as well as the access protocol.
Learning word embeddings efficiently with noise-contrastive estimation
This work proposes a simple and scalable new approach to learning word embeddings based on training log-bilinear models with noise-contrastive estimation, and achieves results comparable to the best ones reported, using four times less data and more than an order of magnitude less computing time.
Automobile Prices in Market Equilibrium
This paper develops techniques for empirically analyzing demand and supply in differentiated product markets and then applies these techniques to the U.S. automobile industry. The authors' framework