Some Practice for Improving the Search Results of E-commerce

  title={Some Practice for Improving the Search Results of E-commerce},
  author={Fanyou Wu and Yang Liu and Rado Gazo and Benes Bedrich and Xiaobo Qu},
In the Amazon KDD Cup 2022, we aim to apply natural language processing methods to improve the quality of search results that can significantly enhance user experience and engagement with search engines for e-commerce. We discuss our practical solution for this competition, ranking 6th in task one, 2nd in task two, and 2nd in task 3. The code is available at 

Figures and Tables from this paper



Shopping Queries Dataset: A Large-Scale ESCI Benchmark for Improving Product Search

The Shopping Queries Dataset is introduced, a large dataset of difficult Amazon search queries and results publicly released with the aim of fostering research in improving the quality of search results and is expected to become the gold standard for future research in the topic of product search.

Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks

Sentence-BERT (SBERT), a modification of the pretrained BERT network that use siamese and triplet network structures to derive semantically meaningful sentence embeddings that can be compared using cosine-similarity is presented.

Big Bird: Transformers for Longer Sequences

It is shown that BigBird is a universal approximator of sequence functions and is Turing complete, thereby preserving these properties of the quadratic, full attention model.

DeBERTaV3: Improving DeBERTa using ELECTRA-Style Pre-Training with Gradient-Disentangled Embedding Sharing

This paper presents a new pre-trained language model, DeBERTaV3, which improves the original DeBERTa model by replacing mask language modeling (MLM) with replaced token detection (RTD), a more

COCO-LM: Correcting and Contrasting Text Sequences for Language Model Pretraining

A self-supervised learning framework that pretrains Language Models by COrrecting and COntrasting corrupted text sequences that outperforms recent state-of-the-art pretrained models in accuracy, but also improves pretraining efficiency.

Distilling the Knowledge in a Neural Network

This work shows that it can significantly improve the acoustic model of a heavily used commercial system by distilling the knowledge in an ensemble of models into a single model and introduces a new type of ensemble composed of one or more full models and many specialist models which learn to distinguish fine-grained classes that the full models confuse.

Learning to Rank with Nonsmooth Cost Functions

A class of simple, flexible algorithms, called LambdaRank, which avoids difficulties by working with implicit cost functions by using neural network models, and can be extended to any non-smooth and multivariate cost functions.