• Corpus ID: 229297535

Ultra-Fast, Low-Storage, Highly Effective Coarse-grained Selection in Retrieval-based Chatbot by Using Deep Semantic Hashing

@article{Lan2020UltraFastLH,
  title={Ultra-Fast, Low-Storage, Highly Effective Coarse-grained Selection in Retrieval-based Chatbot by Using Deep Semantic Hashing},
  author={Tian Lan and Xian-Ling Mao and Xiaoyan Gao and Heyan Huang},
  journal={ArXiv},
  year={2020},
  volume={abs/2012.09647}
}
We study the coarse-grained selection module in the retrieval-based chatbot. Coarse-grained selection is a basic module in a retrieval-based chatbot, which constructs a rough candidate set from the whole database to speed up the interaction with customers. So far, there are two kinds of approaches for coarse-grained selection modules: (1) sparse representation; (2)dense representation. To the best of our knowledge, there is no systematic comparison between these two approaches in retrieval… 

Figures and Tables from this paper

Contextual Fine-to-Coarse Distillation for Coarse-grained Response Selection in Open-Domain Conversations
TLDR
A Contextual Fine-to-Coarse (CFC) distilled model for coarsegrained response selection in open-domain conversations, where dense representations of query, candidate response and corresponding context is learned based on the multi-tower architecture.
Multiproxies Adaptive Distribution Loss with Weakly Supervised Feature Aggregation for Fine-Grained Retrieval
TLDR
A novel multiproxies adaptive distribution loss which can better characterize the intraclass variations and the degree of dispersion of each cluster center is proposed and a weakly supervised feature aggregation method based on channel weighting is proposed, which distinguishes the importance of different feature channels to obtain more representative image feature descriptors.

References

SHOWING 1-10 OF 27 REFERENCES
Distilling Knowledge for Fast Retrieval-based Chat-bots
TLDR
This paper proposes a new cross-encoders architecture and transfer knowledge from this model to a bi-encoder model using distillation, which effectively boosts bi- encoder performance at no cost during inference time.
Multi-Representation Fusion Network for Multi-Turn Response Selection in Retrieval-Based Chatbots
TLDR
This work proposes a multi-representation fusion network where the representations can be fused into matching at an early stage, at an intermediate stage, or at the last stage, and demonstrates the effect of each representation to matching, which sheds light on how to select them in practical systems.
Domain Adaptive Training BERT for Response Selection
TLDR
The powerful pre-trained language model Bi-directional Encoder Representations from Transformer (BERT) is utilized for a multi-turn dialog system and a highly effective post-training method on domain-specific corpus is proposed.
A Survey on Learning to Hash
TLDR
This paper presents a comprehensive survey of the learning to hash algorithms, categorize them according to the manners of preserving the similarities into: pairwise similarity preserving, multiwise Similarity preserving, implicit similarity preserve, as well as quantization, and discusses their relations.
Convolutional Neural Networks for Text Hashing
TLDR
A novel text hashing framework with convolutional neural networks that first embeds the keyword features into compact binary code with a locality preserving constraint and incorporates the implicit features into the explicit features to fit the pretrained binary code.
Interactive Matching Network for Multi-Turn Response Selection in Retrieval-Based Chatbots
TLDR
Experiments show that IMN outperforms the baseline models on all metrics, achieving a new state-of-the-art performance and demonstrating compatibility across domains for multi-turn response selection.
Multi-hop Selector Network for Multi-turn Response Selection in Retrieval-based Chatbots
TLDR
The side effect of using too many context utterances is analyzed and a multi-hop selector network (MSN) is proposed to alleviate the problem and results show that MSN outperforms some state-of-the-art methods on three public multi-turn dialogue datasets.
Context-to-Session Matching: Utilizing Whole Session for Response Selection in Information-Seeking Dialogue Systems
TLDR
The response and its context as a whole session is considered and the task of matching the query's context with the sessions is explored and the proposed context-to-session method outperforms the strong baselines significantly.
Billion-Scale Similarity Search with GPUs
TLDR
This paper proposes a novel design for an inline-formula that enables the construction of a high accuracy, brute-force, approximate and compressed-domain search based on product quantization, and applies it in different similarity search scenarios.
Poly-encoders: Architectures and Pre-training Strategies for Fast and Accurate Multi-sentence Scoring
TLDR
This work develops a new transformer architecture, the Poly-encoder, that learns global rather than token level self-attention features, and shows that the models achieve state-of-the-art results on four tasks.
...
1
2
3
...