Match²: A Matching over Matching Model for Similar Question Identification

  title={Match²: A Matching over Matching Model for Similar Question Identification},
  author={Zizhen Wang and Yixing Fan and Jiafeng Guo and Liu Yang and Ruqing Zhang and Yanyan Lan and Xueqi Cheng and Hui Jiang and Xiaozhao Wang},
  journal={Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval},
  • Zizhen WangYixing Fan Xiaozhao Wang
  • Published 21 June 2020
  • Computer Science
  • Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval
Community Question Answering (CQA) has become a primary means for people to acquire knowledge, where people are free to ask questions or submit answers. To enhance the efficiency of the service, similar question identification becomes a core task in CQA which aims to find a similar question from the archived repository whenever a new question is asked. However, it has long been a challenge to properly measure the similarity between two questions due to the inherent variation of natural language… 

Figures and Tables from this paper

PerCQA: Persian Community Question Answering Dataset

This paper presents PerCZA, the first Persian dataset for CQA, which contains the questions and answers crawled from the most well-known Persian forum and provides rigorous annotation guidelines in an iterative process and then the annotation of question-answer pairs in SemEvalCQA format.

A Discriminative Semantic Ranker for Question Retrieval

DenseTrans is a densely connected Transformer, which learns semantic embeddings for texts based on Transformer layers to keep the discriminative power of the learned representations, and can obtain significant gain on recall against strong term- based methods as well as state-of-the-art embedding-based methods.

A lightweight semantic‐enhanced interactive network for efficient short‐text matching

A lightweight Semantic‐Enhanced Interactive Network (SEIN) model for efficient short‐text matching is proposed, which focuses on integrating semantic information and interactive information of text while simplifying the structure of other modules.

Semantic Models for the First-Stage Retrieval: A Comprehensive Review

The current landscape of the first-stage retrieval models under a unified framework is described to clarify the connection between classical term-based retrieval methods, early semantic retrieved methods, and neural semantic retrieval methods.

Mining and searching association relation of scientific papers based on deep learning

The research on mining and searching the association relationship of scientific papers based on deep learning has far-reaching practical significance and can help to design applications to serve scientific researchers.

An Efficient and Robust Semantic Hashing Framework for Similar Text Search

A general unsupervised encoder-decoder semantic hashing framework, namely MASH (short for Memory-bAsed Semantic Hashing), to learn the balanced and compact hash codes for similar text search, with a target of retaining semantic information as much as possible.



Improving Question Retrieval in Community Question Answering Using World Knowledge

This work proposes a way to build a concept thesaurus based on the semantic relations extracted from the world knowledge of Wikipedia and develops a unified framework to leverage these semantic relations in order to enhance the question similarity in the concept space.

Question Retrieval with High Quality Answers in Community Question Answering

A topic-based language model, which matches questions not only on a term level but also on a topic level, which can significantly outperform state-of-the-art retrieval models in CQA.

Question-answer topic model for question retrieval in community question answering

A novel Question-Answer Topic Model (QATM) is proposed to learn the latent topics aligned across the question-answer pairs to alleviate the lexical gap problem, with the assumption that a question and its paired answer share the same topic distribution.

Approaches to Exploring Category Information for Question Retrieval in Community Question-Answer Archives

This article presents several new approaches to exploiting the category information of questions for improving the performance of question retrieval, and it applies these approaches to existing question retrieval models, including a state-of-the-art question retrieval model.

Thread-Level Information for Comment Classification in Community Question Answering

Two ways of modeling dependencies between the answer labels captured by structured prediction models are explored, showing that the thread-level features consistently improve the performance for a variety of machine learning models, yielding state-of-the-art results.

Towards faster and better retrieval models for question search

This paper proposes a faster and better retrieval model for question search by leveraging user chosen category and shows that the proposed techniques are more effective and efficient than a variety of baseline methods.

Learning the Latent Topics for Question Retrieval in Community QA

This paper proposes a topic model incorporated with the category information into the process of discovering the latent topics in the content of questions and combines the semantic similarity based latent topics with the translation-based language model into a unified framework for question retrieval.

Question Condensing Networks for Answer Selection in Community Question Answering

This paper proposes the Question Condensing Networks (QCN) to make use of the subject-body relationship of community questions and shows that QCN outperforms all existing models on two CQA datasets.

Adaptive Multi-Attention Network Incorporating Answer Information for Duplicate Question Detection

An answer information- enhanced adaptive multi-attention network (AMAN) is proposed to perform this task, which takes full advantage of the semantic information in the paired answers while alleviating the noise problem caused by adding the answers.

CQArank: jointly model topics and expertise in community question answering

This work proposed Topic Expertise Model (TEM), a novel probabilistic generative model with GMM hybrid, to jointly model topics and expertise by integrating textual content model and link structure analysis, and proposed CQARank to measure user interests and expertise score under different topics.