Convex Aggregation for Opinion Summarization

  title={Convex Aggregation for Opinion Summarization},
  author={Hayate Iso and Xiaolan Wang and Yoshihiko Suhara and Stefanos Angelidis and Wang Chiew Tan},
Recent approaches for unsupervised opinion summarization have predominantly used the review reconstruction training paradigm. An encoder-decoder model is trained to reconstruct single reviews and learns a latent review encoding space. At summarization time, the unweighted average of latent review vectors is decoded into a summary. In this paper, we challenge the convention of simply averaging the latent vector set, and claim that this simplistic approach fails to consider variations in the… 
Comparative Opinion Summarization via Collaborative Decoding
A comparative summarization framework CoCoSum is developed, which consists of two base summarization models that jointly generate contrastive and common summaries that can produce higher-quality Contrastive and Common summaries than state-of-the-art opinion summarization Models.
Beyond Opinion Mining: Summarizing Opinions of Customer Reviews
This three-hour tutorial will provide a comprehensive overview over major advances in opinion summarization, including how summarizers can be trained in the unsupervised, few-shot, and supervised regimes.
Factorizing Content and Budget Decisions in Abstractive Summarization of Long Documents by Sampling Summary Views
The method, FactorSum, does this disentanglement by factorizing summarization into two steps through an energy function: generation of abstractive summary views and combination of these views into a final summary, following a budget and content guidance.
Unsupervised Extractive Opinion Summarization Using Sparse Coding
Semantic Autoencoder (SemAE) is presented to perform extractive opinion summarization in an unsupervised manner that uses dictionary learning to implicitly capture semantic information from the review text and learns a latent representation of each sentence over semantic units.


Generating Sentences from a Continuous Space
This work introduces and study an RNN-based variational autoencoder generative model that incorporates distributed latent representations of entire sentences that allows it to explicitly model holistic properties of sentences such as style, topic, and high-level syntactic features.
Unsupervised Opinion Summarization as Copycat-Review Generation
A generative model for a review collection is defined which capitalizes on the intuition that when generating a new review given a set of other reviews of a product, the authors should be able to control the “amount of novelty” going into the new review or, equivalently, vary the extent to which it deviates from the input.
MeanSum: A Neural Model for Unsupervised Multi-Document Abstractive Summarization
This work considers the setting where there are only documents with no summaries provided, and proposes an end-to-end, neural model architecture to perform unsupervised abstractive summarization, and shows that the generated summaries are highly abstractive, fluent, relevant, and representative of the average sentiment of the input reviews.
LexRank: Graph-based Lexical Centrality as Salience in Text Summarization
A new approach, LexRank, for computing sentence importance based on the concept of eigenvector centrality in a graph representation of sentences is considered and the LexRank with threshold method outperforms the other degree-based techniques including continuous LexRank.
Unsupervised Opinion Summarization with Noising and Denoising
This paper enables the use of supervised learning for the setting where there are only documents available without ground truth summaries, and introduces several linguistically motivated noise generation functions and a summarization model which learns to denoise the input and generate the original review.
Language Models are Unsupervised Multitask Learners
It is demonstrated that language models begin to learn these tasks without any explicit supervision when trained on a new dataset of millions of webpages called WebText, suggesting a promising path towards building language processing systems which learn to perform tasks from their naturally occurring demonstrations.
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
A new language representation model, BERT, designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks.
OpinionDigest: A Simple Framework for Opinion Summarization
OpinionDigest, an abstractive opinion summarization framework, which uses an Aspect-based Sentiment Analysis model to extract opinion phrases from reviews, and trains a Transformer model to reconstruct the original reviews from these extractions.
Unsupervised Opinion Summarization with Content Planning
This work shows that explicitly incorporating content planning in a summarization model not only yields output of higher quality, but also allows the creation of synthetic datasets which are more natural, resembling real world document-summary pairs.
Extractive Opinion Summarization in Quantized Transformer Spaces
The Quantized Transformer is inspired by Vector- Quantized Variational Autoencoders and uses a clustering interpretation of the quantized space and a novel extraction algorithm to discover popular opinions among hundreds of reviews, a significant step towards opinion summarization of practical scope.