Corpus ID: 9594633

Cross-Domain Sentiment Classification via Topic-Related TrAdaBoost

  title={Cross-Domain Sentiment Classification via Topic-Related TrAdaBoost},
  author={Xingchang Huang and Yanghui Rao and Haoran Xie and Tak-Lam Wong and Fu Lee Wang},
Cross-domain sentiment classification aims to tag sentiments for a target domain by labeled data from a source domain. Due to the difference between domains, the accuracy of a trained classifier may be very low. In this paper, we propose a boosting-based learning framework named TR-TrAdaBoost for cross-domain sentiment classification. We firstly explore the topic distribution of documents, and then combine it with the unigram TrAdaBoost. The topic distribution captures the domain information of… Expand

Tables and Topics from this paper

Domain Adaptation for Sentiment Analysis : A Survey
Domain adaptation is a useful technique to combat the problem of data scarcity. It has been used for multiple NLP tasks like part of speech tagging, dependency parsing, named entity recognition etc.Expand
Distributional Correspondence Indexing for Cross-Lingual and Cross-Domain Sentiment Classification (Extended Abstract)
A new DA method called Distributional Correspondence Indexing (DCI) for sentiment classification that derives term representations in a vector space common to both domains where each dimension reflects its distributional correspondence to a pivot. Expand
Enhanced cross-domain sentiment classification utilizing a multi-source transfer learning approach
State-of-the-art performance comparison proved that the cosine similarity-based transfer learning approach outperforms other approaches. Expand
A Comparative Study Of Co-Occurrence Strategies for Building A Cross-Domain Sentiment Thesaurus
Experimental results show that BCP results outperform four baseline co-occurrence calculation methods (PMI, PMI-square, EMI, and G-means) in the task of cross-domain sentiment analysis. Expand
Cross-Domain Sentiment Encoding through Stochastic Word Embedding
This work proposes to explore the word polarity and occurrence information through a simple mapping and encode such information more accurately whilst managing lower computational costs and takes advantage of the stochastic embedding technique to tackle cross-domain sentiment alignment. Expand
A context-based regularization method for short-text sentiment analysis
This paper uses contextual knowledge obtained from the data to improve performance of the sentiment classification and incorporates the contextual knowledge as a regularization into a supervised classification framework, which then converts into an optimization problem to train a more accurate model. Expand
A Convolution-LSTM-Based Deep Neural Network for Cross-Domain MOOC Forum Post Classification
A transfer learning framework based on a convolutional neural network and a long short-term memory model, called ConvL, to automatically identify whether a post expresses confusion, determine the urgency and classify the polarity of the sentiment is proposed. Expand
Transductive Learning with String Kernels for Cross-Domain Text Classification
An algorithm composed of two simple yet effective transductive learning approaches to further improve the results of string kernels in cross-domain settings by adapting string kernels to the test set without using the ground-truth test labels is formally described. Expand
Clustering Word Embeddings with Self-Organizing Maps. Application on LaRoSeDa - A Large Romanian Sentiment Data Set
This paper introduces LaRoSeDa, a Large Romanian Sentiment Data Set, which is composed of 15,000 positive and negative reviews collected from the largest Romanian e-commerce platform, and replaces the k-means clustering algorithm with self-organizing maps (SOMs). Expand
Improving the results of string kernels in sentiment analysis and Arabic dialect identification by adapting them to your test set
Two simple yet effective transductive learning approaches are applied to further improve the results of string kernels by adapting string kernels to the test set and report significantly better accuracy rates in English polarity classification and Arabic dialect identification. Expand


Cross-domain sentiment classification via spectral feature alignment
This work develops a general solution to sentiment classification when the authors do not have any labels in a target domain but have some labeled data in a different domain, regarded as source domain and proposes a spectral feature alignment (SFA) algorithm to align domain-specific words from different domains into unified clusters, with the help of domain-independent words as a bridge. Expand
Topic Correlation Analysis for Cross-Domain Text Classification
A novel approach named Topic Correlation Analysis (TCA), which extracts both the shared and the domain-specific latent features to facilitate effective knowledge transfer, is proposed and the experimental results justify the superiority of the proposed method over the stat-of-the-art baselines. Expand
Biographies, Bollywood, Boom-boxes and Blenders: Domain Adaptation for Sentiment Classification
This work extends to sentiment classification the recently-proposed structural correspondence learning (SCL) algorithm, reducing the relative error due to adaptation between domains by an average of 30% over the original SCL algorithm and 46% over a supervised baseline. Expand
Cross-domain and cross-category emotion tagging for comments of online news
An extensive set of experimental results on four datasets from popular online news services demonstrates the effectiveness of the proposed models in cross-domain emotion tagging for comments of online news in both the scenarios of sharing the same emotion categories or having different categories in the source and target domains. Expand
Latent Dirichlet Allocation
We propose a generative model for text and other collections of discrete data that generalizes or improves on several previous models including naive Bayes/unigram, mixture of unigrams [6], andExpand
Boosting for transfer learning
This paper presents a novel transfer learning framework called TrAdaBoost, which extends boosting-based learning algorithms and shows that this method can allow us to learn an accurate model using only a tiny amount of new data and a large amount of old data, even when the new data are not sufficient to train a model alone. Expand