Addressing Age-Related Bias in Sentiment Analysis

@article{Diaz2018AddressingAB,
  title={Addressing Age-Related Bias in Sentiment Analysis},
  author={Mark Diaz and Isaac L. Johnson and Amanda Lazar and Anne Marie Piper and Darren Gergle},
  journal={Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems},
  year={2018}
}
Computational approaches to text analysis are useful in understanding aspects of online interaction, such as opinions and subjectivity in text. Yet, recent studies have identified various forms of bias in language-based models, raising concerns about the risk of propagating social biases against certain groups based on sociodemographic factors (e.g., gender, race, geography). In this study, we contribute a systematic examination of the application of language models to study discourse on aging… 

Tables from this paper

Sentiment analysis
Mitigation of Unintended Biases against Non-Native English Texts in Sentiment Analysis
TLDR
A generalisable framework for measuring and mitigating non-native speaker bias in sentiment analysis is proposed, which was discovered in lexicon-based sentiment analysis systems and mitigated by updating 4 lexicons for English cognates in 3 languages.
Debiasing Word Embeddings from Sentiment Associations in Names
TLDR
DebiasEmb is proposed, a skip-gram based word embedding approach that, for a given oracle sentiment classification model, will debias the name representations, such that they cannot be associated with either positive or negative sentiment.
Identification of Bias Against People with Disabilities in Sentiment Analysis and Toxicity Detection Models
TLDR
This paper provides an examination of sentiment and toxicity analysis models to understand in detail how they discriminate PWD, and presents Bias Identification Test in Sentiments (BITS), a corpus of 1,126 sentences designed to probe sentiment analysis models for biases in disability.
Multilingual Twitter Corpus and Baselines for Evaluating Demographic Bias in Hate Speech Recognition
TLDR
This work assemble and publish a multilingual Twitter corpus for the task of hate speech detection with inferred four author demographic factors: age, country, gender and race/ethnicity, and measures the performance of four popular document classifiers and evaluates the fairness and bias of the baseline classifiers on the author-level demographic attributes.
Perturbation Sensitivity Analysis to Detect Unintended Model Biases
TLDR
A generic evaluation framework, Perturbation Sensitivity Analysis, is proposed, which detects unintended model biases related to named entities, and requires no new annotations or corpora to be employed.
Towards an Enhanced Understanding of Bias in Pre-trained Neural Language Models: A Survey with Special Emphasis on Affective Bias
TLDR
The attempt to draw a comprehensive view of bias in pre-trained language models, and especially the exploration of affective bias will be highly beneficial to researchers interested in this evolving field.
Balancing out Bias: Achieving Fairness Through Training Reweighting
TLDR
This paper introduces a very simple but highly effective method for countering bias using instance reweighting, based on the frequency of both task labels and author demographics, and extends the method in the form of a gated model which incorporates the author demographic as an input.
What do Bias Measures Measure?
TLDR
This work presents a comprehensive survey of existing bias measures in NLP as a function of the associated NLP tasks, metrics, datasets, and social biases and corresponding harms and proposes a documentation standard for bias measures to aid their development, categorization, and appropriate usage.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 96 REFERENCES
SentiBench - a benchmark comparison of state-of-the-practice sentiment analysis methods
TLDR
A benchmark comparison of twenty-four popular sentiment analysis methods, covering messages posted on social networks, movie and product reviews, as well as opinions and comments in news articles is presented, highlighting the extent to which the prediction performance of these methods varies considerably across datasets.
VADER: A Parsimonious Rule-Based Model for Sentiment Analysis of Social Media Text
TLDR
Interestingly, using the authors' parsimonious rule-based model to assess the sentiment of tweets, it is found that VADER outperforms individual human raters, and generalizes more favorably across contexts than any of their benchmarks.
Learning Word Vectors for Sentiment Analysis
TLDR
This work presents a model that uses a mix of unsupervised and supervised techniques to learn word vectors capturing semantic term--document information as well as rich sentiment content, and finds it out-performs several previously introduced methods for sentiment classification.
Enhanced Sentiment Learning Using Twitter Hashtags and Smileys
TLDR
A supervised sentiment classification framework which is based on data from Twitter, a popular microblogging service, is proposed, utilizing 50 Twitter tags and 15 smileys as sentiment labels, allowing identification and classification of diverse sentiment types of short texts.
Quantifying and Reducing Stereotypes in Word Embeddings
TLDR
A novel gender analogy task is created and combined with crowdsourcing to systematically quantify the gender bias in a given embedding, and an efficient algorithm is developed that reduces gender stereotype using just a handful of training examples while preserving the useful geometric properties of the embedding.
Lexicon-Based Methods for Sentiment Analysis
TLDR
The Semantic Orientation CALculator (SO-CAL) uses dictionaries of words annotated with their semantic orientation (polarity and strength), and incorporates intensification and negation, and is applied to the polarity classification task.
Opinion mining and sentiment analysis
TLDR
This paper aims to undertake a stepwise methodology to determine the effects of an average person's tweets over fluctuation of stock prices of a multinational firm called Samsung Electronics Ltd.
Quantifying Search Bias: Investigating Sources of Bias for Political Searches in Social Media
TLDR
This paper proposes a framework to quantify these distinct biases and applies this framework to politics-related queries on Twitter and found that both the input data and the ranking system contribute significantly to produce varying amounts of bias in the search results.
#Snowden: Understanding Biases Introduced by Behavioral Differences of Opinion Groups on Social Media
TLDR
A study of 10-month Twitter discussions on the controversial topic of Edward Snowden found that the minority group engaged in a "shared audiencing" practice with more persistent production of original tweets, focusing increasingly on inter-personal interactions with like-minded others.
Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank
TLDR
A Sentiment Treebank that includes fine grained sentiment labels for 215,154 phrases in the parse trees of 11,855 sentences and presents new challenges for sentiment compositionality, and introduces the Recursive Neural Tensor Network.
...
1
2
3
4
5
...