Tag Me a Label with Multi-arm: Active Learning for Telugu Sentiment Analysis

  title={Tag Me a Label with Multi-arm: Active Learning for Telugu Sentiment Analysis},
  author={Sandeep Sricharan Mukku and Subba Reddy Oota and R. Mamidi},
Sentiment Analysis is one of the most active research areas in natural language processing and an extensively studied problem in data mining, web mining and text mining for English language. With the proliferation of social media these days, data is widely increasing in regional languages along with English. Telugu is one such regional language with abundant data available in social media, but it’s hard to find a labeled training set as human annotation is time-consuming and cost-ineffective… 
An adaptable scheme to enhance the sentiment classification of Telugu language
A novel less error pruning-shortest description length (LEP-SDL) for error pruned and ant lion boosting model (ALBM) for opinion specification purpose is developed and evaluated to evaluate the competence of the projected model.
Light Gradient Boosting Machine for General Sentiment Classification on Short Texts: A Comparative Evaluation
A general multi-class sentiment classifier is proposed using the proven capabilities of Light Gradient Boosting Machine (LGBM) in dealing with high dimensional and imbalance data, and an LGBM model is trained to recognize one of three sentiments of tweets: positive, negative, or neutral.
Deep Sentiment Representation Through Char-level CNN and LSTM
A novel deep learning method was developed using CNN and LSTM at the lowest atomic representation and has outperformed existing state-of-the-art methods for a single language and the model for multiple languages performed better than individual techniques.
Multi-Task Text Classification using Graph Convolutional Networks for Large-Scale Low Resource Language
A supervised graph reconstruction method, Multi-Task Text GCN (MT-Text GCN) on the Telugu that leverages to simultaneously learn the low-dimensional word and sentence graph embeddings from word-sentence graph reconstruction using graph autoencoder (GAE) and perform multi-task text classification using these latent sentence graphembeddings.
Learning Representations for Text Classification of Indian Languages
This thesis presents a twin Bidirectional Long Short Term Memory (Bi-LSTM) network with shared parameters consolidated by a contrastive loss function (based on a similarity metric) to classify text into multiple categories based on its features.
Am I a Resource-Poor Language? Data Sets, Embeddings, Models and Analysis for four different NLP tasks in Telugu Language
These representations significantly improve the performance of four NLP tasks and present the benchmark results for Telugu, and argue that the pretrained embeddings are competitive or better than the existing multilingual pretrained models: mBERT, XLM-R, and IndicBERT.
Emotions are Universal: Learning Sentiment Based Representations of Resource-Poor Languages using Siamese Networks
Experiments reveal that SNASA outperforms the state-of-the-art sentiment analysis approaches based on distributional semantics, semantic rules, lexicon lists and deep neural network representations without sh.
A Novel Ant Colony Based DBN Framework to Analyze the Drug Reviews
  • Nazia Tazeen, K. Rani
  • Computer Science
    International Journal of Intelligent Systems and Applications
  • 2021
A novel Ant Colonybased Deep Belief Neural Network (AC-DBN) framework is proposed and drug review tweets are opted to perform sentiment classification by using the proposed framework in python environment.
Contrastive Learning of Emoji-based Representations for Resource-Poor Languages
Experiments on large-scale Twitter datasets of resource-rich languages - English and Spanish and resource-poor languages - Hindi and Telugu reveal that CESNA outperforms the state-of-the-art emoji prediction approaches based on distributional semantics, semantic rules, lexicon lists and deep neural network representations without shared parameters.


A machine learning approach to sentiment analysis in multilingual Web texts
This paper presents machine learning experiments with regard to sentiment analysis in blog, review and forum texts found on the World Wide Web and written in English, Dutch and French and investigates the role of active learning techniques for reducing the number of examples to be manually annotated.
Active deep learning method for semi-supervised sentiment classification
Enhanced Sentiment Classification of Telugu Text using ML Techniques
Various Machine Learning techniques are explored for the classification of Telugu sentences into positive or negative polarities.
Shared Task on Sentiment Analysis in Indian Languages (SAIL) Tweets - An Overview
This is the first attempt to sentiment analysis task in tweets for three Indian languages namely Bengali, Hindi and Tamil, and the main objective was to classify the tweets into positive, negative, and neutral polarity.
Active Learning for Imbalanced Sentiment Classification
This paper proposes a novel active learning approach, named co-selecting, that employs two feature subspace classifiers to collectively select most informative minority-class samples for manual annotation by leveraging a certainty measurement and an uncertainty measurement, to reduce human-annotation efforts.
Support Vector Machine Active Learning with Applications to Text Classification
Experimental results showing that employing the active learning method can significantly reduce the need for labeled training instances in both the standard inductive and transductive settings are presented.
Curious machines: active learning with structured instances
This thesis explores several important questions regarding active learning for tasks involving key person and organization names from text documents and the utility and promise of active learning algorithms in complex real-world learning systems.
An Analysis of Active Learning Strategies for Sequence Labeling Tasks
This paper surveys previously used query selection strategies for sequence models, and proposes several novel algorithms to address their shortcomings, and conducts a large-scale empirical comparison.
Active Learning Literature Survey
This report provides a general introduction to active learning and a survey of the literature, including a discussion of the scenarios in which queries can be formulated, and an overview of the query strategy frameworks proposed in the literature to date.
Active Learning
The key idea behind active learning is that a machine learning algorithm can perform better with less training if it is allowed to choose the data from which it learns. An active learner may pose