UIT-VSFC: Vietnamese Students’ Feedback Corpus for Sentiment Analysis

  title={UIT-VSFC: Vietnamese Students’ Feedback Corpus for Sentiment Analysis},
  author={Kiet Van Nguyen and Vu Duc Nguyen and Phu X. V. Nguyen and Tham T. H. Truong and Ngan Luu-Thuy Nguyen},
  journal={2018 10th International Conference on Knowledge and Systems Engineering (KSE)},
Students’ feedback is a vital resource for the interdisciplinary research combining of two fields: sentiment analysis and education. [] Key Method The resource consists of over 16,000 sentences which are human-annotated on the two tasks. To assess the quality of our corpus, we measure the inter-annotator agreements and classification accuracies on our UIT-VSFC. As a result, we achieved 91.20% of the inter-annotator agreement for the sentiment-based task and 71.07% of that for the topic-based task. In…

Figures and Tables from this paper


A new dataset on student’s feedback of aspect categories detection and aspect-sentiment classification tasks and a series of experiments on the dataset based on a combination model BiLSTM-CNN, compared with other machine learning approaches are presented.

Sentiment Analysis Implementing BERT-based Pre-trained Language Model for Vietnamese

This work studies a sentiment analysis model using PhoBERT pre-trained model for Vietnamese, which is a robust optimization for Vietnamese of the prominent BERT model, and employs alternative fine-tuning techniques to generalize the model for multi-class classification other than the binary task.

Variants of Long Short-Term Memory for Sentiment Analysis on Vietnamese Students’ Feedback Corpus

According to the experimental results, the Dependency Tree-LSTM were not better than the LSTM model, however, when combining final hidden state vectors of L STM and Dependency tree-L STM models with a Support Vector Machine classifier, the F1-score of 90.2%, which is higher than the performance of the LstM model.

Sentiment Analysis of Students’ Feedback with NLP and Deep Learning: A Systematic Mapping Study

The mapping results showed that the field is rapidly growing, especially regarding the application of DL, which is the most recent trend, and various aspects that need to be considered in order to contribute to the maturity of research and development in the field were identified.

A corpus for aspect-based sentiment analysis in Vietnamese

A new annotated corpus for studies on the two sub-tasks: aspect detection and polarity detection is presented, including 7,828 restaurant reviews at document-level and a supervised learning method with rich features.

SA2SL: From Aspect-Based Sentiment Analysis to Social Listening System for Business Intelligence

This paper creates UITViSFD, a Vietnamese Smartphone Feedback Dataset as a new benchmark corpus built based on a strict annotation schemes for evaluating aspect-based sentiment analysis, consisting of 11,122 human-annotated comments for mobile e-commerce, which is freely available for research purposes.

Empirical Study of Text Augmentation on Social Media Text in Vietnamese

The data augmentation techniques are applied to solve the imbalance problem between classes of the dataset, increasing the prediction model's accuracy.


A new representation learning model called a two-channel vector to learn a higher-level feature of a document for SC is proposed that can enhance the accuracy of SC problems compared to two single models and three state-of-the-art ensemble methods.

An Evaluation of the UIT-VSFC Dataset Using Modern Machine Learning Techniques and Word Embeddings

Deep neural network and recurrent neural network models are developed employing a word embedding method for two different tasks, i.e., topic and polarity classification, and experimental results show that DNN models outperform RNN models.

Emotion Recognition for Vietnamese Social Media Text

A standard Vietnamese Social Media Emotion Corpus (UIT-VSMEC) with exactly 6,927 emotion-annotated sentences is built, contributing to emotion recognition research in Vietnamese which is a low-resource language in natural language processing (NLP).



Mining Vietnamese Comparative Sentences for Sentiment Analysis

This paper presents a general framework for mining Vietnamese comparative sentences, and introduces a new corpus for the task in Vietnamese, and conducts a series of experiments on that corpus to investigate thetask in both linguistic and modeling aspects.

Sentiment Analysis for Vietnamese

  • Binh Thanh KieuS. Pham
  • Computer Science
    2010 Second International Conference on Knowledge and Systems Engineering
  • 2010
This paper addresses the problem at the sentence level and builds a rule-based system using the Gate framework, the first work that analyzes sentiment at sentence level in Vietnamese.

An empirical study on sentiment analysis for Vietnamese

An empirical study on machine learning based sentiment analysis for Vietnamese focuses on the task of sentiment classification, and introduces an annotated corpus for sentiment classification extracted from hotel reviews in Vietnamese.

Twitter as a Corpus for Sentiment Analysis and Opinion Mining

A novel solution to target-oriented sentiment summarization and SA of short informal texts with a main focus on Twitter posts known as “tweets” is introduced and it is shown that the hybrid polarity detection system not only outperforms the unigram state-of-the-art baseline, but also could be an advantage over other methods when used as a part of a sentiment summarizing system.

SemEval-2013 Task 2: Sentiment Analysis in Twitter

Crowdourcing on Amazon Mechanical Turk was used to label a large Twitter training dataset along with additional test sets of Twitter and SMS messages for both subtasks, which included two subtasks: A, an expression-level subtask, and B, a message level subtask.

SemEval-2016 Task 4: Sentiment Analysis in Twitter

The fourth year of the SemEval-2016 Task 4 comprises five subtasks, three of which represent a significant departure from previous editions, and the task continues to be very popular, attracting a total of 43 teams.

SentiWordNet 3.0: An Enhanced Lexical Resource for Sentiment Analysis and Opinion Mining

This work discusses SENTIWORDNET 3.0, a lexical resource explicitly devised for supporting sentiment classification and opinion mining applications, and reports on the improvements concerning aspect (b) that it embodies with respect to version 1.0.

SentBuk: Sentiment analysis for e-learning environments

SentBuk is a Facebook application that extracts information about the user sentiment automatically, in a non-intrusive way, so that adaptive e-learning systems can adapt any of their aspects according to each student sentiment, among other criteria.

Sentiment analysis in Facebook and its application to e-learning

Annotating Expressions of Opinions and Emotions in Language

The manual annotation process and the results of an inter-annotator agreement study on a 10,000-sentence corpus of articles drawn from the world press are presented.