Corpus ID: 1452658

Short Text Classification on Complaint Documents

@article{Hayati2016ShortTC,
  title={Short Text Classification on Complaint Documents},
  author={Shirley Anugrah Hayati and Alfan Farizki Wicaksono and Mirna Adriani},
  journal={Int. J. Comput. Linguistics Appl.},
  year={2016},
  volume={7},
  pages={129-143}
}
Indonesian government has developed a system for citizens to voice their aspirations and complaints, which are then stored in the form of short documents. Unfortunately, the existing system employs human annotators to manually categorize the short documents, which is very expensive and time-consuming. As a result, automatically classifying the short documents into their correct topics will reduce manual works and obviously increase the efficiency of the task itself. In this paper, we propose… Expand
Automatic classification of complaint letters according to service provider categories
TLDR
The automatic text classification of complaint letters written in Hebrew that were sent to various companies from a wide variety of categories revealed that the most significant issues were related to poor service and delayed delivery. Expand
Automatically Coding Occupation Titles to a Standard Occupation Classification
TLDR
This work implements flat and hierarchical models using Naïve Bayes, Maximum Entropy (MaxEnt), Support Vector Machines (SVM), and Convolutional Neural Networks (CNN) to code job titles to SOC, and shows that MaxEnt, SVM, and CNN perform similarly and are better than Na naïve Bayes on coding job title to SOC. Expand

References

SHOWING 1-10 OF 17 REFERENCES
Short text classification based on strong feature thesaurus
TLDR
This paper presents a new method to tackle the problem of data sparseness in the classification of short texts using statistical methods by building a strong feature thesaurus (SFT) based on latent Dirichlet allocation (LDA) and information gain (IG) models. Expand
Transductive LSI for Short Text Classification Problems
TLDR
This paper presents work that uses Transductive Latent Semantic Indexing (LSI) for text classification, and shows that tailoring the SVD process to the test examples can be even more useful than adding additional training data. Expand
Automatic multilabel categorization using learning to rank framework for complaint text on Bandung government
  • Ahmad Fauzan, M. L. Khodra
  • Computer Science
  • 2014 International Conference of Advanced Informatics: Concept, Theory and Application (ICAICTA)
  • 2014
TLDR
The experiment results show that LamdaMART, which is listvvise approach in learning to rank, is the best algorithm for classifying the primary agency and the secondary agencies for complaint text. Expand
Text Classification based on the Latent Topics of Important Sentences extracted by the PageRank Algorithm
TLDR
A method to raise the accuracy of text classification based on latent topics, reconsidering the techniques necessary for good classification by employing the k-means algorithm and investigating how it works for good clustering. Expand
The use of bigrams to enhance text categorization
TLDR
An efficient text categorization algorithm that generates bigrams selectively by looking for ones that have an especially good chance of being useful by using the information gain metric, combined with various frequency thresholds is presented. Expand
Using Bigrams in Text Categorization
In the past decade a sufficient effort has been expended on attempting to come up with a document representation which is richer than the simple Bag-Of-Words (BOW). One of the widely exploredExpand
Machine learning in automated text categorization
TLDR
This survey discusses the main approaches to text categorization that fall within the machine learning paradigm and discusses in detail issues pertaining to three different problems, namely, document representation, classifier construction, and classifier evaluation. Expand
Indonesian Twitter text authority classification for government in Bandung
  • Janice Laksana, A. Purwarianti
  • Computer Science
  • 2014 International Conference of Advanced Informatics: Concept, Theory and Application (ICAICTA)
  • 2014
TLDR
An automatic authority classification for Twitter text in Indonesian as part of the complaint management system is proposed and the best experimental result was achieved by the feature combination of 1-gram and complaint word, with Support Vector Machine and Label Power Set as the algorithm. Expand
Concept-based Short Text Classification and Ranking
TLDR
This paper proposes using ``Bag-of-Concepts'' in short text representation, aiming to avoid the surface mismatching and handle the synonym and polysemy problem, and proposes a novel framework for lightweight short text classification applications. Expand
Learning to classify short and sparse text & web with hidden topics from large-scale data collections
TLDR
A general framework for building classifiers that deal with short and sparse text & Web segments by making the most of hidden topics discovered from large-scale data collections that is general enough to be applied to different data domains and genres ranging from Web search results to medical text. Expand
...
1
2
...