Automatic Multiclass Document Classification of Hindi Poems using Machine Learning Techniques

  title={Automatic Multiclass Document Classification of Hindi Poems using Machine Learning Techniques},
  author={Kaushika Pal and Biraj V. Patel},
  journal={2020 International Conference for Emerging Technology (INCET)},
  • Kaushika Pal, B. Patel
  • Published 1 June 2020
  • Computer Science
  • 2020 International Conference for Emerging Technology (INCET)
Text Classification of Indic language face fundamental challenges in terms of achieving good accuracy, as the languages are morphologically rich and too much information is fused in words. In this paper an actual experiment implemented is demonstrated for Classification of Hindi Poem documents to classify poems into 3 classes namely Shringar, Karuna and Veera. Poem content represents mood and have sentiments associated, the classification of emotions become more challenging when the language is… 
1 Citations

Figures and Tables from this paper


Punjabi Poetry Classification: The Test of 10 Machine Learning Algorithms
Results for Punjabi poetry classification revealed that 4 machine learning algorithms namely, Hyperpipes (HP), K- nearest neighbour (KNN), Naive Bayes (NB) and Support Vector Machine (SVM) with an accuracy of 50.63 %, 52.75 % and 58.79 % respectively, outperformed all other machinelearning algorithms under the test.
Multi - Class Document Classification: Effective and SystematizedMethod to Categorize Documents
This research work is combining approach of Natural Language Processing and Machine Learning for content-based classification of documents that is successful in classifying documents with more than 70% of accuracy for major Indian Languages and more than 80% accuracy for English Language.
A Study of Text Classification Natural Language Processing Algorithms for Indian Languages
This study shows that supervised learning algorithms (Naive Bayes (NB), Support Vector Machine (SVM), Artificial Neural Network (ANN), and N-gram) performed better for Text Classification task.
An Efficient Hindi Text Classification Model Using SVM
A Hindi Text Classification model is proposed, which accepts a set of known Hindi documents, preprocesses them at document, sentence and word levels, extracts features, and trains SVM classifier, which further classifies aSet of Hindi unknown documents.
Multiclass classification and class based sentiment analysis for Hindi language
A model for classification of Hindi speech documents into multiple classes with the help of ontology is proposed and sentiment analysis is carried out using HindiSentiWordNet (HSWN) to determine the polarity of individual class.
Handwritten Hindi character recognition using k-means clustering and SVM
  • Akanksha Gaur, S. Yadav
  • Computer Science
    2015 4th International Symposium on Emerging Trends and Technologies in Libraries and Information Services
  • 2015
Recognition of Hindi characters is done by using a three step procedure, in which binarization of the image and separations of characters are performed.
Classification of children stories in hindi using keywords and POS density
This paper is proposing a framework for story classification using keyword and Part-of-speech (POS) based features for Hindi stories into three genres: fable, folk-tale and legend.
Model for Classification of Poems in Hindi Language Based on Ras
The developed model will classify poem into Shringar, Hasya, Adbhuta, Shanta, Raudra, Veera, Karuna, Bhayanaka, Vibhasta rasas, which will use mix of part-of-speech-based feature and emotional
Emotion-specific features for classifying emotions in story text
The importance of story genre information in emotion classification was observed from the experiments conducted on classifying emotions within story genre, and SVM models outperformed other models in terms of classification accuracy.
Performance analysis of flexible zone based features to classify Hindi numerals
The performance of fixed boundary and flexible boundary is evaluated and performance for SVM is better than SVM for recognition of the digits and MLP based classifier is used.