Computational linguistic prosody rule-based unified technique for automatic metadata generation for Hindi poetry

  title={Computational linguistic prosody rule-based unified technique for automatic metadata generation for Hindi poetry},
  author={Milind Kumar Audichya and Jatinderkumar R. Saini},
  journal={2019 1st International Conference on Advances in Information Technology (ICAIT)},
Metadata generation for the poems based on the unified rules is very complex from the viewpoint of computational linguistics. This is more tedious when it comes to Hindi poems. Prosody or ‘Chhand’ as it is called in the Hindi language consists of several sets of rules and which are used while construction of a Hindi poem. Currently, no such metadata generator or technique is in existence which can generate the metadata of Hindi poetry based on the prosody or any other Hindi grammatical rule. In… 

Figures and Tables from this paper

Stanza Type Identification using Systematization of Versification System of Hindi Poetry
The paper covers various challenges and the best possible solutions for those challenges, describing the methodology to generate automatic metadata for “Chhand” based on the poems’ stanzas, and provides some advanced information and techniques for metadata generation for ”Muktak Chhands”.
Towards Natural Language Processing with Figures of Speech in Hindi Poetry
This work is the first of its kind in Hindi Natural Language Processing (NLP), which touches on the area of Hindi figure of speech and has created a systematic hierarchical structure of Hindi “Alankaar” types and sub-types and attempted and extended the work to identify a few.
Hindi Multi-document Word Cloud based Summarization through Unsupervised Learning
  • P. Bafna, Jatinderkumar R. Saini
  • Computer Science
    2019 9th International Conference on Emerging Trends in Engineering and Technology - Signal and Information Processing (ICETET-SIP-19)
  • 2019
The objective is to manage the documents and summarize Hindi corpus by applying extracting tokens and document clustering, an application of TF-IDF, cosine-based document similarity measures and cluster dendrograms, in addition to various other Natural Language Processing (NLP) activities.
On Exhaustive Evaluation of Eager Machine Learning Algorithms for Classification of Hindi Verses
Text classification algorithms along with Natural Language Processing (NLP) facilitates fast, cost-effective, and scalable solution for classification and prediction of verses on Hindi corpus.
Hindi Poetry Classification using Eager Supervised Machine Learning Algorithms
Two eager machine learning algorithms are applied on the corpus containing 450 Hindi poems and poetry/poem gets classified based on terms present in it using a misclassification error.
Hindi Verse Class Predictor using Concept Learning Algorithms
In this paper, 565 Hindi poems are classified based on four topics using lazy machine-learning algorithms which are K-nearest neighbours and regression, and K nearset neighbours performs better than Linear regression.


Computational linguistics for metadata building (CLiMB): using text mining for the automatic identification, categorization, and disambiguation of subject terms for image metadata
A system using computational linguistic techniques to extract metadata for image access, developed in the Computational Linguistics for Metadata Building research project, and tested components, including phrase finding for the art and architecture domain, functional semantic labeling using machine learning, and disambiguation of terms in domain-specific text.
Metadata for content description in legal information
  • M. Sagri, D. Tiscornia
  • Computer Science
    14th International Workshop on Database and Expert Systems Applications, 2003. Proceedings.
  • 2003
The subject of this paper is a description of the Jur-Wordnet (Jur-IWN) project, an extension for legal domain of the Italian version of EuroWordNet database, linked to the Interlingual Index (ILI)
Punjabi Poetry Classification: The Test of 10 Machine Learning Algorithms
Results for Punjabi poetry classification revealed that 4 machine learning algorithms namely, Hyperpipes (HP), K- nearest neighbour (KNN), Naive Bayes (NB) and Support Vector Machine (SVM) with an accuracy of 50.63 %, 52.75 % and 58.79 % respectively, outperformed all other machinelearning algorithms under the test.
Sanskrit computational linguistics : third international symposium, Hyderabad, India, January 15-17, 2009 : proceedings
This chapter discusses P??ini's Grammar and Its Computerization: A Construction Grammar Approach, annotating Sanskrit Texts Based on ??bdabodha Systems, and Translation Divergence in English-Sanskrit-Hindi Language Pairs.
Using Automatic Metadata Extraction to Build a Structured Syllabus Repository
This paper proposes an intelligent approach to automatically annotate freely-available syllabi from the Web to benefit the educational community through supporting services such as semantic search and demonstrates the effiectiveness of the extractor.
Rule-based word clustering for document metadata extraction
This paper introduces a domain Rule-based word clustering method for cluster feature representation formed from various domain databases and the word orthographic properties that outperforms the distributional word clusters in the context of document metadata extraction.
Hanuman Chalisa - Wikipedia
  • Wikimedia Foundation, Inc., 27-Nov-2005. [Online]. Available: [Accessed: 07-Mar- 2019] 2019 1st International Conference on Advances in Information Technology 978-1-7281-3241-9/19/$31.00 © 2019 IEEE 442
  • 2005
PuPoCl: Development of Punjabi Poetry Classifier Using Linguistic Features and Weighting
  • INFOCOMP, vol. 16, no. 1–2, pp. 1–7, Dec. 2017 [Online]. Available: [Accessed: 12-Apr-2019]
  • 2017
वैिदक छंद - िविकपीिडया
  • Wikimedia Foundation, Inc., 24-Dec-2014. [Online]. Available: 6%E0%A4%BF%E0%A4%95_%E0%A4%9B%E0%A4%82%E0%A4 %A6. [Accessed: 09-Mar-2019]
  • 2014