A deep learning approach for Malayalam morphological analysis at character level

  title={A deep learning approach for Malayalam morphological analysis at character level},
  author={B. Premjith and K. Soman and M. A. Kumar},
  journal={Procedia Computer Science},
Abstract Morphological analysis is one of the fundamental tasks in computational processing of natural languages. It is the study of the rules of word construction by analysing the syntactic properties and morphological information. In order to perform this task, morphemes have to be separated from the original word. This process is termed as sandhi splitting. Sandhi splitting is important in the morphological analysis of agglutinative languages like Malayalam, because of the richness in… Expand
Kazakh Language Open Vocabulary Language Model with Deep Neural Networks
This work aims to build a character-based language model for the Kazakh Language, with the use of Deep Neural Networks, namely a Long Short-Term Memory model, which aims to produce all possible correct words within the context given. Expand
Deep learning based Character-level approach for Morphological Inflection Generation
A computational model for word formation in Sanskrit is proposed using deep learning based models to attain the morphological changes that a root word undergoes to result in the surface form. Expand
A Sequential Labelling Approach for the Named Entity Recognition in Arabic Language Using Deep Learning Algorithms
A deep learning based approach for Arabic NER which make use of well-known deep neural network architectures like Recurrent neural network (RNN), Long short term memory (LSTM), Gated recurrent unit (GRU), stacked and bidirectional versions of these three architectures. Expand
A Deep Learning Approach for Part-of-Speech Tagging in Nepali Language
A deep learning based POS tagger for Nepali text is proposed which is built using Recurrent Neural Network, Long Short-Term Memory Networks, Gated Recurrent Unit and their bidirectional variants and shows significant improvement and outperforms the state-of-art POS taggers with more than 99% accuracy. Expand
Feedforward Approach to Sequential Morphological Analysis in the Tagalog Language
  • Arian Yambao, C. Cheng
  • Computer Science
  • 2020 International Conference on Asian Language Processing (IALP)
  • 2020
This paper presents a Tagalog morphological analyzer implemented using a Feed Forward Neural Networks model (FFNN) and used the model to identify its capability of learning the language’s inflections. Expand
ThamizhiMorph: A morphological parser for the Tamil language
This paper describes how Thamizhi Morph is designed using a Finite-State Transducer (FST) and implemented using Foma, and specifies a high-level meta-language to efficiently characterise the language’s inflectional morphology. Expand
Embedding Linguistic Features in Word Embedding for Preposition Sense Disambiguation in English - Malayalam Machine Translation Context
The study showed that, the classification accuracy is higher when both verb and noun class features are taken into consideration, and the same trend was observed in the study when the training data contained only noun class Features, i.e., nounclass features dominates the verb class features. Expand
A deep learning based Part-of-Speech (POS) tagger for Sanskrit language by embedding character level features
Various deep learning algorithms are used for implementing a POS tagger for Sanskrit using both uni and bidirectional forms of RNN, LSTM and GRU, which outperformed all world level implementations in terms of accuracy, number of trainable parameters and the storage requirement. Expand
A Deep Learning based Interlingua Representation for Malayalam Documents
Compact representation of sentences like feature vectors, offer better understanding of the sentence formation. Majority applications in natural language processing often requires the help of suchExpand
SinSpell: A Comprehensive Spelling Checker for Sinhala
It is found that the most common errors in a corpus of Sinhala documents were in vowel length and similar sounding letters, and this analysis was used to develop the suggestion generator and auto-corrector. Expand


A sequence labeling approach to morphological analyzer for Tamil language
A novel approach is proposed to solve the morphological analyzer problem using machine learning methodology based on sequence labeling and training by kernel methods that captures the non linear relationships of the Morphological features from training data samples in a better and simpler way. Expand
A Rule Based Approach for Root Word Identification in Malayalam Language
The Root Word Identifier proposed in this work is a rule based approach which automatically removes the inflected part and derive the root words using morphophonemic rules. Expand
Malayalam Noun and Verb Morphological Analyzer: A Simple Approach
The grammatical behavior of thelanguage, the formation of words with multiple suffixes and the preparation of the language are dealt with here, with examples of noun and verb forms in detail. Expand
Long Short-Term Memory
A novel, efficient, gradient based method called long short-term memory (LSTM) is introduced, which can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units. Expand