M. Anand Kumar

Learn More
A short text gets updated every now and then. With the global upswing of such micro posts, the need to retrieve information from them also seems to be incumbent. This work focuses on the knowledge extraction from the micro posts by having entity as evidence. Here the extracted entities are then linked to their relevant DBpedia source by featurization, Part(More)
The objective of this experiment is to validate the performance of the distributional semantic representation of text in the classification (Question Classification) task and the Information Retrieval task. Followed by the distributional representation, first level classification of the questions is performed and relevant tweets with respect to the given(More)
This paper is based on morphological analyzer using machine learning approach for complex agglutinative natural languages. Morphological analysis is concerned with retrieving the structure, the syntactic and morphological properties or the meaning of a morphologically complex word. The morphology structure of agglutinative language is unique and capturing(More)
Author attribution has grown into an area that is more challenging from the past decade. It has become an inevitable task in many sectors like forensic analysis, law, journalism and many more as it helps to detect the author in every documentation. Here unigram/bigram features along with latent semantic features from word space were taken and the similarity(More)
Clause boundary identification is a very important task in natural language processing. Identifying the clauses in the sentence becomes a tough task if the clauses are embedded inside other clauses in the sentence. In our approach, we use the dependency parser to identify the boundary for the clause. The dependency tag set, contains 11 tags, and is useful(More)
This contemporary work is done as a slice of the shared task on Entity Extraction from Social Media Text Indian Languages in Forum for Information Retrieval and Evaluation (FIRE2015). Nowadays people are extensively using social media platforms like Face book, Twitter, etc, to exchange their thoughts. The twitter messages are growing rapidly and their style(More)
This paper aims at implementing Named Entity Recognition (NER) for four languages such as English, Tamil, Hindi and Malayalam. The results obtained from this work are submitted to a research evaluation workshop Forum for Information Retrieval and Evaluation (FIRE 2014). This system detects three levels of named entity tags which are referred as nested named(More)
Transliteration is the process of replacing the characters in one language with the corresponding phonetically equivalent characters of the other language. India is a language diversified country where people speak and understand many languages but does not know the script of some of these languages. Transliteration plays a major role in such cases.(More)