Niraj Aswani

Twitter is the largest source of microblog text, responsible for gigabytes of human discourse every day. Processing microblog text is difficult: the genre is noisy, documents have little context, and utterances are very short. As such, conventional NLP tools fail when faced with tweets and other microblog text. We present TwitIE, an open-source NLP pipeline(More)
The need for efficient corpus indexing and querying arises frequently both in machine learning-based and human-engineered natural language processing systems. This paper presents the ANNIC system, which can index documents not only by content, but also by their linguististic annotations and features. It also enables users to formulate versatile queries(More)
Instance unification determines whether two instances in an ontology refer to the same object in the real world. More specifically, this paper addresses the instance unification problem for person names. The approach combines the use of citation information (i.e., abstract, initials, titles and co-authorship information) with web mining, in order to gather(More)
This paper presents GATE Teamware – an open-source, web-based, collaborative text annotation framework. It enables users to carry out complex corpus annotation projects, involving distributed annotator teams. Different user roles are provided (annotator, manager, administrator) with customisable user interface functionalities, in order to support the(More)
Using semantic technologies for mining and intelligent information access to microblogs is a challenging, emerging research area. Unlike carefully authored news text and other longer content, tweets pose a number of new challenges, due to their short, noisy, context-dependent, and dynamic nature. Semantic annotation of tweets is typically performed in a(More)
When researching new product ideas or filing new patents, inventors need to retrieve all relevant pre-existing know-how and/or to exploit and enforce patents in their technological domain. However, this process is hindered by lack of richer metadata, which if present, would allow more powerful concept-based search to complement the current keyword-based(More)
The pancreas secretes a bicarbonate-rich fluid containing digestive enzymes via the ampulla of Vater into the duodenum. Defective secretion leads to maldigestion of fat and protein with increased faecal losses. Cystic fibrosis (CF) is the major cause of pancreatic exocrine failure in childhood, whereas pancreatic insufficiency in adults is commonly(More)
BACKGROUND Pseudomonas aeruginosa is the commonest micro-organism associated with respiratory infections in cystic fibrosis. Retrospective studies have suggested that survival is increased by using an aggressive policy of intravenous antipseudomonal antibiotics at regular intervals, irrespective of symptoms. OBJECTIVES To determine whether there is(More)