Kripabandhu Ghosh

Learn More
The FIRE 2016 Microblog track focused on retrieval of microblogs (tweets posted on Twitter) during disaster events. A collection of about 50,000 microblogs posted during a recent disaster event was made available to the participants, along with a set of seven practical information needs during a disaster situation. The task was to retrieve microblogs(More)
IR methods are increasingly being applied over microblogs to extract real-time information, such as during disaster events. In such sites, most of the user-generated content is written informally – the same word is often spelled differently by different users, and words are shortened arbitrarily due to the length limitations on microblogs. Stemming is a(More)
This paper describes some preliminary results obtained by treating the tweet contextualization task as a passage retrieval task. Each tweet was submitted as a query to the Indri 5.2 search engine after some preprocessing. Either paragraphs or sentences were retrieved in response to a query. Passages retrieved from the same document were concatenated. This(More)
Information Retrieval performance is hurt to a great extent by OCR errors. Much research has been reported on modelling and correction of OCR errors. However, all the existing systems make use of language dependent resources or training texts to study the nature of errors. No research has been reported on improving retrieval performance from erroneous text(More)
E-discovery is the requirement that the documents and information in electronic form stored in corporate systems be produced as evidence in litigation. It has posed great challenges for legal experts. Legal searchers have always looked to find "any and all" evidence for a given case. Thus, a legal search system would essentially be a recall-oriented system.(More)
Microblogging sites like Twitter are increasingly being used for aiding post-disaster relief operations. In such situations, identifying needs and availabilities of various types of resources is critical for effective coordination of the relief operations. We focus on the problem of automatically identifying tweets that inform about needs and availabilities(More)