Learn More
Prior efforts have shown that under certain situations, retrieval effectiveness may be improved via the use of data fusion techniques. Although these improvements have been observed from the fusion of result sets from several distinct information retrieval systems, it has often been thought that fusing different document retrieval strategies in a single(More)
Mobile SMS spam is on the rise and is a prevalent problem. While recent work has shown that simple machine learning techniques can distinguish between ham and spam with high accuracy, this paper explores the individual contributions of various textual features in the classification process. <i>Our results reveal the surprising finding that simple is(More)
One of the tasks a Clinical Decision Support (CDS) system is designed to solve is retrieving the most relevant and actionable literature for a given medical case report. In this work, we present a query reformulation approach that addresses the unique formulation of case reports, making them suitable to be used on a general purpose search engine.(More)
In this work, we emphasize how to merge and re-rank contextual suggestions from the open Web based on a user " s personal interests. We retrieve relevant results from the open Web by identifying context-independent queries, combining them with location information, and issuing the combined queries to multiple Web search engines. Our learning to rank model(More)
Interest in medical data mining is growing rapidly as more health-related data becomes available online. We propose methods for extracting Adverse Drug Reactions (ADRs) from forum posts and linking extracted ADRs to the drugs that users claim are responsible for them. We evaluate our methodology using a corpus of annotated forum posts. We find that our ADR(More)
Extraction and interpretation of temporal information from clinical text is essential for clinical practitioners and researchers. SemEval 2016 Task 12 (Clinical TempEval) addressed this challenge using the THYME 1 corpus, a corpus of clinical narratives annotated with a schema based on TimeML 2 guidelines. We developed and evaluated approaches for:(More)
Online mental health forums provide users with an anonymous support platform that is facilitated by moderators responsible for finding and addressing critical posts, especially those related to self-harm. Given the seriousness of these posts, it is important that the mod-erators are able to locate these critical posts quickly in order to respond with timely(More)
Many prior efforts have been devoted to the basic idea that data fusion techniques can improve retrieval effectiveness. Recent work in the area suggests that many approaches, particularly multiple-evidence combinations, can be a successful means of improving the effectiveness of a system. Unfortunately, the conditions favorable to effectiveness improvements(More)
Some recent topic model-based methods have been proposed to discover and summarize the evolutionary patterns of themes in temporal text collections. However, the theme patterns extracted by these methods are hard to interpret and evaluate. To produce a more descriptive representation of the theme pattern, we not only give new representations of sentences(More)