Lisa M. Gandy

Learn More
BACKGROUND User content posted through Twitter has been used for biosurveillance, to characterize public perception of health-related topics, and as a means of distributing information to the general public. Most of the existing work surrounding Twitter and health care has shown Twitter to be an effective medium for these problems but more could be done to(More)
A useful approach for enabling computers to automatically create new content is utilizing the text, media, and information already present on the World Wide Web. The newly created content is known as "machine-generated content". For example, a machine-generated content system may create a multimedia news show with two animated anchors presenting a news(More)
Hearing people argue opposing sides of an issue can be a useful way to understand the topic; however, these debates or conversations often don’t exist. Unfortunately, generating interesting natural language conversations is a difficult problem and typically requires a deep model of both a domain and its language. Fortunately, there is a huge amount of(More)
In this paper we discuss a system, CashTagNN, which uses the sentiment and subjectivity scores of tweets that include cashtags of two companies, Apple and Johnson and Johnson, to model stock market movement, and in particular predict opening and closing stock market prices. We demonstrate that by using only sentiment and subjectivity along with a neural(More)
Clustering algorithms are invaluable methods for organizing data into useful information. The CARD algorithm (Nasroui et al., 2000) is one such algorithm that is designed to organize user sessions into profiles, where each profile would highlight a particular type of user. The CARD algorithm is a viable candidate for Web clustering. However it does have(More)
Today’s low cost digital data provides unprecedented opportunities for scientific discovery from synthesis studies. For example, the medical field is revolutionizing patient care by creating personalized treatment plans based upon mining electronic medical records, imaging, and genomics data. Standardized annotations are essential to subsequent analyses for(More)
The United States Senate has become increasingly partisan through the years; in fact it is quite unusual for a Senator to vote against his or her party. The Congressional Close Up system automatically reports on factors that might have influenced a bipartisan vote. The system begins by assigning a topic to the legislation on which a Senate or House vote was(More)
ClinicalTrials.org is a popular portal which physicians use to find clinical trials for their patients. However, the current setup of ClinicalTrials.org makes it difficult for oncologists to locate clinical trials for patients based on mutational status. We present CTMine, a system that mines ClinicalTrials.org for clinical trials per cancer mutation and(More)
Scientists have unprecedented access to a wide variety of high-quality datasets. These datasets, which are often independently curated, commonly use unstructured spreadsheets to store their data. Standardized annotations are essential to perform synthesis studies across investigators, but are often not used in practice. Therefore, accurately combining(More)
Shoppers with an internet-enabled computer have a wealth of product information available to them. By browsing to a variety of websites, users can conduct searches and compare prices, read reviews, and learn more about a product. These sites are pivot points for a user; once they are at Amazon.com’s landing page, for example, they can navigate outwards to a(More)