Mehrbod Sharifi

Learn More
Story segmentation of news broadcasts has been shown to improve the accuracy of the subsequent processes such as question answering and information retrieval. In previous work, a decision tree trained on automatically extracted lexical and acoustic features was trained to predict story boundaries, using hypothesized sentence boundaries to define potential(More)
The goal of the ongoing project described in this paper is evaluation of the utility of Latent Semantic Analysis (LSA) for unsupervised word sense discrimination. The hypothesis is that LSA can be used to compute context vectors for ambiguous words that can be clustered together – with each cluster corresponding to a different sense of the word. In this(More)
—While many cybersecurity tools are available to computer users, their default configurations often do not match needs of specific users. Since most modern users are not computer experts, they are often unable to customize these tools, thus getting either insufficient or excessive security. To address this problem, we are developing an automated assistant(More)
—We consider the task of scheduling a conference based on incomplete information about resources and constraints, which requires elicitation of additional data, and describe a learning procedure that improves elicitation strategies. We outline the representation of incomplete knowledge, and then describe an adaptive elicitation procedure, which learns to(More)
—We describe a crowdsourcing system, called SmartNotes, which detects security threats related to web browsing, such as Internet scams, deceptive sales of substandard products, and websites with intentionally misleading information. It combines automatically collected information related to the website reputation with user votes and comments, and uses it to(More)
Information extraction techniques (such as Named Entity Recognition) have long been used to extract useful pieces of information from text. The types of information to be extracted are generally fixed and well defined (e.g., names of people, organizations, etc.). However in some cases, the user goal is more abstract and information types cannot be narrowly(More)
— Internet scam is fraudulent or intentionally misleading information posted on the web, usually with the intent of tricking people into sending money or disclosing sensitive information. We describe an application of logistic regression to detection of Internet scam. The developed system automatically collects 43 characteristic statistics of the websites(More)