Learn More
Naive Bayes is often used as a baseline in text classification because it is fast and easy to implement. Its severe assumptions make such efficiency possible but also adversely affect the quality of its results. In this paper we propose simple, heuristic solutions to some of the problems with Naive Bayes classifiers, addressing both systemic issues as well(More)
People often turn to their friends, families, and colleagues when they have questions. The recent, rapid rise of online social networking tools has made doing this on a large scale easy and efficient. In this paper we explore the phenomenon of using social network status messages to ask questions. We conducted a survey of 624 people, asking them to share(More)
We formulate and study search algorithms that consider a user's prior interactions with a wide variety of content to personalize that user's current Web search. Rather than relying on the unrealistic assumption that people will precisely specify their intent when searching, we pursue techniques that leverage implicit information about the user's interests.(More)
Relevance feedback has a history in information retrieval that dates back well over thirty years (c.f. [SL96]). Relevance feedback is typically used for query expansion during short-term modeling of a user’s immediate information need and for user profiling during long-term modeling of a user’s persistent interests and preferences. Traditional relevance(More)
This paper presents a modified diary study that investigated how people performed personally motivated searches in their email, in their files, and on the Web. Although earlier studies of directed search focused on keyword search, most of the search behavior we observed did not involve keyword search. Instead of jumping directly to their information target(More)
Social networking Web sites are not just places to maintain relationships; they can also be valuable information sources. However, little is known about how and why people search socially-generated content. In this paper we explore search behavior on the popular microblogging/social networking site Twitter. Using analysis of large-scale query logs and(More)
People often repeat Web searches, both to find new information on topics they have previously explored and to re-find information they have seen in the past. The query associated with a repeat search may differ from the initial query but can nonetheless lead to clicks on the same results. This paper explores repeat search behavior through the analysis of a(More)
Domain experts search differently than people with little or no domain knowledge. Previous research suggests that domain experts employ different search strategies and are more successful in finding what they are looking for than non-experts. In this paper we present a large-scale, longitudinal, log-based analysis of the effect of domain expertise on web(More)
The Web is a dynamic, ever changing collection of information. This paper explores changes in Web content by analyzing a crawl of 55,000 Web pages, selected to represent different user visitation patterns. Although change over long intervals has been explored on random (and potentially unvisited) samples of Web pages, little is known about the nature of(More)