Michael Busch

Learn More
The web today is increasingly characterized by social and real-time signals, which we believe represent two frontiers in information retrieval. In this paper, we present Early bird, the core retrieval engine that powers Twitter's real-time search service. Although Early bird builds and maintains inverted indexes like nearly all modern retrieval engines, its(More)
Cluster analysis is the art of detecting groups of similar objects in large data sets— without having specified these groups by means of explicit features. Among the various cluster algorithms that have been developed so far the density-based algorithms count to the most advanced and robust approaches. However, this paper shows that density-based cluster(More)
Many models of disease and rumor-spreading phenomena average the behavior of individuals in a population in order to obtain a coarse description of expected system behavior. For these types of models, we determine how close the coarse approximation is to its corresponding agent-based system. These findings lead to a general result on the logistic behavior(More)
Information propagation in social media depends not only on the static follower structure but also on the topic-specific user behavior. Hence novel models incorporating dynamic user behavior are needed. To this end, we propose a model for individual social media users, termed a <i>genotype</i>. The genotype is a <i>per-topic</i> summary of a user's(More)
We explore a real-time Twitter search application where tweets are arriving at a rate of several thousands per second. Real-time search demands that they be indexed and searchable immediately, which leads to a number of implementation challenges. In this paper, we focus on one aspect: dynamic postings allocation policies for index structures that are(More)
Schools of fish and flocks of birds are examples of self-organized animal groups that arise through social interactions among individuals. We numerically study two individual-based models, which recent empirical studies have suggested to explain self-organized group animal behavior: (i) a zone-based model where the group communication topology is determined(More)
Our framework is inspired by biology and evolution similar to Reali and Griffiths [1]. We broaden the genotype interpretation beyond word variants, and demonstrate their predictive utility. Our goal is to treat the observable content as a genetic parcel of information that users pass on to one another, while potentially introducing a delay or alteration to(More)
Information propagation in social media depends not only on the static follower structure but also on the topic-specific user behavior. Hence, novel models incorporating dynamic user behavior are needed. To this end, we propose a model for individual social media users, termed a genotype. The genotype is a per-topic summary of a user’s interest, activity(More)