Baharan Mirzasoleiman

Learn More
Many large-scale machine learning problems (such as clustering, non-parametric learning, kernel machines, etc.) require selecting, out of a massive data set, a manageable yet representative subset. Such problems can often be reduced to maximizing a submodular set function subject to cardinality constraints. Classical approaches require centralized access to(More)
How can one summarize a massive data set "on the fly", i.e., without even having seen it in its entirety? In this paper, we address the problem of extracting representative elements from a large stream of data. I.e., we would like to select a subset of say k data points from the stream that are most representative according to some objective function. Many(More)
Is it possible to maximize a monotone submodular function faster than the widely used lazy greedy algorithm (also known as accelerated greedy), both in theory and practice? In this paper, we develop the first linear-time algorithm for maximizing a general monotone submodular function subject to a cardinality constraint. We show that our randomized(More)
Can we summarize multi-category data based on user preferences in a scalable manner? Many utility functions used for data summarization satisfy submodularity, a natural diminishing returns property. We cast personalized data summarization as an instance of a general submodular maximization problem subject to multiple constraints. We develop the first(More)
Submodular continuous functions are a category of (generally) non-convex/nonconcave functions with a wide spectrum of applications. We characterize these functions and demonstrate that they can be maximized efficiently with approximation guarantees. Specifically, I) for monotone submodular continuous functions with an additional diminishing returns(More)
Many technological networks can experience random and/or systematic failures in their components. More destructive situations can happen if the components have limited capacity, where the failure in one of them might lead to a cascade of failures in other components, and consequently break down the structure of the network. In this paper, the tolerance of(More)
Many large-scale machine learning problems – clustering, non-parametric learning, kernel machines, etc. – require selecting a small yet representative subset from a large dataset. Such problems can often be reduced to maximizing a submodular set function subject to various constraints. Classical approaches to submodular optimization require centralized(More)
How can one find a subset, ideally as small as possible, that well represents a massive dataset? I.e., its corresponding utility, measured according to a suitable utility function, should be comparable to that of the whole dataset. In this paper, we formalize this challenge as a submodular cover problem. Here, the utility is assumed to exhibit(More)
Social networking has become a part of daily life for many individuals across the world. Widespread adoption of various strategies in such networks can be utilized by business corporations as a powerful means for advertising. In this study, we investigated viral marketing strategies in which buyers are influenced by other buyers who already own an item.(More)