Learn More
etworks have become ubiquitous. Communication networks, financial transaction networks, networks describing physical systems, and social networks are all becoming increasingly important in our day-today life. Often, we are interested in models of how nodes in the network influence each other (for example, who infects whom in an epidemiological network),(More)
A key challenge for machine learning is tackling the problem of mining richly structured data sets, where the objects are linked in some way due to either an explicit or implicit relationship that exists between the objects. Links among the objects demonstrate certain patterns, which can be helpful for many machine learning tasks and are usually hard to(More)
A large portion of real-world data is stored in commercial relational database systems. In contrast, most statistical learning methods work only with " flat " data representations. Thus, to apply these methods, we are forced to convert our data into a flat form, thereby losing much of the relational structure present in our database. This paper builds on(More)
Many databases contain uncertain and imprecise references to real-world entities. The absence of identifiers for the underlying entities often results in a database which contains multiple references to the same entity. This can lead not only to data redundancy, but also inaccuracies in query processing and knowledge extraction. These problems can be(More)
Probabilistic soft logic (PSL) is a framework for collective, probabilistic reasoning in relational domains. PSL uses first order logic rules as a template language for graphical models over random variables with soft truth values from the interval [0, 1]. Inference in this setting is a continuous optimization task, which can be solved efficiently. This(More)
The dynamic nature of citation networks makes the task of ranking scientific articles hard. Citation networks are continually evolving because articles obtain new citations every day. For ranking scientific articles, we can define the popularity or prestige of a paper based on the number of past citations at the user query time; however, we argue that what(More)
We generalize the graph streaming model to hypergraphs. In this streaming model, hyperedges are arriving online and any computation has to be done on-the-fly using a small amount of space. Each hyperedge can be viewed as a set of elements (nodes), so we refer to our proposed model as the " set-streaming " model of computation. We consider the problem of "(More)
In order to address privacy concerns, many social media websites allow users to hide their personal profiles from the public. In this work, we show how an adversary can exploit an online social network with a mixture of public and private user profiles to predict the private attributes of users. We map this problem to a relational classification problem and(More)