Learn More
A random graph model based on Kronecker products of probability matrices has been recently proposed as a generative model for large-scale real-world networks such as the web. This model simultaneously captures several well-known properties of real-world networks; in particular, it gives rise to a heavy-tailed degree distribution, has a low diameter, and(More)
Recent trends towards database outsourcing, as well as concerns and laws governing data privacy, have led to great interest in enabling secure database services. Previous approaches to enabling such a service have been based on data encryption, causing a large overhead in query processing. We propose a new, distributed architecture that allows an(More)
We consider a privacy threat to a social network in which the goal of an attacker is to obtain knowledge of a significant fraction of the links in the network. We formalize the typical social network interface and the information about links that it provides to its users in terms of lookahead. We consider a particular threat where an attacker subverts user(More)
We study the use of viral marketing strategies on social networks that seek to maximize revenue from the sale of a single product. We propose a model in which the decision of a buyer to buy the product is influenced by friends that own the product and the price at which the product is offered. The influence model we analyze is quite general, naturally(More)
P3P [23, 24] is a set of standards that allow corporations to declare their privacy policies. Hippo-cratic Databases [6] have been proposed to implement such policies within a corporation's datas-tore. From an end-user individual's point of view, both of these rest on an uncomfortable philosophy of trusting corporations to protect his/her privacy. Recent(More)
We consider the problem of estimating the size of a collection of documents using only a standard query interface. Our main idea is to construct an unbiased and low-variance estimator that can closely approximate the size of any set of documents defined by certain conditions, including that each document in the set must match at least one query from a(More)
MOTIVATION Genome analysis suggests that tandem duplication is an important mode of evolutionary novelty by permitting one copy of each gene to drift and potentially to acquire a new function. With more and more genomic sequences available, reconstructing duplication history has received extensive attention recently. RESULTS An efficient method is(More)
SUMMARY SEGID is a tool for finding conserved regions (regions of high scores) for a given (multiple) sequence alignment. It takes a (multiple) sequence alignment as its input and converts the alignment into a sequence of numbers, where each number is the alignment score of a column. Three algorithms are used to identify regions of high scores. A graphical(More)