Learn More
There has been much recent research on identifying global community structure in networks. However, most existing approaches require complete information of the graph in question, which is impractical for some networks, e.g. the World Wide Web (WWW). Algorithms for local community detection have been proposed but their results usually contain many outliers.(More)
We present a discriminative method for learning selectional preferences from unlabeled text. Positive examples are taken from observed predicate-argument pairs, while negatives are constructed from unobserved combinations. We train a Support Vector Machine classifier to distinguish the positive from the negative instances. We show how to partition the(More)
Many datasets can be described in the form of graphs or networks where nodes in the graph represent entities and edges represent relationships between pairs of entities. A common property of these networks is their community structure, considered as clusters of densely connected groups of vertices, with only sparser connections between groups. The(More)
Web-scale data has been used in a diverse range of language research. Most of this research has used web counts for only short, fixed spans of context. We present a unified view of using web counts for lexical disambiguation. Unlike previous approaches , our supervised and unsupervised systems combine information from multiple and overlapping segments of(More)
We present an automatic approach to determining whether a pronoun in text refers to a preceding noun phrase or is instead non-referential. We extract the surrounding tex-tual context of the pronoun and gather, from a large corpus, the distribution of words that occur within that context. We learn to reliably classify these distributions as representing(More)
MOTIVATION The availability of the whole genomic sequences of HIV-1 viruses provides an excellent resource for studying the HIV-1 phylogenies using all the genetic materials. However, such huge volumes of data create computational challenges in both memory consumption and CPU usage. RESULTS We propose the complete composition vector representation for an(More)
We present a framework for visualizing remote distributed data sources using a multiuser immersive virtual reality environment. DIVE-ON is a system prototype that consolidates distributed data sources into a multidimensional data model, transports user-specified views to a 3D immersive display, and presents various data attributes and mining results as(More)
Much structured data of scientific interest can be represented as networks, where sets of nodes or vertices are joined together in pairs by links or edges. Although these networks may belong to different research areas, there is one property that many of them do have in common: the network community structure, which means that there exists densely connected(More)
Extracting information from large collections of structured, semi-structured or even unstructured data can be a considerable challenge when much of the hidden information is implicit within relationships among entities within the data. Social networks are such data collections in which relationships play a vital role in the knowledge these networks can(More)