Learn More
There has been much recent research on identifying global community structure in networks. However, most existing approaches require complete information of the graph in question, which is impractical for some networks, e.g. the World Wide Web (WWW). Algorithms for local community detection have been proposed but their results usually contain many outliers.(More)
We present a discriminative method for learning selectional preferences from unlabeled text. Positive examples are taken from observed predicate-argument pairs, while negatives are constructed from unobserved combinations. We train a Support Vector Machine classifier to distinguish the positive from the negative instances. We show how to partition the(More)
Much structured data of scientific interest can be represented as networks, where sets of nodes or vertices are joined together in pairs by links or edges. Although these networks may belong to different research areas, there is one property that many of them do have in common: the network community structure, which means that there exists densely connected(More)
Many datasets can be described in the form of graphs or networks where nodes in the graph represent entities and edges represent relationships between pairs of entities. A common property of these networks is their community structure, considered as clusters of densely connected groups of vertices, with only sparser connections between groups. The(More)
Web-scale data has been used in a diverse range of language research. Most of this research has used web counts for only short, fixed spans of context. We present a unified view of using web counts for lexical disambiguation. Unlike previous approaches , our supervised and unsupervised systems combine information from multiple and overlapping segments of(More)
Traditional relation extraction seeks to identify pre-specified semantic relations within natural language text, while open Information Extraction (Open IE) takes a more general approach , and looks for a variety of relations without restriction to a fixed relation set. With this generalization comes the question, what is a relation? For example, should the(More)
MOTIVATION The availability of the whole genomic sequences of HIV-1 viruses provides an excellent resource for studying the HIV-1 phylogenies using all the genetic materials. However, such huge volumes of data create computational challenges in both memory consumption and CPU usage. RESULTS We propose the complete composition vector representation for an(More)
We present an automatic approach to determining whether a pronoun in text refers to a preceding noun phrase or is instead non-referential. We extract the surrounding tex-tual context of the pronoun and gather, from a large corpus, the distribution of words that occur within that context. We learn to reliably classify these distributions as representing(More)
We present a framework for visualizing remote distributed data sources using a multiuser immersive virtual reality environment. DIVE-ON is a system prototype that consolidates distributed data sources into a multidimensional data model, transports user-specified views to a 3D immersive display, and presents various data attributes and mining results as(More)