Learn More
Existing graph-based ranking methods for keyphrase extraction compute a single importance score for each word via a single random walk. Motivated by the fact that both documents and words can be represented by a mixture of semantic topics, we propose to decompose traditional random walk into multiple random walks specific to various topics. We thus build a(More)
When we write or prepare to write a research paper, we always have appropriate references in mind. However, there are most likely references we have missed and should have been read and cited. As such a good citation recommendation system would not only improve our paper but, overall, the efficiency and quality of literature search. Usually, a citation's(More)
We introduce a big data platform that provides various services for harvesting scholarly information and enabling efficient scholarly applications. The core architecture of the platform is built on a secured private cloud, crawls data using a scholarly focused crawler that leverages a dynamic scheduler, processes by utilizing a map reduce based(More)
Oriented patterns, e.g. fingerprints, consist of smoothly varying flow-like patterns, together with important singular points (i.e. cores and deltas) where the orientation changes abruptly. Gabor filters and anisotropic diffusion methods have been widely used to enhance oriented patterns. However , none of them can well cope with regions of varying(More)
Twitter user profiles contain rich information that allows researchers to infer particular attributes of users' identities. Knowing identity attributes such as gender, age, and/or nationality are a first step in many studies which seek to describe various phenomena related to computational social science. Often, it is through such attributes that studies of(More)
Associating place name mentions in unstructured text with their actual references in geographic space is vital to enable spatial queries and analysis. In this paper, we introduce GeoTxt, a web API plus human-usable web tool designed and implemented to tackle three components of place-reference processing from text, namely: extraction, disambiguation, and(More)
Citations are important in academic dissemination. To help researchers check the completeness of citations while authoring a paper, we introduce a citation recommendation system called RefSeer. Researchers can use it to find related works to cited while authoring papers. It can also be used by reviewers to check the completeness of a paper's references.(More)
We explore a new metadata extraction framework without human annotators with the ground truth harvested from Web. A new training sample is selected based on not only the uncertainty and representativeness in the unlabeled pool, but also on its availability and credibility in Web knowledge bases. We construct a dataset of 4329 books with valid metadata and(More)
Automatic citation recommendation can be very useful for authoring a paper and is an AI-complete problem due to the challenge of bridging the semantic gap between citation context and the cited paper. It is not always easy for knowledgeable researchers to give an accurate citation context for a cited paper or to find the right paper to cite given context.(More)