• Publications
  • Influence
Face recognition: a convolutional neural-network approach
TLDR
A hybrid neural-network for human face recognition which compares favourably with other methods and analyzes the computational complexity and discusses how new classes could be added to the trained recognizer.
CiteSeer: an automatic citation indexing system
TLDR
CiteSeer has many advantages over traditional citation indexes, including the ability to create more up-to-date databases which are not limited to a preselected set of journals or restricted by journal publication delays, completely autonomous operation with a corresponding reduction in cost, and powerful interactive browsing of the literature using the context of citations.
Efficient identification of Web communities
TLDR
A focused crawler that crawls to a depth can approximate community membership by augmenting the graph induced by the cra wl with links to a virtual sink node.
Accessibility of information on the web
TLDR
As the web becomes a major communications medium, the data on it must be made more accessible, and search engines need to make the data more accessible.
Digital Libraries and Autonomous Citation Indexing
TLDR
Digital libraries incorporating ACI can help organize scientific literature and may significantly improve the efficiency of dissemination and feedback and speed the transition to scholarly electronic publishing.
Learning and Extracting Finite State Automata with Second-Order Recurrent Neural Networks
TLDR
It is shown that a recurrent, second-order neural network using a real-time, forward training algorithm readily learns to infer small regular grammars from positive and negative string training samples, and many of the neural net state machines are dynamically stable, that is, they correctly classify many long unseen strings.
ParsCit: an Open-source CRF Reference String Parsing Package
TLDR
Parsing package ParsCit is described, a freely available, open-source implementation of a reference string parsing package that wraps a trained conditional random field model with added functionality to identify reference strings from a plain text file, and to retrieve the citation contexts.
Two supervised learning approaches for name disambiguation in author citations
TLDR
Two supervised learning approaches to disambiguate authors in the citations are investigated, one uses the naive Bayes probability model, a generative model; the other uses support vector machines (SVMs) and the vector space representation of citations, a discriminative model.
Focused Crawling Using Context Graphs
TLDR
A focused crawling algorithm is presented that builds a model for the context within which topically relevant pages occur on the web that can capture typical link hierarchies within which valuable pages occur, as well as model content on documents that frequently cooccur with relevant pages.
Collaborative Filtering by Personality Diagnosis: A Hybrid Memory and Model-Based Approach
TLDR
This work describes and evaluates a new method called personality diagnosis (PD), which compute the probability that a user is of the same "personality type" as other users, and, in turn, the likelihood that he or she will like new items.
...
...