Pradeep Muthukrishnan

Learn More
We introduce the ACL Anthology Network (AAN), a comprehensive manually curated networked database of citations, collaborations, and summaries in the field of Computational Linguistics. We also present a number of statistics about the network including the most cited authors, the most central collaborators, as well as network statistics about the paper(More)
The number of research publications in various disciplines is growing exponentially. Researchers and scientists are increasingly finding themselves in the position of having to quickly understand large amounts of technical material. In this paper we present the first steps in producing an automatically generated, readily consumable, technical survey.(More)
The ACL Anthology is a large collection of research papers in computational linguistics. Citation data was obtained using text extraction from a collection of PDF files with significant manual post-processing performed to clean up the results. Manual annotation of the references was then performed to complete the citation network. We analyzed the networks(More)
A key problem in document classification and clustering is learning the similarity between documents. Traditional approaches include estimating similarity between feature vectors of documents where the vectors are computed using TF-IDF in the bag-of-words model. However, these approaches do not work well when either similar documents do not use the same(More)
The growth of the web has directly influenced the increase in the availability of relational data. One of the key problems in mining such data is computing the similarity between objects with heterogeneous feature types. For example, publications have many heterogeneous features like text, citations, authorship information, venue information, etc. In most(More)
We propose a new unsupervised method for topic detection that automatically identifies the different facets of an event. We use pointwise Kullback-Leibler divergence along with the Jaccard coefficient to build a topic graph which represents the community structure of the different facets. The problem is formulated as a weighted set cover problem with(More)
Information and Decision Theoretic Approaches to Problems in Active Diagnosis by Gowtham Bellala Chair: Clayton D. Scott In applications such as active learning or disease/fault diagnosis, one often encounters the problem of identifying an unknown object while minimizing the number of “yes” or “no” questions (queries) posed about that object. This problem(More)
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x 1 Classification As Model of Human Categorization . . . . . . . 1 1.1 Review of Classification in Machine Learning and Cognitive Psychology . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2 Semi-Supervised Learning Assumptions . . . . . . . . . . . 5 1.3 Translating Between ML(More)