Share This Author
Keyword searching and browsing in databases using BANKS
- Gaurav Bhalotia, Arvind Hulgeri, Charuta Nakhe, Soumen Chakrabarti, S. Sudarshan
- Computer ScienceProceedings 18th International Conference on Data…
- 26 February 2002
BANKS is described, a system which enables keyword-based search on relational databases, together with data and schema browsing, and presents an efficient heuristic algorithm for finding and ranking query results.
Focused Crawling: A New Approach to Topic-Specific Web Resource Discovery
Bidirectional Expansion For Keyword Search on Graph Databases
- V. Kacholia, Shashank Pandit, Soumen Chakrabarti, S. Sudarshan, Rushi Desai, Hrishikesh Vijay Karambelkar
- Computer ScienceVLDB
- 30 August 2005
This paper proposes a new search algorithm, Bidirectional Search, which improves on Backward Expanding search by allowing forward search from potential roots towards leaves, and devise a novel search frontier prioritization technique based on spreading activation.
Enhanced hypertext categorization using hyperlinks
This work has developed a text classifier that misclassified only 13% of the documents in the well-known Reuters benchmark; this was comparable to the best results ever obtained and its technique also adapts gracefully to the fraction of neighboring documents having known topics.
Collective annotation of Wikipedia entities in web text
This work gives formulations for the trade-off between local spot-to-entity compatibility and measures of global coherence between entities, and investigates practical solutions based on local hill-climbing, rounding integer linear programs, and pre-clustering entities followed by local optimization within clusters.
Automatic Resource Compilation by Analyzing Hyperlink Structure and Associated Text
Annotating and searching web tables using entities, types and relationships
This paper proposes new machine learning techniques to annotate table cells with entities that they likely mention, table columns with types from which entities are drawn for cells in the column, and relations that pairs of table columns seek to express, and a new graphical model for making all these labeling decisions for each table simultaneously.
Generalizing Across Domains via Cross-Gradient Training
- Shiv Shankar, Vihari Piratla, Soumen Chakrabarti, S. Chaudhuri, P. Jyothi, Sunita Sarawagi
- Computer ScienceICLR
- 15 February 2018
Empirical evaluation on three different applications establishes that (1) domain-guided perturbation provides consistently better generalization to unseen domains, compared to generic instance perturbations methods, and that (2) data augmentation is a more stable and accurate method than domain adversarial training.
Dynamic personalized pagerank in entity-relation graphs
- Soumen Chakrabarti
- Computer ScienceWWW '07
- 8 May 2007
HubRank is presented, a new system for fast, dynamic, space-efficient proximity searches in ER graphs, and experiments with CiteSeer's ER graph and millions of real Cite Seer queries are reported on.
Flow and stretch metrics for scheduling continuous job streams
This paper proposes two novel scheduling metrics, namely, maz-stretch and mar-flow, which gauge the responsiveness of the scheduler to each job and avoid starvation of any job.