David O. Holmes

Learn More
We integrate structured data and text using the unchanged, standard relational model. We started with the premise that a relational system could be used to implement an Information Retrieval (IR) system. After implementing a prototype to verify that premise, we then began to investigate the performance of a parallel relational database system for this(More)
For TREC-10, we participated in the adhoc and manual web tracks and in both the site-finding and cross-lingual tracks. For the adhoc track, we did extensive calibrations and learned that combining similarity measures yields little improvement. This year, we focused on a single highperformance similarity measure. For site finding, we implemented several(More)
A scalable, parallel, relational-database driven information retrieval engine is described. To support portability across a wide-range of execution environments, including parallel multicomputers, all algorithms strictly adhere to the SQL-92 standard. By incorporating relevance feedback algorithms, accuracy was significantly enhanced over prior(More)
The relational platform provides a flexible, low maintenance environment for integrating searches of structured and unstructured data. We present relational algebra for the Information Retrieval problem and SQL for leading probabilistic retrieval approaches. We tested 150 standard Text Retrieval Evaluation Conference queries against a collection of half a(More)
For TREC-9, we focused on effectiveness in the web track. The key techniques we employed were information fusion, entity-based relevance feedback, Wordnet-based query parsing and a user interface designed to assist with web-based manual queries. Our initial results are positive. For the manual task, forty of fifty queries are over the median. In the adhoc,(More)
For TREC-5, we enhanced our existing prototype that implements relevance ranking using the AT&T DBC-1012 Model 4 parallel database machine to include relevance feedback. We identified SQL to compute relevance feedback and ran several experiments to identify good cutoffs for the number of documents that should be assumed to be relevant and the number of(More)
For TREC-4, we enhanced our existing prototype that implements relevance ranking using the AT&T DBC-1012 Model 4 parallel database machine to support the entire document collection. Additionally, we developed a special purpose IR prototype to test a new index compression algorithm and to provide performance comparisons to the relational approach. We(More)
A scalable, parallel, relational-database driven information retrieval engine is described. To support portability across a wide-range of execution environments, including parallel multicomputers, all algorithms strictly adhere to the SQL-92 standard. By incorporating relevance feedback algorithms, accuracy was significantly enhanced over prior(More)