Optimal information retrieval when queries are not random

Abstract

We consider the complexity of the general information retrieval system design problem and multiattribute file systems based upon the multiple key hashing (MKH) design problem. We first show that the problem of designing an optimal multiattribute file system is NP-hard. The performance formula for multiattribute file systems based upon the MKH method is derived. We also show that the design problem for a multiattribute file system based upon the MKH method is related to the prime number problem. We show that the problem of designing optimal multiattribute files based upon the MKH method can be reduced to finding minimal N-tuples, which was discussed by Chang, Lee, and Du. We further present a very efficient method for designing good multiple key hashing functions in the case where the number of buckets is a power of a prime number. We also propose a heuristic algorithm to design good multiple key hashing functions in general.

DOI: 10.1016/0020-0255(84)90049-5

8 Figures and Tables

Cite this paper

@article{Chang1984OptimalIR, title={Optimal information retrieval when queries are not random}, author={Chin-Chen Chang}, journal={Inf. Sci.}, year={1984}, volume={34}, pages={199-223} }