The classical bag-of-word models for information retrieval (IR) fail to capture contextual associations between words. In this article, we propose to investigate <i>pure high-order dependence</i> among a number of words forming an unseparable semantic entity, that is, the high-order dependence that cannot be reduced to the random coincidence of lower-order… (More)
Quantum theory (QT) has recently been employed to advance the theory of information retrieval (IR). A typical method, namely the Quantum Probability Ranking Principle (QPRP), was proposed to re-rank top retrieved documents by considering the inter-dependencies between documents through the " quantum interference ". In this paper, we attempt to explore… (More)
The classical bag-of-word models fail to capture contextual associations between words. We propose to investigate the " high-order pure dependence " among a number of words forming a semantic entity, i.e., the high-order dependence that cannot be reduced to the random coincidence of lower-order dependence. We believe that identifying these high-order pure… (More)
Estimating the probability of relevance for a document is fundamental in information retrieval. From a theoretical point of view, risk exists in the estimation process, in the sense that the estimated probabilities may not be the actual ones precisely. The estimation risk is often considered to be dependent on the rank. For example, the probability ranking… (More)
Query expansion, while generally effective in improving retrieval performance, may lead to the query-drift problem. Following the recent development of applying Quantum Mechanics (QM) to IR, we investigate the problem from a novel theoretical perspective inspired by photon polarization (a key QM experiment).
Recently, Quantum Theory (QT) has been employed to advance the theory of Information Retrieval (IR). Various analogies between QT and IR have been established. Among them, a typical one is applying the idea of photon polarization in IR tasks, e.g., for document ranking and query expansion. In this paper, we aim to further extend this work by constructing a… (More)
Copyright and Moral Rights for the articles on this site are retained by the individual authors and/or other copyright owners. For more information on Open Research Online's data policy on reuse of materials please consult the policies page. Abstract. The classical bag-of-word models fail to capture contextual associations between words. We propose to… (More)
—Typical dimensionality reduction (DR) methods are often data-oriented, focusing on directly reducing the number of random variables (features) while retaining the maximal variations in the high-dimensional data. In unsupervised situations , one of the main limitations of these methods lies in their dependency on the scale of data features. This paper aims… (More)