Robert Wing Pong Luk

Learn More
A novel probabilistic retrieval model is presented. It forms a basis to interpret the TF-IDF term weights as making relevance decisions. It simulates the local relevance decision-making for every location of a document, and combines all of these “local” relevance decisions as the “document-wide” relevance decision for the document.(More)
† This work was supported by the RGC CERG project PolyU 5065/98E and the Departmental Grant H-ZJ84 ‡ Corresponding author ABSTRACT Pattern discovery from time series is of fundamental importance. Particularly when the domain expert derived patterns do not exist or are not complete, an algorithm to discover specific patterns or shapes automatically from the(More)
Introduction This is my personal “summary in 337 one-liners” of A Survey in Indexing and Searching XML Documents by Luk et al. (2002) [1]. I focus on technical aspects, omitting all system names and references. In my opinion, one cannot learn any technique from the survey: it only mentions various techniques but does not explain any. Alas, my 337 one-liners(More)
In an ad-hoc retrieval task, the query is usually short and the user expects to find the relevant documents in the first several result pages. We explored the possibilities of using Wikipedia's articles as an external corpus to expand ad-hoc queries. Results show promising improvements over measures that emphasize on weak queries.
This paper discusses various issues about the rank equivalence of Lafferty and Zhai between the log-odds ratio and the query likelihood of probabilistic retrieval models. It highlights that Robertson’s concerns about this equivalence may arise when multiple probability distributions are assumed to be uniformly distributed, after assuming that the marginal(More)