Share This Author
Relevant document distribution estimation method for resource selection
It is shown that the CORI algorithm does not do well in environments with a mix of "small" and "very large" databases, and a new resource selection algorithm is proposed that uses information about database sizes as well as database contents.
Composite hashing with multiple information sources
- Dan Zhang, Fei Wang, Luo Si
- Computer ScienceAnnual International ACM SIGIR Conference on…
- 24 July 2011
The focus of the new research problem is to design an algorithm for incorporating the features from different information sources into the binary hashing codes efficiently and effectively, and to propose an algorithm CHMIS-AW (CHMIS with Adjusted Weights) for learning the codes.
Flexible Mixture Model for Collaborative Filtering
FMM extends existing partitioning/clustering algorithms for collaborative filtering by clustering both users and items together simultaneously without assuming that each user and item should only belong to a single cluster.
Mining contrastive opinions on political texts using cross-perspective topic model
- Yi Fang, Luo Si, Naveen Somasundaram, Zhengtao Yu
- Computer ScienceWeb Search and Data Mining
- 8 February 2012
An extensive set of experiments have been conducted to evaluate the proposed unsupervised topic model for contrastive opinion modeling, which simulates the generative process of how opinion words occur in the documents of different collections.
A semisupervised learning method to merge search engine results
This article presents a semisupervised learning solution to the result merging problem and demonstrates that this method is more effective than the well-known CORI result-merging algorithm under a variety of conditions.
A statistical model for scientific readability
Experiments show that this new method of using statistical models to estimate readability has a better performance than the widely used Flesch-Kincaid readability formula.
A language modeling framework for resource selection and results merging
- Luo Si, Rong Jin, Jamie Callan, Paul Ogilvie
- Computer ScienceInternational Conference on Information and…
- 4 November 2002
This paper extends the language modeling approach to integrate resource selection, ad-hoc searching, and merging of results from different text databases into a single probabilistic retrieval model, designed primarily for Intranet environments.
An automatic weighting scheme for collaborative filtering
An optimization algorithm to automatically compute the weights for different items based on their ratings from training users will create a clustered distribution for user vectors in the item space by bringing users of similar interests closer and separating users of different interests more distant.
- K. Balog, Yi Fang, M. de Rijke, P. Serdyukov, Luo Si
- Computer ScienceFoundations and Trends in Information Retrieval
- 12 August 2012
This survey highlights advances in models and algorithms relevant to expertise retrieval as an emerging subdiscipline in information retrieval and draws connections among methods proposed in the literature and summarizes them in five groups of basic approaches.
A Bayesian Approach toward Active Learning for Collaborative Filtering
This paper takes one step further by taking into account of the posterior distribution of the estimated model, which results in more robust active learning algorithm.