The Linear Combination Data Fusion Method in Information Retrieval

  title={The Linear Combination Data Fusion Method in Information Retrieval},
  author={Shengli Wu and Yaxin Bi and Xiaoqin Zeng},
In information retrieval, data fusion has been investigated by many researchers. Previous investigation and experimentation demonstrate that the linear combination method is an effective data fusion method for combining multiple information retrieval results. One advantage is its flexibility since different weights can be assigned to different component systems so as to obtain better fusion results. However, how to obtain suitable weights for all the component retrieval systems is still an open… 

Combining Retrieval Results for Balanced Effectiveness and Efficiency in the Big Data Search Environment

Using 3 groups of historical runs from TREC for the experiment, it is found that with the weights trained by weighted linear regression, the linear combination method can achieve good results in effectiveness and efficiency.

Fusion in Information Retrieval: SIGIR 2018 Half-Day Tutorial

The goal of this half day, intermediate-level, tutorial is to provide a methodological view of the theoretical foundations of fusion approaches, the numerous fusion methods that have been devised and a variety of applications for which fusion techniques have been applied.

Application of Data Fusion in the Web Track

This paper uses data fusion to test how to improve the results from different information retrieval systems and finds that all four runs submitted are better than all component results with one exception.

Mixture model with multiple centralized retrieval algorithms for result merging in federated search

A mixture probabilistic model is proposed to learn more appropriate combination weights with respect to different types of information sources with some training data to deal with heterogeneous information sources.

Differential Evolution-Based Fusion for Results Diversification of Web Search

This paper proposes a differential evolution-based method to find optimal weights in the weight space for the linear combination method and shows that the proposed method is effective compared with the state-of-the-art techniques.

New Re-ranking Approach in Merging Search Results

Two methods of merging search results are compared: a) applying formulas to re-evaluate document based on different combinations of returned order ranks, documents titles and snippets; b) Top-Down Re-ranking algorithm (TDR) gradually downloads, calculates scores and adds top documents from each source into the final list.



Assigning appropriate weights for the linear combination data fusion method in information retrieval

Improving high accuracy retrieval by eliminating the uneven correlation effect in data fusion

The experimental results show that all eight data fusion methods involved outperform the best component system on average and demonstrate that the data fusion technique in general is effective with accurate retrieval results.

Selecting the N-Top Retrieval Result Lists for an Effective Data Fusion

This paper explores the combination of only the n-top result lists as an alternative to the fusion of all available data, and describes a heuristic measure based on redundancy and ranking information to evaluate the quality of each result list, and to select the presumably n-best lists per query.

Applying statistical principles to data fusion in information retrieval

  • Shengli Wu
  • Computer Science
    2007 IEEE International Conference on Systems, Man and Cybernetics
  • 2007

Automatic combination of multiple ranked retrieval systems

This work proposes a method by which the relevance estimates made by different experts can be automatically combined to result in superior retrieval performance and applies the method to two expert combination tasks.

Fusion Via a Linear Combination of Scores

A thorough analysis of the capabilities of the linear combination (LC) model for fusion of information retrieval systems and introduces d—the difference between the average score on relevant documents and theaverage score on nonrelevant documents—as a performance measure which not only allows mathematical reasoning about system performance, but also allows the selection of weights which generalize well to new documents.

An outranking approach for rank aggregation in information retrieval

This paper proposes a rank aggregation method within a multiple criteria framework using aggregation mechanisms based on decision rules identifying positive and negative reasons for judging whether a document should get a better rank than another.

Segmentation of Search Engine Results for Effective Data-Fusion

This work proposes a new fusion method that partitions the rank lists of document retrieval systems into chunks and shows that the proposed method produces higher average precision values than previous systems across a range of testbeds.

Generative model-based metasearch for data fusion in information retrieval

A novel approach to the fusion problem: generative model-based Metasearch (GeM), which achieves a final ranking by listing documents in decreasing probability of generation under the induced model using Bayesian parameter estimation.

Predicting the performance of linearly combined IR systems

A new technique for analyzing combination models that allows to make qualitative conclusions about which IR systems should be combined by using a linear regression to accurately predict the performance of the combined system based on quantitative measurements of individual component systems taken from TREC5.