The Linear Combination Data Fusion Method in Information Retrieval

  title={The Linear Combination Data Fusion Method in Information Retrieval},
  author={Shengli Wu and Yaxin Bi and Xiaoqin Zeng},
In information retrieval, data fusion has been investigated by many researchers. Previous investigation and experimentation demonstrate that the linear combination method is an effective data fusion method for combining multiple information retrieval results. One advantage is its flexibility since different weights can be assigned to different component systems so as to obtain better fusion results. However, how to obtain suitable weights for all the component retrieval systems is still an open… 

Combining Retrieval Results for Balanced Effectiveness and Efficiency in the Big Data Search Environment

Using 3 groups of historical runs from TREC for the experiment, it is found that with the weights trained by weighted linear regression, the linear combination method can achieve good results in effectiveness and efficiency.

Fusion in Information Retrieval: SIGIR 2018 Half-Day Tutorial

The goal of this half day, intermediate-level, tutorial is to provide a methodological view of the theoretical foundations of fusion approaches, the numerous fusion methods that have been devised and a variety of applications for which fusion techniques have been applied.

Application of Data Fusion in the Web Track

This paper uses data fusion to test how to improve the results from different information retrieval systems and finds that all four runs submitted are better than all component results with one exception.

New Re-ranking Approach in Merging Search Results

Two methods of merging search results are compared: a) applying formulas to re-evaluate document based on different combinations of returned order ranks, documents titles and snippets; b) Top-Down Re-ranking algorithm (TDR) gradually downloads, calculates scores and adds top documents from each source into the final list.

Mixture model with multiple centralized retrieval algorithms for result merging in federated search

A mixture probabilistic model is proposed to learn more appropriate combination weights with respect to different types of information sources with some training data to deal with heterogeneous information sources.

Differential Evolution-Based Fusion for Results Diversification of Web Search

This paper proposes a differential evolution-based method to find optimal weights in the weight space for the linear combination method and shows that the proposed method is effective compared with the state-of-the-art techniques.



Assigning appropriate weights for the linear combination data fusion method in information retrieval

Improving high accuracy retrieval by eliminating the uneven correlation effect in data fusion

The experimental results show that all eight data fusion methods involved outperform the best component system on average and demonstrate that the data fusion technique in general is effective with accurate retrieval results.

Selecting the N-Top Retrieval Result Lists for an Effective Data Fusion

This paper explores the combination of only the n-top result lists as an alternative to the fusion of all available data, and describes a heuristic measure based on redundancy and ranking information to evaluate the quality of each result list, and to select the presumably n-best lists per query.

Data Fusion with Correlation Weights

This paper is focused on the effect of correlation on data fusion for multiple retrieval results. If some of the retrieval results involved in data fusion correlate more strongly than the others,

Applying statistical principles to data fusion in information retrieval

  • Shengli Wu
  • Computer Science
    2007 IEEE International Conference on Systems, Man and Cybernetics
  • 2007

Regression Relevance Models for Data Fusion

  • Shengli WuY. BiS. McClean
  • Computer Science
    18th International Workshop on Database and Expert Systems Applications (DEXA 2007)
  • 2007
This paper investigates how to model rank-probability of relevance relationship in resultant document list for data fusion since reliable relevance scores are very often unavailable for component results.

Automatic combination of multiple ranked retrieval systems

This work proposes a method by which the relevance estimates made by different experts can be automatically combined to result in superior retrieval performance and applies the method to two expert combination tasks.

Fusion Via a Linear Combination of Scores

A thorough analysis of the capabilities of the linear combination (LC) model for fusion of information retrieval systems and introduces d—the difference between the average score on relevant documents and theaverage score on nonrelevant documents—as a performance measure which not only allows mathematical reasoning about system performance, but also allows the selection of weights which generalize well to new documents.

Estimating probabilities for effective data fusion

The use of existing IR evaluation metrics is proposed as a substitution for probability calculations, and Mean Average Precision is used to demonstrate the effectiveness of this approach, with evaluation results demonstrating competitive performance when compared with related algorithms with more onerous requirements for training data.

An outranking approach for rank aggregation in information retrieval

This paper proposes a rank aggregation method within a multiple criteria framework using aggregation mechanisms based on decision rules identifying positive and negative reasons for judging whether a document should get a better rank than another.