Extended Boolean information retrieval

  title={Extended Boolean information retrieval},
  author={Gerard Salton and Edward A. Fox and Harry Wu},
  journal={Commun. ACM},
In conventional information retrieval Boolean combinations of index terms are used to formulate the users'' information requests. While any document is in principle retrievable by a Boolean query, the amount of output obtainable by Boolean processing is difficult to control, and the retrieved items are not ranked in any presumed order of importance to the user population. In the vector processing model of retrieval, the retrieved items are easily ranked in decreasing order of the query-record… 

Trial2Vec: Zero-Shot Clinical Trial Document Similarity Search using Self-Supervision

A zero-shot clinical trial retrieval method, called Trial2Vec, which learns through self-supervision without the need for annotating similar clinical trials, and yields medically interpretable embeddings by visualization and it gets 15% average improvement over the best baselines on precision/recall for trial retrieval.

The Fact Extraction and VERification (FEVER) Shared Task

The first Fact Extraction and VERification (FEVER) Shared Task challenged participants to classify whether human-written factoid claims could be SUPPORTED or REFUTED using evidence retrieved from Wikipedia.

Word Similarity Based Model for Tweet Stream Prospective Notification

This work proposes an adaptation of the extended Boolean model based on word similarity to estimate the relevance score of tweets and takes advantage of the word2vec model to capture the similarity between query terms and tweet terms.

Learning metrics for content-based medical image retrieval

  • John CollinsK. Okada
  • Computer Science
    2013 35th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC)
  • 2013
A medical CBIR that adapts its similaritymetric from data by using information theoretic metric learning and systematically compares the SIFT bag-of-words-based system with various plug-in similarity measures available in literature.

Approches de recherche multimédia dans des documents semi-structurés : utilisation du contexte textuel et structurel pour la sélection d'objets multimédia

L'evolution conjointe des besoins utilisateurs et des documents electroniques ne cesse de soulever de nouvelles problematiques dans le domaine de la Recherche d'Information (RI). Si l'on considere la

Integrating Document Features for Expert Finding

Both document internal structure and the novel mult iple window based approach for taking into account multiple levels of associations improve expert finding over the direct use of a number of well-known IR models for both w ith and without query expansion.

Integrating multiple windows and document features for expert finding

This work argues that expert finding is more sensitive to multiple levels of associations and document features that current expert finding systems insufficiently address, and proposes a novel approach that integrates the above-mentioned three aspects as well as a query expansion technique in a two-stage model for expert finding.

Word Vectors and Quantum Logic Experiments with negation and disjunction

A calculus which combined the flexible geometric structure o f vector models with the crisp efficiency of Boolean logic would be extrem ely beneficial for modelling natural language. With this goal

An improved text classification modelling approach to identify security messages in heterogeneous projects

Using harvested security keywords as features to train a text classification model improve classification models and generalise to other projects significantly, and introduces new and promising approaches to construct models that can generalise across different independent projects.



The Elements Of Integration

On Query Formulation in Information Retrieval

Experimental evidence indicates that relevance feedback is a most promising strategy for query term weights and the advantages and disadvantages of the Boolean and vector representations of queries are demonstrated.

On the role of words and phrases in automatic text analysis

A model, known as discrimination value analysis is introduced which assigns an appropriate role in the indexing operation to the terms, term phrases, and thesaurus classes.

Boolean Query Formulation with Relevance Feedback

Methods are outlined in this study for the automatic generation of Boolean search statements based on the natural language texts of initially available search requests and of previously retrieved document excerpts identified as relevant by the user population.

Information retrieval: on-line

This text treats current developments in the design, operation and evaluation of information retrieval systems operating in an on-line, realtime, time-shared, interactive mode. Look for coverage

Information retrieval systems; characteristics, testing, and evaluation

Information retrieval systems: characteristics, testing, and evaluation , Information retrieval systems: characteristics, testing, and evaluation , مرکز فناوری اطلاعات و اطلاع رسانی کشاورزی

The SMART Retrieval System—Experiments in Automatic Document Processing

A theory of indexing

  • G. Salton
  • Computer Science
    Regional conference series in applied mathematics
  • 1975

Relevance weighting of search terms

This paper examines statistical techniques for exploiting relevance information to weight search terms using information about the distribution of index terms in documents in general and shows that specific weighted search methods are implied by a general probabilistic theory of retrieval.