Skip to search formSkip to main content
You are currently offline. Some features of the site may not work correctly.

Tf–idf

Known as: Tfxidf, Tf×idf, TF * IDF 
In information retrieval, tf–idf, short for term frequency–inverse document frequency, is a numerical statistic that is intended to reflect how… Expand
Wikipedia

Papers overview

Semantic Scholar uses AI to extract papers important to this topic.
Highly Cited
2012
Highly Cited
2012
Data mining technology helps extract usable knowledge from large data sets. The process of data collection and data dissemination… Expand
  • table 1
  • table 2
  • table 3
  • table 4
  • table 7
Is this relevant?
Highly Cited
2008
Highly Cited
2008
A novel probabilistic retrieval model is presented. It forms a basis to interpret the TF-IDF term weights as making relevance… Expand
  • table I
  • figure 1
  • figure 2
  • figure 3
  • table II
Is this relevant?
Highly Cited
2008
Highly Cited
2008
This paper proposes two novel image similarity measures for fast indexing via locality sensitive hashing. The similarity measures… Expand
  • figure 1
  • table 1
  • figure 2
Is this relevant?
Highly Cited
2008
Highly Cited
2008
In the realm of machine learning for text classification, TF-IDF is the most widely used representation for real-valued feature… Expand
  • table 1
  • figure 1
  • figure 2
  • figure 3
  • figure 4
Is this relevant?
Highly Cited
2007
Highly Cited
2007
An increasing number of database applications today require sophisticated approximate string matching capabilities. Examples of… Expand
  • table 1
  • figure 1
  • figure 3
  • figure 4
Is this relevant?
Highly Cited
2005
Highly Cited
2005
This paper presents a new improved term frequency/inverse document frequency (TF-IDF) approach which uses confidence, support and… Expand
Is this relevant?
Highly Cited
2003
Highly Cited
2003
In this paper, we examine the results of applying Term Frequency Inverse Document Frequency (TF-IDF) to determine what words in a… Expand
  • table 2
  • table 1
  • figure 1
  • figure 2
Is this relevant?
Highly Cited
2003
Highly Cited
2003
This paper presents a mathematical definition of the "probability-weighted amount of information" (PWI), a measure of specificity… Expand
  • table 1
  • figure 1
  • figure 2
  • figure 3
  • table 2
Is this relevant?
Highly Cited
2000
Highly Cited
2000
  • D. Hiemstra
  • International Journal on Digital Libraries
  • 2000
  • Corpus ID: 5230471
Abstract.This paper presents a new probabilistic model of information retrieval. The most important modeling assumption made is… Expand
  • table 1
  • table 2
  • table 3
  • table 5
  • table 4
Is this relevant?
Highly Cited
1975
Highly Cited
1975
In a document retrieval, or other pattern matching environment where stored entities (documents) are compared with each other or… Expand
  • figure 1
  • figure 2
  • figure 4
  • table I
  • table II
Is this relevant?