Text segmentation

Known as: Chinese word segmentation, Word segmentation, Word splitting 
Text segmentation is the process of dividing written text into meaningful units, such as words, sentences, or topics. The term applies both to mental… (More)
Wikipedia

Papers overview

Semantic Scholar uses AI to extract papers important to this topic.
Highly Cited
2005
Highly Cited
2005
Many language processing tasks can be reduced to breaking the text into segments with prescribed properties. Such tasks include… (More)
  • table 1
  • table 2
  • figure 5
Is this relevant?
Highly Cited
2004
Highly Cited
2004
Automatically segmenting unstructured text strings into structured records is necessary for importing the information contained… (More)
  • figure 1
  • figure 2
  • figure 3
  • figure 4
  • figure 5
Is this relevant?
Highly Cited
2002
Highly Cited
2002
The Pk evaluation metric, initially proposed by Beeferman, Berger, and Lafferty (1997), is becoming the standard measure for… (More)
  • figure 1
  • figure 2
  • figure 3
  • figure 4
  • figure 5
Is this relevant?
Highly Cited
2000
Highly Cited
2000
This paper describes a method for linear text segmentation which is twice as accurate and over seven times as fast as the state… (More)
  • figure 1
  • figure 2
  • figure 3
  • figure 4
  • figure 5
Is this relevant?
Highly Cited
2000
Highly Cited
2000
Hidden Markov models (HMMs) are a powerful probabilistic tool for modeling sequential data, and have been applied with success to… (More)
Is this relevant?
Highly Cited
1999
Highly Cited
1999
This paper introduces a new statistical approach to automatically partitioning text into coherent segments. The approach is based… (More)
  • figure 1
  • figure 2
  • figure 3
  • figure 4
  • figure 5
Is this relevant?
Highly Cited
1997
Highly Cited
1997
A b s t r a c t . We investigate the problem of text segmentation by topic. Applications for this task include topic tracking of… (More)
Is this relevant?
Highly Cited
1997
Highly Cited
1997
This paper introduces a new statistical approach to partitioning text automatically into coherent segments. Our approach enlists… (More)
Is this relevant?
Highly Cited
1993
Highly Cited
1993
This paper proposes a new indicator of text structure, called the lexical cohesion profile (LCP), which locates segment… (More)
  • figure 1
  • figure 2
  • figure 3
  • figure 4
Is this relevant?
Highly Cited
1992
Highly Cited
1992
There is a considerable interest in designing automatic systems that will scan a given paper document and store it on electronic… (More)
Is this relevant?