Skip to search formSkip to main contentSkip to account menu

Text segmentation

Known as: Chinese word segmentation, Word segmentation, Word splitting 
Text segmentation is the process of dividing written text into meaningful units, such as words, sentences, or topics. The term applies both to mental… 
Wikipedia (opens in a new tab)

Papers overview

Semantic Scholar uses AI to extract papers important to this topic.
2017
2017
This paper presents the IMS contribution to the CoNLL 2017 Shared Task. In the preprocessing step we employed a CRF POS… 
2012
2012
We address the issue of consuming heterogeneous annotation data for Chinese word segmentation and part-of-speech tagging. We… 
2012
2012
We present our new alignment model Model 3P for cross-lingual word-to-phoneme alignment, and show that unsupervised learning of… 
2010
2010
Sentence-level aligned parallel texts are important resources for a number of natural language processing (NLP) tasks and… 
2005
2005
This paper proposes a chunking strategy to detect unknown words in Chinese word segmentation. First, a raw sentence is pre… 
2001
2001
  • Yasuto Ishitani
  • 2001
  • Corpus ID: 24214889
A new method for information extraction from document images is proposed in this paper as the basis for a document reader which… 
Highly Cited
1999
Highly Cited
1999
Information Extraction (IE) is the process of analyzing natural language text or speech, and collecting information about… 
1997
1997
We investigate the effects of lexicon size and stopwords on Chinese information retrieval using our method of short-word… 
Highly Cited
1994
Highly Cited
1994
Block based algorithms have found widespread use in image and video compression. However, popular algorithms such as JPEG, which…