Skip to search form
Skip to main content
Skip to account menu
Semantic Scholar
Semantic Scholar's Logo
Search 233,621,003 papers from all fields of science
Search
Sign In
Create Free Account
Text segmentation
Known as:
Chinese word segmentation
, Word segmentation
, Word splitting
Expand
Text segmentation is the process of dividing written text into meaningful units, such as words, sentences, or topics. The term applies both to mental…
Expand
Wikipedia
(opens in a new tab)
Create Alert
Alert
Related topics
Related topics
15 relations
Cluster analysis
Delimiter
Document classification
Hidden Markov model
Expand
Papers overview
Semantic Scholar uses AI to extract papers important to this topic.
2017
2017
IMS at the CoNLL 2017 UD Shared Task: CRFs and Perceptrons Meet Neural Networks
Anders Björkelund
,
Agnieszka Falenska
,
Xiang Yu
,
Jonas Kuhn
Conference on Computational Natural Language…
2017
Corpus ID: 8525137
This paper presents the IMS contribution to the CoNLL 2017 Shared Task. In the preprocessing step we employed a CRF POS…
Expand
2012
2012
Reducing Approximation and Estimation Errors for Chinese Lexical Processing with Heterogeneous Annotations
Weiwei Sun
,
Xiaojun Wan
Annual Meeting of the Association for…
2012
Corpus ID: 470570
We address the issue of consuming heterogeneous annotation data for Chinese word segmentation and part-of-speech tagging. We…
Expand
2012
2012
Word segmentation through cross-lingual word-to-phoneme alignment
Felix Stahlberg
,
Tim Schlippe
,
S. Vogel
,
Tanja Schultz
Spoken Language Technology Workshop
2012
Corpus ID: 2310379
We present our new alignment model Model 3P for cross-lingual word-to-phoneme alignment, and show that unsupervised learning of…
Expand
2010
2010
Fast-Champollion: A Fast and Robust Sentence Alignment Algorithm
Peng Li
,
Maosong Sun
,
Ping Xue
International Conference on Computational…
2010
Corpus ID: 6734393
Sentence-level aligned parallel texts are important resources for a number of natural language processing (NLP) tasks and…
Expand
2008
2008
Natural Language and Information Systems, 13th International Conference on Applications of Natural Language to Information Systems, NLDB 2008, London, UK, June 24-27, 2008, Proceedings
E. Kapetanios
,
V. Sugumaran
,
M. Spiliopoulou
International Conference on Applications of…
2008
Corpus ID: 35068266
2005
2005
A Chunking Strategy Towards Unknown Word Detection in Chinese Word Segmentation
Guodong Zhou
International Joint Conference on Natural…
2005
Corpus ID: 9945296
This paper proposes a chunking strategy to detect unknown words in Chinese word segmentation. First, a raw sentence is pre…
Expand
2001
2001
Model-based information extraction method tolerant of OCR errors for document images
Yasuto Ishitani
Proceedings of Sixth International Conference on…
2001
Corpus ID: 24214889
A new method for information extraction from document images is proposed in this paper as the basis for a document reader which…
Expand
Highly Cited
1999
Highly Cited
1999
A Statistical Information Extraction System for Turkish
Gökhan Tür
,
Dilek Z. Hakkani-Tür
,
Kemal Oflazer
1999
Corpus ID: 13429290
Information Extraction (IE) is the process of analyzing natural language text or speech, and collecting information about…
Expand
1997
1997
Lexicon Effects on Chinese Information Retrieval
K. Kwok
Conference on Empirical Methods in Natural…
1997
Corpus ID: 6614339
We investigate the effects of lexicon size and stopwords on Chinese information retrieval using our method of short-word…
Expand
Highly Cited
1994
Highly Cited
1994
Text segmentation in mixed-mode images
N. Chaddha
,
Rosen Sharma
,
Avneesh Agrawal
,
Anoop Gupta
Proceedings of 28th Asilomar Conference on…
1994
Corpus ID: 58358347
Block based algorithms have found widespread use in image and video compression. However, popular algorithms such as JPEG, which…
Expand
By clicking accept or continuing to use the site, you agree to the terms outlined in our
Privacy Policy
(opens in a new tab)
,
Terms of Service
(opens in a new tab)
, and
Dataset License
(opens in a new tab)
ACCEPT & CONTINUE