Skip to search form
Skip to main content
Skip to account menu
Semantic Scholar
Semantic Scholar's Logo
Search 234,985,928 papers from all fields of science
Search
Sign In
Create Free Account
Text segmentation
Known as:
Chinese word segmentation
, Word segmentation
, Word splitting
Expand
Text segmentation is the process of dividing written text into meaningful units, such as words, sentences, or topics. The term applies both to mental…
Expand
Wikipedia
(opens in a new tab)
Create Alert
Alert
Related topics
Related topics
15 relations
Cluster analysis
Delimiter
Document classification
Hidden Markov model
Expand
Papers overview
Semantic Scholar uses AI to extract papers important to this topic.
2016
2016
A Character-Aware Encoder for Neural Machine Translation
Zhen Yang
,
Wei Chen
,
Feng Wang
,
Bo Xu
International Conference on Computational…
2016
Corpus ID: 16307400
This article proposes a novel character-aware neural machine translation (NMT) model that views the input sequences as sequences…
Expand
2012
2012
Reducing Approximation and Estimation Errors for Chinese Lexical Processing with Heterogeneous Annotations
Weiwei Sun
,
Xiaojun Wan
Annual Meeting of the Association for…
2012
Corpus ID: 470570
We address the issue of consuming heterogeneous annotation data for Chinese word segmentation and part-of-speech tagging. We…
Expand
2012
2012
Exploiting Shared Chinese Characters in Chinese Word Segmentation Optimization for Chinese-Japanese Machine Translation
Chenhui Chu
,
Toshiaki Nakazawa
,
Daisuke Kawahara
,
S. Kurohashi
European Association for Machine Translation…
2012
Corpus ID: 3015371
Unknown words and word segmentation granularity are two main problems in Chinese word segmentation for ChineseJapanese Machine…
Expand
2010
2010
Fast-Champollion: A Fast and Robust Sentence Alignment Algorithm
Peng Li
,
Maosong Sun
,
Ping Xue
International Conference on Computational…
2010
Corpus ID: 6734393
Sentence-level aligned parallel texts are important resources for a number of natural language processing (NLP) tasks and…
Expand
2008
2008
Natural Language and Information Systems, 13th International Conference on Applications of Natural Language to Information Systems, NLDB 2008, London, UK, June 24-27, 2008, Proceedings
E. Kapetanios
,
V. Sugumaran
,
M. Spiliopoulou
International Conference on Applications of…
2008
Corpus ID: 35068266
2008
2008
Exploiting Unlabeled Text with Different Unsupervised Segmentation Criteria for Chinese Word Segmentation
Zhao Hai
,
Chunyu Kit
2008
Corpus ID: 5944954
This paper presents a novel approach to improve Chinese word seg- mentation (CWS) that attempts to utilize unlabeled data such as…
Expand
2006
2006
Automatic multimedia indexing: combining audio, speech, and visual information to index broadcast news
K. Ohtsuki
,
K. Bessho
,
Y. Matsuo
,
S. Matsunaga
,
Y. Hayashi
IEEE Signal Processing Magazine
2006
Corpus ID: 17586703
This paper describes an indexing system that automatically creates metadata for multimedia broadcast news content by integrating…
Expand
Highly Cited
1999
Highly Cited
1999
A Statistical Information Extraction System for Turkish
Gökhan Tür
,
Dilek Z. Hakkani-Tür
,
Kemal Oflazer
1999
Corpus ID: 13429290
Information Extraction (IE) is the process of analyzing natural language text or speech, and collecting information about…
Expand
1997
1997
Lexicon Effects on Chinese Information Retrieval
K. Kwok
Conference on Empirical Methods in Natural…
1997
Corpus ID: 6614339
We investigate the effects of lexicon size and stopwords on Chinese information retrieval using our method of short-word…
Expand
Highly Cited
1994
Highly Cited
1994
Text segmentation in mixed-mode images
N. Chaddha
,
Rosen Sharma
,
Avneesh Agrawal
,
Anoop Gupta
Proceedings of 28th Asilomar Conference on…
1994
Corpus ID: 58358347
Block based algorithms have found widespread use in image and video compression. However, popular algorithms such as JPEG, which…
Expand