Skip to search form
Skip to main content
Skip to account menu
Semantic Scholar
Semantic Scholar's Logo
Search 228,394,739 papers from all fields of science
Search
Sign In
Create Free Account
Text corpus
Known as:
Text corpora
, Linguistic corpus
, Text item
Expand
In linguistics, a corpus (plural corpora) or text corpus is a large and structured set of texts (nowadays usually electronically stored and processed…
Expand
Wikipedia
(opens in a new tab)
Create Alert
Alert
Related topics
Related topics
50 relations
Amarna letter EA 256
Amarna letter EA 365
Amarna letters–localities and their rulers
Amebis
Expand
Papers overview
Semantic Scholar uses AI to extract papers important to this topic.
Highly Cited
2010
Highly Cited
2010
Syntax-to-Morphology Mapping in Factored Phrase-Based Statistical Machine Translation from English to Turkish
Reyyan Yeniterzi
,
Kemal Oflazer
Annual Meeting of the Association for…
2010
Corpus ID: 14292100
We present a novel scheme to apply factored phrase-based SMT to a language pair with very disparate morphological structures. Our…
Expand
Highly Cited
2007
Highly Cited
2007
Efficient Handling of N-gram Language Models for Statistical Machine Translation
Marcello Federico
,
M. Cettolo
WMT@ACL
2007
Corpus ID: 603858
Statistical machine translation, as well as other areas of human language processing, have recently pushed toward the use of…
Expand
Review
2007
Review
2007
A review of text and image retrieval approaches for broadcast news video
Rong Yan
,
Alexander Hauptmann
Information retrieval (Boston)
2007
Corpus ID: 16483842
The effectiveness of a video retrieval system largely depends on the choice of underlying text and image retrieval components…
Expand
Highly Cited
2005
Highly Cited
2005
Identifying Non-Referential it: A Machine Learning Approach Incorporating Linguistically Motivated Patterns
Adriane Boyd
,
Whitney Gegg-Harrison
,
D. Byron
Annual Meeting of the Association for…
2005
Corpus ID: 7045889
In this paper, we present a machine learning system for identifying non-referential it. Types of non-referential it are examined…
Expand
Highly Cited
2004
Highly Cited
2004
Error Mining for Wide-Coverage Grammar Engineering
Gertjan van Noord
Annual Meeting of the Association for…
2004
Corpus ID: 2040944
Parsing systems which rely on hand-coded linguistic descriptions can only perform adequately in as far as these descriptions are…
Expand
Highly Cited
2002
Highly Cited
2002
Patterns and meanings: Using corpora for English language research and teaching. By ALAN PARTINGTON. (Studies in corpus linguistics 2.) Amsterdam & Philadelphia: John Benjamins, 1998
D. Noel
2002
Corpus ID: 165815406
Highly Cited
2000
Highly Cited
2000
Exploring Automatic Word Sense Disambiguation with Decision Lists and the Web
Eneko Agirre
,
David Martínez
SAIC@COLING
2000
Corpus ID: 1238985
The most effective paradigm for word sense disambiguation, supervised learning, seems to be stuck because of the knowledge…
Expand
Review
1989
Review
1989
Book Reviews: Machine Translation: Linguistic Characteristics of MT Systems and General Methodology of Evaluation
R. Mccardell
International Conference on Computational Logic
1989
Corpus ID: 10496230
This book (which has been long in the making!) is a compilation of a large number of papers written over the years (1971-1981…
Expand
Highly Cited
1961
Highly Cited
1961
The Histology of the Neurosecretory System of the Adult Female Desert Locust, Schistocerca gregaria
K. C. Highnam
1961
Corpus ID: 31721805
The pars intercerebralis of the brain of the desert locust contains about 2,400 cells in two groups, which stain with chrome…
Expand
Highly Cited
1937
Highly Cited
1937
GRAVIMETRIC METHOD FOR THE DETERMINATION OF SODIUM PREGNANDIOL GLUCURONIDATE (AN EXCRETION PRODUCT OF PROGESTERONE)
E. Venning
1937
Corpus ID: 6783082
By clicking accept or continuing to use the site, you agree to the terms outlined in our
Privacy Policy
(opens in a new tab)
,
Terms of Service
(opens in a new tab)
, and
Dataset License
(opens in a new tab)
ACCEPT & CONTINUE