Skip to search form
Skip to main content
Skip to account menu
Semantic Scholar
Semantic Scholar's Logo
Search 226,468,797 papers from all fields of science
Search
Sign In
Create Free Account
Text corpus
Known as:
Text corpora
, Linguistic corpus
, Text item
Expand
In linguistics, a corpus (plural corpora) or text corpus is a large and structured set of texts (nowadays usually electronically stored and processed…
Expand
Wikipedia
(opens in a new tab)
Create Alert
Alert
Related topics
Related topics
50 relations
Amarna letter EA 256
Amarna letter EA 365
Amarna letters–localities and their rulers
Amebis
Expand
Papers overview
Semantic Scholar uses AI to extract papers important to this topic.
Highly Cited
2007
Highly Cited
2007
Polish tagger TaKIPI: rule based construction and optimization
Maciej Piasecki
2007
Corpus ID: 16144878
A large number of different tags, limited corpora and the free word order are the main causes of low accuracy of tagging in…
Expand
Highly Cited
2007
Highly Cited
2007
Efficient Handling of N-gram Language Models for Statistical Machine Translation
Marcello Federico
,
M. Cettolo
WMT@ACL
2007
Corpus ID: 603858
Statistical machine translation, as well as other areas of human language processing, have recently pushed toward the use of…
Expand
Highly Cited
2002
Highly Cited
2002
DARPA communicator: cross-system results for the 2001 evaluation
M. Walker
,
Alexander I. Rudnicky
,
+12 authors
D. Stallard
Interspeech
2002
Corpus ID: 10318532
This paper describes the evaluation methodology and results of the 2001 DARPA Communicator evaluation. The experiment spanned 6…
Expand
Highly Cited
2002
Highly Cited
2002
Patterns and meanings: Using corpora for English language research and teaching. By ALAN PARTINGTON. (Studies in corpus linguistics 2.) Amsterdam & Philadelphia: John Benjamins, 1998
D. Noel
2002
Corpus ID: 165815406
Highly Cited
2000
Highly Cited
2000
Extended Models and Tools for High-performance Part-of-speech
Masayuki Asahara
,
Yuji Matsumoto
International Conference on Computational…
2000
Corpus ID: 6533697
Statistical part-of-speech (POS) taggers achieve high accuracy and robustness when based on large scale manually tagged corpora…
Expand
Highly Cited
2000
Highly Cited
2000
Exploring Automatic Word Sense Disambiguation with Decision Lists and the Web
Eneko Agirre
,
David Martínez
SAIC@COLING
2000
Corpus ID: 1238985
The most effective paradigm for word sense disambiguation, supervised learning, seems to be stuck because of the knowledge…
Expand
Review
1989
Review
1989
Book Reviews: Machine Translation: Linguistic Characteristics of MT Systems and General Methodology of Evaluation
R. Mccardell
International Conference on Computational Logic
1989
Corpus ID: 10496230
This book (which has been long in the making!) is a compilation of a large number of papers written over the years (1971-1981…
Expand
Highly Cited
1981
Highly Cited
1981
The Progressive Construction of Mind
R. Lawler
Cognitive Sciences
1981
Corpus ID: 18830574
We propose a vision of the structure of knowledge and processes of learning based upon the particularity of experience. Highly…
Expand
Highly Cited
1961
Highly Cited
1961
The Histology of the Neurosecretory System of the Adult Female Desert Locust, Schistocerca gregaria
K. C. Highnam
1961
Corpus ID: 31721805
The pars intercerebralis of the brain of the desert locust contains about 2,400 cells in two groups, which stain with chrome…
Expand
Highly Cited
1937
Highly Cited
1937
GRAVIMETRIC METHOD FOR THE DETERMINATION OF SODIUM PREGNANDIOL GLUCURONIDATE (AN EXCRETION PRODUCT OF PROGESTERONE)
E. Venning
1937
Corpus ID: 6783082
By clicking accept or continuing to use the site, you agree to the terms outlined in our
Privacy Policy
(opens in a new tab)
,
Terms of Service
(opens in a new tab)
, and
Dataset License
(opens in a new tab)
ACCEPT & CONTINUE