Skip to search form
Skip to main content
Skip to account menu
Semantic Scholar
Semantic Scholar's Logo
Search 225,098,611 papers from all fields of science
Search
Sign In
Create Free Account
Text corpus
Known as:
Text corpora
, Linguistic corpus
, Text item
Expand
In linguistics, a corpus (plural corpora) or text corpus is a large and structured set of texts (nowadays usually electronically stored and processed…
Expand
Wikipedia
(opens in a new tab)
Create Alert
Alert
Related topics
Related topics
50 relations
Amarna letter EA 256
Amarna letter EA 365
Amarna letters–localities and their rulers
Amebis
Expand
Papers overview
Semantic Scholar uses AI to extract papers important to this topic.
Highly Cited
2006
Highly Cited
2006
Why Are They Excited? Identifying and Explaining Spikes in Blog Mood Levels
K. Balog
,
G. Mishne
,
M. de Rijke
Conference of the European Chapter of the…
2006
Corpus ID: 15213905
We describe a method for discovering irregularities in temporal mood patterns appearing in a large corpus of blog posts, and…
Expand
Highly Cited
2006
Highly Cited
2006
Morphological Richness Offsets Resource Demand – Experiences in Constructing a POS Tagger for Hindi
Smriti Singh
,
Kuhoo Gupta
,
Manish Shrivastava
,
P. Bhattacharyya
Annual Meeting of the Association for…
2006
Corpus ID: 1739705
In this paper we report our work on building a POS tagger for a morphologically rich language- Hindi. The theme of the research…
Expand
Highly Cited
2004
Highly Cited
2004
SpamBayes: Effective open-source, Bayesian based, email classification system
T. Meyer
,
Brendon Whateley
International Conference on Email and Anti-Spam
2004
Corpus ID: 2368172
This paper introduces the SpamBayes classification engine and outlines the most important features and techniques which…
Expand
Highly Cited
2000
Highly Cited
2000
Extended Models and Tools for High-performance Part-of-speech
Masayuki Asahara
,
Yuji Matsumoto
International Conference on Computational…
2000
Corpus ID: 6533697
Statistical part-of-speech (POS) taggers achieve high accuracy and robustness when based on large scale manually tagged corpora…
Expand
Highly Cited
2000
Highly Cited
2000
Exploring Automatic Word Sense Disambiguation with Decision Lists and the Web
Eneko Agirre
,
David Martínez
SAIC@COLING
2000
Corpus ID: 1238985
The most effective paradigm for word sense disambiguation, supervised learning, seems to be stuck because of the knowledge…
Expand
Highly Cited
2000
Highly Cited
2000
Searching the web by constrained spreading activation
F. Crestani
,
Puay Leng Lee
Information Processing & Management
2000
Corpus ID: 11268803
Highly Cited
1981
Highly Cited
1981
The Progressive Construction of Mind
R. Lawler
Cognitive Sciences
1981
Corpus ID: 18830574
We propose a vision of the structure of knowledge and processes of learning based upon the particularity of experience. Highly…
Expand
Highly Cited
1980
Highly Cited
1980
The primary visual pathway through the corpus callosum: morphological and functional aspects in the cat
Innocenti Gm
1980
Corpus ID: 143268783
Highly Cited
1961
Highly Cited
1961
Occurrence of a Hyperglycæmic Factor in the Corpus Cardiacum of an Insect
J. E. Steele
Nature
1961
Corpus ID: 4148518
ALMOST two decades have elapsed since Abramowitz et al.1 first reported the occurrence of a ‘diabetogenic’ factor in the sinus…
Expand
Highly Cited
1937
Highly Cited
1937
GRAVIMETRIC METHOD FOR THE DETERMINATION OF SODIUM PREGNANDIOL GLUCURONIDATE (AN EXCRETION PRODUCT OF PROGESTERONE)
E. Venning
1937
Corpus ID: 6783082
By clicking accept or continuing to use the site, you agree to the terms outlined in our
Privacy Policy
(opens in a new tab)
,
Terms of Service
(opens in a new tab)
, and
Dataset License
(opens in a new tab)
ACCEPT & CONTINUE