Text corpus

Known as: Text corpora, Language corpus, Text item 
In linguistics, a corpus (plural corpora) or text corpus is a large and structured set of texts (nowadays usually electronically stored and processed… (More)
Wikipedia

Topic mentions per year

Topic mentions per year

1955-2018
0500100019552017

Papers overview

Semantic Scholar uses AI to extract papers important to this topic.
2014
2014
A labeled text corpus made up of Turkish papers' titles, abstracts and keywords is collected. The corpus includes 35 number of… (More)
Is this relevant?
2006
2006
Web text has been successfully used as training data for many NLP applications. While most previous work accesses web text… (More)
  • table 1
  • table 2
  • table 3
  • table 4
  • table 5
Is this relevant?
Highly Cited
2005
Highly Cited
2005
Text categorization or classification is the automated assigning of text documents to pre-defined classes based on their contents… (More)
  • table 1
  • table 2
  • figure 1
  • figure 2
  • figure 3
Is this relevant?
Highly Cited
2005
Highly Cited
2005
An important approach to text mining involves the use of natural-language information extraction. Information extraction (IE… (More)
Is this relevant?
Highly Cited
2005
Highly Cited
2005
This paper presents a part-of-speech tagger which is specifically tuned for biomedical text. We have built the tagger with… (More)
  • table 1
  • table 2
  • table 3
  • table 4
  • table 5
Is this relevant?
Highly Cited
2001
Highly Cited
2001
Most work in statistical parsing has focused on a single corpus: the Wall Street Journal portion of the Penn Treebank. While this… (More)
  • table 1
  • table 2
  • table 3
  • table 5
  • table 6
Is this relevant?
Highly Cited
1998
Highly Cited
1998
Corpus-based approaches to word sense identification have flexibility and generality but suffer from a knowledge acquisition… (More)
  • table 1
  • figure 1
  • figure 2
  • table 3
  • table 4
Is this relevant?
Highly Cited
1996
Highly Cited
1996
Many corpus-based natural language processing systems rely on text corpora that have been manually annotated with syntactic or… (More)
Is this relevant?
Review
1993
Review
1993
There is a growing consensus that significant, rapid progress can be made in both text understanding and spoken language… (More)
  • table 1
  • table 2
  • table 3
  • figure 5
Is this relevant?
Review
1992
Review
1992
The DARPA Spoken Language System (SLS) community has long taken a leadership position in designing, implementing, and globally… (More)
  • table 2
Is this relevant?