The Reuters Corpus Volume 1 -from Yesterday's News to Tomorrow's Language Resources

  title={The Reuters Corpus Volume 1 -from Yesterday's News to Tomorrow's Language Resources},
  author={Tony Rose and Mark Stevenson and Miles Whitehead},
Reuters, the global information, news and technology group, has for the first time made available free of charge, large quantities of archived Reuters news stories for use by research communities around the world. The Reuters Corpus Volume 1 (RCV1) includes over 800,000 news stories typical of the annual English language news output of Reuters. This paper describes the origins of RCV1, the motivations behind its creation, and how it differs from previous corpora. In addition we discuss the… CONTINUE READING
Highly Influential
This paper has highly influenced 36 other papers. REVIEW HIGHLY INFLUENTIAL CITATIONS
Highly Cited
This paper has 340 citations. REVIEW CITATIONS

From This Paper

Figures, tables, and topics from this paper.


Publications citing this paper.
Showing 1-10 of 228 extracted citations

Wikipedia-Based Hybrid Document Representation for Textual News Classification

2016 3rd International Conference on Soft Computing & Machine Intelligence (ISCMI) • 2016
View 7 Excerpts
Highly Influenced

Visual Classifier Training for Text Document Retrieval

IEEE Transactions on Visualization and Computer Graphics • 2012
View 10 Excerpts
Highly Influenced

Pattern co-occurrence matrix to reduce the low frequency problem and effective pattern discovery

2016 International Conference on Computing, Analytics and Security Trends (CAST) • 2016
View 6 Excerpts
Highly Influenced

A Scalable Meta-Classifier Combining Search and Classification Techniques for Multi-Level Text Categorization

International Journal of Computational Intelligence and Applications • 2015
View 11 Excerpts
Highly Influenced

Effective pattern discovery by cleaning patterns with pattern co-occurrence matrix and PDCS deploying approach

2014 First International Conference on Networks & Soft Computing (ICNSC2014) • 2014
View 7 Excerpts
Highly Influenced

340 Citations

Citations per Year
Semantic Scholar estimates that this publication has 340 citations based on the available data.

See our FAQ for additional information.


Publications referenced by this paper.
Showing 1-10 of 16 references

Guidelines for the TREC - 2001 Filtering Track – Version 1

J. Veronis

Guidelines for the TREC-2001 Filtering Track – Version

S. Roberson, J. Callan

Foundations of Statistical Natural Language Processing

Computational Linguistics • 2000
View 2 Excerpts

An Evaluation of Statistical Approaches to Text Categorization

Y. Yang
Information Retrieval, • 1999
View 2 Excerpts

Learning to Classify Text form Labelled and Unlabelled Documents

K. Nigam
Proceedings of AAAI-98, • 1998
View 2 Excerpts

Similar Papers

Loading similar papers…