Text Genre Detection Using Common Word Frequencies

@inproceedings{Stamatatos2000TextGD,
  title={Text Genre Detection Using Common Word Frequencies},
  author={Efstathios Stamatatos and Nikos Fakotakis and George K. Kokkinakis},
  booktitle={COLING},
  year={2000}
}
In this paper we present a method for detecting the text genre quickly and easily following an approach originally proposed in authorship attribution studies which uses as style markers the frequencies of occurrence of the most frequent words in a training corpus (Burrows, 1992). In contrast to this approach we use the frequencies of occurrence of the most frequent words of the entire written language. Using as testing ground a part of the Wall Street Journal corpus, we show that the most… CONTINUE READING
Highly Cited
This paper has 174 citations. REVIEW CITATIONS

From This Paper

Figures, tables, and topics from this paper.

Citations

Publications citing this paper.
Showing 1-10 of 119 extracted citations

Genre Classification on German Novels

2015 26th International Workshop on Database and Expert Systems Applications (DEXA) • 2015
View 7 Excerpts
Highly Influenced

Genre identification for office document search and browsing

International Journal on Document Analysis and Recognition (IJDAR) • 2011
View 10 Excerpts
Highly Influenced

Cross-Lingual Genre Classification

EACL • 2012
View 4 Excerpts
Highly Influenced

Retrieval Models for Genre Classification

Scandinavian J. Inf. Systems • 2008
View 3 Excerpts
Highly Influenced

KI 2004: Advances in Artificial Intelligence

Susanne Biundo Thom Frühwirth, Oliver Günther
Lecture Notes in Computer Science • 2004
View 4 Excerpts
Highly Influenced

175 Citations

01020'01'04'08'12'16
Citations per Year
Semantic Scholar estimates that this publication has 175 citations based on the available data.

See our FAQ for additional information.

References

Publications referenced by this paper.
Showing 1-10 of 11 references

Not Unless You Ask Nicely: The Inte,'pretativc Nexus Between Analysis and Infornaation

J. Burrows
Literaly and Linguistic Computing, 7(2), pp. 91-109. • 1992
View 1 Excerpt

Variation Across Speech and Writing

D. Biber
Cambridge University Press. • 1988
View 1 Excerpt

Word-patterns and Story-shapes: The Statistical Analysis of Narrative Style

J. Burrows
Litera W and Linguistic Computing, 2(2), pp. 61-70. • 1987

Some Simple Measures of Richness of Vocabulary

A. Honore
Association Jbr Literaw and Linguistic Computing Bulletin, 7(2), pp. 172177. • 1979
View 1 Excerpt

On a Distribution Law for Word Frequencies

H. Sichel
Journal oJ" the American Statistical Associaton, 70, pp. 542-547. • 1975
View 1 Excerpt

Similar Papers

Loading similar papers…