Unsupervised Decomposition of a Document into Authorial Components

  title={Unsupervised Decomposition of a Document into Authorial Components},
  author={Moshe Koppel and Navot Akiva and Idan Dershowitz and Nachum Dershowitz},
We propose a novel unsupervised method for separating out distinct authorial components of a document. In particular, we show that, given a book artificially “munged” from two thematically similar biblical books, we can separate out the two constituent books almost perfectly. This allows us to automatically recapitulate many conclusions reached by Bible scholars over centuries of research. One of the key elements of our method is exploitation of differences in synonym choice by different… CONTINUE READING
Highly Cited
This paper has 47 citations. REVIEW CITATIONS

From This Paper

Figures, tables, and topics from this paper.

Explore Further: Topics Discussed in This Paper


Publications citing this paper.
Showing 1-10 of 26 extracted citations

Cyber Security Cryptography and Machine Learning

Lecture Notes in Computer Science • 2017
View 8 Excerpts
Highly Influenced

Clustering Voices in The Waste Land

View 4 Excerpts
Highly Influenced


Publications referenced by this paper.
Showing 1-10 of 22 references

Statistical methods in the study of the Masoretic text of the Old Testament

R. E. Bee.
J. of the Royal Statistical Society, 134(1):611-622. • 1971
View 3 Excerpts
Highly Influenced

Conjectures sur les mémoires originaux dont il paroit que Moyse s’est servi pour composer le livre de la Genèse

J. Astruc.
Brussels. • 1753
View 3 Excerpts
Highly Influenced

Author Attribution

P. Juola.
Series title: Foundations and Trends in Information Retrieval. Now Publishing, Delft. • 2008

Segmenting documents by stylistic character

Natural Language Engineering • 2005
View 1 Excerpt

Similar Papers

Loading similar papers…