On the “Calligraphy” of Books

@article{Marinho2017OnT,
  title={On the “Calligraphy” of Books},
  author={Vanessa Queiroz Marinho and Henrique Ferraz de Arruda and Thales S. Lima and Luciano da Fontoura Costa and Diego Raphael Amancio},
  journal={ArXiv},
  year={2017},
  volume={abs/1705.10415}
}
Authorship attribution is a natural language processing task that has been widely studied, often by considering small order statistics. In this paper, we explore a complex network approach to assign the authorship of texts based on their mesoscopic representation, in an attempt to capture the flow of the narrative. Indeed, as reported in this work, such an approach allowed the identification of the dominant narrative structure of the studied authors. This has been achieved due to the ability of… 
5 Citations

Figures and Tables from this paper

Accessibility and Trajectory-Based Text Characterization
TLDR
This work adopt an extension to the mesoscopic approach to represent text narratives, in which only the recurrent relationships among tagged parts of speech are considered to establish connections among sequential pieces of text (e.g., paragraphs).
Text characterization based on recurrence networks
TLDR
An extension to the mesoscopic approach to represent text narratives is adopted, in which only the recurrent relationships among tagged parts of speech are considered to establish connections among sequential pieces of text (e.g., paragraphs).

References

SHOWING 1-10 OF 38 REFERENCES
Representation of texts as complex networks: a mesoscopic approach
TLDR
This work has devised a network model which is able to analyze documents in a multi-scale fashion, and shows that the mesoscopic structure of a document, modeled as a network, reveals many semantic traits of texts.
Text Authorship Identified Using the Dynamics of Word Co-Occurrence Networks
TLDR
This study introduces a methodology based on the dynamics of word co-occurrence networks representing written texts to classify a corpus of 80 texts by 8 authors, paving the way for a robust description of large texts in terms of small evolving networks.
Authorship attribution based on Life-Like Network Automata
TLDR
This work proposed a novel method to characterize text networks, by considering both topological and dynamical aspects of networks, using concepts and methods from cellular automata theory and devised a strategy to grasp informative spatio-temporal patterns from this model.
Authorship Attribution via Network Motifs Identification
TLDR
The goal of this paper is to apply the concept of motifs, recurrent interconnection patterns, in the authorship attribution task and show that motifs are able to distinguish the writing style of different authors.
A survey of modern authorship attribution methods
TLDR
A survey of recent advances of the automated approaches to attributing authorship is presented, examining their characteristics for both text representation and text classification.
Topic segmentation via community detection in complex networks
TLDR
This work proposes a novel network representation whose main purpose is to capture the semantical relationships of words in a simple way, and shows that the proposed representations favor the emergence of communities of semantically related words, and this feature may be used to identify relevant topics.
Authorship Attribution Using Word Network Features
TLDR
A set of novel features for authorship attribution of documents are explored, derived from a word network representation of natural language text, that are suitable as features for machine-learning-based authorship Attribution of documents.
Quantitative Authorship Attribution: An Evaluation of Techniques
TLDR
A comparison of thirty-nine different types of textual measurements commonly used in attribution studies is presented, in order to determine which are the best indicators of authorship.
Computational methods in authorship attribution
TLDR
Three scenarios are considered here for which solutions to the basic attribution problem are inadequate; it is shown how machine learning methods can be adapted to handle the special challenges of that variant.
Using network science and text analytics to produce surveys in a scientific topic
...
...