Spectral Methods Cluster Words of the Same Class in a Syntactic Dependency Network

@article{FerreriCancho2007SpectralMC,
  title={Spectral Methods Cluster Words of the Same Class in a Syntactic Dependency Network},
  author={Ramon Ferrer-i-Cancho and Andrea Capocci and Guido Caldarelli},
  journal={Int. J. Bifurc. Chaos},
  year={2007},
  volume={17},
  pages={2453-2463}
}
We analyze here a particular kind of linguistic network where vertices represent words and edges stand for syntactic relationships between words. The statistical properties of these networks have been recently studied and various features such as the small-world phenomenon and a scale-free distribution of degrees have been found. Our work focuses on four classes of words: verbs, nouns, adverbs and adjectives. Here, we use spectral methods sorting vertices. We show that the ordering clusters… 

Figures and Tables from this paper

Function Nodes in Chinese Syntactic Networks
TLDR
Based on two syntactic dependency networks derived from two Chinese treebanks of different registers, a statistical study is conducted regarding word frequency and distributions and shows that all three function words are central nodes of the Chinese syntactic networks but have different status.
The effect of linguistic constraints on the large scale organization of language
TLDR
It is hypothesized that many of the network statistics reported here studied are in fact functions of the distribution of the underlying data from which the network is built and may not be indicative of the nature of the concerned network.
The Modular Community Structure of Linguistic Predication Networks
TLDR
A semantically motivated measure of predication strength is introduced to weight relevant predications observed in text and shows that predications do indeed form modular structures without any weighting and that usingpredication strength increases this modularity without discarding low-frequency items.
Unsupervised Parts-of-Speech Induction for Bengali
TLDR
A study of the word interaction networks of Bengali in the framework of complex networks reveals interesting insights into the morpho-syntax of the language, whereas clustering helps in the induction of the natural word classes leading to a principled way of designing POS tagsets.
How does language change as a lexical network? An investigation based on written Chinese word co-occurrence networks
TLDR
The hierarchy of Chinese lexical networks has indeed evolved over time at three different levels, and the connections of words at the micro level are continually weakening; the number of words in the meso-level communities has increased significantly; and the network is expanding at the macro level.
...
...

References

SHOWING 1-10 OF 99 REFERENCES
The Large-Scale Structure of Semantic Networks: Statistical Analyses and a Model of Semantic Growth
TLDR
A simple model for semantic growth is described, in which each new word or concept is connected to an existing network by differentiating the connectivity pattern of an existing node, which generates appropriate small-world statistics and power-law connectivity distributions.
Global organization of the Wordnet lexicon
  • M. Sigman, G. Cecchi
  • Computer Science
    Proceedings of the National Academy of Sciences of the United States of America
  • 2002
TLDR
A quantitative study of the graph structure of Wordnet to understand the global organization of the lexicon and shows that Wordnet has global properties common to many self-organized systems, and polysemy organizes the semantic graph in a compact and categorical representation, in a way that may explain the ubiquity of polyse my across languages.
Patterns in syntactic dependency networks.
TLDR
It is shown that different syntactic dependency networks (from Czech, German, and Romanian) share many nontrivial statistical patterns such as the small world phenomenon, scaling in the distribution of degrees, and disassortative mixing.
Topology of the conceptual network of language.
TLDR
This work maps out the conceptual network of the English language, with the connections being defined by the entries in a Thesaurus dictionary, and finds that this network presents a small-world structure, and appears to exhibit an asymptotic scale-free feature with algebraic connectivity distribution.
The small world of human language
TLDR
It is shown that graphs of words in sentences display two important features recently found in a disparate number of complex systems, the so called small–world effect and a scale–free distribution of degrees.
Detecting communities in large networks
Method to find community structures based on information centrality.
TLDR
An algorithm of hierarchical clustering that consists in finding and removing iteratively the edge with the highest information centrality is developed that is very effective especially when the communities are very mixed and hardly detectable by the other methods.
Entropic Analysis of the Role of Words in Literary Texts
TLDR
It is shown that there is a quantitative relation between the role of content words in literary English and the Shannon information entropy defined over an appropriate probability distribution.
Euclidean distance between syntactically linked words.
  • R. Ferrer i Cancho
  • Computer Science
    Physical review. E, Statistical, nonlinear, and soft matter physics
  • 2004
TLDR
The Euclidean distance between syntactically linked words in sentences predicts, under ideal conditions, an exponential distribution of the distance between linked words, a trend that can be identified in real sentences.
Euclidean distance between syntactically linked words.
TLDR
The Euclidean distance between syntactically linked words in sentences predicts, under ideal conditions, an exponential distribution of the distance between linked words, a trend that can be identified in real sentences.
...
...