Spectral Methods Cluster Words of the Same Class in a Syntactic Dependency Network
@article{FerreriCancho2007SpectralMC, title={Spectral Methods Cluster Words of the Same Class in a Syntactic Dependency Network}, author={Ramon Ferrer-i-Cancho and Andrea Capocci and Guido Caldarelli}, journal={Int. J. Bifurc. Chaos}, year={2007}, volume={17}, pages={2453-2463} }
We analyze here a particular kind of linguistic network where vertices represent words and edges stand for syntactic relationships between words. The statistical properties of these networks have been recently studied and various features such as the small-world phenomenon and a scale-free distribution of degrees have been found. Our work focuses on four classes of words: verbs, nouns, adverbs and adjectives. Here, we use spectral methods sorting vertices. We show that the ordering clusters…
Figures and Tables from this paper
40 Citations
Function Nodes in Chinese Syntactic Networks
- Computer Science
- 2016
Based on two syntactic dependency networks derived from two Chinese treebanks of different registers, a statistical study is conducted regarding word frequency and distributions and shows that all three function words are central nodes of the Chinese syntactic networks but have different status.
The effect of linguistic constraints on the large scale organization of language
- Computer ScienceArXiv
- 2011
It is hypothesized that many of the network statistics reported here studied are in fact functions of the distribution of the underlying data from which the network is built and may not be indicative of the nature of the concerned network.
Relationships among the statistical parameters in evolving modern Chinese linguistic co-occurrence networks
- MathematicsPhysica A: Statistical Mechanics and its Applications
- 2019
The Modular Community Structure of Linguistic Predication Networks
- Computer ScienceTextGraphs@EMNLP
- 2014
A semantically motivated measure of predication strength is introduced to weight relevant predications observed in text and shows that predications do indeed form modular structures without any weighting and that usingpredication strength increases this modularity without discarding low-frequency items.
Spectral analysis of Chinese language: Co-occurrence networks from four literary genres
- Computer Science
- 2016
Network measures: A new paradigm towards reliable novel word sense detection
- Computer ScienceInf. Process. Manag.
- 2020
Unsupervised Parts-of-Speech Induction for Bengali
- Computer ScienceLREC
- 2008
A study of the word interaction networks of Bengali in the framework of complex networks reveals interesting insights into the morpho-syntax of the language, whereas clustering helps in the induction of the natural word classes leading to a principled way of designing POS tagsets.
How does language change as a lexical network? An investigation based on written Chinese word co-occurrence networks
- Computer SciencePloS one
- 2018
The hierarchy of Chinese lexical networks has indeed evolved over time at three different levels, and the connections of words at the micro level are continually weakening; the number of words in the meso-level communities has increased significantly; and the network is expanding at the macro level.
Applications of graph theory to an English rhyming corpus
- Computer ScienceComput. Speech Lang.
- 2011
References
SHOWING 1-10 OF 99 REFERENCES
The Large-Scale Structure of Semantic Networks: Statistical Analyses and a Model of Semantic Growth
- Computer ScienceCogn. Sci.
- 2005
A simple model for semantic growth is described, in which each new word or concept is connected to an existing network by differentiating the connectivity pattern of an existing node, which generates appropriate small-world statistics and power-law connectivity distributions.
Global organization of the Wordnet lexicon
- Computer ScienceProceedings of the National Academy of Sciences of the United States of America
- 2002
A quantitative study of the graph structure of Wordnet to understand the global organization of the lexicon and shows that Wordnet has global properties common to many self-organized systems, and polysemy organizes the semantic graph in a compact and categorical representation, in a way that may explain the ubiquity of polyse my across languages.
Patterns in syntactic dependency networks.
- LinguisticsPhysical review. E, Statistical, nonlinear, and soft matter physics
- 2004
It is shown that different syntactic dependency networks (from Czech, German, and Romanian) share many nontrivial statistical patterns such as the small world phenomenon, scaling in the distribution of degrees, and disassortative mixing.
Topology of the conceptual network of language.
- Computer SciencePhysical review. E, Statistical, nonlinear, and soft matter physics
- 2002
This work maps out the conceptual network of the English language, with the connections being defined by the entries in a Thesaurus dictionary, and finds that this network presents a small-world structure, and appears to exhibit an asymptotic scale-free feature with algebraic connectivity distribution.
The small world of human language
- LinguisticsProceedings of the Royal Society of London. Series B: Biological Sciences
- 2001
It is shown that graphs of words in sentences display two important features recently found in a disparate number of complex systems, the so called small–world effect and a scale–free distribution of degrees.
Method to find community structures based on information centrality.
- Computer SciencePhysical review. E, Statistical, nonlinear, and soft matter physics
- 2004
An algorithm of hierarchical clustering that consists in finding and removing iteratively the edge with the highest information centrality is developed that is very effective especially when the communities are very mixed and hardly detectable by the other methods.
Entropic Analysis of the Role of Words in Literary Texts
- LinguisticsAdv. Complex Syst.
- 2002
It is shown that there is a quantitative relation between the role of content words in literary English and the Shannon information entropy defined over an appropriate probability distribution.
Euclidean distance between syntactically linked words.
- Computer SciencePhysical review. E, Statistical, nonlinear, and soft matter physics
- 2004
The Euclidean distance between syntactically linked words in sentences predicts, under ideal conditions, an exponential distribution of the distance between linked words, a trend that can be identified in real sentences.
Euclidean distance between syntactically linked words.
- Computer Science
- 2004
The Euclidean distance between syntactically linked words in sentences predicts, under ideal conditions, an exponential distribution of the distance between linked words, a trend that can be identified in real sentences.