Hierarchical Information Clustering by Means of Topologically Embedded Graphs

  title={Hierarchical Information Clustering by Means of Topologically Embedded Graphs},
  author={Won-min Song and Tiziana di Matteo and Tomaso Aste},
  journal={PLoS ONE},
We introduce a graph-theoretic approach to extract clusters and hierarchies in complex data-sets in an unsupervised and deterministic manner, without the use of any prior information. This is achieved by building topologically embedded networks containing the subset of most significant links and analyzing the network structure. For a planar embedding, this method provides both the intra-cluster hierarchy, which describes the way clusters are composed, and the inter-cluster hierarchy which… 

HiSCF: leveraging higher-order structures for clustering analysis in biological networks

The promising performance of HiSCF demonstrates that the consideration of higher-order network motifs gains new insight into the analysis of biological networks, such as the identification of overlapping protein complexes and the inference of new signaling pathways, and also reveals the rich higher- order organizational structures presented in biological networks.

Topological Strata of Weighted Complex Networks

This work introduces a novel method, based on persistent homology, to detect particular non-local structures, akin to weighted holes within the link-weight network fabric, which are invisible to existing methods and creates the first bridge between network theory and algebraic topology, which will allow to import the toolset of algebraic methods to complex systems.

Exploring complex networks via topological embedding on surfaces.

It is proved, with examples, that topologically embedded graphs can be built in a way to contain arbitrary complex networks as subgraphs and this method opens a new avenue to build geometrically embedded networks on hyperbolic manifolds.

Network Filtering for Big Data: Triangulated Maximally Filtered Graph

We propose a network-filtering method, the Triangulated Maximally Filtered Graph (TMFG), that provides an approximate solution to the Weighted Maximal Planar Graph problem. The underlying idea of

Learning Clique Forests

The algorithm, named Maximally Filtered Clique Forest, produces a clique forest and an associated Markov Random Field by generalising Prim's minimum spanning tree algorithm and it is shown that the MCFC outperforms the Graphical Lasso for a number of classes of matrices.

A Skeleton-based Community Detection Algorithm for Directed Networks

  • Hao LongTong WuHongyan Yin
  • Computer Science
    2020 IEEE 3rd International Conference on Information Systems and Computer Aided Education (ICISCAE)
  • 2020
A novel skeleton-based community detection algorithm for directed networks by extending the term of the edge intensity for undirected graphs to directed ones, then the skeleton chain is extracted out as a profile of the original directed network; with iterative splitting of network skeleton and the extended intensity-based modularity, disjoint communities fordirected networks can be accurately retrieved.

A Pólya urn approach to information filtering in complex networks

A filtering methodology inspired by the Pólya urn is proposed, a combinatorial model driven by a self-reinforcement mechanism, which relies on a family of null hypotheses that can be calibrated to assess which links are statistically significant with respect to a given network’s own heterogeneity.

Community detection for correlation matrices

This work introduces, via a consistent redefinition of null models based on random matrix theory, the appropriate correlation-based counterparts of the most popular community detection techniques, and can filter out both unit-specific noise and system-wide dependencies, and the resulting communities are internally correlated and mutually anti-correlated.

Multi-resolution functional summarization and alignment of biological network models

This dissertation aims to build frameworks that allow biologists to rapidly visualize the processes that govern biological systems via: 1) functional organization within a biological network (intra-system processes), and 2) functional relationships between biological networks (inter- system processes).

Multiscale Embedded Gene Co-expression Network Analysis

A new co- expression network analysis framework called Multiscale Embedded Gene Co-expression Network Analysis (MEGENA) is developed by introducing quality control of co-expression similarities, parallelizing embedded network construction, and developing a novel clustering technique to identify multi-scale clustering structures in Planar Filtered Networks (PFNs).



Community structure in social and biological networks

  • M. GirvanM. Newman
  • Computer Science
    Proceedings of the National Academy of Sciences of the United States of America
  • 2002
This article proposes a method for detecting communities, built around the idea of using centrality indices to find community boundaries, and tests it on computer-generated and real-world graphs whose community structure is already known and finds that the method detects this known structure with high sensitivity and reliability.

Resolution limit in community detection

It is found that modularity optimization may fail to identify modules smaller than a scale which depends on the total size of the network and on the degree of interconnectedness of the modules, even in cases where modules are unambiguously defined.

The use of dynamical networks to detect the hierarchical organization of financial market sectors

AbstractTwo kinds of filtered networks: minimum spanning trees (MSTs) and planar maximally filtered graphs (PMFGs) are constructed from dynamical correlations computed over a moving window. We study

A tool for filtering information in complex systems.

A technique to filter out complex data sets by extracting a subgraph of representative links that is especially suitable for correlation-based graphs, giving filtered graphs that preserve the hierarchical organization of the minimum spanning tree but containing a larger amount of information in their internal structure.

Clustering cancer gene expression data: a comparative study

This study presents the first large-scale analysis of seven different clustering methods and four proximity measures for the analysis of 35 cancer gene expression data sets and reveals that the finite mixture of Gaussians, followed closely by k-means, exhibited the best performance in terms of recovering the true structure of the data sets.

Biased percolation on scale-free networks.

By presenting an extension of the Fortuin-Kasteleyn construction, it is found that biased percolation is well-described by the q-->1 limit of the q -state Potts model with inhomogeneous couplings.

Clustering of the SOM easily reveals distinct gene expression patterns: results of a reanalysis of lymphoma study

By using SOM as an intermediate step to analyze genome-wide gene expression data, the gene expression patterns can more easily be revealed and the "expression display" by the SOM component plane summarises the complicated data in a way that allows the clinician to evaluate the classification options rather than giving a fixed diagnosis.

A general co-expression network-based approach to gene expression analysis: comparison and applications

It is demonstrated that the novel approach is very effective in discovering the modular structures in microarray data, both for genes and for samples, and may be applied to large data sets where the number of clusters is difficult to estimate.

Self-organized network evolution coupled to extremal dynamics

The interplay between topology and dynamics in complex networks is a fundamental but widely unexplored problem. Here, we study this phenomenon on a prototype model in which the network is shaped by a

Data clustering: a review

An overview of pattern clustering methods from a statistical pattern recognition perspective is presented, with a goal of providing useful advice and references to fundamental concepts accessible to the broad community of clustering practitioners.