• Corpus ID: 239009840

Detecting Modularity in Deep Neural Networks

@article{Hod2021DetectingMI,
  title={Detecting Modularity in Deep Neural Networks},
  author={Shlomi Hod and Stephen Casper and Daniel Filan and Cody Wild and Andrew Critch and Stuart J. Russell},
  journal={ArXiv},
  year={2021},
  volume={abs/2110.08058}
}
A neural network is modular to the extent that parts of its computational graph (i.e. structure) can be represented as performing some comprehensible subtask relevant to the overall task (i.e. functionality). Are modern deep neural networks modular? How can this be quantified? In this paper, we consider the problem of assessing the modularity exhibited by a partitioning of a network’s neurons. We propose two proxies for this: importance, which reflects how crucial sets of neurons are to network… 
1 Citations

Figures and Tables from this paper

Graph Modularity: Towards Understanding the Cross-Layer Transition of Feature Representations in Deep Neural Networks
TLDR
It is demonstrated that modularity can be used to identify and locate redundant layers in DNNs, which provides theoretical guidance for layer pruning and is proposed as a layer-wise pruning method based on modularity.

References

SHOWING 1-10 OF 42 REFERENCES
Interpreting Layered Neural Networks via Hierarchical Modular Representation
TLDR
The application of a hierarchical clustering method to a trained network reveals a tree-structured relationship among hidden layer units, based on their feature vectors defined by their correlation with the input and output dimension values.
Clusterability in Neural Networks
The learned weights of a neural network have often been considered devoid of scrutable internal structure. In this paper, however, we look for structure in the form of clusterability: how well a
Deep learning systems as complex networks
TLDR
This article proposes to study deep belief networks using techniques commonly employed in the study of complex networks, in order to gain some insights into the structural and functional properties of the computational graph resulting from the learning process.
Frivolous Units: Wider Networks Are Not Really That Wide
TLDR
This work identifies two distinct types of "frivolous" units that proliferate when the network's width is increased: prunable units which can be dropped out of the network without significant change to the output and redundant units whose activities can be expressed as a linear combination of others.
Network Dissection: Quantifying Interpretability of Deep Visual Representations
TLDR
This work uses the proposed Network Dissection method to test the hypothesis that interpretability is an axis-independent property of the representation space, then applies the method to compare the latent representations of various networks when trained to solve different classification problems.
Modular Networks: Learning to Decompose Neural Computation
TLDR
This work proposes a training algorithm that flexibly chooses neural modules based on the data to be processed, and applies modular networks both to image recognition and language modeling tasks, where it achieves superior performance compared to several baselines.
On the importance of single directions for generalization
TLDR
It is found that class selectivity is a poor predictor of task importance, suggesting not only that networks which generalize well minimize their dependence on individual units by reducing their selectivity, but also that individually selective units may not be necessary for strong network performance.
Are Neural Nets Modular? Inspecting Their Functionality Through Differentiable Weight Masks
Neural networks (NNs) whose subnetworks implement reusable functions are expected to offer numerous advantages, e.g., compositionality through efficient recombination of functional building blocks,
Finding and evaluating community structure in networks.
  • M. Newman, M. Girvan
  • Computer Science, Physics
    Physical review. E, Statistical, nonlinear, and soft matter physics
  • 2004
TLDR
It is demonstrated that the algorithms proposed are highly effective at discovering community structure in both computer-generated and real-world network data, and can be used to shed light on the sometimes dauntingly complex structure of networked systems.
Learning Multiple Layers of Features from Tiny Images
TLDR
It is shown how to train a multi-layer generative model that learns to extract meaningful features which resemble those found in the human visual cortex, using a novel parallelization algorithm to distribute the work among multiple machines connected on a network.
...
1
2
3
4
5
...