Corpus ID: 231942749

LambdaNetworks: Modeling Long-Range Interactions Without Attention

@article{Bello2021LambdaNetworksML,
  title={LambdaNetworks: Modeling Long-Range Interactions Without Attention},
  author={Irwan Bello},
  journal={ArXiv},
  year={2021},
  volume={abs/2102.08602}
}
We present lambda layers – an alternative framework to self-attention – for capturing long-range interactions between an input and structured contextual information (e.g. a pixel surrounded by other pixels). Lambda layers capture such interactions by transforming available contexts into linear functions, termed lambdas, and applying these linear functions to each input separately. Similar to linear attention, lambda layers bypass expensive attention maps, but in contrast, they model both… Expand
An Attention Free Transformer
MLP-Mixer: An all-MLP Architecture for Vision
CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification
Refiner: Refining Self-attention for Vision Transformers
  • Daquan Zhou, Yujun Shi, +6 authors Jiashi Feng
  • Computer Science
  • ArXiv
  • 2021
KVT: k-NN Attention for Boosting Vision Transformers
RegionViT: Regional-to-Local Attention for Vision Transformers
Revisiting ResNets: Improved Training and Scaling Strategies
...
1
2
3
4
...

References

SHOWING 1-10 OF 86 REFERENCES
Gather-Excite: Exploiting Feature Context in Convolutional Neural Networks
BAM: Bottleneck Attention Module
Linformer: Self-Attention with Linear Complexity
Squeeze-and-Excitation Networks
...
1
2
3
4
5
...