• Publications
  • Influence
PyTorch: An Imperative Style, High-Performance Deep Learning Library
TLDR
This paper details the principles that drove the implementation of PyTorch and how they are reflected in its architecture, and explains how the careful and pragmatic implementation of the key components of its runtime enables them to work together to achieve compelling performance. Expand
End-to-End Object Detection with Transformers
TLDR
This work presents a new method that views object detection as a direct set prediction problem, and demonstrates accuracy and run-time performance on par with the well-established and highly-optimized Faster RCNN baseline on the challenging COCO object detection dataset. Expand
Training data-efficient image transformers & distillation through attention
TLDR
This work produces a competitive convolution-free transformer by training on Imagenet only, and introduces a teacher-student strategy specific to transformers that relies on a distillation token ensuring that the student learns from the teacher through attention. Expand
MLPerf Inference Benchmark
TLDR
This paper presents the benchmarking method for evaluating ML inference systems, MLPerf Inference, and prescribes a set of rules and best practices to ensure comparability across systems with wildly differing architectures. Expand
Crafting a multi-task CNN for viewpoint estimation
TLDR
This paper presents a comparison of CNN approaches in a unified setting as well as a detailed analysis of the key factors that impact perfor- mance, and presents a new joint training method with the detection task and demonstrates its benefit. Expand
Deep Exemplar 2D-3D Detection by Adapting from Real to Rendered Views
This paper presents an end-to-end convolutional neural network (CNN) for 2D-3D exemplar detection. We demonstrate that the ability to adapt the features of natural images to better align with thoseExpand
Frame Interpolation with Multi-Scale Deep Loss Functions and Generative Adversarial Networks
TLDR
A multi-scale generative adversarial network for frame interpolation (FIGAN) that is jointly supervised at different levels with a perceptual loss function that consists of an adversarial and two content losses to improve the quality of synthesised intermediate video frames. Expand
Convolutional Neural Networks for joint object detection and pose estimation: A comparative study
TLDR
It is shown that a classification approach on discretized viewpoints achieves state-of-the-art performance for joint object detection and pose estimation, and significantly outperforms existing baselines on this benchmark. Expand
Automatic 3D Car Model Alignment for Mixed Image-Based Rendering
TLDR
A method is proposed that automatically identifies stock 3D models, aligns them in the 3D scene and performs morphing to better capture image contours, and shows significant improvement in image quality for free-viewpoint IBR, especially when moving far from the captured viewpoints. Expand
The Vision Behind MLPerf: Understanding AI Inference Performance
TLDR
MLPerf is an ML benchmark standard driven by academia and industry and establishes a standard benchmark suite with proper metrics and benchmarking methodologies to level the playing field for ML system performance measurement of different ML inference hardware, software, and services. Expand
...
1
2
...