• Publications
  • Influence
Rethinking the Inception Architecture for Computer Vision
Convolutional networks are at the core of most state of-the-art computer vision solutions for a wide variety of tasks. Since 2014 very deep convolutional networks started to become mainstream,Expand
  • 8,486
  • 1362
  • Open Access
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
Training Deep Neural Networks is complicated by the fact that the distribution of each layer's inputs changes during training, as the parameters of the previous layers change. This slows down theExpand
  • 18,729
  • 1159
  • Open Access
Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning
Very deep convolutional networks have been central to the largest advances in image recognition performance in recent years. One example is the Inception architecture that has been shown to achieveExpand
  • 4,669
  • 622
  • Open Access
Probabilistic Linear Discriminant Analysis
Linear dimensionality reduction methods, such as LDA, are often used in object recognition for feature extraction, but do not address the problem of how to use these features for recognition. In thisExpand
  • 277
  • 60
  • Open Access
Deep Convolutional Ranking for Multilabel Image Annotation
Multilabel image annotation is one of the most important challenges in computer vision with many real-world applications. While existing work usually use conventional visual features for multilabelExpand
  • 302
  • 49
  • Open Access
No Fuss Distance Metric Learning Using Proxies
We address the problem of distance metric learning (DML), defined as learning a distance consistent with a notion of semantic similarity. Traditionally, for this problem supervision is expressed inExpand
  • 186
  • 36
  • Open Access
Batch Renormalization: Towards Reducing Minibatch Dependence in Batch-Normalized Models
  • S. Ioffe
  • Computer Science, Mathematics
  • NIPS
  • 1 February 2017
Batch Normalization is quite effective at accelerating and improving the training of deep models. However, its effectiveness diminishes when the training minibatches are small, or do not consist ofExpand
  • 219
  • 34
  • Open Access
Improved Consistent Sampling, Weighted Minhash and L1 Sketching
  • S. Ioffe
  • Computer Science
  • IEEE International Conference on Data Mining
  • 13 December 2010
We propose a new Consistent Weighted Sampling method, where the probability of drawing identical samples for a pair of inputs is equal to their Jaccard similarity. Our method takes deterministicExpand
  • 123
  • 26
  • Open Access
Temporal Differences-Based Policy Iteration and Applications in Neuro-Dynamic Programming
We introduce a new policy iteration method for dynamic programming problems with discounted and undiscounted cost. The method is based on the notion of temporal differences, and is primarily gearedExpand
  • 115
  • 21
  • Open Access
Probabilistic Methods for Finding People
Finding people in pictures presents a particularly difficult object recognition problem. We show how to find people by finding candidate body segments, and then constructing assemblies of segmentsExpand
  • 255
  • 11
  • Open Access