Learn More
As deep nets are increasingly used in applications suited for mobile devices, a fundamental dilemma becomes apparent: the trend in deep learning is to grow models to absorb ever-increasing data set sizes; however mobile devices are designed with very little memory and cannot store such large models. We present a novel network architecture, HashedNets, that(More)
The goal of machine learning is to develop predictors that generalize well to test data. Ideally, this is achieved by training on very large (infinite) training data sets that capture all variations in the data distribution. In the case of finite training data, an effective solution is to extend the training set with artificially created examples—which,(More)
In this paper, we introduce two novel metric learning algorithms, χ 2-LMNN and GB-LMNN, which are explicitly designed to be non-linear and easy-to-use. The two approaches achieve this goal in fundamentally different ways: χ 2-LMNN inherits the computational benefits of a linear mapping from linear metric learning , but uses a non-linear χ 2-distance to(More)
Gradient Boosted Regression Trees (GBRT) are the current state-of-the-art learning paradigm for machine learned web-search ranking - a domain notorious for very large data sets. In this paper, we propose a novel method for parallelizing the training of GBRT. Our technique parallelizes the construction of the individual regression trees and operates using(More)
Convolutional neural networks (CNN) are increasingly used in many areas of computer vision. They are particularly attractive because of their ability to " absorb " great quantities of labeled data through millions of parameters. However, as model sizes increase, so do the storage and memory requirements of the classi-fiers. We present a novel network(More)
We present Stochastic Neighbor Compression (SNC), an algorithm to compress a dataset for the purpose of k-nearest neighbor (kNN) classification. Given training data, SNC learns a much smaller synthetic data set, that minimizes the stochastic 1-nearest neighbor classification error on the training data. This approach has several appealing properties: due to(More)
In this paper, we evaluate the performance of various parallel optimization methods for Kernel Support Vector Machines on multicore CPUs and GPUs. In particular , we provide the first comparison of algorithms with explicit and implicit parallelization. Most existing parallel implementations for multi-core or GPU ar-chitectures are based on explicit(More)
Strata-Gem utilizes mission trees to perform risk assessments by linking an organization's objectives to the IT assets that implement them. Critical states are identified which indicate goals that a potential attacker can achieve to prevent each asset from completing its objectives. Those goals are then used as states to drive attack and fault tree analysis(More)
The goal of machine learning is to develop predictors that generalize well to test data. Ideally, this is achieved by training on an almost infinitely large training data set that captures all variations in the data distribution. In practical learning settings, however, we do not have infinite data and our predictors may overfit. Overfitting may be(More)
Covariance matrices are an effective way to capture global spread across local interest points in images. Often , these image descriptors are more compact, robust and informative than, for example, bags of visual words. However , they are symmetric and positive definite (SPD) and therefore live on a non-Euclidean Riemannian manifold, which gives rise to(More)