This work shows that performing adaptive basis function regression with a neural network as the parametric form performs competitively with state-of-the-art GP-based approaches, but scales linearly with the number of data rather than cubically, which allows for a previously intractable degree of parallelism.
A machine learning-based approach to lossy image compression which outperforms all existing codecs, while running in real-time, and supplementing the approach with adversarial training specialized towards use in a compression setting.
This work proposes spectral pooling, which performs dimensionality reduction by truncating the representation in the frequency domain, and demonstrates the effectiveness of complex-coefficient spectral parameterization of convolutional filters.
This work proposes a novel approach explicitly designed to address a number of subtle yet important issues which have stymied earlier DML algorithms, which maintains an explicit model of the distributions of the different classes in representation space and employs this knowledge to adaptively assess similarity, and achieve local discrimination by penalizing class distribution overlap.
It is shown that in standard architectures, the representational capacity of the network tends to capture fewer degrees of freedom as the number of layers increases, retaining only a single degree of freedom in the limit.
This work presents a new algorithm for video coding, learned end-to-end for the low-latency mode, which outperforms all existing video codecs across nearly the entire bitrate range, and is the first ML-based method to do so.
Nested dropout, a procedure for stochastically removing coherent nested sets of hidden units in a neural network, is introduced and it is rigorously shown that the application of nested dropout enforces identifiability of the units, which leads to an exact equivalence with PCA.
The deep density model (DDM) is introduced, a new approach to density estimation that exploits insights from deep learning to construct a bijective map to a representation space, under which the transformation of the distribution of the data is approximately factorized and has identical and known marginal densities.
This work proposes several novel ideas for learned video compression which allow for improved performance for the low-latency mode (I- and P-frames only) along with a considerable increase in computational efficiency, and introduces a flexible-rate framework allowing a single model to cover a large and dense range of bitrates.
A novel web-tool is built which allows users to paint spatio-temporal importance maps over videos and is used to demonstrate that the tool can indeed be used to generate videos which, at the same bitrate, look perceptually better through a subjective study — and are 1.9 times more likely to be preferred by viewers.