Corpus ID: 58004637

Neural network gradient-based learning of black-box function interfaces

  title={Neural network gradient-based learning of black-box function interfaces},
  author={Alon Jacovi and Guy Hadash and Einat Kermany and B. Carmeli and Ofer Lavi and George Kour and Jonathan Berant},
Deep neural networks work well at approximating complicated functions when provided with data and trained by gradient descent methods. At the same time, there is a vast amount of existing functions that programmatically solve different tasks in a precise manner eliminating the need for training. In many cases, it is possible to decompose a task to a series of functions, of which for some we may prefer to use a neural network to learn the functionality, while for others the preferred method… Expand
REST: Performance Improvement of a Black Box Model via RL-based Spatial Transformation
This work aims to improve the robustness by adding a REST module in front of any black boxes and training only the REST module without retraining the original black box model in an end-to-end manner, and empirically shows that the method has an advantage in generalization to geometric transformations and sample efficiency. Expand
Stealing Black-Box Functionality Using The Deep Neural Tree Architecture
This paper makes a substantial step towards cloning the functionality of black-box models by introducing a Machine learning (ML) architecture named Deep Neural Trees (DNTs), and proposes to train the DNT using an active learning algorithm to obtain faster and more sample-efficient training. Expand
Differentiable Signal Processing With Black-Box Audio Effects
A data-driven approach to automate audio signal processing by incorporating stateful third-party, audio effects as layers within a deep neural network, which can enable new automatic audio effects tasks, and can yield results comparable to a specialized, state-of-the-art commercial solution for music mastering. Expand
A novel residual whitening based training to avoid overfitting.
In this paper we demonstrate that training models to minimize the autocorrelation of the residuals as an additional penalty prevents overfitting of the machine learning models. We use differentExpand
Improved Modeling of Complex Systems Using Hybrid Physics/Machine Learning/Stochastic Models
Combining domain knowledge models with neural models has been challenging. End-to-end trained neural models often perform better (lower Mean Square Error) than domain knowledge models orExpand
A Primer for Neural Arithmetic Logic Modules
Focusing on the shortcomings of NALU, an in-depth analysis is provided to reason about design choices of recent units to highlight inconsistencies in a fundamental experiment causing the inability to directly compare across papers. Expand
Learning Representations by Humans, for Humans
A new, complementary approach to interpretability is proposed, in which machines are not considered as experts whose role it is to suggest what should be done and why, but rather as advisers that are effective for human decision-making. Expand
InverSynth: Deep Estimation of Synthesizer Parameter Configurations From Audio Signals
InverSynth is an automatic method for synthesizer parameters tuning to match a given input sound that is based on strided convolutional neural networks and is capable of inferring the synthesizers parameters configuration from the input spectrogram and even from the raw audio. Expand
Unknown-box Approximation to Improve Optical Character Recognition Performance
The proposed approach approximates the gradient of a particular OCR engine to train a preprocessor module to improve the accuracy of the OCR up to 46% from the baseline by applying pixel-level manipulation to the document image. Expand
Early Prediction of Breast Cancer Recurrence for Patients Treated with Neoadjuvant Chemotherapy: A Transfer Learning Approach on DCE-MRIs
A transfer learning approach to give an early prediction of three-year Breast Cancer Recurrence (BCR) for patients undergoing NACT, using DCE-MRI exams from I-SPY1 TRIAL and BREAST-MRI-NACT-Pilot public databases is proposed. Expand


Reinforcement Learning Neural Turing Machines
The RL-NTM is the first mo del that can, in principle, learn programs of unbounded running time and a simple technique for numerically checking arbitrary implementations of models that use Reinforce is developed, which may be of independent interest. Expand
Hybrid computing using a neural network with dynamic external memory
A machine learning model called a differentiable neural computer (DNC), which consists of a neural network that can read from and write to an external memory matrix, analogous to the random-access memory in a conventional computer. Expand
Asynchronous Methods for Deep Reinforcement Learning
A conceptually simple and lightweight framework for deep reinforcement learning that uses asynchronous gradient descent for optimization of deep neural network controllers and shows that asynchronous actor-critic succeeds on a wide variety of continuous motor control problems as well as on a new task of navigating random 3D mazes using a visual input. Expand
Adam: A Method for Stochastic Optimization
This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework. Expand
Rethinking the Inception Architecture for Computer Vision
This work is exploring ways to scale up networks in ways that aim at utilizing the added computation as efficiently as possible by suitably factorized convolutions and aggressive regularization. Expand
Attention is All you Need
A new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely is proposed, which generalizes well to other tasks by applying it successfully to English constituency parsing both with large and limited training data. Expand
Identity Mappings in Deep Residual Networks
The propagation formulations behind the residual building blocks suggest that the forward and backward signals can be directly propagated from one block to any other block, when using identity mappings as the skip connections and after-addition activation. Expand
Sequence to Sequence Learning with Neural Networks
This paper presents a general end-to-end approach to sequence learning that makes minimal assumptions on the sequence structure, and finds that reversing the order of the words in all source sentences improved the LSTM's performance markedly, because doing so introduced many short term dependencies between the source and the target sentence which made the optimization problem easier. Expand
Making Neural Programming Architectures Generalize via Recursion
This work proposes augmenting neural architectures with a key abstraction: recursion, and implements recursion in the Neural Programmer-Interpreter framework on four tasks, demonstrating superior generalizability and interpretability with small amounts of training data. Expand
Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning
Very deep convolutional networks have been central to the largest advances in image recognition performance in recent years. One example is the Inception architecture that has been shown to achieveExpand