Optimizing neural network hyperparameters with Gaussian processes for dialog act classification

@article{Dernoncourt2016OptimizingNN,
  title={Optimizing neural network hyperparameters with Gaussian processes for dialog act classification},
  author={Franck Dernoncourt and J. Y. Lee},
  journal={2016 IEEE Spoken Language Technology Workshop (SLT)},
  year={2016},
  pages={406-413}
}
Systems based on artificial neural networks (ANNs) have achieved state-of-the-art results in many natural language processing tasks. Although ANNs do not require manually engineered features, ANNs have many hyperparameters to be optimized. The choice of hyperparameters significantly impacts models' performances. However, the ANN hyperparameters are typically chosen by manual, grid, or random search, which either requires expert experiences or is computationally expensive. Recent approaches… Expand
Meta Learning for Hyperparameter Optimization in Dialogue System
TLDR
A meta learning approach to carry out multifidelity Bayesian optimization of hyperparameter optimization for dialogue system based on the deep Q network where a two-level recurrent neural network (RNN) is developed for sequential learning and optimization. Expand
Evaluation of the Influences of Hyper-Parameters and L2-Norm Regularization on ANN Model for MNIST Recognition
TLDR
A performance evaluation architecture that is based on method of normalized scoring to quantify the model performance and shows that a small learning rate and a small batch size not only affect the model's convergence speed but also make the model difficult to converge. Expand
Hybrid Stochastic GA-Bayesian Search for Deep Convolutional Neural Network Model Selection
TLDR
This work proposes an alternative automated system that combines the advantages of evolutionary processes and state-of-the-art Bayesian optimization, and results in overall classification accuracy improvements over several wellestablished techniques, and significant computational costs reductions compared to brute force computation. Expand
Bayesian Optimization for Selecting Efficient Machine Learning Models
TLDR
This work presents a unified Bayesian Optimization framework for jointly optimizing models for both prediction effectiveness and training efficiency and proposes an objective that captures the tradeoff between these two metrics and demonstrates how they can jointly optimize in a principled Bayesian optimization framework. Expand
Sequential short-text classification with neural networks
TLDR
This thesis introduces several algorithms to perform sequential short-text classification, which outperform state-of-the-art algorithms and proposes several natural language processing methods based on artificial neural networks to facilitate the completion of systematic reviews. Expand
Hyper-parameter Optimisation by Restrained Stochastic Hill Climbing
TLDR
HORSHC, a novel optimisation approach proposed in this paper has been proven to significantly out-perform the NE control algorithm and presents itself as a solution that is computationally comparable in terms of both time and complexity as well as outperforming the control algorithm. Expand
Neural Networks for Joint Sentence Classification in Medical Paper Abstracts
TLDR
This work presents an ANN architecture that combines the effectiveness of typical ANN models to classify sentences in isolation, with the strength of structured prediction, and outperforms the state-of-the-art results on two different datasets for sequential sentence classification in medical abstracts. Expand
Normality Testing for Vectors on Perceptron Layers
TLDR
The normality of probability distribution of vectors on perceptron layers was examined by the Multivariate Normality Test and the result of the hypothesis on Gaussian distribution is negative, ensuring that none of the set of vectors passed the criteria of normality. Expand
Information extraction with neural networks
TLDR
This thesis proposes the first de-identification system based on artificial neural networks (ANNs), which achieves state-of-the-art results without any human-engineered features, and presents an ANN architecture for relation extraction, which ranked first in the SemEval-2017 task 10 (ScienceIE) for relation extracted in scientific articles. Expand
2018 Application and Evaluation of Artificial Neural Networks in Solvency Capital Requirement Estimations for Insurance Products MATTIAS NILSSON ERIK SANDBERG
The least squares Monte Carlo (LSMC) approach is commonly used in the estimation of the solvency capital requirement (SCR), as a more computationally efficient alternative to a full nested MonteExpand
...
1
2
3
...

References

SHOWING 1-10 OF 37 REFERENCES
Practical Bayesian Optimization of Machine Learning Algorithms
TLDR
This work describes new algorithms that take into account the variable cost of learning algorithm experiments and that can leverage the presence of multiple cores for parallel experimentation and shows that these proposed algorithms improve on previous automatic procedures and can reach or surpass human expert-level optimization for many algorithms. Expand
Algorithms for Hyper-Parameter Optimization
TLDR
This work contributes novel techniques for making response surface models P(y|x) in which many elements of hyper-parameter assignment (x) are known to be irrelevant given particular values of other elements. Expand
A Convolutional Neural Network for Modelling Sentences
TLDR
A convolutional architecture dubbed the Dynamic Convolutional Neural Network (DCNN) is described that is adopted for the semantic modelling of sentences and induces a feature graph over the sentence that is capable of explicitly capturing short and long-range relations. Expand
Recurrent neural network based language model
TLDR
Results indicate that it is possible to obtain around 50% reduction of perplexity by using mixture of several RNN LMs, compared to a state of the art backoff language model. Expand
Practical Recommendations for Gradient-Based Training of Deep Architectures
  • Yoshua Bengio
  • Computer Science
  • Neural Networks: Tricks of the Trade
  • 2012
TLDR
Overall, this chapter describes elements of the practice used to successfully and efficiently train and debug large-scale and often deep multi-layer neural networks and closes with open questions about the training difficulties observed with deeper architectures. Expand
Sequential Short-Text Classification with Recurrent and Convolutional Neural Networks
TLDR
This work presents a model based on recurrent neural networks and convolutional neural networks that incorporates the preceding short texts that achieves state-of-the-art results on three different datasets for dialog act prediction. Expand
Recurrent Neural Networks for Word Alignment Model
TLDR
A word alignment model based on a recurrent neural network (RNN), in which an unlimited alignment history is represented by recurrently connected hidden layers, which outperforms the feed-forward neural network-based model as well as the IBM Model 4 under Japanese-English and French-English word alignment tasks. Expand
Natural Language Processing (Almost) from Scratch
We propose a unified neural network architecture and learning algorithm that can be applied to various natural language processing tasks including part-of-speech tagging, chunking, named entityExpand
Convolutional Neural Networks for Sentence Classification
TLDR
The CNN models discussed herein improve upon the state of the art on 4 out of 7 tasks, which include sentiment analysis and question classification, and are proposed to allow for the use of both task-specific and static vectors. Expand
Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank
TLDR
A Sentiment Treebank that includes fine grained sentiment labels for 215,154 phrases in the parse trees of 11,855 sentences and presents new challenges for sentiment compositionality, and introduces the Recursive Neural Tensor Network. Expand
...
1
2
3
4
...