Corpus ID: 16867372

Feedforward Sequential Memory Neural Networks without Recurrent Feedback

@article{Zhang2015FeedforwardSM,
  title={Feedforward Sequential Memory Neural Networks without Recurrent Feedback},
  author={Shiliang Zhang and Hui Jiang and Si Wei and Lirong Dai},
  journal={ArXiv},
  year={2015},
  volume={abs/1510.02693}
}
We introduce a new structure for memory neural networks, called feedforward sequential memory networks (FSMN), which can learn long-term dependency without using recurrent feedback. The proposed FSMN is a standard feedforward neural networks equipped with learnable sequential memory blocks in the hidden layers. In this work, we have applied FSMN to several language modeling (LM) tasks. Experimental results have shown that the memory blocks in FSMN can learn effective representations of long… Expand
Feedforward Sequential Memory Networks: A New Structure to Learn Long-term Dependency
TLDR
Experimental results have shown FSMNs significantly outperform the conventional recurrent neural networks (RNN) in modeling sequential signals like speech or language and can be learned much more reliably and faster than RNNs or LSTMs due to the inherent non-recurrent model structure. Expand
Compact Feedforward Sequential Memory Networks for Large Vocabulary Continuous Speech Recognition
TLDR
This work proposes a compact feedforward sequential memory networks (cFSMN) by combining FSMN with low-rank matrix factorization and makes a slight modification to the encoding method used in FSMNs in order to further simplify the network architecture. Expand
Gating Recurrent Enhanced Memory Neural Networks on Language Identification
TLDR
It is confirmed by the experimental results that the proposed GREMN possesses strong ability of sequential modeling and generalization, where about 5% relative equal error rate (EER) reduction is obtained comparing with the approximate-sized gating RNNs and 38.5% performance improvements is observed compared to conventional i-Vector based LID system. Expand
FPGA architecture for feed-forward sequential memory network targeting long-term time-series forecasting
TLDR
A field-programmable gate-array (FPGA) architecture is proposed for FSMN, and it is exhibited that the resource does not increase exponentially as the network scale increases. Expand
Global context-dependent recurrent neural network language model with sparse feature learning
TLDR
A new language model that can capture the global context just by the words within the current sequences, incorporating all the preceding and following words of target, without resorting to additional information summarized from other sequences is proposed. Expand
Deep-FSMN for Large Vocabulary Continuous Speech Recognition
TLDR
An improved feedforward sequential memory networks (FSMN) architecture, namely Deep-FSMN (DFSMN), is presented by introducing skip connections between memory blocks in adjacent layers, which enable the information flow across different layers and thus alleviate the gradient vanishing problem when building very deep structure. Expand
Future Context Attention for Unidirectional LSTM Based Acoustic Model
TLDR
A novel architecture called attention-based LSTM is proposed, which employs context-dependent scores or context- dependent weights to encode temporal future context information with the help of a kind of attention mechanism for unidirectional L STM based acoustic model. Expand
On training bi-directional neural network language model with noise contrastive estimation
  • Tianxing He, Yu Zhang, J. Droppo, Kai Yu
  • Computer Science
  • 2016 10th International Symposium on Chinese Spoken Language Processing (ISCSLP)
  • 2016
TLDR
It is shown that NCE-trained bi-directional NNLM behaves well in the sanity test and outperformed the one trained by conventional maximum likelihood training on the rescore task. Expand
Deep Feed-Forward Sequential Memory Networks for Speech Synthesis
TLDR
Both objective and subjective experiments show that, compared with BLSTM TTS method, the DFSMN system can generate synthesized speech with comparable speech quality while drastically reduce model complexity and speech generation time. Expand
Compact Feedforward Sequential Memory Networks for Small-footprint Keyword Spotting
Due to limited resource on devices and complicated scenarios, a compact model with high precision, low computational cost and latency is expected for small-footprint keyword spotting tasks. ToExpand
...
1
2
...

References

SHOWING 1-10 OF 17 REFERENCES
Learning Longer Memory in Recurrent Neural Networks
TLDR
This paper shows that learning longer term patterns in real data, such as in natural language, is perfectly possible using gradient descent, by using a slight structural modification of the simple recurrent neural network architecture. Expand
Extensions of recurrent neural network language model
TLDR
Several modifications of the original recurrent neural network language model are presented, showing approaches that lead to more than 15 times speedup for both training and testing phases and possibilities how to reduce the amount of parameters in the model. Expand
Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling
TLDR
These advanced recurrent units that implement a gating mechanism, such as a long short-term memory (LSTM) unit and a recently proposed gated recurrent unit (GRU), are found to be comparable to LSTM. Expand
Recurrent neural network based language model
TLDR
Results indicate that it is possible to obtain around 50% reduction of perplexity by using mixture of several RNN LMs, compared to a state of the art backoff language model. Expand
Inferring Algorithmic Patterns with Stack-Augmented Recurrent Nets
TLDR
The limitations of standard deep learning approaches are discussed and it is shown that some of these limitations can be overcome by learning how to grow the complexity of a model in a structured way. Expand
Learning long-term dependencies with gradient descent is difficult
TLDR
This work shows why gradient based learning algorithms face an increasingly difficult problem as the duration of the dependencies to be captured increases, and exposes a trade-off between efficient learning by gradient descent and latching on information for long periods. Expand
The Fixed-Size Ordinally-Forgetting Encoding Method for Neural Network Language Models
TLDR
Experimental results have shown that without using any recurrent feedbacks, FOFE based FNNLMs can significantly outperform not only the standard fixed-input FNN-LMs but also the popular recurrent neural network (RNN) LMs. Expand
Generating Sequences With Recurrent Neural Networks
This paper shows how Long Short-term Memory recurrent neural networks can be used to generate complex sequences with long-range structure, simply by predicting one data point at a time. The approachExpand
End-To-End Memory Networks
TLDR
A neural network with a recurrent attention model over a possibly large external memory that is trained end-to-end, and hence requires significantly less supervision during training, making it more generally applicable in realistic settings. Expand
Understanding the difficulty of training deep feedforward neural networks
TLDR
The objective here is to understand better why standard gradient descent from random initialization is doing so poorly with deep neural networks, to better understand these recent relative successes and help design better algorithms in the future. Expand
...
1
2
...