• Publications
  • Influence
Towards Fuzzy Domain Ontology Based Concept Map Generation for E-Learning
TLDR
The main contribution of this paper is the illustration of a novel concept map generation mechanism which is underpinned by a fuzzy domain ontology discovery algorithm which can automatically construct a concept map based on the messages posted to an online discussion board.
Convolutional gated recurrent neural network incorporating spatial features for audio tagging
TLDR
This paper proposes to use a convolutional neural network (CNN) to extract robust features from mel-filter banks, spectrograms or even raw waveforms for audio tagging to evaluate the proposed methods on Task 4 of the Detection and Classification of Acoustic Scenes and Events 2016 (DCASE 2016) challenge.
Attention and Localization Based on a Deep Convolutional Recurrent Model for Weakly Supervised Audio Tagging
TLDR
A weakly supervised method to not only predict the tags but also indicate the temporal locations of the occurred acoustic events and the attention scheme is found to be effective in identifying the important frames while ignoring the unrelated frames.
Unsupervised Feature Learning Based on Deep Models for Environmental Audio Tagging
TLDR
A shrinking deep neural network (DNN) framework incorporating unsupervised feature learning to handle the multilabel classification task and a symmetric or asymmetric deep denoising auto-encoder (syDAE or asyDAE) to generate new data-driven features from the logarithmic Mel-filter banks features.
Hierarchical Learning for DNN-Based Acoustic Scene Classification
TLDR
Two hierarchical learning methods are proposed to improve the DNN baseline performance by incorporating the hierarchical taxonomy information of environmental sounds in a deep neural network (DNN)-based acoustic scene classification framework.
Robust Speaker Recognition Using Speech Enhancement And Attention Model
TLDR
The obtained results show that the proposed approach using speech enhancement and multi-stage attention models outperforms two strong baselines not using them in most acoustic conditions in the authors' experiments.
Chinese Spelling Check System Based on N-gram Model
TLDR
A model based on joint bi-gram and trigram LM and Chinese word segmentation and dynamic programming is presented to increase efficiency and employ smoothing technique to address the sparseness of the n-gram in training data.
Tennis Ball Tracking Using a Two-Layered Data Association Approach
TLDR
This paper proposes a two-layered data association method to improve the robustness of tennis ball tracking and demonstrates that this approach outperforms current state-of-the-art approach.
Detection of ball hits in a tennis game using audio and visual information
TLDR
This paper uses Gaussian mixture models to generate estimates of the times of hits using the audio information, and then integrates these two sources of information in a probabilistic framework to improve the detection of ball hit events in tennis games.
...
...