- Full text PDF available (5)
- This year (0)
- Last 5 years (8)
- Last 10 years (8)
Journals and Conferences
In this paper, we study the use of two kinds of kernel-based discriminative models, namely support vector machine (SVM) and deep neural network (DNN), for speaker verification. We treat the verification task as a binary classification problem, in which a pair of two utterances, each represented by an i-vector, is assumed to belong to either the… (More)
This study investigates the classification performances of emotion and autism spectrum disorders from speech utterances using ensemble classification techniques. We first explore the performances of three well-known machine learning techniques, namely, support vector machines (SVM), deep neural networks (DNN) and k-nearest neighbours (KNN), with acoustic… (More)
Anchorperson segment detection enables efficient video content indexing for information retrieval. Anchorperson detection based on audio analysis has gained popularity due to lower computational complexity and satisfactory performance. This paper presents a robust framework using a hybrid I-vector and deep neural network (DNN) system to perform anchorperson… (More)
Alternative features were derived from extracted temporal envelope bank (TBANK). These simplified temporal representations were investigated in alignment procedures to generate frame-level training labels for deep neural networks (DNNs). TBANK features improved temporal alignments both for supervised training and for context dependent tree building.
Traditionally, cerebellar model articulation controller (CMAC) is used in motor control, inverted pendulum robot, and nonlinear channel equalization. In this study, we investigate the capability of CMAC for speech enhancement. We construct a CMAC-based supervised speech enhancement system, which includes offline and online phases. In the offline phase, a… (More)
OBJECTIVE This study focuses on the first (S1) and second (S2) heart sound recognition based only on acoustic characteristics; the assumptions of the individual durations of S1 and S2 and time intervals of S1-S2 and S2-S1 are not involved in the recognition process. The main objective is to investigate whether reliable S1 and S2 recognition performance can… (More)
Raw temporal features were derived from extracted temporal envelope bank (referred to as “Tbank”). Tbank features were used with deep neural networks (DNNs) to greatly increase the amount of detailed information about the past to be carried forward to help in the interpretation of the future.