Learn More
In this paper, we propose a fast method to recognize human actions which accounts for intra-class variability in the way an action is performed. We propose the use of a low dimensional feature vector which consists of (a) the projections of the width profile of the actor on to an ldquoaction basisrdquo and (b) simple spatio-temporal features. The action(More)
In this paper, we propose an efficient model for automatic transcription of polyphonic music. The model extends the shift-invariant probabilistic latent component analysis method and uses pre-extracted and pre-shifted note templates from multiple instruments. Thus, the proposed system can efficiently transcribe polyphonic music, while taking into account(More)
We address the problem of audio analytics with respect to efficient modeling of audio classes and continuous decoding of audio stream to automatically segment and label the audio stream as required in audio indexing. We propose the use of left-to-right HMMs and ergodic HMMs to respectively model definite and indefinite duration audio classes and Viterbi(More)
In this paper, we investigate the use of Music Language Models (MLMs) for improving Automatic Music Transcription performance. The MLMs are trained on sequences of symbolic polyphonic music from the Nottingham dataset. We train Recurrent Neural Network (RNN)-based models, as they are capable of capturing complex temporal structure present in symbolic music(More)
A framework is proposed for generating interesting, and musically similar variations of a given monophonic melody. The focus is on rock/pop guitar and bass-guitar melodies with the aim of eventual extensions to other instruments and musical styles. It is demonstrated here how learning musical style from segmented audio data can be formulated as an(More)
We propose a novel technique for audio analytics and audio indexing using template based modeling of audio classes set in a one-pass dynamic programming continuous decoding framework. We propose use of concatenation costs in the one-pass DP recursions to reduce so-called incursion errors; we also propose selection of variable length templates for modeling(More)
We are interested in modelling musical pitch sequences in melodies in the symbolic form. The task here is to learn a model to predict the probability distribution over the various possible values of pitch of the next note in a melody, given those leading up to it. For this task, we propose the Recurrent Temporal Discriminative Restricted Boltzmann Machine(More)
The multiple viewpoints representation is an event-based representation of symbolic music data which offers a means for the analysis and generation of notated music. Previous work using this representation has predominantly relied on n-gram and variable order Markov models for music sequence modelling. Recently the efficacy of a class of distributed models,(More)