Share This Author
Principles of computerized tomographic imaging
Properties of Computerized Tomographic Imaging provides a tutorial overview of topics in tomographic imaging covering mathematical principles and theory and how to apply the theory to problems in medical imaging and other fields.
CNN architectures for large-scale audio classification
- Shawn Hershey, S. Chaudhuri, K. Wilson
- Computer ScienceIEEE International Conference on Acoustics…
- 29 September 2016
This work uses various CNN architectures to classify the soundtracks of a dataset of 70M training videos with 30,871 video-level labels, and investigates varying the size of both training set and label vocabulary, finding that analogs of the CNNs used in image classification do well on the authors' audio classification task, and larger training and label sets help up to a point.
Construction and evaluation of a robust multifeature speech/music discriminator
A real-time computer system capable of distinguishing speech signals from music signals over a wide range of digital audio input is constructed and extensive data on system performance and the cross-validated training/test setup used to evaluate the system is provided.
Attentional Selection in a Cocktail Party Environment Can Be Decoded from Single-Trial EEG.
It is shown that single-trial unaveraged EEG data can be decoded to determine attentional selection in a naturalistic multispeaker environment and a significant correlation between the EEG-based measure of attention and performance on a high-level attention task is shown.
Video Rewrite: driving visual speech with audio
Video Rewrite is the first facial-animation system to automate all the labeling and assembly tasks required to resync existing footage to a new soundtrack.
Content-Based Music Information Retrieval: Current Directions and Future Challenges
- M. Casey, R. Veltkamp, Masataka Goto, M. Leman, Christophe Rhodes, M. Slaney
- ArtProceedings of the IEEE
- 14 March 2008
The problems of content-based music information retrieval are outlined and the state-of-the-art methods using audio cues and other cues are explored, and some of the major challenges for the coming years are identified.
An Efficient Implementation of the Patterson-Holdsworth Auditory Filter Bank
- M. Slaney
- Computer Science
Previous work is extended by deriving an even more efficient implementation of the Gammatone filter bank, and by showing the MATLAB™ code to design and implement an ERB filter bank based on Gamm atone filters.
Limitations of Imaging with First-Order Diffraction Tomography
In this paper, the results of computer simulations used to determine the domains of applicability of the first-order Born and Rytov approximations in diffraction tomography for cross-sectional (or…
Locality-Sensitive Hashing for Finding Nearest Neighbors [Lecture Notes]
This lecture note describes a technique known as locality-sensitive hashing (LSH) that allows one to quickly find similar entries in large databases. This approach belongs to a novel and interesting…
Collaborative Filtering and the Missing at Random Assumption
Experimental results are presented showing that incorporating an explicit model of the missing data mechanism can lead to significant improvements in prediction performance on the random sample of ratings.