• Corpus ID: 212464098

Voice User Interface Using Hidden Markov Model for Word Formation

@inproceedings{Joshi2015VoiceUI,
  title={Voice User Interface Using Hidden Markov Model for Word Formation},
  author={Pranali Joshi and Ravi P. Patki},
  year={2015}
}
Voice can be a powerful tool for use in human computer interaction because it is the fundamental means of human communication. With the rapid growth of hand free computer interface, the need for voice recognition techniques has increased greatly. Voice applications based on voice interfaces, voice recognition, and voice dialogue management can help users to be focused on their current work without extra effort for hands or eyes, and without extra learning time. Audio I/O devices such as a… 

Figures from this paper

References

SHOWING 1-10 OF 15 REFERENCES
The self-taught vocal interface
  • B. OnsJ. GemmekeH. V. hamme
  • Computer Science
    2014 4th Joint Workshop on Hands-free Speech Communication and Microphone Arrays (HSCMA)
  • 2014
TLDR
This work evaluates the feasibility of a self-taught VUI on a new database called domotica-3, which contains dysarthric speech with typical commands in a home automation setting, and proposes the use of a multi-layered semantic frame structure and demonstrates its effectiveness in boosting overall performance.
Self-improvement of voice interface with user-input spoken query at early stage of commercialization
TLDR
This paper analyzes error patterns in user-input spoken queries, categorizes these error patterns, defines a quantitative measurement for each category of error patterns and proposes a filter-based approach over this quantitative measurement.
A Dual-Mode Human Computer Interface Combining Speech and Tongue Motion for People with Severe Disabilities
TLDR
Preliminary evaluation results indicated that the dTDS headset, combined with a commercially available speech recognition (SR) software, can provide end users with significantly higher performance than either unimodal forms based on the tongue motion or speech alone, particularly in completing tasks that require both pointing and text entry.
Audiovisual Voice Activity Detection Based on Microphone Arrays and Color Information
TLDR
A new approach for lips and visual audiovisual voice activity detection with a late fusion approach is employed to combine the result of each modality (audio and video).
Pocketsphinx: A Free, Real-Time Continuous Speech Recognition System for Hand-Held Devices
TLDR
This paper presents a preliminary case study on the porting and optimization of CMU Sphinx-11, a popular open source large vocabulary continuous speech recognition (LVCSR) system, to hand-held devices, and is believed to be the firsthand-held LVCSR system available under an open-source license.
Mmdagent—A fully open-source toolkit for voice interaction systems
TLDR
Development of an open-source toolkit which makes it possible to explore a vast variety of aspects in speech interactions at spoken dialog systems and speech interfaces is described.
The Development and Evaluation of an Eyes-Free Interaction Model for Mobile Reading Devices
TLDR
A model of a voice user interface (VUI) for a mobile reading device is presented and empirically derived probabilities of grammar token usage for the commands that comprise the VUI are derived.
Synthetic speech detection using temporal modulation feature
TLDR
From the synthetic speech detection results, the modulation features provide complementary information to magnitude/phase features, and the best detection performance is obtained by fusing phase modulation features and phase features, yielding an equal error rate.
Poster: Say it to see it: A speech based immersive model retrieval system
TLDR
This work develops a novel interface for the insertion of models into a virtual scene through the use of voice, 3D visuals, and a 3D input device enabling models to be acquired and inserted in the scene during a user's virtual session.
Text dependent speaker recognition using shifted MFCC
TLDR
A novel approach to recognize/identify speakers by including a new set of features and using Gaussian mixture models (GMMs) so as to incorporate accent information in the recognition algorithm.
...
...