Kemal Sönmez

Learn More
We describe SRI’s large vocabulary conversational speech recognition system as used in the March 2000 NIST Hub-5E evaluation. The system performs four recognition passes: (1) bigram recognition with phone-loop-adapted, within-word triphone acoustic models, (2) lattice generation with transcription-mode-adapted models, (3) trigram lattice recognition with(More)
Giving feedback on the degree of nativeness of a student’s speech is an important aspect of computer-aided language learning. This task has been addressed by many studies focusing on the segmental assessment of the speech signal. To better model human nativeness scores, other aspects of speech should also be considered, such as prosody. This study examines(More)
We present a probabilistic model that uses both prosodic and lexical cues for the automatic segmentation of speech into topic units. The approach combines hidden Markov models, statistical language models, and prosody-based decision trees. Lexical information is obtained from a speech recognizer, and prosodic features are extracted automatically from speech(More)
Introduction State-of-the-art automatic speech recognition (ASR) systems continue to improve, and yet there remain many tasks for which the technology is inadequate. The core acoustic operation has essentially remained the same for decades: a single feature vector (derived from the power spectral envelope over a 20-30 ms window, stepped forward by ~10 ms(More)
Cancer development is presumed to be an evolutionary process that is influenced by genetic background and environment. In laboratory animals, genetics and environment are variables that can largely be held constant. In humans, it is possible to compare independent tumours that have developed in the same patient, effectively constraining genetic and(More)
The design of novel antimicrobial peptides (AMPs) is an important problem given the rise of drug-resistant bacteria. However, the large size of the sequence search space, combined with the time required to experimentally test or simulate AMPs at the molecular level makes computational approaches based on sequence analysis attractive. We propose a method for(More)
Tumorigenesis is a multi-step process, involving the acquisition of multiple oncogenic mutations that transform cells, resulting in systemic dysregulation that enables proliferation, invasion, and other cancer hallmarks. The goal of precision medicine is to identify therapeutically-actionable mutations from large-scale omic datasets. However, the(More)
Mobile phone usage in World is spreading rapidly and has gone through great changes due to new developments and innovations in mobile phone technology. This project based on evaluating voice versus keypad as a means for entry and editing of texts. In other words, messages can be voice/speech typed. The project will make use of a dictating-machine prototype(More)
We present a conditional distribution learning formulation for real-time signal processing with neural networks based on a recent extension of maximum likelihood theory—partial likelihood (PL) estimation—which allows for i) dependent observations and ii) sequential processing. For a general neural network conditional distribution model, we establish a(More)
BACKGROUND Computational prediction of Transcription Factor Binding Sites (TFBS) from sequence data alone is difficult and error-prone. Machine learning techniques utilizing additional environmental information about a predicted binding site (such as distances from the site to particular chromatin features) to determine its occupancy/functionality class(More)